Upload
violet-ward
View
29
Download
1
Tags:
Embed Size (px)
DESCRIPTION
The Case for A Data Mining Approach to Technical Analysis. If I’m so smart how come I’m not rich yet ??. The Case for Data Mining. You Before Finance 9790. You After Finance 9790. 1. TA Is a Multivariate Recurrent Prediction Problem. - PowerPoint PPT Presentation
Citation preview
The Case for A Data Mining Approach to Technical
AnalysisIf I’m so smarthow come I’m
not richyet ??
The Case for Data Mining
YouBefore
Finance 9790
YouAfter
Finance 9790
4. TA Practitioners Should Partner-Up With
Data Mining Algorithms
1. TA Is a Multivariate Recurrent Prediction Problem
2.The Four Tasks of A Recurrent Prediction Problem1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s)
3) Build Data Base of Solved Examples4) Selecting X’s, 5) Determining the Prediction Function
3. Humans & ComputersComplimentary Information Processing
AbilitiesHumans Uniquely Able to Handle Tasks 1 & 2 &3
But Poor at Tasks 4 & 5Data Mining Algorithms Optimal for Task 4 & 5
5. TA Practitioners Should Abandon Outdated Methods& Focus On
Their Proper Role in a Human / Machine Partnership
Data Mining
Practitioner Data MiningSoftware
Data Bases
4. TA Practitioners Should Partner-Up With
Data Mining Algorithms
1. TA Is a Multivariate Recurrent Prediction Problem
2.The Four Tasks of A Recurrent Prediction Problem1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s)
3) Build Data Base of Solved Examples4) Selecting X’s, 5) Determining the Prediction Function
3. Humans & ComputersComplimentary Information Processing
AbilitiesHumans Uniquely Able to Handle Tasks 1 & 2 & 3
But Poor at Tasks 4 & 5Data Mining Algorithms Optimal for Task 3 & 4
There Are Two Kinds of Prediction Problems
1. Regression: predicting the FUTURE value of a continuous variable
2. Classification: predicting the class of an object (situation)
In Both Regression & Classification
The target variable concerns something that is not yet known!!
In Both Regression & Classification
We use information that is known
To make the prediction
Two Kinds of Prediction Problems
1. Regression: we wish to predict the FUTURE value of a continuous variable
• This variable is referred to as: the dependant variable, the target variable, Y
• The target variable in a regression problem is a continuous variable: can assume any value within a range Example: the % change in the S&P500 from now
(t0) to a point in time 90 days into the future ( t+90)
2. Classification: we wish to predict the class of an object whose class is not yet known
• The target variable in a classification problem is a discrete variable Assumes a limited number of discrete values or
names ( 0,1), (+1, 0, -1), (benign / malignant) Example 1: the future class of a company with
respect to solvency ( bankrupt / non-bankrupt) Example 2: the future trend of the market over
the next 90 days ( up / down)
Two Kinds of Prediction Problems
What Is A Recurrent Multivariate Prediction Problem?
1. The same type of prediction is required over and over again.
2. The same set of information is available each time a prediction is required
• The information is a set of values for each of a multitude of variables
• These variables are referred by the name “independent variables, predictors, candidate predictors, indicators, etc.
Recurrent Decision Problems
1. The same type of prediction is required over and over again.– Medicine: Is a given tumor malignant or benign– Oil Exploration: At a given location: Is there Oil or
No Oil (Drill / Don’t Drill)– Marketing: is given consumer a likely buyer or
non-buyer for our product or service– Credit Approval: Is a given loan applicant likely to
Repay or Default ( Lend / Don’t Lend)– Technical Analysis: Is the market more likely to
advance or decline ( Buy / Sell)
Examples Classification Problem Does the Object Belong to
Class 1Class 1 or Class 2Class 2
Recurrent Decision Problems
1. The same type of prediction is required over and over again.– Medicine: survival time for someone with disease X– Oil Exploration: amount of oil a new well is likely to
produce– Marketing: What are the likely sales of a product– Technical Analysis:
• How much will the S&P500 appreciate over the next month• By how much will stock A beat the market over the next
month
Examples Regression Problem The Future Value of A Continuous Y Variable
2. The same set of information is available each time a decision is required• Information is a set of values for a multitude
of variables
Recurrent Decision Problem
Multivariate Information Setmeasured values for a multitude of variables
Medicine: set of results on medical tests Blood pressure, cholesterol level, blood sugar, etc.
Oil Exploration: set of values for various geological parameters
Marketing: set of demographic factors describing the person zip code, owns car yes/no, etc.
Credit Approval: set of credit factors describing the loan applicant . # years at current address, number of credit cards,
payment history
1. close / moving average = $ 1.0752. 10 day ma / 50 day ma = 1.0673. RSI Indicator = 744. 5 day ma volume / 25 day ma volume5. VIX (Implied Volatility on Stock Options)6. Ratio of Insider Sales / Purchases7. Ratio of Upside / Downside Volume
Technical Analysis Information Setmultitude of Indicator Readings at a given point in time
This point in timeIs characterized by
These indicator values75.575.5, , -2.1-2.1,,-.55-.55
62.162.1, , +0.1,+0.1, -.-.0202
75.5
-2.1
-.55-.55
62.1
+0.1
-.02-.02
In Other Words: There Are 3 Candidate Predictor Variables.
We can treat this asClassification Problem
Class 1: Market Return over the next 20 days is > 0
Class 2: Market Return over the next 20 days is < 0
The Target Variable: The Thing We Wish To PredictIs Discrete Variable that can Assume 2 Values
> 0 or < 0 ( we can call this Class 1 or Class 2,
This point in time t0
Is characterized by75.575.5, , -2.1-2.1,,-.55-.55
62.162.1, , +0.1,+0.1, -.-.0202
75.5
-2.1
-.55
62.1
+0.1
-.02
Do These predictors (indicators )Enable Us to classify (discriminate)
Future Up-MovesUp-Moves from Future Down MovesDown Moves?Class 1Class 1 from Class 2Class 2
This point in time t0
Is characterized by75.575.5, , -2.1-2.1,,-.55-.55
62.162.1, , +0.1,+0.1, -.-.0202
t0 t+20
t+20t0
Getting Matters of Time Straightt0 and t+20
• t0 refers to the date on which the prediction or classification is made– This is date of the most recent values of the
predictor variables
• t+20 or t+n refers to a time in the future that the target variable (Y) refers to– In the bankruptcy prediction problem it is any
time over the following two years.– So the future looking horizon of the target
need not be a fixed date.
Value of Y is based on Future InformationValues of X’s based on past and current information
Timet0
Past Future
Value of Target (Y) based on
What happens out here
From t0 until t+n
Values of Predictors (X) based on
What happens Back here & up to
from t-n unitl t0
1. TA Is a Multivariate Recurrent Prediction Problem
4. TA Practitioners Should Partner-Up With Data Mining Algorithms
2.The Four Tasks of A Recurrent Prediction Problem1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s)
3) Build Data Base of Solved Examples4) Selecting X’s, 5) Determining the Prediction Function
3. Humans & ComputersComplimentary Information Processing
AbilitiesHumans Uniquely Able to Handle Tasks 1 & 2 & 3
But Poor at Tasks 4 & 5Data Mining Algorithms Optimal for Task 4 & 5
Task 1: Define The Target Variable (Y) The Single Variable We Wish to Predict
1. Define the type of the problem: Classification or Regression
A. Classification (Discrimination): Y defined as a class 2 or more distinct classes• Benign / malignant• Lend / Don’t Lend• Buy / Sell /• Strong Buy / Weak Buy/ Weak Sell / Strong Sell
B. Regression: a continuous quantity (linear regression)• Future % increase in the market• Predicted amount of future purchases
Task 2: Propose Candidate Predictors (X’s)
These are merely candidates because we don’t know yet if any will be useful for predicting the target Y
Predictors must be based on data known at the time the prediction is made: look back in time from present Tomorrow’s closing price – No Today’s closing price or prior closing prices- Yes
Not all indicators need to be useful, but some must be. Success in predictive modeling requires that
some candidate predictors have useful information about the quantity or class to be predicted (Y)
Task 2 is crucial!!!!!If not done well…..all is lost
1. The TASK of the domain expert……(YOU)2. Expert must know which raw data series may
contain relevant information1. Price2. Volume3. Open interest4. Interest rates, etc
3. Expert proposes useful ways to transform raw series into indicators
– For example in our problem X’s must be stationary.– That expose the information in the raw data series
to the data mining algorithm
Skipping Task 3 For A moment
Building the Data BaseOf Solved Examples
From Which DM AlgorithmLearns the Model
Tasks 4 & 5
4. Selecting Indicators for from Candidate List that warrant a place in the prediction model
Determining which candidates contain relevant non-redundant information about (Y)
The set of indicators that work synergistically
5. Determining the prediction function What is mathematical or logical formula for
combining the values of the X’s to best estimate the value of Y
A complex configural reasoning problem
What Is A Prediction Function
• A mathematical or logical formula for combining the selected indicators to produce a best estimate of the target variable.
• Simplest :– 1 predictor model
– linear shape: y = ax1+b
Y
X1
b is value of the Y intercept of linea is the slope of the line
Simplest Prediction Model1 predictor & flat (no hills or valleys) in model’s surface
Y
X1
Y intercept =b
For this value of X1
The model predictsThis value of Y
Multiple Linear RegressionCombines Two or More X’s in a linear way to predict the
value of Y
• In multiple linear regression the combining function is assumed to be linear (weighted sum)
• Y= a1X1 + a2X2 + a3X3……….anXn + c.
Regression coefficients (weights) are foundBy the method of Least-Squares
Modern Data-Miners Need Not Assume A Linear FormThey Allow the data mining algorithm to discover it.
It May Be Non-Linear & Arbitrarily Complex
X1
X2
Y
Linear Model : Flat Response (Y) Surface Y Is Linear Function of Two Features X1, X2
Y = A X1 + B X2 + C
“A” slope
“B” slope“C” intercept
X1
X2
Y
Linear Model Is Best FittingTilted Flat Surface to the Data
Y = A X1 + B X2 + C
“A” slope
“B” slope“C” intercept
X1
X2
Y
The Model’s Prediction is The Altitude of the Y SurfaceCorresponding to values of X1 and X2
Given this value of X 2
Given
this valu
e of X`
The model predictsThis value of Y
Thinking of A Prediction Model’s Output AsA Super Indicator
A new indicator that condenses & combines the information
In two or more indicators (variables)Into a new or super indicator
Model Output As a “Super Indicator”
• The output of a prediction model is a new variable, produced by function found by regression analysis
• The function is a weighted sum of the indicators serving as inputs to the model ( X1, X2, etc)
• The function’s weights been optimized to transform values of inputs into a best estimate of the target (Y).– method of least-squares is used to find optimal
weights– Weights cause the line or plane to fit the historical
data
Multiple Linear RegressionCombines Two or More X’s in a linear way to predict the
value of Y
• In multiple linear regression the combining function is assumed to be linear (additive)
• Y= a1X1 + a2X2 + a3X3……….anXn + c.
But What If the true shape of the relationshipBetween the indicators (X1…..Xn) is not a tiltedFlat Surface….but something more complex????
Multiple Linear RegressionCombines Two or More X’s in a linear way to predict the
value of Y
• In multiple linear regression the combining function is assumed to be linear (additive)
• Y= a1X1 + a2X2 + a3X3……….anXn + c.
Modern Data-Miners Do Not Assume the ModelSurface Is Linear (free of hills and valleys)
They Allow the data mining algorithm to discover itsShape, Which May Be Non-Linear
X1
X2
Y
Suppose the authentic relationshipBetween X1 & X2 and Y Looks Like This
Y = f ( X1 , X2 )3
Forcing A Linear to DescribeNon-Linear Phenomenon Misses The Boat!
X1
X2 – TA indicator X2
Y – future trend
Financial Markets Are Most Likely to BeComplex Non-Linear Systems2
Linear Model’sPredictionsToo Low
Linear Model’sPredictionsToo High
The Model Fails to CaptureThe Authentic Patterns in the Data
CandidatePredictors:
A Setof
IndicatorsProposed
ByHumanExpert
X1
X2
X3
Xn
Outcome
YTo Predict
Y = f (x)ComplexSystem
6
Tasks 4 & 5Must Be Performed by Data Mining Software
X4
X5
Task 4Which, if any, of the candidate predictors
Contain information relevant to Y ?
Task 5What is the shape of the mathematical function
best combines the indicatorsinto a Predicted Value of Y
? f ?Combining Function
CandidatePredictors:
A Setof
IndicatorsProposed
ByHumanExpert
X1
X2
X3
Xn
Outcome
YTo Predict
Y = f (x)ComplexSystem
6
Tasks 4 & 5Must Be Performed by Data Mining Software
X4
X5
Task 4Which, if any, of the candidate predictors
Contain information relevant to Y ?
Task 5What is the shape of the mathematical function
best combines the indicatorsinto a Predicted Value of Y
? f ?Combining Function
Task 5Note!!
In When the DM method usedIs Multiple Linear Regression
The Prediction Function IsAssumed to Be Linear
1. TA Is a Multivariate Recurrent Prediction Problem
4. TA Practitioners Should Partner-Up With Data Mining Algorithms
2.The Four Tasks of A Recurrent Prediction Problem1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s)
3) Build Data Base of Solved Examples4) Selecting X’s, 5) Determining the Prediction Function
3. Humans & ComputersComplimentary Information Processing
AbilitiesHumans Uniquely Able to Handle Tasks 1 & 2 & 3
But Poor at Tasks 4 & 5Data Mining Algorithms Optimal for Task 4 & 5
Human Experts &
Data Mining AlgorithmsHave Different But Complementary
Information Processing Abilities
They Synergize
Where Human’s Are Strong, DM Algorithms WeakWhere Humans Experts Are Weak, DM Algorithms Strong
Definition: Configural Thinking
a multitude of variables (indicators) must be considered simultaneously as an inseparable configuration (pattern).
Considering each variable individuallywill not provide the correct conclusion.
Human Intelligence Strengths & Weaknesses:
• Creative– Posing Problems (Y)– Proposing candidate
indicators (Xs)
• Weak Configural Reasoning– Distinguishing
relevant from irrelevant X’s
– Combining multiple variables
3
Machine Intelligence (Data Mining) Weaknesses & Strengths
• Lack Creativity– Unable to pose
questions (define Y)– Unable to propose
candidate indicators (define X’s).
• Excellent ability to handle numerous variables simultaneously Configural – Can identify relevant non-
redundant indicators.– Can formulate
multivariate prediction functions.
3
Who or What Should Handle the 5 Tasks?
1. Define Y2. Propose Candidate Indicators X’s3. Build Data Base of Solved Cases4. Indicator Selection: which Candidate X’s
Are relevant and non-redundant5. Determining optimal combining function:
a mathematical model that combines useful X’s into a prediction or classification decision
A Task for AutomatedData Mining Algorithms
The Evidence
Studies of Human Experts Solving Multivariate Recurrent Prediction Problems
Shows……..1. Experts realize the necessity for configural
reasoning (combining variables in complex non-linear fashion)
2. Experts are under the impression that they are combining information in a complex configural manner but studies show….
3. Experts rely primarily on simple linear rules for combining information
4. Their performance is poor– Inconsistent–same set of information elicits different
decision on different :Correlation .6– Correlation among experts is also low
Technical Analyst Faced With Large Set Of Conflicting Indicators
5 bullish factors & 3 bearish factorsLet each bullish factor = +1 & each bearish factor = -1 Sum bullish factors = +5 : Sum bearish factors = -3
Bullish
Bearish
Human Experts (Technical Analysts) Rely on Intuitive Linear Combining
Sum bullish factors = +5 & Sum bearish factors = -3
+5 – 3 = +2I’m bullish
Comparing the Subjective Predictions of Experts
With Multiple Linear Regression
ModelsStudies Began in 1954
The QuestionHow accurate are the predictions of humans
compared to multiple linear regression models given the same set of indicators ?
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1 2 3 4 5 6 7 8 9
Expert
Model
r2
PredictedVs.
Actual
1 2 3 4 5 6 7 8 9Expert
Model
Expert Mean0.11r2
Model Mean0.38r2
Expert’s Subjective Predictions vs. Multiple Regression Models
Academic
Cancer survival
Stocks
Mental ill.
Student Att.
Business Failure
Teach. effective
Sales Effective.
Meta-Analysisof 135 Similar Studies
Study1 Study2 Study3 Studyn
Draws A ConclusionFrom Multiple Independent Studies
Swets, Monahan & Dawes2000
• meta-analysis of >135 studies comparing 3 decision making methods.
1. Expert / intuitive (subjective) judgment based on anecdotal experience & informal reasoning.
2. Statistical models.
3. Combination of methods #1 & #2.
Wide Variety of Disciplines Were Examined in the 135 Studies.
• Fields– Medical diagnosis– Penology (parole recidivism,violence)– Psychology(diagnosis and treatment
selection),– Education ( predicting success in academics)– Predicting football game outcomes.
• Results were quite consistent across fields
Results of Meta Analysis135 Studies
• In 96% of the studies, regression models beat or were equal to expert judgment.
• In medical diagnosis expert judgment was always worse than regression model.
• Experts beat statistical models in only 6 studies.
The Question:With All This Evidence Why Do Experts Insist on Making
Subjective / Intuitive Predictions & Decisions
Bottom Line For Technical Analysis
Aronson’s Editorial Opinion
When Making Predictions
Rely On Objective Statistical Models Not
Subjective Judgment
1
1. TA Is a Multivariate Recurrent Prediction Problem
4. TA Practitioners Should Partner-Up With Data Mining Algorithms
2.The Four Tasks of A Recurrent Prediction Problem1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s)
3) Build Data Base of Solved Examples4) Selecting X’s, 5) Determining the Prediction Function
3. Humans & ComputersComplimentary Information Processing
AbilitiesHumans Uniquely Able to Handle Tasks 1 & 2 & 3
But Poor at Tasks 4 & 5Data Mining Algorithms Optimal for Task 4 & 5
Task #3
Build Data BaseOf Solved Examples
The Data (Experience) Base Is Used By the Data Mining Algorithms to Learn How to
Build The Prediction Model
This task often takes 90-95% of the time when developingA Data Mined Model
Data Base of Solved Examples Known Values of “Y”
• What is a “solved example”? : A case (situations, examples, etc) for which the value of the target variable is known as well as the values of the X (candidate predictors)– Value of Y is known because the case happened in
the past– Even though Y is a forward looking the case
occurred long enough ago so that the value of Y is known.
• Each case in the data based is described by 2 kinds of information1. Value for the target variable Y.2. The values for the candidate predictors
Examples of A Solved CaseA. 1 day of market history for the S&P500
1. Y value: % change over the month following the date of the case (regression)
2. X values: values of the indicators on the date of the case
B. An oil drilling site1. Y value: did the site produce oil or not (class)2. X values: values of 10 geophysical parameters
characterizing the site
C. 1 company 1. Y value: company failed or did not fail within next 2
years2. X values: values of various financial ratios taken
from the most recent balance sheet and income statement
Data Base of Solved Examples
• Contains many cases: (typically thousands)– Why so many? - data density.
• From the many cases the DM algorithm tries to discover– Which, if any, of the candidate predictors can solve
the regression or classification problem• Task #4
– How the selected predictors should be combined mathematically or logically to give the most accurate estimate possible of the value of the target (Y)• Task #5
1
1
0
0
0
0
1
0
0
1
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 XN Y
Candidate Indicators
Case N
Examp. 1
Examp. N
Examp. 2
Examp. 3
Examp. 4
1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1
1.2 -2.5 -5.11.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1
1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1
1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1
1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 -2.5 -5.1
1.2 -2.5 -5.11.2 -2.5 -2.5 1.2 -5.1
1.2 -2.5 -5.1 -2.5 1.2 -2.5 -5.1
1.2 -2.5 -5.1 -2.51.2 -5.1Matrix of Examples
With Known Values
Of both Xs & Y
1.2 -2.5 -5.1
1.2 -5.1-2.5
-2.5
-2.5
-2.5
-5.1
-5.1
1.2 -2.5 -5.1
-2.5
-2.5
-2.5 -5.1-2.5
1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1
1.2 -2.5 -5.11.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1
1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1
1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1
1.2 -2.5 -5.11.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1
Human Intelligence: Unchanging Computer Power & Machine Intelligence
Growing Exponentially
Moore’s Law:An IncreasingCompetitive
Advantage to theData Miners
PowerArithmetic
Scale
Time
The A,B,C’s of Being An Intelligent Technical Analyst
A. Know How to Use Data Mining Tools
B. Know how to Define Data Mining Problems ( Define Y)
C. Know how to define List of Information Rich Candidate Predictors (X’s)