The Case for A Data Mining Approach to Technical Analysis

The Case for A Data Mining Approach to Technical

AnalysisIf I’m so smarthow come I’m

not richyet ??

The Case for Data Mining

YouBefore

Finance 9790

YouAfter

Finance 9790

4. TA Practitioners Should Partner-Up With

Data Mining Algorithms

1. TA Is a Multivariate Recurrent Prediction Problem

2.The Four Tasks of A Recurrent Prediction Problem1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s)

3) Build Data Base of Solved Examples4) Selecting X’s, 5) Determining the Prediction Function

3. Humans & ComputersComplimentary Information Processing

AbilitiesHumans Uniquely Able to Handle Tasks 1 & 2 &3

But Poor at Tasks 4 & 5Data Mining Algorithms Optimal for Task 4 & 5

5. TA Practitioners Should Abandon Outdated Methods& Focus On

Their Proper Role in a Human / Machine Partnership

Data Mining

Practitioner Data MiningSoftware

Data Bases

4. TA Practitioners Should Partner-Up With

Data Mining Algorithms





AbilitiesHumans Uniquely Able to Handle Tasks 1 & 2 & 3


There Are Two Kinds of Prediction Problems

1. Regression: predicting the FUTURE value of a continuous variable

2. Classification: predicting the class of an object (situation)

In Both Regression & Classification

The target variable concerns something that is not yet known!!

In Both Regression & Classification

We use information that is known

To make the prediction

Two Kinds of Prediction Problems

1. Regression: we wish to predict the FUTURE value of a continuous variable

• This variable is referred to as: the dependant variable, the target variable, Y

• The target variable in a regression problem is a continuous variable: can assume any value within a range Example: the % change in the S&P500 from now

(t0) to a point in time 90 days into the future ( t+90)

2. Classification: we wish to predict the class of an object whose class is not yet known

• The target variable in a classification problem is a discrete variable Assumes a limited number of discrete values or

names ( 0,1), (+1, 0, -1), (benign / malignant) Example 1: the future class of a company with

respect to solvency ( bankrupt / non-bankrupt) Example 2: the future trend of the market over

the next 90 days ( up / down)

Two Kinds of Prediction Problems

What Is A Recurrent Multivariate Prediction Problem?

1. The same type of prediction is required over and over again.

2. The same set of information is available each time a prediction is required

• The information is a set of values for each of a multitude of variables

• These variables are referred by the name “independent variables, predictors, candidate predictors, indicators, etc.

Recurrent Decision Problems

1. The same type of prediction is required over and over again.– Medicine: Is a given tumor malignant or benign– Oil Exploration: At a given location: Is there Oil or

No Oil (Drill / Don’t Drill)– Marketing: is given consumer a likely buyer or

non-buyer for our product or service– Credit Approval: Is a given loan applicant likely to

Repay or Default ( Lend / Don’t Lend)– Technical Analysis: Is the market more likely to

advance or decline ( Buy / Sell)

Examples Classification Problem Does the Object Belong to

Class 1Class 1 or Class 2Class 2

Recurrent Decision Problems

1. The same type of prediction is required over and over again.– Medicine: survival time for someone with disease X– Oil Exploration: amount of oil a new well is likely to

produce– Marketing: What are the likely sales of a product– Technical Analysis:

• How much will the S&P500 appreciate over the next month• By how much will stock A beat the market over the next

month

Examples Regression Problem The Future Value of A Continuous Y Variable

2. The same set of information is available each time a decision is required• Information is a set of values for a multitude

of variables

Recurrent Decision Problem

Multivariate Information Setmeasured values for a multitude of variables

Medicine: set of results on medical tests Blood pressure, cholesterol level, blood sugar, etc.

Oil Exploration: set of values for various geological parameters

Marketing: set of demographic factors describing the person zip code, owns car yes/no, etc.

Credit Approval: set of credit factors describing the loan applicant . # years at current address, number of credit cards,

payment history

1. close / moving average = $ 1.0752. 10 day ma / 50 day ma = 1.0673. RSI Indicator = 744. 5 day ma volume / 25 day ma volume5. VIX (Implied Volatility on Stock Options)6. Ratio of Insider Sales / Purchases7. Ratio of Upside / Downside Volume

Technical Analysis Information Setmultitude of Indicator Readings at a given point in time

This point in timeIs characterized by

These indicator values75.575.5, , -2.1-2.1,,-.55-.55

62.162.1, , +0.1,+0.1, -.-.0202

75.5

-2.1

-.55-.55

62.1

+0.1

-.02-.02

In Other Words: There Are 3 Candidate Predictor Variables.

We can treat this asClassification Problem

Class 1: Market Return over the next 20 days is > 0

Class 2: Market Return over the next 20 days is < 0

The Target Variable: The Thing We Wish To PredictIs Discrete Variable that can Assume 2 Values

> 0 or < 0 ( we can call this Class 1 or Class 2,

This point in time t0

Is characterized by75.575.5, , -2.1-2.1,,-.55-.55

62.162.1, , +0.1,+0.1, -.-.0202

75.5

-2.1

-.55

62.1

+0.1

-.02

Do These predictors (indicators )Enable Us to classify (discriminate)

Future Up-MovesUp-Moves from Future Down MovesDown Moves?Class 1Class 1 from Class 2Class 2

This point in time t0

Is characterized by75.575.5, , -2.1-2.1,,-.55-.55

62.162.1, , +0.1,+0.1, -.-.0202

t0 t+20

t+20t0

Getting Matters of Time Straightt0 and t+20

• t0 refers to the date on which the prediction or classification is made– This is date of the most recent values of the

predictor variables

• t+20 or t+n refers to a time in the future that the target variable (Y) refers to– In the bankruptcy prediction problem it is any

time over the following two years.– So the future looking horizon of the target

need not be a fixed date.

Value of Y is based on Future InformationValues of X’s based on past and current information

Timet0

Past Future

Value of Target (Y) based on

What happens out here

From t0 until t+n

Values of Predictors (X) based on

What happens Back here & up to

from t-n unitl t0


4. TA Practitioners Should Partner-Up With Data Mining Algorithms






Task 1: Define The Target Variable (Y) The Single Variable We Wish to Predict

1. Define the type of the problem: Classification or Regression

A. Classification (Discrimination): Y defined as a class 2 or more distinct classes• Benign / malignant• Lend / Don’t Lend• Buy / Sell /• Strong Buy / Weak Buy/ Weak Sell / Strong Sell

B. Regression: a continuous quantity (linear regression)• Future % increase in the market• Predicted amount of future purchases

Task 2: Propose Candidate Predictors (X’s)

These are merely candidates because we don’t know yet if any will be useful for predicting the target Y

Predictors must be based on data known at the time the prediction is made: look back in time from present Tomorrow’s closing price – No Today’s closing price or prior closing prices- Yes

Not all indicators need to be useful, but some must be. Success in predictive modeling requires that

some candidate predictors have useful information about the quantity or class to be predicted (Y)

Task 2 is crucial!!!!!If not done well…..all is lost

1. The TASK of the domain expert……(YOU)2. Expert must know which raw data series may

contain relevant information1. Price2. Volume3. Open interest4. Interest rates, etc

3. Expert proposes useful ways to transform raw series into indicators

– For example in our problem X’s must be stationary.– That expose the information in the raw data series

to the data mining algorithm

Skipping Task 3 For A moment

Building the Data BaseOf Solved Examples

From Which DM AlgorithmLearns the Model

Tasks 4 & 5

4. Selecting Indicators for from Candidate List that warrant a place in the prediction model

Determining which candidates contain relevant non-redundant information about (Y)

The set of indicators that work synergistically

5. Determining the prediction function What is mathematical or logical formula for

combining the values of the X’s to best estimate the value of Y

A complex configural reasoning problem

What Is A Prediction Function

• A mathematical or logical formula for combining the selected indicators to produce a best estimate of the target variable.

• Simplest :– 1 predictor model

– linear shape: y = ax1+b

Y

X1

b is value of the Y intercept of linea is the slope of the line

Simplest Prediction Model1 predictor & flat (no hills or valleys) in model’s surface

Y

X1

Y intercept =b

For this value of X1

The model predictsThis value of Y

Multiple Linear RegressionCombines Two or More X’s in a linear way to predict the

value of Y

• In multiple linear regression the combining function is assumed to be linear (weighted sum)

• Y= a1X1 + a2X2 + a3X3……….anXn + c.

Regression coefficients (weights) are foundBy the method of Least-Squares

Modern Data-Miners Need Not Assume A Linear FormThey Allow the data mining algorithm to discover it.

It May Be Non-Linear & Arbitrarily Complex

X1

X2

Y

Linear Model : Flat Response (Y) Surface Y Is Linear Function of Two Features X1, X2

Y = A X1 + B X2 + C

“A” slope

“B” slope“C” intercept

X1

X2

Y

Linear Model Is Best FittingTilted Flat Surface to the Data

Y = A X1 + B X2 + C

“A” slope

“B” slope“C” intercept

X1

X2

Y

The Model’s Prediction is The Altitude of the Y SurfaceCorresponding to values of X1 and X2

Given this value of X 2

Given

this valu

e of X`

The model predictsThis value of Y

Thinking of A Prediction Model’s Output AsA Super Indicator

A new indicator that condenses & combines the information

In two or more indicators (variables)Into a new or super indicator

Model Output As a “Super Indicator”

• The output of a prediction model is a new variable, produced by function found by regression analysis

• The function is a weighted sum of the indicators serving as inputs to the model ( X1, X2, etc)

• The function’s weights been optimized to transform values of inputs into a best estimate of the target (Y).– method of least-squares is used to find optimal

weights– Weights cause the line or plane to fit the historical

data


value of Y

• In multiple linear regression the combining function is assumed to be linear (additive)

• Y= a1X1 + a2X2 + a3X3……….anXn + c.

But What If the true shape of the relationshipBetween the indicators (X1…..Xn) is not a tiltedFlat Surface….but something more complex????


value of Y

• In multiple linear regression the combining function is assumed to be linear (additive)

• Y= a1X1 + a2X2 + a3X3……….anXn + c.

Modern Data-Miners Do Not Assume the ModelSurface Is Linear (free of hills and valleys)

They Allow the data mining algorithm to discover itsShape, Which May Be Non-Linear

X1

X2

Y

Suppose the authentic relationshipBetween X1 & X2 and Y Looks Like This

Y = f ( X1 , X2 )3

Forcing A Linear to DescribeNon-Linear Phenomenon Misses The Boat!

X1

X2 – TA indicator X2

Y – future trend

Financial Markets Are Most Likely to BeComplex Non-Linear Systems2

Linear Model’sPredictionsToo Low

Linear Model’sPredictionsToo High

The Model Fails to CaptureThe Authentic Patterns in the Data

CandidatePredictors:

A Setof

IndicatorsProposed

ByHumanExpert

X1

X2

X3

Xn

Outcome

YTo Predict

Y = f (x)ComplexSystem

6

Tasks 4 & 5Must Be Performed by Data Mining Software

X4

X5

Task 4Which, if any, of the candidate predictors

Contain information relevant to Y ?

Task 5What is the shape of the mathematical function

best combines the indicatorsinto a Predicted Value of Y

? f ?Combining Function

CandidatePredictors:

A Setof

IndicatorsProposed

ByHumanExpert

X1

X2

X3

Xn

Outcome

YTo Predict

Y = f (x)ComplexSystem

6

Tasks 4 & 5Must Be Performed by Data Mining Software

X4

X5

Task 4Which, if any, of the candidate predictors

Contain information relevant to Y ?

Task 5What is the shape of the mathematical function

best combines the indicatorsinto a Predicted Value of Y

? f ?Combining Function

Task 5Note!!

In When the DM method usedIs Multiple Linear Regression

The Prediction Function IsAssumed to Be Linear








Human Experts &

Data Mining AlgorithmsHave Different But Complementary

Information Processing Abilities

They Synergize

Where Human’s Are Strong, DM Algorithms WeakWhere Humans Experts Are Weak, DM Algorithms Strong

Definition: Configural Thinking

a multitude of variables (indicators) must be considered simultaneously as an inseparable configuration (pattern).

Considering each variable individuallywill not provide the correct conclusion.

Human Intelligence Strengths & Weaknesses:

• Creative– Posing Problems (Y)– Proposing candidate

indicators (Xs)

• Weak Configural Reasoning– Distinguishing

relevant from irrelevant X’s

– Combining multiple variables

3

Machine Intelligence (Data Mining) Weaknesses & Strengths

• Lack Creativity– Unable to pose

questions (define Y)– Unable to propose

candidate indicators (define X’s).

• Excellent ability to handle numerous variables simultaneously Configural – Can identify relevant non-

redundant indicators.– Can formulate

multivariate prediction functions.

3

Who or What Should Handle the 5 Tasks?

1. Define Y2. Propose Candidate Indicators X’s3. Build Data Base of Solved Cases4. Indicator Selection: which Candidate X’s

Are relevant and non-redundant5. Determining optimal combining function:

a mathematical model that combines useful X’s into a prediction or classification decision

A Task for AutomatedData Mining Algorithms

The Evidence

Studies of Human Experts Solving Multivariate Recurrent Prediction Problems

Shows……..1. Experts realize the necessity for configural

reasoning (combining variables in complex non-linear fashion)

2. Experts are under the impression that they are combining information in a complex configural manner but studies show….

3. Experts rely primarily on simple linear rules for combining information

4. Their performance is poor– Inconsistent–same set of information elicits different

decision on different :Correlation .6– Correlation among experts is also low

Technical Analyst Faced With Large Set Of Conflicting Indicators

5 bullish factors & 3 bearish factorsLet each bullish factor = +1 & each bearish factor = -1 Sum bullish factors = +5 : Sum bearish factors = -3

Bullish

Bearish

Human Experts (Technical Analysts) Rely on Intuitive Linear Combining

Sum bullish factors = +5 & Sum bearish factors = -3

+5 – 3 = +2I’m bullish

Comparing the Subjective Predictions of Experts

With Multiple Linear Regression

ModelsStudies Began in 1954

The QuestionHow accurate are the predictions of humans

compared to multiple linear regression models given the same set of indicators ?

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 2 3 4 5 6 7 8 9

Expert

Model

r2

PredictedVs.

Actual

1 2 3 4 5 6 7 8 9Expert

Model

Expert Mean0.11r2

Model Mean0.38r2

Expert’s Subjective Predictions vs. Multiple Regression Models

Academic

Cancer survival

Stocks

Mental ill.

Student Att.

Business Failure

Teach. effective

Sales Effective.

Meta-Analysisof 135 Similar Studies

Study1 Study2 Study3 Studyn

Draws A ConclusionFrom Multiple Independent Studies

Swets, Monahan & Dawes2000

• meta-analysis of >135 studies comparing 3 decision making methods.

1. Expert / intuitive (subjective) judgment based on anecdotal experience & informal reasoning.

2. Statistical models.

3. Combination of methods #1 & #2.

Wide Variety of Disciplines Were Examined in the 135 Studies.

• Fields– Medical diagnosis– Penology (parole recidivism,violence)– Psychology(diagnosis and treatment

selection),– Education ( predicting success in academics)– Predicting football game outcomes.

• Results were quite consistent across fields

Results of Meta Analysis135 Studies

• In 96% of the studies, regression models beat or were equal to expert judgment.

• In medical diagnosis expert judgment was always worse than regression model.

• Experts beat statistical models in only 6 studies.

The Question:With All This Evidence Why Do Experts Insist on Making

Subjective / Intuitive Predictions & Decisions

Bottom Line For Technical Analysis

Aronson’s Editorial Opinion

When Making Predictions

Rely On Objective Statistical Models Not

Subjective Judgment

1








Task #3

Build Data BaseOf Solved Examples

The Data (Experience) Base Is Used By the Data Mining Algorithms to Learn How to

Build The Prediction Model

This task often takes 90-95% of the time when developingA Data Mined Model

Data Base of Solved Examples Known Values of “Y”

• What is a “solved example”? : A case (situations, examples, etc) for which the value of the target variable is known as well as the values of the X (candidate predictors)– Value of Y is known because the case happened in

the past– Even though Y is a forward looking the case

occurred long enough ago so that the value of Y is known.

• Each case in the data based is described by 2 kinds of information1. Value for the target variable Y.2. The values for the candidate predictors

Examples of A Solved CaseA. 1 day of market history for the S&P500

1. Y value: % change over the month following the date of the case (regression)

2. X values: values of the indicators on the date of the case

B. An oil drilling site1. Y value: did the site produce oil or not (class)2. X values: values of 10 geophysical parameters

characterizing the site

C. 1 company 1. Y value: company failed or did not fail within next 2

years2. X values: values of various financial ratios taken

from the most recent balance sheet and income statement

Data Base of Solved Examples

• Contains many cases: (typically thousands)– Why so many? - data density.

• From the many cases the DM algorithm tries to discover– Which, if any, of the candidate predictors can solve

the regression or classification problem• Task #4

– How the selected predictors should be combined mathematically or logically to give the most accurate estimate possible of the value of the target (Y)• Task #5

1

1

0

0

0

0

1

0

0

1

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 XN Y

Candidate Indicators

Case N

Examp. 1

Examp. N

Examp. 2

Examp. 3

Examp. 4

1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1

1.2 -2.5 -5.11.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1

1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1

1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1

1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 -2.5 -5.1

1.2 -2.5 -5.11.2 -2.5 -2.5 1.2 -5.1

1.2 -2.5 -5.1 -2.5 1.2 -2.5 -5.1

1.2 -2.5 -5.1 -2.51.2 -5.1Matrix of Examples

With Known Values

Of both Xs & Y

1.2 -2.5 -5.1

1.2 -5.1-2.5

-2.5

-2.5

-2.5

-5.1

-5.1

1.2 -2.5 -5.1

-2.5

-2.5

-2.5 -5.1-2.5

1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1

1.2 -2.5 -5.11.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1

1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1

1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1

1.2 -2.5 -5.11.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1

Human Intelligence: Unchanging Computer Power & Machine Intelligence

Growing Exponentially

Moore’s Law:An IncreasingCompetitive

Advantage to theData Miners

PowerArithmetic

Scale

Time

The A,B,C’s of Being An Intelligent Technical Analyst

A. Know How to Use Data Mining Tools

B. Know how to Define Data Mining Problems ( Define Y)

C. Know how to define List of Information Rich Candidate Predictors (X’s)

Documents

The Case for A Data Mining Approach to Technical Analysis