Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Comparing Performance of Distributed
Computing Platforms in Backtesting
FINRA's Limit Up/Down Rules
An InformationWeek Financial Services Webcast
Sponsored by
Webcast Logistics
Today’s Presenters
Michael Kane,
Associate Research Scientist,
Yale Center for Analytical Sciences
Casey King,
Executive Director,
Yale Center for Analytical Sciences
© 2011 IBM Corporation
Information Management
IBM Netezza Analytics Appliance + Revolution R
Enterprise vs. the Cloud and R Comparing Performance of Distributed Computing Platforms using Applications in
Backtesting FINRA’s Limit Up / Down Rules
© 2011 IBM Corporation
Information Management
5
“One of the Most Terrifying Moments in Wall Street History. . .”
“A bad day in the stock market turned into one of the most terrifying moments in Wall Street history. . .It lasted just 16 minutes but left Wall Street experts and ordinary investors alike struggling to come to grips with what had happened -- and fearful of where the markets might go from here.” (New York Times, May 7, 2010)
© 2011 IBM Corporation
Information Management
Flash Crash May 6, 2010
Trader Steven Rickard reacts in the S&P 500 futures pit at the CME Group in Chicago
near the close of trading on Thursday, May 6, 2010. The stock market that day had
one its most turbulent sessions ever, with the Dow Jones Industrial Average plunging
nearly 1,000 points in a half-hour before recovering two-thirds of its losses. (AP)
6
© 2011 IBM Corporation
Information Management
Agenda
• Flash Crash • What was the Flash Crash? • What was the reaction from the market and policy makers? • What is the best way to evaluate SEC policy?
• Backtesting • How do you go about backtesting? • What are the challenges? • What is the model for backtesting SEC policy? • Three technological approaches for evaluating SEC policy:
• Workstation + R • Cloud + R • IBM Netezza + Revolution R Enterprise
• Wrap-up • Conclusions • Resources • Q&A
7
© 2011 IBM Corporation
Information Management
8
© 2011 IBM Corporation
Information Management
Volatility Spurs Market Fear
Governing
Board
Assesses
SEC institutes Circuit Breaker
Rules
SEC Institutes Circuit Breaker Rules
9
© 2011 IBM Corporation
Information Management
SEC Approves New Stock-by-Stock Circuit Breaker Rules
FOR IMMEDIATE RELEASE
2010-98
Washington, D.C., June 10, 2010 — The Securities and Exchange
Commission today approved rules that will require the exchanges and
FINRA to pause trading in certain individual stocks if the price
moves 10 percent or more in a five-minute period.
10
© 2011 IBM Corporation
Information Management
Rational Behind Circuit Breakers
Goal
To control volatility during extreme trading conditions
Halt
Stop trading in the event of extreme swings in stock price
Intervention
Gives human traders time to intervene
11
© 2011 IBM Corporation
Information Management
Market Response to SEC Policy
12
“ What the S.E.C. has recommended is working. Had they done this two months ago, there never would have been a Flash Crash.
- Patrick J. Healy, Issuer Advisory Group ” Advisor to public companies on how and were to list their shares for trading
© 2011 IBM Corporation
Information Management
Evaluating Volatility Rules
The Circuit Breaker rules were created based on the opinion and
experience of experts
These rules are evaluated through pilot programs
Can disrupt “normal” market behavior
May not be tested in extreme volatility
Should we be evaluating these rules with live markets?
13
© 2011 IBM Corporation
Information Management
A Call to Action
Anecdotal Evidence, Policy Makers’
“Opinions” Should be Considered
Insufficient to Determine SEC Policy
Policy decisions must be data driven.
14
© 2011 IBM Corporation
Information Management
Alternative: Utilize Market Data to Make Data-Driven Policy Decisions
Trade data has been collected for decades now
Records of the exchange of trillions of individual stocks
Provides insight into market behavior over a wide variety of
conditions
Used by hedge-funds and banks to evaluate and calibrate trading
strategies
“Standard practice” in financial service industry
15
© 2011 IBM Corporation
Information Management
Proposal: Backtest Rules for Controlling Volatility
Evaluate the rules based on historical data
Ensures bad rules don't negatively affect market behavior
Provides a quantitative approach for evaluating market policy
More efficient mechanism for evaluating and refining rules
16
© 2011 IBM Corporation
Information Management
Polling Question 1
• Does your organization use backtesting as part of its standard operating
procedure as a part of testing analytical models?
• A. yes
• B. no
• C. I don’t know
17
© 2011 IBM Corporation
Information Management
Circuit Breakers “Exposed”: The illusion of safety is often more dangerous than the surety of risk
Conclusion: The rules are not effective in stopping catastrophic
events like the Flash Crash
We also showed that circuit breakers tend to trigger during normal
market conditions
Was this more a symbolic rather than substantive regulation
measure in the face of intense political pressure?
Should circuit breaker rules be modified to address broader market
volatility?
18
© 2011 IBM Corporation
Information Management
SEC Announces Filing of Limit Up-Limit Down Proposal to Address Extraordinary Market Volatility
FOR IMMEDIATE RELEASE 2011-84
Washington, D.C., April 5, 2011 – The Securities and Exchange Commission today announced that national securities exchanges and the Financial Industry Regulatory
Authority (FINRA) today filed a proposal to establish a new “limit up-limit down” mechanism to address extraordinary market volatility in U.S. equity markets.
19
© 2011 IBM Corporation
Information Management
Limit Up/Down
• What is Limit Up/Down and how will the SEC evaluate whether it’s a
proposal worth making policy?
• First Challenge: Getting a clear articulation of exactly what the rules are.
20
© 2011 IBM Corporation
Information Management
• The proposed “Limit Up-Limit Down” mechanism would prevent trades in
listed equity securities from occurring outside of a specified price band,
which would be set at a percentage level above and below the average
price of the security over the immediately preceding five-minute period.
For stocks currently subject to the circuit breaker pilot, the percentage
would be 5 percent, and for those not subject to the pilot, the percentage
would be 10 percent.
• The percentage bands would be doubled during the opening and closing
periods, and broader price bands would apply to stocks priced below
$1.00. To accommodate more fundamental price moves, there would be a
five-minute trading pause – similar to the pause triggered by the current
circuit breakers – if trading is unable to occur within the price band for
more than 15 seconds.
Limit Up/Down Requirements Per the SEC Website
21
© 2011 IBM Corporation
Information Management
Questions Unanswered By SEC Website
1. What is the average price? Are they evaluated by the bid, the
ask, the trades executed?
2. Is the 5 minute window a sliding window, or is it contiguous
blocks?
3. If the window slides, by what increments does it slide?
(millisecond, second, minute?)
4. Is the average a volume weighted average, or simple
average?
It’s challenging to build a model to evaluate the rules given this
much uncertainly and fluidity. Requires flexible analytic platform.
22
© 2011 IBM Corporation
Information Management
The SEC Response:
Dear Dr. King:
Thank you for your message. In consultation with the Division experts who have been working on this matter, we would like to refer you to certain releases that may provide specific information for you. In particular, you may wish to review FINRA and other SRO rules concerning circuit breakers, including the rules of the NYSE and Nasdaq, among others. The rule books of the exchanges and FINRA are available on the respective SRO websites, and most have a search mechanism, which can be helpful. In addition, there are publicly available SEC orders relating to approved SRO rule filings on this matter. The link to the SRO rule-filing page on the SEC’s website is at http://www.sec.gov/rules/sro.shtml. The individual SROs may also have parallel postings on their pages, as well as their SEC filing history and submissions relating to their rule proposals. The initial SEC approval order for this matter was issued in June 2010 and can be found at http://www.sec.gov/rules/sro/bats/2010/34-62252.pdf. This order may provide useful background information and further references for your research. In addition, an order expanding the list of securities covered by the pilot was issued in September 2010 and can be found at http://www.sec.gov/rules/sro/bats/2010/34-62884.pdf. We hope you find this helpful. Please let us know if you have additional questions.
Sincerely,
Marie Ito
Senior Special Counsel
Division of Trading and Markets
23
© 2011 IBM Corporation
Information Management
Limit Up, Limit Down
Policy Goal: To mitigate market volatility.
Scope: All stocks and EFT’s traded on the US equity markets. Does not
apply to first and last 15 minutes of trading.
Method (In General Terms): Setting an acceptable range on both the
upside and downside, or “price band” within a specific time.
24
© 2011 IBM Corporation
Information Management
Limit Up, Limit Down
• Q: What about for stocks that are not S&P, Russell 1,000 or EFTs?
• A1: Stocks not listed on S&P, Russell or ETF’s, will have a plus/minus
band of 10%, provided that they do not trade for less than $1.
• A2: Stocks that trade for less than $1 are subject to a 75% plus/minus
band (based on previous day close)
25
© 2011 IBM Corporation
Information Management
Limit Up, Limit Down
If price bands are exceeded, the stock may or may not stop trading.
1. Pause: This is like a 15 second “probation.” If an additional trade is
executed within the bands, the stock continues to trade without a
stop. But if trades continue to exceed the bands, or if no trade brings
the stock back within the bands, then a stop is issued.
• Rationale: Markets don’t want stocks to stop trading because of “fat fingers,” or
some anomaly.
2. Halt: If no trade is executed that brings the stock back within the
acceptable bands within 15 seconds, trading on the stock stops for 5
minutes.
26
© 2011 IBM Corporation
Information Management
Example: May 6, 2010 Apple (AAPL)
• At 2:41 P.M., EDT on the day of “flash crash” Apple was trading at
$239.96
• 2:45.37 P.M. EDT, a trade on AAPL was executed at $225.10. If limit
up, limit down had been in place, this would have triggered a pause.
• 2:45.37, another trade was executed at 229.50 cents. Therefore,
there would be no “halt” and trading would have continued.
• 2:45.38, a trade on AAPL is executed at 225.00
• 2:45.38, a trade on AAPL is executed at 227.48
• 2:45.39, AAPL at 225. No trade is executed within 5% of average of
last 5 minutes within the following 15 seconds. Therefore a halt would
have been issued.
27
© 2011 IBM Corporation
Information Management
AAPL continues to decline in value. The lowest trade of the day is at 199.25. It opened that day 253.83. A loss in value of 21.5 percent.
200
210
220
230
240
250
Time (05/06/10)
Sto
ck P
rice
14:37:00 14:39:00 14:41:00 14:43:00 14:45:00 14:47:00
28
© 2011 IBM Corporation
Information Management
BACKTESTING
29
© 2011 IBM Corporation
Information Management
Our Research
• Run the FINRA rules on historic stock data (backtest)
• Determine if the circuit breaker rules are effective for controlling volatility
during catastrophic events like the Flash Crash
• Perform backtest calculations in a timely manner
30
© 2011 IBM Corporation
Information Management
31
Inability to course correct
Time consuming
High total cost Inefficient processing
What are the Challenges?
© 2011 IBM Corporation
Information Management
Three Technological Approaches
Workstation Cloud IBM Netezza
• Created model using
open source R and
Revolution Analytics’
parallel packages
• Circuit breaker
calculations only
• Added 3rd Year
• Added calculations
for limit up/down
• Had to move data to
computation
• Revolution R
Enterprise
• Moved computation
to data
• No security issues
• Ability to manage
data within Netezza
• Scale to multi-
terabyte range and
thousands of
variables
• Quickly adapt to
evolving market
conditions
32
© 2011 IBM Corporation
Information Management
Polling Question #2
Which of the following best completes the sentence for your organization?
When building a new model for backtesting purposes:
A. most of the development time is spent implementing the business logic
B. most of the development time is spent optimizing the model to improve
performance
33
© 2011 IBM Corporation
Information Management
Processing 24 Billion Transactions
Breaking up the data
754 Files
Approximately 7800 symbols for each day
Approximately 3800 trades per symbol
An embarrassingly parallel problem!
Days can be processed independently
For a given day, symbols can be processed independently
34
© 2011 IBM Corporation
Information Management
The Approach
• Retrieve data for a single day
• Within that day retrieve data all transactions for a given symbol
• Return 5 minute windows of trade data
• Windows are passed to a function that detects limit up/down and halt
conditions
35
© 2011 IBM Corporation
Information Management
The Data
SYMBOL,DATE,TIME,PRICE,SIZE
A,20101029,9:30:00,34.88,37
A,20101029,9:30:11,34.86,100
A,20101029,9:30:11,34.82,200
A,20101029,9:30:24,34.82,200
A,20101029,9:30:24,34.82,100
A,20101029,9:30:24,34.82,100
A,20101029,9:30:24,34.82,100
A,20101029,9:30:27,34.8496,209
A,20101029,9:30:27,34.82,1700
36
© 2011 IBM Corporation
Information Management
The Tools: foreach and iterators
The “iterators” package (Steve Weston, Revolution Analytics) allow a
programmer to define how a program traverses through a data set
Separates the data extraction from the data source
Easily add new sources to a given analysis
The “foreach” package (Steve Weston, Revolution Analytics) provides a
platform independent method for defining embarrassingly parallel loops
Single process, multiple cores on a single machine, or distributed
across a cluster
Packages are provided to exploit different parallel mechanisms
37
© 2011 IBM Corporation
Information Management
The Implementation
foreach (file in taqFiles) %dopar% {
taqData <- read.csv(file)
symbolIndexList <- split(1:nrow(taqData), x$symbol)
foreach(inds in symbolIndexList) %dopar% {
findLimitUpDown(taqData[inds,])
}
}
findLimitUpDown <- function(taqSymbolDayData) {
foreach(w=time.window.iter(taqSymbolDayData)) %do% {
if (limitUpDownInWindow(w))
writeLimitUpDownInfo(w)
}
}
38
© 2011 IBM Corporation
Information Management
Workstation
• Where most development is done
• Used the “doMC” package (Steve Weston, Revolution Analytics) which
provides a link between "foreach" and a parallel programming backend --
in this case, the “multicore” package (from Simon Urbanek)
• foreach and iterators packages minimize the code changes required to
move to a distributed environment
• After the analysis is tested, can be moved to the cloud or IBM Netezza for
better performance
• Most policy reviews only allow 21 days to assess – not possible on a work
station. Time is of the essence! Need scale and performance.
39
© 2011 IBM Corporation
Information Management
Polling Question # 3
In your day-to-day operation, what is the typical time required to backtest a
model or strategy?
A. intra-day
B. overnight
C. weekly
D. quarterly
E. time isn’t a consideration
40
© 2011 IBM Corporation
Information Management
The Cloud
• Used the doRedis package (Bryan Lewis) which uses the Redis key-value
store to provide distributed computing capabilities
• Calculations can be made to run faster simply by adding more machines
to a cluster
• Machines can be added as a calculation is being performed (dynamic
scalability)
41
© 2011 IBM Corporation
Information Management
• Objective: Compare backtesting performance between IBM Netezza and
the Cloud
• Model Migration: Minimize refactoring (less than one hour of work)
• Data Ingestion: two hours to load three years of TAQ trade data
• Porting: Replaced reading of compressed files with data streaming that
reads partitioned data by stock symbol and trade date
IBM Netezza
42
© 2011 IBM Corporation
Information Management
The Results
• IBM Netezza + Revolution Analytics performed 43% percent faster than the cloud – with no tuning of the analytic model
• Very quick to model and load the data in IBM Netezza architecture • Moving the analytics next to the data saves significant time • Business logic of the R code remained intact • Speed at which data can be interrogated allows users to play with many
models • IBM Netezza is much easier to set up than a cloud infrastructure
• Plug IBM Netezza in, connect to network, load data and run queries
Cloud IBM Netezza
TwinFin – 24
Nodes 60 CPU/240 Core 48 CPU/184 Core
Memory 900 GB 384 GB
Observations (Rows) 24.9 Billion 24.9 Billion
Variables (Columns) 6 6
Time to Ingest Data 24 hours 2 hours
Model Execution Time 108 hours 96 hours
Normalized Time (by Core) 108 hours 73.6 hours
Elapsed Time 132 hours 75.6 hours
43
© 2011 IBM Corporation
Information Management Information Management
Purpose-built analytics engine
Integrated database, server and storage
Standard interfaces
Low total cost of ownership
Speed: 10-100x faster than traditional system
Simplicity: Minimal administration and tuning
Scalability: Peta-scale user data capacity
Smart: High-performance advanced analytics
IBM Netezza Data
Warehouse Appliance
The true data warehousing appliance.
44
© 2011 IBM Corporation
Information Management
Exploiting In-Database Analytics with Revolution R Enterprise
45
LARGE DATA SET
LARGE DATA SET
Host
S-Blades™ Disk Enclosures
R Client
Results
Results
Results
LARGE DATA SET
crunching…
crunching… crunching… Results
Results
IBM Netezza
data warehouse appliance
Analytics
Analytics
Analytics
© 2011 IBM Corporation
Information Management
Exploiting In-Database Analytics with Revolution R Enterprise
46
LARGE DATA SET
LARGE DATA SET
Host
S-Blades™ Disk Enclosures
R Client
LARGE DATA SET
IBM Netezza
data warehouse appliance
Analytics
Analytics
Analytics
© 2011 IBM Corporation
Information Management
47
Eclipse Client
Plug-in
IBM SPSS
Modeler 3rd Party Packages
Revolution
Analytics SAS
Client
IBM Netezza
appliance
IBM Netezza Analytics
IBM Netezza AMPP™ Platform
Revolution
Analytics Spatial
Hadoop
MR Matrix
IBM SPSS In-Database
Analytics
Data Prep
Predictive Analytics
Data Mining
3rd Party In-Database
Analytics
SAS
Fuzzy Logix
Software Development Kit (SDK)
User-Defined Extensions (UDF, UDA, UDFT, UDAP)
Language Support & Adaptors (R, Hadoop MR, Java, C, C++, Python, Fortran)
Development Environment
SQL Pushdown
© 2011 IBM Corporation
Information Management
CONCLUSIONS
48
© 2011 IBM Corporation
Information Management
Summary
• IBM Netezza + Revolution Analytics Advantage
• Performance
• Value
• Simplicity
• Future Optimizations
• Refactor business logic to allow data partitioning for previous/forward
aggregation inside database
• Minimize memory management in R
49
© 2011 IBM Corporation
Information Management
Conclusions
• Although Limit Up/Down rules are an improvement over circuit breaker
rules to mitigate market volatility, these rules require further refinement.
• Policy should be data-driven rather than opinion-driven.
• The illusion of safety is often more dangerous than the surety of risk.
• Additional research required
50
© 2011 IBM Corporation
Information Management
QUESTIONS?
51
© 2011 IBM Corporation
Information Management
& Revolution
Analytics
www.netezza.com/testdrive