"Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at...

Preview:

Citation preview

Three-Dimensional Time: Working with Alternative Data

Now You’re Thinking with Perspectives!

Kathryn GlowinskiEngineer, Quantopian

Disclaimer

This presentation is for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by Quantopian, Inc. ("Quantopian"). Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company. In preparing the information contained herein, Quantopian has not taken into account the investment needs, objectives, and financial circumstances of any particular investor. Additionally, this presentation is being provided on the express basis that it and any related communications (whether written or oral) will not cause Quantopian to become an investment advice fiduciary under ERISA or the Internal Revenue Code with respect to any retirement plan or IRA investor, as the recipients are fully aware that the Quantopian (i) is not undertaking to provide impartial investment advice, make a recommendation regarding the acquisition, holding or disposal of an investment, act as an impartial adviser, or give advice in a fiduciary capacity, and (ii) has a financial interest in the offering and sale of one or more products and services, which may depend on a number of factors relating to Quantopian’s internal business objectives, and which has been disclosed to the recipient. Nothing set forth herein or any information conveyed (in writing or orally) in connection with this presentation is intended to constitute a recommendation that any person take or refrain from taking any course of action within the meaning of U.S. Department of Labor Regulation §2510.3-21(b)(1), including without limitation buying, selling or continuing to hold any security. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. You are advised to contact your own financial advisor or other fiduciary unrelated to Quantopian about whether any given course of action may be appropriate for your circumstances. The information provided herein is intended to be used solely by the recipient in considering the products or services described herein and may not be used for any other reason, personal or otherwise. Any views expressed and data illustrated herein were prepared based upon information, believed to be reliable, available to Quantopian at the time of publication. Quantopian makes no guarantees as to their accuracy or completeness. All information is subject to change and may quickly become unreliable for various reasons, including changes in market conditions or economic circumstances.

What’ll It Be?

Let’s chat about how data is typically viewed through the lens of time.

Because generally, that way is typically some percentage wrong. Let me tell you why.

Lessons learned at Q along the way.

What does “multidimensional” data mean for MY data?

More importantly, what does it mean for ANYONE’S data?

Quantopian.com

“Alternative” Data?

“Fundamental” Data?

My Very Special Dataset

Early Attempts

Consider the Following

So, We Can Just Change the Data, Right?

Enter, Fundamentals

What if we captured, every day, what we knew the latest value to be for every piece of known information?

It should now be the corrected state of the world for any present moment.

3/1/

17

3/3/

17

3/5/

17

3/9/

17

3/7/

17

3/13

/17

3/11

/17

First Known

Revisions

10 12

11

10 10 10 10 10 10 11 12 12 12 12 12 12 12Seen

Quantopian.com

Still Not Quite Right

But we still have the same problem, for revisions to updated data.

And what happens if you have 250GB of sparse data alone, before you even forward fill those values?

Quantopian.com

Dueling Problems

Lookahead Bias

Using data for a backtest that we didn’t know at the time is called lookahead bias.

We try VERY hard to avoid this, because it corrupts evaluation of any strategy.

“I know that Apple did well in the past, so I’m going to backtest a strategy that just holds Apple after 2005.”

Quantopian.com

Stale Data

This can be equally disastrous!

If the data is never updated, you may be stuck with hilariously incorrect values.

“My vendor told me that company ABCD announced a split of 1:25, so I on that day, I traded 25 times what I normally would. But when it actually happened it was only 1:5!”

Quantopian.com

So, What Do We Do?

This is referred to as point-in-time data.

It’s a BIG deal.

Personally, I think it’s the BIGGEST deal.

(I’m really biased because I do this for a living.)

Quantopian.com

Let’s Talk Timelines!

Timelines

http://www.csusmhistory.org/faulk006/thresholds/

Though Some People Think About This

Or...This.

Perspective Matters

It’s not just about when the data happened.

It’s also about when you’re observing the data.

If you have a dataset that ever has updates, revisions, or corrections, the data for a single data can change as you move through time.

Quantopian.com

5 87 9

5 297 9

5 87 9 >>>mean(4_days.values())>>>7.25

>>>mean(4_days.values())>>>12.5

Bi-Temporal Data

Separates the concepts of when the information HAPPENED from when we KNOW it.

Maintains accuracy with regards to data changes through history.

Allows questions asked to be answered with regards to perspective.

Quantopian.com

Deltas

Base Table:

Deltas Table:

5 297 9

5 87 9

as-of date 1,timestamp 1

as-of date 1,timestamp 2

But...Do We Care?

PROSReproduces events EXACTLY as they occurred.

Allows for accurate modeling of simulations from the past.

Easily allows for vendor updates to the most accurate known data.

CONSCan force modeling of atypical past events that wouldn’t happen in modern day.

If there are system errors, can be proliferated even into past data.Data shown can be “imperfect” from vendor or ingestor error.

Quantopian.com

Data Analyses Should be Replicable

Realistic view of data delivery, instead of the optimized view.

The world isn’t perfect, and your data is DEFINITELY not perfect.

But, it should at least be consistently imperfect.

Quantopian.com

Different Users, Different Needs

Quantopian.com

Point in time data is a layer of complexity.

In evaluation, only care- does it have alpha?

Users further on have the luxury of checking survivability.

99% of users don’t want to see a platform’s mistakes.

(I made that statistic up, but I’m pretty sure it’s accurate.)

What Else Can this Do for Me?

TWTR Actual Time

Perspective

2006 2017

2006 Tweeter -------

2017 Tweeter Twitter

Quantopian.com

Verify data model assumptions

“We never change our data after the fact. We wouldn’t do that” ~A Quantopian Data Vendor

Quantopian.com

Is that All?

Split Adjustments

Quantopian.com

Why Did You Call this 3D Time at All?

Look Into the Future

Fundamentals, Redux

We just deployed a new system.

Capture not only the first known value, but also the adjustments to those values.

3/1/

17

3/3/

17

3/5/

17

3/9/

17

3/7/

17

3/13

/17

3/11

/17

First Known

Revisions

Perspective of 3/6/17

10 12

11

10 10 10 10 10 10 - - - - - - - -

11 11 11 11 11 11 11 11 12 12 12 12 12 12Perspective of 3/14/17

Point in Timeness as a Service

Raw data history for all (ingested) time

But we’d like to update the way that users can give us data too!

But Wait, there’s More

Point in Time data doesn’t just have to be stock specific data.

This should be applicable to any field, any data.

Data is the Future

Thank You!

Kathryn GlowinskiEngineer, Quantopian

Disclaimer

This presentation is for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by Quantopian, Inc. ("Quantopian"). Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company. In preparing the information contained herein, Quantopian has not taken into account the investment needs, objectives, and financial circumstances of any particular investor. Additionally, this presentation is being provided on the express basis that it and any related communications (whether written or oral) will not cause Quantopian to become an investment advice fiduciary under ERISA or the Internal Revenue Code with respect to any retirement plan or IRA investor, as the recipients are fully aware that the Quantopian (i) is not undertaking to provide impartial investment advice, make a recommendation regarding the acquisition, holding or disposal of an investment, act as an impartial adviser, or give advice in a fiduciary capacity, and (ii) has a financial interest in the offering and sale of one or more products and services, which may depend on a number of factors relating to Quantopian’s internal business objectives, and which has been disclosed to the recipient. Nothing set forth herein or any information conveyed (in writing or orally) in connection with this presentation is intended to constitute a recommendation that any person take or refrain from taking any course of action within the meaning of U.S. Department of Labor Regulation §2510.3-21(b)(1), including without limitation buying, selling or continuing to hold any security. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. You are advised to contact your own financial advisor or other fiduciary unrelated to Quantopian about whether any given course of action may be appropriate for your circumstances. The information provided herein is intended to be used solely by the recipient in considering the products or services described herein and may not be used for any other reason, personal or otherwise. Any views expressed and data illustrated herein were prepared based upon information, believed to be reliable, available to Quantopian at the time of publication. Quantopian makes no guarantees as to their accuracy or completeness. All information is subject to change and may quickly become unreliable for various reasons, including changes in market conditions or economic circumstances.