Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
SDS PODCAST
EPISODE 405:
THE WORK OF
QUANTS AND
DATA SCIENTISTS
IN THE FINANCIAL
SPACE
Kirill Eremenko: 00:00:00 This is episode number 405, with Lead Data Scientist at
Axpo Group, Thomas Obrist.
Kirill Eremenko: 00:00:12 Welcome to the SuperDataScience podcast. My name is
Kirill Eremenko, Data Science Coach, and Lifestyle
Entrepreneur. Each week we bring you inspiring people,
and ideas to help you build your successful career in data
science. Thanks for being here today, and now let's make
the complex simple.
Kirill Eremenko: 00:00:44 Welcome to the SuperDataScience podcast everybody.
Super excited to have you back here on the show. Today's
episode is going to be more of the advanced type. We've
got Thomas Obrist joining us, who is a lead data scientist
at Axpo Group. Now, while Thomas's lead data is lead
data scientist, the work that he does more resembles the
work of a quant. A quantitative analyst in a financial firm.
But in this case the difference is that this is not stock
trading, this is not financial trading, this is energy
trading. But the principles are the same.
Kirill Eremenko: 00:01:18 Why is this episode quite advanced? This episode is more
advanced because we're going to be talking about how
you can analyze data as a data scientist, versus how you
can analyze the same data as a quant, as a quantitative
analyst. What are the differences? What are the
approaches? How do they differ? We'll be mentioning
things like Monte Carlo simulations for example,
stochastic principles and things like that.
Kirill Eremenko: 00:01:47 This episode will be useful to you if you're specifically
interested in analyzing data in the space of trading, of
stochastic processes, of financial markets, analysis like
that, or if you're specifically interested in the energy
sector. If you're interested in the energy markets, and
what's going on there, this episode will be also be useful
to you. If you're in one of those two groups you might find
some very valuable insights in this episode. Just keep in
mind that it's quite specific to those areas.
Kirill Eremenko: 00:02:21 Things that we'll talk about, long versus short. Long
trading versus short trading. Psychology in trading,
versus data quantitative analysis versus data science.
We'll touch on the Monte Carlo simulation. We'll learn
about the energy industry. Thomas is going to share a
use case called the grid losses for one of the European
countries, an analysis that he was doing, very interesting.
Kirill Eremenko: 00:02:47 You'll hear about how he has to deal with uncertainty
that comes from other uncertainty, where a lot of inputs
like wind data, solar data, weather data, are being input
into his model, and he has to model them to find out
what the prices are going to be, but in the first place
those models, that data that's coming in is actually a
model itself.
Kirill Eremenko: 00:03:11 He doesn't know the wind data, the solar data for the next
day. Dealing with uncertainty driven by more uncertainty,
how he goes about that. We'll talk about out of sample
testing, and shadow trading. We'll talk about the trade-off
between testing and trading, and we'll talk a bit about
organizing hackathons, something that Thomas has
experience in.
Kirill Eremenko: 00:03:32 We've got this advanced episode coming up. Hope you
enjoy and without further ado I bring to you Thomas
Obrist, lead data scientist at Axpo Group, Switzerland.
Kirill Eremenko: 00:03:48 Welcome back to SuperDataScience podcast everybody.
Super excited to have you back on the show. Today we've
got a special guest calling in from Switzerland, Thomas
Obrist. Thomas, how are you doing?
Thomas Obrist: 00:03:57 Hi Kirill, thanks a lot for having me, very good. How about
you?
Kirill Eremenko: 00:04:02 Very good as well. Super pumped to finally have this
podcast. We've known each other for quite some time,
right? Like what? It's been a year and a half or two years?
Thomas Obrist: 00:04:13 I think around two years. Two years ago we met.
Kirill Eremenko: 00:04:18 Yeah. You've had quite an interesting career growth since
then. You've moved from ... Were you still finishing your
university back then when we met?
Thomas Obrist: 00:04:33 I think we just met after my master's degree. I started my
trading at Axpo. Since then now I'm the quant for short
term trading for Axpo Origination.
Kirill Eremenko: 00:04:46 Got you.
Kirill Eremenko: 00:04:47 How are you feeling about this podcast?
Thomas Obrist: 00:04:50 I mean, it's great. I'm a bit nervous, but it's going to be
fine.
Kirill Eremenko: 00:04:55 I'm sure it's going to be fine. Lots of cool topics to cover
off. Before we get started, before we dive into your
profession, your role, tell us a bit about, what your
background is. What did you study at uni? Have you
always been in Switzerland? I forget. Are you originally
from Switzerland?
Thomas Obrist: 00:05:16 Yes, born in Switzerland, I grew up in Switzerland, and I
studied in Switzerland. I studied mathematics at ETH, in
my bachelor's. Mostly focused on probability theory and
statistics. Then I worked for one year as a consultant, the
year between bachelor's and master's. Then I did my
master's in quant finance.
Thomas Obrist: 00:05:40 With my math background I mostly focused again on the
math part, on probability and deepen my knowledge in
probability theory. During my master's actually I got
really interested in data science. Back then, it was not
long ago, but still during my year any IT course or
lectures or data science or more machine learning
approaches, they were not part of my curriculum, but I
took them anyway, because at ETH you can basically take
more or less each, every class, you just don't get the
points.
Thomas Obrist: 00:06:16 I mean, they write it on your diploma, but you don't add it
up. I took a lot of IT lectures during my master's because
I think it was really fun to take. I was using it for my
master thesis. My focus was probability theory, a bit IT,
and then some finance lectures on top of it.
Kirill Eremenko: 00:06:39 What was the thesis?
Thomas Obrist: 00:06:41 My thesis was, I actually don't remember the full name.
The topic was like, I used deep reinforcement learning to
predict bitcoin currencies.
Kirill Eremenko: 00:06:56 That's so exciting. Were you able to predict it?
Thomas Obrist: 00:06:59 I would say not really. I actually should use it again. The
issue was like, it was during the hype. During the hype,
everything went up. It went to, I think January-
Kirill Eremenko: 00:07:18 2018, end of 2017, start of 2018.
Thomas Obrist: 00:07:22 It went up to 21,000?
Kirill Eremenko: 00:07:22 Mm-hmm (affirmative).
Thomas Obrist: 00:07:23 Yeah, end of 17. During this time, I mean, the area algo
was I thought nice, because if you only go along, and
everything goes up, nothing can fail. Then at the end of-
Kirill Eremenko: 00:07:39 Kind of like Tesla stock prices right now.
Thomas Obrist: 00:07:42 Exactly, you cannot fail at Tesla for the last half year,
because-
Kirill Eremenko: 00:07:46 This is not trading advice for everybody listening to this
podcast, right? We're not advising to buy or sell any kind
of stocks, it's just speculation I guess.
Thomas Obrist: 00:07:57 It's a huge move. During this time period you can run an
algo, and algo basically cannot fail if you can only go
along. Because in 18, 17, a lot of the exchanges, they
didn't allow us to go short. You could not design an
algorithm who would short bitcoin. Now there are way
more exchanges who very can do that. Therefore, I
designed this world algo who always goes along, and close
to dollar, normal dollar. During the samples, like the back
test or in out of sample testing was-
Kirill Eremenko: 00:08:34 Thomas, can you explain long and short? I just realized
that it's not common terms that maybe some people are
not familiar with.
Thomas Obrist: 00:08:45 Of course. I mean, easy speaking without all the financial
transaction, if you go along on an asset like Tesla, you're
betting basically that the stock price will go up, and you
profit from that movement. If you go short you're betting
that the price goes down.
Thomas Obrist: 00:09:03 Assuming you would have shortened bitcoin at 21,000,
and you would have closed your position at 10,000 for
one bitcoin, you would have made 11K by bitcoin going
down. You can bet on both directions. There is basically
long and short, I mean it's just a directional view. My
lectures, and my out of sample testing for my master
thesis was really good.
Thomas Obrist: 00:09:35 Then even whatever algo you have during the period
afterwards where it falls from 21,000, I think the lowest
was $4,800 per bitcoin. I mean, during the time period if
you only can go along, you automatically lose all the time.
I mean, as soon as you do something you basically lose.
The algo was not that nice.
Thomas Obrist: 00:10:01 I think the issue as well, for a deep reinforcement
learning, that the time period you have for bitcoin to
actually do something wasn't huge. There was not that
much data. I mean, now I have been more advanced I
would say after some years of actual using data science in
real work environment. I would say my algorithm was
kind overfit quite heavily. Because there was not so much
data.
Thomas Obrist: 00:10:30 Another thing is like there is not so much fundamental
data, where you can actually predict what it's dependent
on? What should we you use in impact? Yeah, you could
use all the indicators, and build a lot of stuff based on
price data, but that's not something more fundamental
like oil prices correlated to Bitcoin. My sample house, I
never tested that.
Thomas Obrist: 00:10:56 It makes it really difficult to actually fit such a heavy
structure like a deep reinforcement learning framework to
bitcoin prices. Now with my more experience it looks like
not heavily an overfit, but a generalization bet. The issue
is for bitcoin you have limited data, and then this data is
kind of free as well, because actually it's just one
realization of reality. It's a stochastic process but it's
actually adjusted for one timeline. I mean, this is how life
is.
Thomas Obrist: 00:11:30 It's difficult for trading, because actually in the situation
when you want to predict bitcoin at 2K, like $2,000 value.
It's a complete different story than at 20,000, but if you
use a deep learning neural network you assume
independence, so that they have more centers to train on.
This is actually not true because they are heavily
correlated.
Thomas Obrist: 00:11:58 I mean, to some extent people behave differently if Bitcoin
is at $10,000, than if they were if Bitcoin was at $10.
Even if you have the theory, the points are not
independent of each other, and the whole will run. I
mean, this is difficult then to generalize on.
Kirill Eremenko: 00:12:20 I totally understand. The way I understood it is that your
deep reinforcement learning algorithm is looking at prices
as these price points which you compare each other?
That go in comparison to each other, and movements in
the price. It doesn't really care, whether it's 20,000 or 20
euros, but for people that's a big difference in terms of
psychology.
Thomas Obrist: 00:12:46 Exactly. The issues [inaudible 00:12:48] I had inputs the
price itself. I mean, those standardize, normalize and so
on. So there was only like 20,000. The algorithm knew
exactly it was 20,000 it was 20,000, it was more looking
at the difference between how it moved. The psychologic
factor, I mean, there was a lot of sense that Bitcoin
cannot stop before 10,000, because just to change from
four numbers to five numbers had a big impact on how
people behaved. Bitcoin was traded heavily with a
psychology approach.
Thomas Obrist: 00:13:25 There was a lot of emotions in the market basically, and
mostly dependent on the level where it was. An algo who
got standardized inputs. I mean, he wasn't aware of this,
and how could he? Because how do you treat emotions to
a trading bot? You can, you can build features based on
10,000 could be a one or a zero or something like this,
and you can make borders, or you could build features
around this.
Thomas Obrist: 00:13:55 Then you need to know which they are, and if you already
know then, why should you build an algo for doing it?
You can just trade it like then you don't need an accurate
structure. I mean, if you know where the borders are,
then there is no point in using an algo, then you just
create it.
Kirill Eremenko: 00:14:11 Yeah. Absolutely.
Kirill Eremenko: 00:14:14 This episode is brought to you by SuperDataScience, our
online membership platform for learning data science at
any level. We've got over two and a half thousand video
tutorials, over 200 hours of content, and 30 plus courses
with new courses being added on average once per
month.
Kirill Eremenko: 00:14:34 All that and more you get as part of your membership at
SuperDataScience. Don't hold off, sign up today at
www.superdatascience.com. Secure your membership
and take your data science skills to the next level. Very
interesting. Let's move on to what you do now. Tell us a
bit about your role. So you're the lead data scientist at
Axpo Origination for west and east Europe. What is Axpo,
and what does the company do?
Thomas Obrist: 00:15:10 Axpo Group is a Swiss security. In Switzerland, we have a
lot of assets, like river plants, water, pump storages, as
well as some nuclear plants, which are partially owned or
mostly owned by Axpo and operated on. Myself, I work for
Axpo trading or Axpo solutions it's called. This is a part of
the group, and what we do we don't have any assets. We
trade our own things. We bring the assets essentially to
the market. Because Axpo Group, they just produce
energy, but we manage their energy.
Thomas Obrist: 00:15:57 Actually, myself I'm in Axpo origination. I have actually
nothing to do with Switzerland. Axpo has as well some
trading activities in other parts in Europe, and in the U.S.
For example myself, I am the quant for origination, for
short and time part. I do everything about data science
and quant stuff for several European countries like
Belgium, France, Netherlands, Austria, Czech, Slovakia,
up to Turkey.
Thomas Obrist: 00:16:31 What origination is, origination is basically, if you are our
steel clients, my department could offer you a contract to
supply energy for your production for the next two years,
or one year or a three years. And that's actually your
hedge. If you're a steel producer this is enough, because
you don't need to worry about power prices. I mean, you
can produce as long as you want, because you don't have
any power risk. We take the risk for these companies, and
we manage this risk.
Kirill Eremenko: 00:17:04 Got you.
Thomas Obrist: 00:17:07 We have this PPAs it's called. This is Power Purchasing
Agreements, where we buy the power from wind parks
and solar parks. If you've got huge wind parks, then you
don't want to worry about production and risk as well. I
mean, production to power price risk. You just want to
have a good power price for your plants, and then you
want to produce as much as possible.
Thomas Obrist: 00:17:34 We take care of this risk as well. We manage these wind
parks on the market for people who have built these
parks. I mean, there is part of the company who build
wind parks as well, and solar parks for Axpo, but it's not
the trading part. So, we just manage then, after they are
built, we manage their production on the market. We go
and sell.
Kirill Eremenko: 00:17:58 [crosstalk 00:17:58]. It's like Axpo is a massive company
that on one hand produces energy itself, with different
kinds of energy with different sources. But then on the
other hand you also purchase energy from other
companies out there? Wind parks, solar parks, and other
energy producers. And you also sell that energy, supply
and sell that energy to whether it's clients, not mom and
dad clients, but big companies like you said, a steel
production plant, which requires lots and lots of energy
per year.
Kirill Eremenko: 00:18:35 You create agreements with them, so that they know what
they will be paying for energy in the next year, three
years, or five years. Is that about right?
Thomas Obrist: 00:18:45 Exactly. We manage their risk for all these things. Yes.
Kirill Eremenko: 00:18:55 Good. You said you're a quant. What is the difference
between a quant, a data scientist, and a data analyst?
Thomas Obrist: 00:19:00 For example, a data analyst, we have a lot of data
analysts. They study the market really deep. They read
from newspaper, browsing newspapers and try to see
where gas prices might be going, or they read all the
news. We've got the news development with new tech
breakdown in Germany, or in France, what's going to
happen in France? What politics decide, politic and
regulations?
Thomas Obrist: 00:19:34 I mean, it can be quantitative, but it's a lot of seeing
where markets are going based on news, events, and all
that stuff. It doesn't need to be quantitative because they
have a lot of experience, and they read and see well.
Thomas Obrist: 00:19:53 Differentiating now between data scientist and quants ... I
mean, I think they're kind of a mixture, and a good quant
can use data science, while a good data scientist has
quant skills. I mean, I see data science like people who
go, have a trading test set, and they build machine on top
of it. While quants, they can run simulations, like Monte
Carlo simulations, and then they can calculate the
probability, and based on the probability they can make a
price. If this price is better than what you get at the
market, you go on buy and sell.
Thomas Obrist: 00:20:27 It's kind of a different approach. It's all free data heavy.
Like you both do a descriptive analytics of the data. I
would say the mathematic methods they use is a bit
different. I mean, traditionally, you see a lot of quants in
risk management do pricing analysis. You can do as well
quants related models for trading, like predictive models
as well.
Thomas Obrist: 00:20:54 It's less, I mean, I would say easy quant models to
differentiate just as an example would be, you may get
stochastical outliers. You calculate the probability if
something like this happens, which is an outlier, you
assume this will follow afterwards, because if you look at
the outliers, there you have like ... You might have, you
don't need to, but you might have higher correlations
between different prices.
Thomas Obrist: 00:21:17 A data scientist, I mean, he can think of these things as
well, he's not excluded, but I would say the approach is a
bit different. You go, and you try to build the models, you
fit the dress space, you try to build features. The delaying
which is a bit different.
Thomas Obrist: 00:21:33 At the end, the models don't need to be necessarily hugely
different, but I would say the language is a bit different. I
think right now since data science is quite new to a lot of
companies, I think it's a little bit split. What the quant
does and what a data scientist does. I think it's
dependent on which field of course they will mix a bit,
because I believe ... I mean, you need to ... If you want to
be a good quant or data scientist, you don't want to use
just a hammer. If you need to saw something, you need to
have a saw.
Thomas Obrist: 00:22:08 I see one tool or any other as a good, a worker can use
both tools. Therefore, I explain it a bit differently between
these three groups. I mean, the analyst, he can do a lot of
things as well. He can go do data science models as well. I
mean, for his daily work, most of the times he doesn't just
need it. I would differentiate the three different groups.
Kirill Eremenko: 00:22:37 Interesting. I was specifically interested in the quant
versus data scientist. Let's dive into that a bit more. The
difference between the two models. For a data scientist,
let's say you want to do a simple prediction of price based
on a linear regression, you just use your ... You have your
training data, you have your test data, very
straightforward, you pass it through your model, and you
have a model.
Kirill Eremenko: 00:23:08 Simplifying things, how would you say a quant approach,
you said Monte Carlo, how is that different? What is the
principle thinking behind it that is different?
Thomas Obrist: 00:23:24 I would say in general a quant goes and first like, he looks
at for example different quantiles. Assuming in linear
regression you would just take your price data. Let's say a
power price data for ... I mean, not that this will work, so
don't try it, but you take price data of Germany, you put
it in a linear regression, do like 10 years, and then you
make a prediction based on this.
Thomas Obrist: 00:23:49 A quant will go, perhaps they may look like, okay, I
normalized my data. I looked at all the quantiles. If power
prices moved yesterday by 10 years. This will be in 10
years power prices in Germany for the car is a lot. This
will be a really high move, a history move. Then you could
compare historically, you look at two years data, and you
see what happened afterwards if I was in this high move
environment, or this high volatility environment? What
happened the day afterwards?
Thomas Obrist: 00:24:21 Then if you see normally, which I believe you could say,
"Okay, probability of 60% movement, after a [inaudible
00:24:28] move, the price is reverted down. You could
down a model and say, "Okay, if I observed this high
move, I go short." So I will sell at cap and I bet that the
prices go down. I mean, this will be one idea which could
come to the same conclusion as a data science model, but
it's a different approach, obvious data, and kind of strain.
Because you're depending on values that you order, it
could be a 10% quantile move, or a 5% quantile move and
so on.
Thomas Obrist: 00:24:58 So, you have a training and a fitting face, but this can
have a different approach in my view. It's a bit different I
would say. I mean, this is less about Monte Carlo. Monte
Carlo you could more use for pricing as a quant. You run
a simulation, and you get a price. For example, on my
work, as I said, my department we go and have this power
purchasing of wind parks, not wind parks itself, the
power of wind parks.
Thomas Obrist: 00:25:34 The question there is like, what will be the short term
risk? Because I'm doing just the short term. You'll get
short term risk in two years. For this, I need to know
where the whole wind in Europe is going, and what could
be the price of this in two years? Because I need to make
a quote for let's say our originator, our sales person. The
originator who goes to the client. He needs to have a
price. I give him the price, therefore, I need to run a
simulation, what could happen? Because it's kind of a
probability. It's less a prediction. It's more an expected
value for example. Because I know what we can get from
there, it's not a forecast, because I know it's going to be
wrong, but it's more like a risk for you.
Kirill Eremenko: 00:26:26 Could you explain Monte Carlo in a few sentences? How
does the simulation work? I find it quite interesting.
Thomas Obrist: 00:26:36 I would say rather easy. If you simulate different
stochastic processes, you can assume distributions, you
can take historical distributions, or other things. Then
you run the suspicion, and you look at how they interact
in hierarchy, it's based on a numerical approach to
mixing distributions. Easy said. Then you can see how it
will converge.
Thomas Obrist: 00:27:02 For example, the issue with itself is that you only observe
one year of data, or two years, which is relevant. I mean,
let's say in Europe, the short term has changed a lot
during the last few years. There has been way more wind
parks and solar parks, because Germany, every European
country is building more wind and solar.
Thomas Obrist: 00:27:27 Germany was a front runner, they built a lot of wind, and
solar power. Since it's changed so much, you cannot go
back 10 years. I mean, I have data for 10 years perhaps,
but 10 years ago the data is useless, because it's such a
different environment now. With Monte Carlo simulations,
I don't just take historical cost of one year, I can run what
would that be in a fair price if this would have happened
based on different stochastic process? Which I can
interfere and see, "Okay, this is an expected value."
Because one year is just, it has a huge variance.
Thomas Obrist: 00:28:07 You need to filter out this variance, because once you
have variance it's like ... For example, you have seen the
cost on the short term for wind park was in 2019 was
huge. Apparently, you are unlucky. I mean, wind is still
huge, this was extremely huge, you don't want to give for
21 for example, the 2019 price, what you observed.
Because this might be way too high, and then you don't
get the contract.
Thomas Obrist: 00:28:43 I mean, the goal is to sign contracts, and to manage more
assets. We want to give a fair price, which is what ... 21,
could be low again perhaps, but we want to know what is
the expected price, and then we can manage the risk.
Kirill Eremenko: 00:29:01 It sounds like you've got a lot of things going on, it
sounds very complex. You don't only have to think about
the data, but all the contracts and managing. I wanted to
ask you about this, you are responsible for trading at
Axpo. That's trading energy as I understand on a daily
basis. To put it into perspective, what is the amount of
funds that you're responsible for trading every year?
Thomas Obrist: 00:29:35 I will say it's ... I mean, the data, the models, which are
live on the market. I only trade over models basically.
Currently they're on let's say 20,000,000 a year, which
gets traded over these models.
Kirill Eremenko: 00:29:48 20,000,000 euros?
Thomas Obrist: 00:29:51 Yes.
Kirill Eremenko: 00:29:52 That is a huge amount. How do you approach this? For
instance, what kind of things do you look at? Tell us
what's your day to day? What is involved in your day to
day as a quant on the trading space?
Thomas Obrist: 00:30:16 Since my department is mainly basically origination, or
let's say contact with clients, we don't have a huge
department with a lot of quants. I do a lot of different
stuff, which I really like, so it's a lot of variance. I started
as a data scientist, but as a mathematician, I use more
and more quant models, but it's really different.
Thomas Obrist: 00:30:41 Sometimes I do pricings, which is more quant related,
where I run simulations, and see what will happen in
three years, or what is my view on two years short term
pricings? Sometimes, just a new contract, for example, for
a steel client, or let's say for example a grid loss client.
Then we need to forecast these grid losses.
Thomas Obrist: 00:31:05 I get a lot of data, and then the thing is, I need to build a
model, which goes every day to the market, and buys the
energy to supply this client. Then I spend days working
on this model, fine tuning it, looking if it works well. Then
I put it live on the market, and it starts trading. It can be
hugely variance, but I myself, as well, sometimes I trade
minimally. Sometimes I get calls to execute some trades
on the market. It's really a huge variance, which I really
love about my job.
Thomas Obrist: 00:31:41 My day to day job looks basically a little bit different. It's
always about short term trading, but in some sense re-
trading, like going to market and trade imbalances, we get
live updates from a lot of wind parks, and sometimes we
need to go and manage them manually. I build models,
like read let's say data machine learning models. We try
to predict as good as possible different client profiles.
Then I do the pricing on the short term, so it's really
quantitative related.
Thomas Obrist: 00:32:17 I would say these are the main three things. I would say
most of my time I spent doing prescriptive analytics. I try
to understand what's happening. Also times if you really
understand what actually happened, then you can add a
few models to be better features, or do really simple
adjustments in the future. It's really a lot of things, so
many things going every day, you get a lot of feedback.
There is so much data coming back every day from each
European country, so much price data, things that
happened.
Thomas Obrist: 00:32:51 Things might go wrong, and then you need to
understand, what happened? I mean, for example COVID
was an interesting time period out of several aspects. I
mean, just on the market. At the beginning of the
lockdown, everything shut down. For short term risk on
energy trading, on power trading, you trade the next day.
I trade tomorrow at 12 o'clock today. I need to make a
forecast today at 12:00 for tomorrow, 24 hours.
Thomas Obrist: 00:33:26 During COVID what happened was that all your demand
forecast, because you don't know which factory shut
down, when, and which machine, or which homes did
more energy during this time? Nobody knew what's going
to happen exactly. I mean, everybody expected that there
would be less demand, but when and how? You need to
forecast on an hourly basis. It gets traded on an hourly
basis in most European countries.
Thomas Obrist: 00:33:56 For example in Germany, you can trade up to 50
minutes. What I mean with 50 minutes, you can trade 50
minutes delivery times. It's really, you need to be precise.
During this time period, all your data, the month data
was wrong, but you never knew how wrong? Market, what
happened in the market was kind of that ... Not
everywhere, but there was too much energy produced
because thought, "Okay, they will need it." But at that
time they didn't needed it. The market was loaded with
energy. Again, the balancing mechanism needed to take
out energy. There was too much energy around on the
short term.
Thomas Obrist: 00:34:41 Another factor is you need to understand and think
about, was it just now or it will be just the future? How
long will this trend last? For this period where everything
shut down, it was really short because this happened
during a weekend, then you knew, "Now we have on the
new levels, and markets got regulated, or normal again."
There are other trends, and you're thinking, why is this
happening? What could be the cause of it? How could I
adapt to it? How could I position myself to not get harmed
by it, and manage the risk?
Kirill Eremenko: 00:35:19 You mentioned you do several different things. Are you
able to share a use case with us to give an example?
Thomas Obrist: 00:35:30 Yes, for example. I think one really interesting use case
was grid loss.
Kirill Eremenko: 00:35:34 What is grid loss?
Thomas Obrist: 00:35:39 Exactly. If you pump energy through a cable, the energy
gets lost on the way. If you have a starting point, and an
end point, you pump energy in at the starting point, and
you take out energy on the endpoint, there will be a
difference, because during transport you lose energy.
Kirill Eremenko: 00:35:57 How much energy do you lose?
Thomas Obrist: 00:36:02 Actually, on percentage, I'm not so sure. I mean, it's not
that much. It depends on the cable, if it's high frequency,
low frequency-
Kirill Eremenko: 00:36:11 The distance.
Thomas Obrist: 00:36:13 A lot of stuff, how many transformers you have, and so
on. On an actual level, I don't know to be honest. I mean,
what I did as a use case was, we needed to supply this
energy on the day ahead. So, I got two years of data. Then
the idea was to forecast this for the next day, over a
different time period.
Thomas Obrist: 00:36:44 There are a lot of physical factors on what are dependent,
high frequency, low frequency, or other things. But just
the aesthetic factors. This is really easy to spot with your
data. You can really mark down these levels. There are
variable technical losses on grid losses. These variable
losses, they're basically dependent on how much energy
runs through it?
Thomas Obrist: 00:37:15 The longer the cable and the more energy runs through it,
the higher the grid loss. This is where the problem starts.
It's really difficult, we're not speaking about grid loss in a
small home, it's on a country level of one of the European
countries. It's a big grid loss with a big cable.
Thomas Obrist: 00:37:39 How much energy runs through it? It's temperature
dependent, which temperature do you take in a whole
country? I mean, there are many factors. Then, with more
renewable energy, normally renewable energy is not
produced where people live. A good example is Germany,
there are a lot of wind parks in the north of Germany, but
a lot of people live in the south of Germany.
Thomas Obrist: 00:38:05 If wind it's produced it needs to be basically transported
from north to south. There is more grid loss based on the
nearness. But for example, on the other side, if you have
more solar panels on your rooftops, these people have
less grid losses, because they don't need energy from the
grid, and so on.
Thomas Obrist: 00:38:25 Wind parks for example, you have huge offshore wind
parks in parts of Netherlands, Belgium, Germany, and so
on. If they produce they have a long time to get to the
people because they're offshore. They're out in the ocean
basically, out in the sea. This takes way more time.
Thomas Obrist: 00:38:47 I would say, this makes it really interesting. It's not just
price data, there are fundamental things why this grid
loss is happening. It's temperature dependent, then it's
dependent on solar production, wind production, wind
speed, and other things.
Thomas Obrist: 00:39:04 Then another difficulty is the demand itself, like how
much energy is actually needed? It's difficult to pin down.
For example, for a country, if you look at grid losses near
a city, this is really dependent on how much energy the
city produces. For example, if temperature goes up, the
city perhaps heats more. So it needs more power to
actually heat to proceed. There might be more grid losses.
All those factors come together-
Kirill Eremenko: 00:39:34 You mean if the temperature goes down they need more
heat?
Thomas Obrist: 00:39:38 Exactly. Sorry. If temperature goes down they need more
heat, so grid losses might go up. For example, in summer,
if it's too hot, they turn on AC and [inaudible 00:39:51].
There are other factors to take into account which are not
linear. The issue comes with ... I mean, if you would know
all these inputs exactly, it would not be that big of an
issue. It would still be difficult, because which
temperature?
Thomas Obrist: 00:40:04 There is a lot of questions around wind production, the
one in the north, or the one in the south, or which solar?
And so on. There is a lot of uncertainty, but the worst
thing is all this data we're using, we need to decide it
today for tomorrow. For example, wind and solar-
Kirill Eremenko: 00:40:22 So, you don't know the wind speed tomorrow, you don't
know the temperature in different areas tomorrow? All of
your inputs are unknown as well?
Thomas Obrist: 00:40:31 Exactly, all of these inputs, they're other forecasts. Wind
production has an MAP of around 15 to 20%.
Kirill Eremenko: 00:40:44 What's an MAP?
Thomas Obrist: 00:40:46 Mean actual percentage error. If you look at forecast
model, wind production can be wrong, up to 20%. I mean,
can be wrong even more, but in average, the absolute
error is about 20% wrong of your wind production on the
day ahead. Today for tomorrow, up to 15 to 20% error in
my forecast. This is just for wind, solar is extremely
wrong as well.
Thomas Obrist: 00:41:13 It's difficult to forecast solar. I mean, just imagine, it's no
clouds at all tomorrow. You don't see anything on your
weather models, but sometimes there is a small cloud.
You don't spot them, but they might be exactly when you
want to produce at noon, so you have your high peak.
You want to produce as much as possible, and perhaps
then, exactly then, there is a small cloud over your solar
panel. This is impossible to forecast. There is no way to
forecast this on the day ahead.
Thomas Obrist: 00:41:52 Therefore, all these inputs I take, they're hugely wrong. I
know they're wrong. I need to deal with how wrong will
they be? And how could I find the data itself? What I did,
and this was really describing what I see, if I look at wind,
can I spot how big their error is in wind generation?
Thomas Obrist: 00:42:16 For example, I don't receive only data for the next day. I
receive wind data, yesterday for tomorrow, three days ago
for tomorrow, four days ago for tomorrow. One idea could
be like-
Kirill Eremenko: 00:42:29 So you receive the wind forecast?
Thomas Obrist: 00:42:33 Yes.
Kirill Eremenko: 00:42:34 Which were in place a day ago, for tomorrow, two days
ago for tomorrow, and so on?
Thomas Obrist: 00:42:38 Exactly.
Kirill Eremenko: 00:42:41 So you can observe how the forecast changed over time?
Thomas Obrist: 00:42:45 Exactly. This could be a feature to study. Assuming, if it
changes a lot, does it make the wind forecast worse or
better? The same thing for solar. Can you spot, perhaps I
know now wind might be wrong tomorrow. Should I
position myself differently? Should I really just look at the
day plus one, or should I look at day plus two, day plus
three, day plus four? And look at the different things.
There is not just time perspective that I need to focus 24
hours. I need to focus, 12, 1:00, 2:00, 3:00 and so on.
Thomas Obrist: 00:43:21 I need to look at the data itself like, the forecast for one
o'clock tomorrow, I have one today, I had one yesterday, I
had one two days, three days ago, and so on. I can study
this as well. The data I have is a lot. The same thing goes
also true for solar, for temperature forecast.
Thomas Obrist: 00:43:46 Then there is demand forecast, like how much each city
needs? And so on. It's a more year, going back to city per
se, but a year. All these things change, and they have
variants and they have uncertainties. You need to think
about, is there a way to analyze the different wind inputs?
All of these things have impact on the grid loss. At the
end, this is interesting about my job, I start a model and
it gets feedback immediately.
Thomas Obrist: 00:44:19 I start trading today, and basically tomorrow during the
day I can see if I was right. Metering takes a bit of time,
but let's say I start trading today, and in five days I got
my feedback. If it was right or wrong, or was my action
good or bad?
Thomas Obrist: 00:44:37 This is really I would say rewarding, and challenging,
since it's shorter term. To think about what is right or
wrong? You get immediate feedback. There is a lot of data
to think about, all the different wind forecasts and solar
production. Then you can think of more things. For
example, grid losses could increase as well if, just a
random example, Switzerland, if they buy energy from
Germany, or import energy from France, it's a different
high grid loss if they take it from France and they will
produce it inside of Switzerland.
Thomas Obrist: 00:45:16 You need not just to think about one country, you need to
think about several countries. One big country we always
think about is Germany. There is so much wind
production. What's happening if Germany doesn't
produce that much wind? They import a lot of energy. If
they have too much energy and they produce a lot with
their wind production, they flow to other Europeans
market with their energy. So cross border trade activity is
really high. It's not just like you need to focus on one
country. It's basically all Europe to worry about, and to
think about, how could this impact?
Thomas Obrist: 00:45:56 Of course, if you look at Spain, you don't need to worry
too much let's say about [inaudible 00:46:06]. Actually I
never looked at this data, but I suppose there is not much
correlation going on. Countries between let's say Belgium
and France, they have huge impact on each other, or
mostly process impact in Belgium, because Belgium is a
small country in relation to France of course, but there
are so many things going on. So much data to consider,
and all of this data I had is wrong, because there is high
uncertainty in each data point.
Thomas Obrist: 00:46:38 I think this is really interesting to study, because you can
spend an eternity just going through, "Okay, what
happens if wind forecast 10 years ago was on a huge
different quantile level, hugely different than one day ago,
or two to three? And so on. There is so much data
around, it makes it really interesting.
Kirill Eremenko: 00:47:02 How did your grid loss case study end?
Thomas Obrist: 00:47:07 I mean, I produced a model. Actually, this is really
difficult to produce a model. I think I got a good model,
which generalizes. I did all the testing, but actually there
was a third party who claimed they could do better. Then
there was a challenge, my management wanted, "Of
course, we should do the same thing. We should be even
better than they are. Why are we worse?"
Thomas Obrist: 00:47:35 We had shadow trading we call it, when we get the fair
point inputs on a day to day basis, but we actually don't
trade them. The issue is, nobody send a bet back to us. If
you're a third party you need to send the bet to us, and
you say, "Okay, this is what we have done." As I said,
nobody sends a back practice, so you never know if it's
overfit or not.
Thomas Obrist: 00:48:00 We have the shadow training where we go and see, "What
is the real out of sample testing?" Because if their out of
sample performance is like ... In sample performance they
should have more or less the same weight, if you receive
day to day the data then to replicate their back test.
Kirill Eremenko: 00:48:23 What is schedule trading?
Thomas Obrist: 00:48:24 Shadow training is like-
Kirill Eremenko: 00:48:28 Shadow trading? [crosstalk 00:48:29].
Thomas Obrist: 00:48:29 Shadow trading.
Kirill Eremenko: 00:48:30 I'm sorry, I heard schedule. Shadow trading. You're
trading a demo version, you're not trading real money?
Thomas Obrist: 00:48:39 Exactly, we are not trading it, but we receive the data
from this other company which says like, "You would
have done that." Because if you receive it on a day to day
basis, they cannot cheat, because we are the one who
bought it, so they don't actually know what's going to
happen.
Kirill Eremenko: 00:48:55 You can evaluate?
Thomas Obrist: 00:48:59 Exactly. We can add an out of sample testing. The issue
with this is ... I mean, you cannot do this forever. Trading
is really short term, markets are changing a lot. If you
have something which works, you don't want to do this
for two years.
Thomas Obrist: 00:49:19 You want to go fast to market. There is the issue like, you
produce [inaudible 00:49:23] for one year, but then your
out of sample testing come up in one year. Let's say two
or one month, let's say, just as an example, one month,
where you can evaluate the out of sample, perhaps it was
a bad one month. Why was it bad? Is this still a good
idea, or was it actually bad? There are so many things to
consider.
Thomas Obrist: 00:49:46 It could be just a bad month, and then you will say, "It
would anyways." Because we can explain why it was a
bad month. But perhaps it was a really good month, and
then you start actually trading, and it goes south. It's
really hard to evaluate, because you don't want to do it
too long then you lose value, your ideas change.
Thomas Obrist: 00:50:11 If you do it too short, you have the statistical sample
actually to extrapolate. How many people you are out of
sample testing? This is like, as a data scientist you want
to have as much data as possible. As a trader, you want
to generate value as soon as possible. You have this
trade-off between testing and actually generating money.
Thomas Obrist: 00:50:34 With this grid loss case it was difficult. Actually there I've
tried a lot of things. The issue is, why it is so difficult is
there is a lot of uncertainty in the market, and a lot of
things who could go wrong, because you have so much
wrong inputs to your model. You really don't want to
overfit.
Thomas Obrist: 00:50:55 You need to think really deep about, of course I have four
months where my model didn't behave really good. The
question was why? Because if I just build a model, which
would take this into account, it could be an overfit,
because perhaps this situation will not generalize in the
future again. I need to know why it performed bad. I need
to know why it's happened, because if it's just build more
features, perhaps we might fix it.
Thomas Obrist: 00:51:25 There is high complexity always, and you can introduce
more features, build a higher complex model. This will fix
your issue on your test side, and your training side. If you
look at your test side several times because your manager
came back and said, "Do it again." Then you might overfit.
You need to balance between overfitting and
generalization. This is always the case.
Thomas Obrist: 00:51:56 The difficulty was the third party said, "Our model
generalizes better." At the end I improved a little bit the
model with new features, more analysis on the data to
capture more uncertainty on the inputs. Then we said as
well, nobody sends a bad back test. The third party, we
said like, "Your back test was too good to be true." I
mean, I don't say they did the wrong job or they wanted
to trick us, but it's really difficult to actually generalize
well always.
Thomas Obrist: 00:52:35 I mean, is it know the problem, the issue ... Even if you
think your training and test error could balance it might
not be, because there might be some factor, which you
don't consider.
Kirill Eremenko: 00:52:51 A very interesting case study. I like the trade-off you
described about testing and trading. How the markets
change really fast. It's a different thing, not something
you often see in data science, this trade-off. I guess it's
specific to applying data techniques in market conditions.
Tell us a bit about your hackathon. On LinkedIn I read
that you won an international hackathon on predictive
modeling of sport prices. Can you tell us a bit about that?
Thomas Obrist: 00:53:36 Exactly. I mean, as Axpo, I did a hackathon for Axpo. I
mean, Axpo organized everything. It was for students.
Every students could have come, but mostly it was ETH
students who study mathematics, mostly machine
learning and data science actually. It was a hackathon for
students mainly. It was really nice. We went for three
days in a power plant from Axpo somewhere in the
mountains.
Kirill Eremenko: 00:54:10 This was before you worked at Axpo?
Thomas Obrist: 00:54:12 Actually, I joined once as a student, and once I was the
organizer. I mean, I was not part of renting a room or
something like this, but I got a use case, and then I
prepared the use case. I gathered all the data. I wrote
environments how students can submit their models. The
use case was, we have a lot of wind parks which we
manage energy for.
Thomas Obrist: 00:54:43 Some of these use cases, especially were in the Nordics
region, north of Europe. Not all the same, but some of
those wind parks they send quite real live data. You have
a feed in I'd say every 50 minutes of measurement data,
of how much spark it's producing. You can calibrate your
new model for the end of the day. So you have intra-day
market as well.
Thomas Obrist: 00:55:12 You could see if you are really wrong on their head,
perhaps you should adjust your intra-day updates. So
you can trade better. Since we have a lot of parks,
perhaps the use case was, there is a correlation, we don't
see it between different parts.
Thomas Obrist: 00:55:31 For example, in the east of these Nordic country, there
was a huge error, but in the west not yet. Perhaps in an
hour the error will be there as well.
Kirill Eremenko: 00:55:50 Interesting.
Thomas Obrist: 00:55:50 That is an interesting example, but there could be
different correlations which we don't see yet. This is just
one example like, those are things we don't consider yet
in our data, because we have so much data. Normally,
our wind forecasting work is like, you have to position,
you give this to a third party. They will do a mapping
between a wind model, like they look at weather data and
everything. They do a mapping how much your wind park
produces. Based on the location it is, which wind turbine
it is, and so on.
Thomas Obrist: 00:56:22 There is kind of a standalone basis let's say like this.
Perhaps they miss the correlation between mappings. If
there was an error in the east, this error could happen in
an hour later in the west, or in the south, and north, or
other things. Perhaps you can interfere from one
impacting the other one. What we did, I get all the data
from each country for all the wind parks we have.
Thomas Obrist: 00:56:48 Then we gave all this data to the students. We gave wing
speed data, wind temperature measurements, forecast
measurements. It was just trying to improve the wind
forecast itself. Like how much is going this turbine to
produce? This was the use case. It was a really interesting
hackathon. I mean, it's really fundamental, you need to
start to think about how a wind turbine produces energy,
how it's dependent on wind speed, and other things.
Thomas Obrist: 00:57:23 The funny thing in the Nordics countries is, turbines can
freeze. If the temperature is so low, even if you have wind,
if it's frozen it will not going to produce something. I think
this makes it really interesting, if they freeze there is a
really huge decrease in production.
Kirill Eremenko: 00:57:45 Interesting. How did that go? Did the students solve the
hackathon?
Thomas Obrist: 00:57:51 I mean, actually there was again one model, which was
really a simple one, or a good idea. Which was a bit better
than the baseline. Honestly, I would say that in the
university hackathon students tried a lot. I think it was
interesting for everyone. I would say to some extent it was
a bit difficult perhaps as well. First thing, you need to
understand how ... We introduced them to day ahead
market price, or intra-day market price, when you can
trade something in how a wind turbine is built, how it's
producing based on wind inputs, dependencies.
Everything was in Python. They know Python, but we
built libraries for them that they can access data and
other things.
Thomas Obrist: 00:58:45 I would say three days was just not enough to solve this
input. They tried a lot, and I think it was really interesting
to see how they progressed. We did I would say every hot
day we did a stock where I evaluated all the current
models. They submitted something, I rated them, and
gave them feedback. They did a round of discussions.
Thomas Obrist: 00:59:11 It was really fun to see how they worked in three days. At
the beginning, the first models we were like, "Okay. Let's
just try to load the data, do something, and submit
something." We used models like linear regressions or
something very simple, some inputs. The second models
then were like, I mean, they used all their techniques.
They took a lot of data, built features, put it in huge
models with high degree of complexity.
Thomas Obrist: 00:59:54 Then the other thing, and this was the second round, and
then everyone was really disappointed. I had a hidden
asset, where I evaluated all of the models and they didn't
have access to it. It was like they only could test basically
four times. I mean, they one had test that split itself in
training and test, but there was one set which only I had.
Thomas Obrist: 01:00:19 Then the second one was ... They tried so much. With one
student I think I stayed until three o'clock in the room in
the morning, just so that he was able to finish his
training. Then it was disappointing because the second
round was worse than the first. What happened was that
most were too high complexity, they didn't generalize well
outside of their set.
Thomas Obrist: 01:00:46 In the third run they all cut back. So they went and
filtered on features, they filtered on data, they filtered on
the complexity and tried to reduce it. It was really
interesting to see how it well worked. It was really fun. We
had one model, which was better than the baseline, but
overall perhaps we should have chosen an easier use
case, and solve the issue of European wind production.
Kirill Eremenko: 01:01:16 Very interesting. Interesting to see how people adjust
their thinking, and change the models with your
feedback. This didn't work, make it more complex, less
complex, and so on. That was fun.
Kirill Eremenko: 01:01:30 Thomas, we're actually running out of time. It's been an
hour. It's flowed by real quick. Before we finish up, just
one final question for you. What's your recommendation
for somebody who wants to get into this space that you're
in? Into energy trading, somebody who is going to be a
data scientist, or starting into this space of data science?
What would you say is an important thing for them to
look into as a first step?
Thomas Obrist: 01:01:58 I mean, if you only want to go to energy trading itself, I
would say as a data scientist you really need to want to
do this. It's dependent on this. It's trading. First, be
interested in trading, start a little bit of trading yourself ...
This always looks really nice, even if you're a data
scientist, if you have to feed off being a trader, or you
know what it is to press the button, and actually do a
trade. I think this is always welcome. So be interested in
finance. Else, it's really good if you have some knowledge
about quantitative approaches as I discussed in the
beginning, what's Monte Carlo simulation?
Thomas Obrist: 01:02:39 I mean, those people know it, but it's not always let's say
if you're really IT heavy, and you went from IT side to data
science, it's not necessary that you did it. This could be
something which is a plus on your CV [inaudible
01:02:56].
Kirill Eremenko: 01:02:55 Awesome. That's a cool idea. Look into what trading is all
about. All right, thanks a lot Thomas for coming. It's been
a pleasure. Before we wrap up, where can our listeners
get in touch with you and find you? What's the best
places to connect?
Thomas Obrist: 01:03:18 I mean, just drop me a message on LinkedIn and I try to
respond.
Kirill Eremenko: 01:03:23 Awesome. One final question for you. What's a book or
books that you can recommend for our listeners?
Thomas Obrist: 01:03:31 I would say I recommend two books. One is Systematic
Trading from Robert Carver. This is not about data
science, it's more about trading in general. It gets you
thinking about, how could I use data approach, or a
quantitative approach for trading. It's a really nice book,
read and apply it book about how to build a framework
for quantitative trading. It starts you thinking about how
to generalize ideas.
Thomas Obrist: 01:03:58 The other book, I mean, most times I just read papers,
but as a student I went through Deep Learning from Ian
Goodfellow. It's a long but I thought it's perfect. It's really
in detail, and I really like to read through it. It takes some
time, I think it's 800 pages. Once you get done with it,
then I think it's really nice.
Kirill Eremenko: 01:04:23 Is that the one that's for free?
Thomas Obrist: 01:04:25 Yeah. I think it's from MIT Press. You can buy it on
Amazon, but there is as well HTML links where you can
access this for free.
Kirill Eremenko: 01:04:36 Yeah, I think it's deeplearningbook.org. That's the
website. It's been recommended a few times. Ian
Goodfellow, Yoshua Bengio, and Aaron Courville. You can
access it there for free if you're interested. Is it a good
book?
Thomas Obrist: 01:04:56 I think it is a really nice book. If you read through it you'll
know everything about networks, and deep learning. It's a
nice book to read through.
Kirill Eremenko: 01:05:07 It's about four years old though. Do you think it's still up
to date? Is it still relevant?
Thomas Obrist: 01:05:14 Depending on which level you are I would say. If you are a
student ... I think it's still on ETH trend list, trade ETH
deep learning. I've not looked this year, but this book was
on the list of lectures, that they go through a part of this
book in the lecture.
Thomas Obrist: 01:05:37 I would say if you want to get into deep learning, this
book covers it very well. I mean, if you are a front runner
in the research, perhaps not. Then I would recommend
something different. Depending on which level you are. I
thought as a book it gives you a really good overview of a
lot of the concepts.
Kirill Eremenko: 01:05:56 Awesome.
Kirill Eremenko: 01:06:02 Well, thank you for the recommendations, and on that
note we're going to wrap up. Thanks a lot Thomas for
coming on the show. It was real fun.
Thomas Obrist: 01:06:11 Thank you very much.
Kirill Eremenko: 01:06:17 There you go everybody. I hope you enjoyed this episode.
As mentioned at the beginning it was quite advanced, and
a lot of topics. I'm sure we could have dove into many of
them, but we touched on quite a lot of topics. Very briefly
so, my favorite part was the trade-off between testing, and
trading. It resembles the whole trade-off between
exploration and exploitation. In this case, once we have a
model that you've back tested, and you've verified that
works, then you want to forward test it. Basically you
want to put it onto the market and shadow trade it for a
bit, to make sure that your model wasn't overfitting, that
your out of sample test ... Not just out of sample test, but
out of sample test on live data that comes in with all
these glitches, and all that delays and lags, and
everything else that resembles the real world markets,
some things that are quite hard sometimes to recreate
and back test, even with out of sample back test.
Kirill Eremenko: 01:07:19 You want to put it on, and shadow trade it for a bit, but
the question is for how long? If you shadow trade it for
four months, you might get your validation, but by then
markets might have changed, and as soon as you switch
to real trading, it's no longer working. On the other hand
if you shadow trade for too short for a week, you might
not get enough data to validate that it's working, and
when you switch to live trading again, it's now working.
An interesting balance. I love these situations when it's
time to decide a balance, and show you there isn't one
right answer. It's on a case by case basis. Maybe there are
some guiding principles, but it's ultimately an art that
data scientists have to participate in.
Kirill Eremenko: 01:08:03 I'm sure you had your own favorite parts from this
episode. As always, the show notes are available at
superdatascience.com/405 where you can find the
transcript for this episode, any materials we mentioned,
and URLs to connect with Thomas. Hit him up on
LinkedIn, especially if you're interested in the space of
energy or quantitative analysis of markets and trading.
I'm sure he'll be happy to help out. If you know somebody
in this space, very easy to send them the episode, to
share, just send them a link superdatscience.com/405.
Kirill Eremenko: 01:08:40 On that note, thank you so much for being here today, I
look forward to seeing you back here next time. Until
then, happy analyzing.