10
Research Article A New Kernel of Support Vector Regression for Forecasting High-Frequency Stock Returns Hui Qu 1 and Yu Zhang 2 1 School of Management and Engineering, Nanjing University, No. 22 Hankou Road, Nanjing 210093, China 2 School of Physics, Nanjing University, No. 22 Hankou Road, Nanjing 210093, China Correspondence should be addressed to Hui Qu; [email protected] Received 22 December 2015; Accepted 31 March 2016 Academic Editor: Yannis Dimakopoulos Copyright © 2016 H. Qu and Y. Zhang. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. is paper investigates the value of designing a new kernel of support vector regression for the application of forecasting high- frequency stock returns. Under the assumption that each return is an event that triggers momentum and reversal periodically, we decompose each future return into a collection of decaying cosine waves that are functions of past returns. Under realistic assumptions, we reach an analytical expression of the nonlinear relationship between past and future returns and introduce a new kernel for forecasting future returns accordingly. Using high-frequency prices of Chinese CSI 300 index from January 4, 2010, to March 3, 2014, as empirical data, we have the following observations: (1) the new kernel significantly beats the radial basis function kernel and the sigmoid function kernel out-of-sample in both the prediction mean square error and the directional forecast accuracy rate. (2) Besides, the capital gain of a simple trading strategy based on the out-of-sample predictions with the new kernel is also significantly higher. erefore, we conclude that it is statistically and economically valuable to design a new kernel of support vector regression for forecasting high-frequency stock returns. 1. Introduction Although the efficient market hypothesis is one of the most influential theories in the past few decades, researchers have never given up examining the predictability of stock returns. e complexity of the market makes the relationship between past and future financial data nonlinear [1, 2]. Linear statistical models, such as the autoregressive (AR) model, the autoregressive moving average (ARMA) model, and the autoregressive integrated moving average (ARIMA) model, are apparently powerless when compared to non- linear approaches such as the generalized autoregressive conditional heteroskedasticity (GARCH) model, the artificial neural network (ANN), and the support vector machine for regression (SVR). Atsalakis and Valavanis [3] provide a very comprehensive review of the nonlinear models used in stock market forecasting. With its remarkable generalization performance, support vector machine (SVM), firstly designed for pattern recognition by Vapnik [4], has gained extensive applications in regression estimation (in which it is called SVR) and is thus introduced to time series forecasting problems. SVR is compared with multiple other models such as the backpropa- gation (BP) neural network [5–7], the regularized radial basis function (RBF) neural network [6], the case-based reasoning (CBR) approach [7], the GARCH-class models [8] and shows superior forecast performance. e kernel function used in SVR plays a crucial role in capturing the nonlinear dynamics of the time series under study. Several commonly used kernels, for example, the radial basis function kernel, are firstly derived mathematically and are widely applied in time series forecasting problems. Parameters are empirically tuned by researchers to achieve good performance of prediction. In addition, several researchers argue that using a single kernel may not solve a complex problem satisfactorily and thus pro- pose the multiple-kernel SVR approach [9–11]. Yeh et al. [11] show that multikernel SVR outperforms single-kernel SVR in forecasting the daily closing prices of Taiwan capitalization weighted stock index. Besides, Huang et al. [12] linearly combine the predicting result of SVR with those of other Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2016, Article ID 4907654, 9 pages http://dx.doi.org/10.1155/2016/4907654

Research Article A New Kernel of Support Vector Regression ...downloads.hindawi.com/journals/mpe/2016/4907654.pdf · neural network (ANN), and the support vector machine for regression

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article A New Kernel of Support Vector Regression ...downloads.hindawi.com/journals/mpe/2016/4907654.pdf · neural network (ANN), and the support vector machine for regression

Research ArticleA New Kernel of Support Vector Regression forForecasting High-Frequency Stock Returns

Hui Qu1 and Yu Zhang2

1School of Management and Engineering Nanjing University No 22 Hankou Road Nanjing 210093 China2School of Physics Nanjing University No 22 Hankou Road Nanjing 210093 China

Correspondence should be addressed to Hui Qu linda59qunjueducn

Received 22 December 2015 Accepted 31 March 2016

Academic Editor Yannis Dimakopoulos

Copyright copy 2016 H Qu and Y ZhangThis is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

This paper investigates the value of designing a new kernel of support vector regression for the application of forecasting high-frequency stock returns Under the assumption that each return is an event that triggers momentum and reversal periodicallywe decompose each future return into a collection of decaying cosine waves that are functions of past returns Under realisticassumptions we reach an analytical expression of the nonlinear relationship between past and future returns and introduce a newkernel for forecasting future returns accordingly Using high-frequency prices of Chinese CSI 300 index from January 4 2010 toMarch 3 2014 as empirical data we have the following observations (1) the new kernel significantly beats the radial basis functionkernel and the sigmoid function kernel out-of-sample in both the predictionmean square error and the directional forecast accuracyrate (2) Besides the capital gain of a simple trading strategy based on the out-of-sample predictions with the new kernel is alsosignificantly higherTherefore we conclude that it is statistically and economically valuable to design a new kernel of support vectorregression for forecasting high-frequency stock returns

1 Introduction

Although the efficient market hypothesis is one of the mostinfluential theories in the past few decades researchershave never given up examining the predictability of stockreturnsThe complexity of the market makes the relationshipbetween past and future financial data nonlinear [1 2]Linear statistical models such as the autoregressive (AR)model the autoregressive moving average (ARMA) modeland the autoregressive integrated moving average (ARIMA)model are apparently powerless when compared to non-linear approaches such as the generalized autoregressiveconditional heteroskedasticity (GARCH)model the artificialneural network (ANN) and the support vector machine forregression (SVR) Atsalakis and Valavanis [3] provide a verycomprehensive review of the nonlinear models used in stockmarket forecasting

With its remarkable generalization performancesupport vector machine (SVM) firstly designed for patternrecognition by Vapnik [4] has gained extensive applications

in regression estimation (in which it is called SVR) and isthus introduced to time series forecasting problems SVR iscompared withmultiple other models such as the backpropa-gation (BP) neural network [5ndash7] the regularized radial basisfunction (RBF) neural network [6] the case-based reasoning(CBR) approach [7] the GARCH-class models [8] and showssuperior forecast performance The kernel function used inSVR plays a crucial role in capturing the nonlinear dynamicsof the time series under study Several commonly usedkernels for example the radial basis function kernel arefirstly derived mathematically and are widely applied in timeseries forecasting problems Parameters are empirically tunedby researchers to achieve good performance of prediction Inaddition several researchers argue that using a single kernelmay not solve a complex problem satisfactorily and thus pro-pose the multiple-kernel SVR approach [9ndash11] Yeh et al [11]show thatmultikernel SVR outperforms single-kernel SVR inforecasting the daily closing prices of Taiwan capitalizationweighted stock index Besides Huang et al [12] linearlycombine the predicting result of SVR with those of other

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2016 Article ID 4907654 9 pageshttpdxdoiorg10115520164907654

2 Mathematical Problems in Engineering

classifiers trained with the same data set and realize betterforecast performance of the weekly movement direction ofNIKKEI 225 index However current applications seldomtouch the inner structure of the existing kernels whichdepicts the nonlinear relationship between past and futuredata Thus it is reasonable to argue that if the kernel isdesigned according to the specific nonlinear dynamics of theseries under study improvement in forecast accuracy can beexpected

In addition the above-mentioned studiesmostly use daily(or lower frequency) data in their empirical experimentsSince high-frequency trading has gained its popularity inrecent years the ability to forecast intraday stock returnsis becoming increasingly important Thus in this study weinstead consider the forecasting of high-frequency stockreturns Matıas and Reboredo [13] and Reboredo et al[14] empirically show the forecast ability of SVR for high-frequency stock returns by directly using the radial basisfunction kernel Instead of directly applying some conven-tional kernel or some combination of conventional kernelswe design a kernel for the specific forecasting problemSpecifically under the assumption that each high-frequencystock return is an event that triggers momentum and rever-sal periodically we decompose each future return into acollection of decaying cosine waves that are functions ofpast returns After taking several realistic assumptions wereach an analytical expression of the nonlinear relationshipbetween past and future returns and design a new kernelaccordingly One-minute returns of Chinese CSI 300 indexare used as empirical data to evaluate the new kernel ofSVR We show that the new kernel significantly beats theconventional radial basis function and sigmoid functionkernels in both the prediction mean square error and thedirectional forecast accuracy rate Besides the capital gain ofa practically simple trading strategy based on the predictionswith the new kernel is also significantly higher

The remainder of this paper is organized as followsSection 2 introduces the basic idea of SVR Section 3 presentsour basic assumptions and designs the new kernel Section 4determines the newly introduced kernel parameters andcompares the new kernel with two commonly used kernelsin terms of the forecast performance of the SVR Finally theconclusions are drawn in Section 5

2 Support Vector Machine for Regression

First designed by Vapnik as a classifier [4] SVR is featuredwith the capability of capturing nonlinear relationship inthe feature space and thus is also considered as an effectiveapproach to regression analysis The following sketches thebasic idea of SVR For more detailed illustration of SVRplease refer to Burges [15]

21 SVR for Linear Regression In a regression problem givena finite data set 119865 = (x119896 119910119896)

119899

119896=1derived from an unknown

function 119910 = 119892(x) with noise we need to determine afunction 119910 = 119891(x) solely based on 119865 and to minimize thedifference between119891 and the unknown function 119892 For linear

regression 119892 is assumed to be a linear relationship between xand 119910

119910 = 119892 (xw 119887) = w sdot x + 119887 =119898

sum

119895=1

119908119895119909119895 + 119887 (1)

where x is called feature vector and the space X it lives inis named as feature space 119898 is the dimension of the featurevector x and the feature space X 119910 is referred to as the labelfor each (x 119910) Now that the relationship to be determined isassumed linear our goal is to find a hyperplane 119910 = 119891(x) inthe119898+1dimension space where (x119896 119910119896)

119899

119896=1are plotted and

to minimize the fitting errors by adjusting the parameters Asis proven by Vapnik the hyperplane is given as

119910 = 119891 (x120572 119887) = sum119896

120572119896119910119896x119896 sdot x + 119887 (2)

where x119896rsquos are support vectors in the given data set 119865 and119910119896rsquos are the corresponding labels ldquosdotrdquo represents the innerproduct in the feature space X Finding the support vectorsand determining the parameters 120572 and 119887 turn out to be alinearly constrained quadratic programming problem thatcan be solved in multiple ways (eg the sequential minimaloptimization algorithm [16]) Such a process conducted onthe given data set 119865 is called learning Once the learningphase is done the model built can be used to predict thecorresponding label 119910 from any feature vector x in the featurespace X

22 SVR for Nonlinear Regression However the linear rela-tionship assumption is often too simple to characterize thedynamics of the time series and thus it is necessary toconsider the case when 119892 is nonlinear The idea of SVR fornonlinear regression is to build amapping x rarr 120601(x) from theoriginal 119898 dimension feature space X to a new feature spaceX1015840 whose dimension depends on the mapping scheme and isnot necessarily finite In the new space X1015840 the relationshipbetween the new feature vector 120601(x) and label 119910 is believedto be in a linear form By building a proper mapping thenonlinear relationship can be approximated by doing in thenew feature spaceX1015840 exactly the same thing as is done for thelinear case and it can be proven that the nonlinear version of(2) is

119910 = 119891 (x120572 119887) = sum119896

120572119896119910119896119870(x119896 x) + 119887 (3)

where 119870(x119896 x) = 120601(x119896) sdot 120601(x) is the kernel function and ldquosdotrdquorepresents the inner product in the new feature space X1015840

The new feature 120601(x) which can be an infinite dimensionvector is usually not necessary to be computed explicitlysince we normally work with the kernel function in thetraining and forecasting phases Accordingly the kernelfunction is essential to the performance of SVR Any functionsatisfying Mercerrsquos condition can be used as the kernelfunction Commonly used kernels include the radial basisfunction kernel119870(x y) = exp(minusxminus y21205902) and the sigmoidfunction kernel 119870(x y) = tanh(120581x sdot y minus 120575) where 120590 120581 and 120575are kernel parameters that can be tuned

Mathematical Problems in Engineering 3

3 A New Kernel for ForecastingHigh-Frequency Stock Returns

Most applications of SVR directly apply the commonlyused kernels and tune the kernel parameters for improvedforecast performance However we argue that a specificallydesigned kernel function which builds on the properties ofthe underlying data can enable the SVR to better capturethe nonlinear relationship between the original feature vectorx and label 119910 Thus in this study we develop a new kernelspecifically for forecasting high-frequency stock returns fromsome basic assumptions about the stock market

31 High-Frequency Stock Return Series Forecasting ProblemHigh-frequency stock return series refers to the return timeseries with one-minute or comparatively small time intervalsGiven a return time series 119868119899 = 119903119899 119903119899minus1 119903119899minus2 frompresent119905 = 119899 to some time point in the history a forecastingproblem is to find the one-step-ahead return 119903119899+1 based onthe knowledge of 119868119899

To put it in another way we need to determine thefunction 119903119899+1 = 119891(119903119899 119903119899minus1 119903119899minus2 ) that best fits the givenreturn series 119868119899 (Some studies introduce exogenous variablesin stock return series forecasting However since we focuson the high-frequency return series where the inefficiencyof the market is obvious [17 18] it is reasonable to believethat the return series itself can provide enough informationfor its forecasting) The vector r119899 = (119903119899 119903119899minus1 119903119899minus2 ) is thusthe feature vector and 119903119899+1 is the label The training data set119865 includes every single return as label and all the returnsbefore it as the elements of the corresponding feature vectorAccording to former studies 119891 is believed to be nonlinearconsidering the complexity of the financial market

32 The New Kernel for SVR It is straightforward that everysingle return has impact on the market Due to behavioraleffects such as overreaction andunderreaction such impact isnot unidirectional during its remaining period In this studywe assume that each high-frequency stock return is an eventthat triggers momentum and reversal periodically and thusexpress the impact generated by the return 119903119894 as

119860(119903119894 119905119901) = 119903119894 cos [120596 (119894 119905119901)10038161003816100381610038161199031198941003816100381610038161003816 119905119901 + 120593 (119894)] 119890

minus120582119905119901 (4)

where 119894 is the time point when the return 119903119894 first occurs 119905119901is the time past since 119905 = 119894 and 120582 is the parameter thatcontrols the decay rate 120596(119894 119905119901)|119903119894| is the frequency of thecosine wave where 120596(119894 119905119901) is a nonzero factor and 120593(119894) is thephase factor Such an expression ensures that the same levelof return occurring at different time points can have differentimpact waves

According to (4) a larger return leads to greater (ampli-tude) and faster (frequency) price change in the future andas time goes by such an impact wave will gradually fadeto vanity This exactly meets the basic intuition about how

the market reacts to events For subsequent deduction (4) istransformed as follows

119860(119903119894 119905119901) = 119903119894 119886119894 cos [120596 (119894 119905119901)10038161003816100381610038161199031198941003816100381610038161003816 119905119901]

+ (1 minus 1198862

119894)12

sin [120596 (119894 119905119901)10038161003816100381610038161199031198941003816100381610038161003816 119905119901] 119890

minus120582119905119901

(5)

where 119886119894 isin [minus1 1]It is natural to assume that every return is comprised

of all the impact waves generated by the past returns (Weonly consider the predictable part of every future returnand ignore the innovation part) Thus we decompose everyreturn 119903119899 into a collection of decaying cosine waves

119903119899 =

infin

sum

119894=1

119860 (119903119899minus119894 119894Δ119905) (6)

where Δ119905 denotes the time interval length which is 1 minutein this study and thus the time passed since event 119903119899minus119894 is 119905119901 =119894Δ119905

Now we substitute (5) into (6) Taylor serial expansionis used on the ldquosinrdquo and ldquocosrdquo terms and the coefficients arerearranged to generate the following

119903119899 =

infin

sum

119894=1

infin

sum

119896=0

119862119894119896119903119899minus119894 [120596 (119899 minus 119894 119894Δ119905)1003816100381610038161003816119903119899minus119894

1003816100381610038161003816 119894Δ119905]119896119890minus120582119894Δ119905

=

infin

sum

119894=1

infin

sum

119896=0

1198621015840

119894119896119903119899minus119894 (

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 119894Δ119905)119896119890minus120582119894Δ119905

(7)

where 1198621015840119894119896119894119896 is the coefficient set to be determined

Therefore based on the assumptions above a nonlinearrelationship between past returns and the one-step-aheadfuture return is derived It is easy to see that 119903119899 is a linearcombination of 119903119899minus119894(|119903119899minus119894|119894Δ119905)

119896119890minus120582119894Δ119905

119894119896Thus it is possible tomap the original feature vectors to a proper form to make therelationship between label and feature vector linear Howeverwe once again fall into the dilemma that (7) is a collection ofinfinite series which makes the mapped feature vectors haveinfinite dimension and hard to compute It is also unlikely toderive the kernel function instead like before

To solve the above-mentioned problem we first considerthe decay property of the impact waves which is presented bythe factor 119890minus120582119894Δ119905 Since the decay rate 120582 is constant we believethat the impact of an event that is119898Δ119905 before the time beingis negligible and 119898 is the minimum integer that satisfies thecondition 119890

minus120582119898Δ119905119890minus120582Δ119905

lt 120575 where 120575 rarr 0 Since we donot know how little 120575 should be 119898 is determined throughexperiments in Section 42

It is also necessary to consider the high-frequency prop-erty of the data Such a property ensures that the scale of thetime interval Δ119905 is subtle compared to the time scale of thefluctuation of the market Thus it is reasonable to assumeΔ119905 ≪ 2120587120596(119894 119905119901)|119903119894| forall119894 where the right hand side is theperiod of the impact wave of return 119903119894 and ldquo≪rdquo representsat least 2 orders of magnitude smaller Furthermore sincethe Taylor serial expansion reserved till order 119902 has an error

4 Mathematical Problems in Engineering

err (119909) = (119891(119902+1)

(120585)(119902 + 1))119909119902+1 the error from truncating

the Taylor expansions in (7) satisfies

err le[120596 (119899 minus 119894 119894Δ119905)

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 119894Δ119905]119902+1

(119902 + 1)

= (120596 (119899 minus 119894 119894Δ119905)

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 Δ119905

2120587⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

≪1

)

119902+1

sdot(2120587119894)119902+1

(119902 + 1)

(8)

Thus setting 119902 = 3 can well ensure the accuracy of theapproximation

Now a much more accessible approximation of (7) isderived

119903119899 =

119898

sum

119894=1

119902

sum

119896=0

1198621015840

119894119896119903119899minus119894 (

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 119894Δ119905)119896119890minus120582119894Δ119905

(9)

Equation (9) explicitly states how the mapping is con-structed For any past return series 119903119899minus119894

119898

119894=1 the original

feature vector is r119899minus1 = (119903119899minus1 119903119899minus2 119903119899minus119898) and themapping120601 is defined as

120601 (r119899minus1) = 119903119899minus119894 (1003816100381610038161003816119903119899minus119894

1003816100381610038161003816 119894Δ119905)119896119890minus120582119894Δ119905

119898119902

119894=1 119896=0 (10)

It is important to understand that the new feature vector120601(r) is 119898 times (119902 + 1) dimension with each element 120601119894119896 givenas 120601119894119896 = 119903119899minus119894(|119903119899minus119894|119894Δ119905)

119896119890minus120582119894Δ119905 Now that 120601(r) has finite

dimension the dimension of the new feature space X1015840 is alsofinite and the kernel function is naturally built as the innerproduct in such a space

119870(r(1) r(2)) =119898

sum

119894=1

119902

sum

119896=0

120601(1)

119894119896120601(2)

119894119896 (11)

Although it is theoretically important to examinewhetherthe newkernel satisfiesMercerrsquos condition or not consideringthat some kernels which fail to meet the condition stilllead to perfectly converged results [15] we will examine theappropriateness of the new kernel through experiments

4 Empirical Experiments

41 Data One-minute prices of Chinese CSI 300 index fromJanuary 4 2010 to March 3 2014 (1000 trading days) areused as empirical data in this study The data are obtainedfromWind Financial Terminal and days with missing pricesare deleted The official trading hours are from 930 to 1130and from 1300 to 1500 and thus we have 240 one-minutereturns per day The returns within the same trading day areused for learning and forecasting since the continuousnessof the time is important in the above derivation (Althoughthere is a 15-hour break in each trading day buy and sellorders submitted in themorning are still valid and neworderscan still be submitted during this period thus we deem thetime as continuous in each trading day) Specifically the first100 + 119898 returns within each trading day are set as the in-sample data (119898 le 30) which are used for learning and the

last 110 returns are set as the out-of-sample data which areused for prediction and evaluationThe average performanceof the SVR during the first 500 trading days that is fromJanuary 4 2010 to February 1 2012 is used for determiningthe kernel parametersThe last 500 trading days that is fromFebruary 2 2012 to March 3 2014 is used for performancecomparison against the commonly used kernels To improvethe performance the logarithmic returns are normalized to[minus1 1] before being input into the SVR

42 Determining the Kernel Parameters Before evaluatingthe performance of the new kernel multiple parameters stillneed to be determined to make best use of the kernel Theundetermined parameters are119898 which measures how manyhistorical data are used 120582 which measures how fast theimpact of one event decays with time andΔ119905 which indicateshow we represent 1 minute numerically We optimize theparameters according to the out-of-sample forecast perfor-mance of the corresponding SVR which is evaluated byMSEand hit rate respectively MSE is the mean square error of thepredictions and is computed using the normalized returnsHit rate is the proportion of predicted returns that have thesame sign with the actual ones that is the directional forecastaccuracy rate

To determine 119898 all the other parameters are fixed and119898 varies from 1 to 30 The average MSE and hit rate in thefirst 500 trading days are plotted in Figures 1(a) and 1(b)respectively (Figures 1(a) and 1(b) are plotted with 120582 and Δ119905set to the optimal values Actually the trends of the plotsdo not vary with the other kernel parameters) The averageMSE decreases sharply with 119898 when 119898 le 21 is relativelyconstant when 119898 isin [21 26] and decreases slowly with 119898

when119898 isin [26 30]The average hit rate increases sharply with119898when119898 le 21 is relatively constant when119898 isin [21 25] andgets smaller afterwards Considering that smaller MSE andgreater hit rate are preferred 119898 = 30 and 119898 isin [21 25] areall appropriate choices suggested by the experiments In thefollowing comparative experiments 119898 is set to 25 (We havealso done comparative experiments with 119898 set to the othersuggested values and the results are consistent)

To determine 120582 all the other parameters are fixed and120582 varies from 1 to 50 The average MSE and hit rate in thefirst 500 trading days are plotted in Figures 2(a) and 2(b)respectively It is quite clear that the minimum average MSEand maximum average hit rate are both achieved at 120582 =

10 and thus we set 120582 to 10 in the following comparativeexperiments

The same method is used to optimize Δ119905 and the averageMSE and hit rate are plotted in Figures 3(a) and 3(b)respectively We can see that the minimum average MSE andmaximum average hit rate are both achieved at Δ119905 = 005and thus we set Δ119905 to 005 in the following comparativeexperiments

43 NewKernel versus CommonlyUsed Kernels Thenewker-nel is compared with the commonly used kernels in terms ofthe out-of-sample forecast performance of the correspondingSVR Although any function satisfying Mercerrsquos condition

Mathematical Problems in Engineering 5

003

0031

0032

0033

0034

0035

003650

0-da

y av

erag

e MSE

5 10 15 20 25 300(m)

(a) Average MSE versus119898

069

07

071

072

500-

day

aver

age h

it ra

te

5 10 15 20 25 300(m)

(b) Average hit rate versus119898

Figure 1 500-day average MSE and hit rate achieved with different values of119898

0031

0032

0033

0034

0035

0036

0037

500-

day

aver

age M

SE

10 20 30 40 500120582

(a) Average MSE versus 120582

067

068

069

07

071

072

07350

0-da

y av

erag

e hit

rate

10 20 30 40 500120582

(b) Average hit rate versus 120582

Figure 2 500-day average MSE and hit rate achieved with different values of 120582

can be used as the kernel function radial basis functionand sigmoid function are two widely used kernels Theformer tends to outperformothers under general smoothnessassumption [12] and the latter gives a particular kind of two-layer sigmoidal neural network [15] Thus these two kernelfunctions are used for performance comparison We realizeeach SVR by LIBSVM-320 [19] and the related codes aremodified for the new kernel

The out-of-sample MSE and hit rate are computed foreach of the three kernels on each of the last 500 trading daysThe results of the new kernel are plotted against those of theradial basis function kernel and the sigmoid function kernelin Figures 4 and 5 respectively

In Figure 4(a) the vertical and horizontal axes representthe MSE of the new kernel and the radial basis functionkernel respectively while in Figure 4(b) the vertical andhorizontal axes represent the hit rate of the newkernel and the

radial basis function kernel respectivelyThere are 500 pointscorresponding to the 500 trading days plotted in each subplotand the diagonal line representing 119910 = 119909 is for referenceWe can see that most points lie below the line 119910 = 119909 inFigure 4(a) and lie above the line 119910 = 119909 in Figure 4(b) Thisindicates that the newkernel leads to smallerMSE and greaterhit rate than the radial basis function kernel in most of the500 trading days Similarly Figures 5(a) and 5(b) indicate thatthe new kernel leads to smaller MSE and greater hit rate thanthe sigmoid function kernel in most of the 500 trading daysTherefore the SVR with the new kernel has obviously betterforecast performance in terms of both the MSE and the hitrate

In addition a simple trading strategy is carried out basedon the out-of-sample forecasts of the SVR with differentkernel specifications The initial capital is set as 100 oneach trading day The index is bought if the one-step-ahead

6 Mathematical Problems in Engineering

20 40 60 80 100 1200Δt(times100)

0031

0032

0033

0034

0035

0036

0037

500-

day

aver

age M

SE

(a) Average MSE versus Δ119905

20 40 60 80 100 1200Δt(times100)

069

07

071

072

073

500-

day

aver

age h

it ra

te

(b) Average hit rate versus Δ119905

Figure 3 500-day average MSE and hit rate achieved with different values of Δ119905

002 004 006 008 01 012 0140MSE-radial basis kernel

0

002

004

006

008

01

012

014

MSE

-new

ker

nel

(a) MSE

045 055 065 075 085035Hit rate-radial basis kernel

035

045

055

065

075

085H

it ra

te-n

ew k

erne

l

(b) Hit rate

Figure 4 New kernel versus radial basis function kernel from February 2 2012 to March 3 2014

predicted return is positive and exceeds a threshold and soldif the one-step-ahead predicted return is negative and below athreshold and no action is performed otherwiseThe thresh-old is set as the average of the (100 + 119898) normalized returnsin the training period divided by the scale coefficient ofln(1000) (The scale coefficient controls the trading strategyrsquossensitivity to index price change and a higher value leads tomore frequent tradingThe value of ln(1000) is arbitrarily setand can be adjusted The results are consistent as the scalecoefficient varies)

The variation of capital under such a strategy in the 110-minute out-of-sample period on February 2 2012 is plottedin Figure 6(a) We can see that the new kernel leads to highercapital gain than the other two kernels most of the timeAlso the average capital variation in the last 500 tradingdays is plotted in Figure 6(b) Unlike the fluctuant plots inFigure 6(a) the capital increases steadily when it is averaged

over the 500-day period no matter which kernel is usedThis confirms the effectiveness of SVR in forecasting high-frequency stock returns Once again the new kernel leads tothe highest capital gain and the advantage gets more obviousas the trading period is prolonged At the end of a day the newkernel leads to a return at about 06110min on averagewhile the resulting returns of the other two kernels are bothless than 04110min on average

Furthermore Studentrsquos 119905-test is used to test whether thenew kernel significantly outperforms the commonly usedkernels Specifically we calculate the differences of MSEhit rate and 110-minute capital gain between the forecastswith the new kernel and those with each comparative kernelrespectively on each of the last 500 trading days Table 1reports the mean (119909119889) and the standard deviation (119904119889) ofthese differences in these 500 trading days We test the nullhypothesis that ldquothe forecasts with the new kernel and those

Mathematical Problems in Engineering 7

002 004 006 008 01 012 0140MSE-sigmoid kernel

0

002

004

006

008

01

012

014M

SE-n

ew k

erne

l

(a) MSE

045 055 065 075 085035Hit rate-sigmoid kernel

035

045

055

065

075

085

Hit

rate

-new

ker

nel

(b) Hit rate

Figure 5 New kernel versus sigmoid function kernel from February 2 2012 to March 3 2014

New kernelRadial basis kernelSigmoid kernel

20 40 60 80 100 1200Time (min)

100

1005

101

1015

102

1025

Capi

tal

(a) Capital variation on February 2 2012

New kernelRadial basis kernelSigmoid kernel

100

1001

1002

1003

1004

1005

1006Av

erag

e cap

ital

20 40 60 80 100 1200Time (min)

(b) Average capital variation from February 2 2012 to March 3 2014

Figure 6 Capital variation of a simple trading strategy

with the comparative kernel have the same accuracy in termsof the specified criterionrdquo with the comparative kernel andthe criterion specified in the first row and the second row ofTable 1 respectively The 119905-statistics are reported in the lastrow of Table 1

We can easily see that the six null hypotheses are all signif-icantly rejected The new kernel has smaller MSE greater hitrate and higher 110-minute capital gain than both the radialbasis function kernel and the sigmoid function kernel Andthe differences are all significant at the 1 level Thereforethe results of Studentrsquos 119905-test indicate that the improvementin out-of-sample forecast accuracy brought about by the newkernel is significant both statistically and economically and

thus the newkernel is preferred in forecasting high-frequencystock returns

5 Summary and Conclusion

Support vector machine for regression is now widely appliedin time series forecasting problems Commonly used kernelssuch as the radial basis function kernel and the sigmoidfunction kernel are first derived mathematically for patternrecognition problems Although their direct applications intime series forecasting problems can generate remarkableperformance we argue that using a kernel designed according

8 Mathematical Problems in Engineering

Table 1 Out-of-sample forecast performance comparison results from February 2 2012 to March 3 2014 (119899 = 500)

Criterion New kernel versus (minus) radial basis kernel New kernel versus (minus) sigmoid kernelMSE Hit rate 110-min gain MSE Hit rate 110min gain

119909119889 minus15201119890 minus 3 48709119890 minus 2 20250119890 minus 3 minus13917119890 minus 3 53945119890 minus 2 23879119890 minus 3

119904119889 53684119890 minus 3 60785119890 minus 2 28530119890 minus 3 53658119890 minus 3 63875119890 minus 2 32202119890 minus 3

119905 =119909119889radic119899

119904119889

minus633lowastlowastlowast

1792lowastlowastlowast

1587lowastlowastlowast

minus580lowastlowastlowast

1888lowastlowastlowast

1658lowastlowastlowast

Note lowastlowastlowast indicates significance at the 1 level

to the specific nonlinear dynamics of the series under studycan further improve the forecast accuracy

Under the assumption that each high-frequency stockreturn is an event that triggers momentum and reversalperiodically we decompose each future return into a col-lection of decaying cosine waves that are functions of pastreturns Under realistic assumptions we reach an analyticalexpression of the nonlinear relationship between past andfuture returns and thus design a new kernel specificallyfor forecasting high-frequency stock returns Using high-frequency prices of Chinese CSI 300 index as empirical datawe determine the optimal parameters of the new kernel andthen compare the new kernel with the radial basis functionkernel and the sigmoid function kernel in terms of the SVRrsquosout-of-sample forecast accuracy It turns out that the newkernel significantly outperforms the other two kernels interms of the MSE the hit rate and the capital gain from asimple trading strategy

Our empirical experiments confirm that it is statisticallyand economically valuable to design a new kernel of SVRspecifically characterizing the nonlinear dynamics of thetime series under study Thus our results shed light onan alternative direction for improving the performance ofSVR Current study only utilizes past returns to predictfuture returns A natural extension is to introduce intradaytrading volumes and intraday highlow prices into the featurevector of the SVR and develop kernels characterizing thecorresponding nonlinear relationship between future returnand feature vector Another possible extension is to apply theSVR with new kernel in the energy markets where trading iscontinuous 24 hours We leave them for future work

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful for the financial support offered bytheNational Natural Science Foundation of China (71201075)and the Specialized Research Fund for the Doctoral Programof Higher Education of China (20120091120003)

References

[1] D A Hsieh ldquoTesting for nonlinear dependence in daily foreignexchange ratesrdquoThe Journal of Business vol 62 no 3 pp 339ndash368 1989

[2] J A Scheinkman and B LeBaron ldquoNonlinear dynamics andstock returnsrdquoThe Journal of Business vol 62 no 3 pp 311ndash3371989

[3] G S Atsalakis and K P Valavanis ldquoSurveying stock marketforecasting techniques-part II soft computingmethodsrdquoExpertSystems with Applications vol 36 no 3 pp 5932ndash5941 2009

[4] V N Vapnik The Nature Of Statistical Learning TheorySpringer New York NY USA 1995

[5] L Cao and F E H Tay ldquoFinancial forecasting using supportvector machinesrdquoNeural Computing amp Applications vol 10 no2 pp 184ndash192 2001

[6] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[7] K-J Kim ldquoFinancial time series forecasting using supportvector machinesrdquo Neurocomputing vol 55 no 1-2 pp 307ndash3192003

[8] V V Gavrishchaka and S Banerjee ldquoSupport vector machine asan efficient framework for stock market volatility forecastingrdquoComputational Management Science vol 3 no 2 pp 147ndash1602006

[9] S Sonnenburg G Ratsch C Schafer and B Scholkopf ldquoLargescale multiple kernel learningrdquo Journal of Machine LearningResearch vol 7 pp 1531ndash1565 2006

[10] A Rakotomamonjy F R Bach S Canu and Y GrandvaletldquoSimple MKLrdquo Journal of Machine Learning Research vol 9 pp2491ndash2521 2008

[11] C-Y Yeh C-W Huang and S-J Lee ldquoA multiple-kernelsupport vector regression approach for stock market priceforecastingrdquo Expert Systems with Applications vol 38 no 3 pp2177ndash2186 2011

[12] W Huang Y Nakamori and S-Y Wang ldquoForecasting stockmarket movement direction with support vector machinerdquoComputers and Operations Research vol 32 no 10 pp 2513ndash2522 2005

[13] J M Matıas and J C Reboredo ldquoForecasting performanceof nonlinear models for intraday stock returnsrdquo Journal ofForecasting vol 31 no 2 pp 172ndash188 2012

[14] J C Reboredo JMMatıas and R Garcia-Rubio ldquoNonlinearityin forecasting of high-frequency stock returnsrdquo ComputationalEconomics vol 40 no 3 pp 245ndash264 2012

[15] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[16] J C Platt ldquoFast training of support vector machines usingsequential minimal optimizationrdquo in Advances in Kernel Meth-ods Support Vector Learning B Scholkopf C J C Burgesand A J Smola Eds chapter 12 pp 185ndash208 MIT PressCambridge Mass USA 1999

Mathematical Problems in Engineering 9

[17] T Chordia R Roll and A Subrahmanyam ldquoEvidence on thespeed of convergence to market efficiencyrdquo Journal of FinancialEconomics vol 76 no 2 pp 271ndash292 2005

[18] D Y Chung and K Hrazdil ldquoSpeed of convergence to marketefficiency the role of ECNsrdquo Journal of Empirical Finance vol19 no 5 pp 702ndash720 2012

[19] C W Hsu C C Chang and C J Lin ldquoA practical guideto support vector classificationrdquo Tech Rep Department ofComputer Science National Taiwan University 2003

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article A New Kernel of Support Vector Regression ...downloads.hindawi.com/journals/mpe/2016/4907654.pdf · neural network (ANN), and the support vector machine for regression

2 Mathematical Problems in Engineering

classifiers trained with the same data set and realize betterforecast performance of the weekly movement direction ofNIKKEI 225 index However current applications seldomtouch the inner structure of the existing kernels whichdepicts the nonlinear relationship between past and futuredata Thus it is reasonable to argue that if the kernel isdesigned according to the specific nonlinear dynamics of theseries under study improvement in forecast accuracy can beexpected

In addition the above-mentioned studiesmostly use daily(or lower frequency) data in their empirical experimentsSince high-frequency trading has gained its popularity inrecent years the ability to forecast intraday stock returnsis becoming increasingly important Thus in this study weinstead consider the forecasting of high-frequency stockreturns Matıas and Reboredo [13] and Reboredo et al[14] empirically show the forecast ability of SVR for high-frequency stock returns by directly using the radial basisfunction kernel Instead of directly applying some conven-tional kernel or some combination of conventional kernelswe design a kernel for the specific forecasting problemSpecifically under the assumption that each high-frequencystock return is an event that triggers momentum and rever-sal periodically we decompose each future return into acollection of decaying cosine waves that are functions ofpast returns After taking several realistic assumptions wereach an analytical expression of the nonlinear relationshipbetween past and future returns and design a new kernelaccordingly One-minute returns of Chinese CSI 300 indexare used as empirical data to evaluate the new kernel ofSVR We show that the new kernel significantly beats theconventional radial basis function and sigmoid functionkernels in both the prediction mean square error and thedirectional forecast accuracy rate Besides the capital gain ofa practically simple trading strategy based on the predictionswith the new kernel is also significantly higher

The remainder of this paper is organized as followsSection 2 introduces the basic idea of SVR Section 3 presentsour basic assumptions and designs the new kernel Section 4determines the newly introduced kernel parameters andcompares the new kernel with two commonly used kernelsin terms of the forecast performance of the SVR Finally theconclusions are drawn in Section 5

2 Support Vector Machine for Regression

First designed by Vapnik as a classifier [4] SVR is featuredwith the capability of capturing nonlinear relationship inthe feature space and thus is also considered as an effectiveapproach to regression analysis The following sketches thebasic idea of SVR For more detailed illustration of SVRplease refer to Burges [15]

21 SVR for Linear Regression In a regression problem givena finite data set 119865 = (x119896 119910119896)

119899

119896=1derived from an unknown

function 119910 = 119892(x) with noise we need to determine afunction 119910 = 119891(x) solely based on 119865 and to minimize thedifference between119891 and the unknown function 119892 For linear

regression 119892 is assumed to be a linear relationship between xand 119910

119910 = 119892 (xw 119887) = w sdot x + 119887 =119898

sum

119895=1

119908119895119909119895 + 119887 (1)

where x is called feature vector and the space X it lives inis named as feature space 119898 is the dimension of the featurevector x and the feature space X 119910 is referred to as the labelfor each (x 119910) Now that the relationship to be determined isassumed linear our goal is to find a hyperplane 119910 = 119891(x) inthe119898+1dimension space where (x119896 119910119896)

119899

119896=1are plotted and

to minimize the fitting errors by adjusting the parameters Asis proven by Vapnik the hyperplane is given as

119910 = 119891 (x120572 119887) = sum119896

120572119896119910119896x119896 sdot x + 119887 (2)

where x119896rsquos are support vectors in the given data set 119865 and119910119896rsquos are the corresponding labels ldquosdotrdquo represents the innerproduct in the feature space X Finding the support vectorsand determining the parameters 120572 and 119887 turn out to be alinearly constrained quadratic programming problem thatcan be solved in multiple ways (eg the sequential minimaloptimization algorithm [16]) Such a process conducted onthe given data set 119865 is called learning Once the learningphase is done the model built can be used to predict thecorresponding label 119910 from any feature vector x in the featurespace X

22 SVR for Nonlinear Regression However the linear rela-tionship assumption is often too simple to characterize thedynamics of the time series and thus it is necessary toconsider the case when 119892 is nonlinear The idea of SVR fornonlinear regression is to build amapping x rarr 120601(x) from theoriginal 119898 dimension feature space X to a new feature spaceX1015840 whose dimension depends on the mapping scheme and isnot necessarily finite In the new space X1015840 the relationshipbetween the new feature vector 120601(x) and label 119910 is believedto be in a linear form By building a proper mapping thenonlinear relationship can be approximated by doing in thenew feature spaceX1015840 exactly the same thing as is done for thelinear case and it can be proven that the nonlinear version of(2) is

119910 = 119891 (x120572 119887) = sum119896

120572119896119910119896119870(x119896 x) + 119887 (3)

where 119870(x119896 x) = 120601(x119896) sdot 120601(x) is the kernel function and ldquosdotrdquorepresents the inner product in the new feature space X1015840

The new feature 120601(x) which can be an infinite dimensionvector is usually not necessary to be computed explicitlysince we normally work with the kernel function in thetraining and forecasting phases Accordingly the kernelfunction is essential to the performance of SVR Any functionsatisfying Mercerrsquos condition can be used as the kernelfunction Commonly used kernels include the radial basisfunction kernel119870(x y) = exp(minusxminus y21205902) and the sigmoidfunction kernel 119870(x y) = tanh(120581x sdot y minus 120575) where 120590 120581 and 120575are kernel parameters that can be tuned

Mathematical Problems in Engineering 3

3 A New Kernel for ForecastingHigh-Frequency Stock Returns

Most applications of SVR directly apply the commonlyused kernels and tune the kernel parameters for improvedforecast performance However we argue that a specificallydesigned kernel function which builds on the properties ofthe underlying data can enable the SVR to better capturethe nonlinear relationship between the original feature vectorx and label 119910 Thus in this study we develop a new kernelspecifically for forecasting high-frequency stock returns fromsome basic assumptions about the stock market

31 High-Frequency Stock Return Series Forecasting ProblemHigh-frequency stock return series refers to the return timeseries with one-minute or comparatively small time intervalsGiven a return time series 119868119899 = 119903119899 119903119899minus1 119903119899minus2 frompresent119905 = 119899 to some time point in the history a forecastingproblem is to find the one-step-ahead return 119903119899+1 based onthe knowledge of 119868119899

To put it in another way we need to determine thefunction 119903119899+1 = 119891(119903119899 119903119899minus1 119903119899minus2 ) that best fits the givenreturn series 119868119899 (Some studies introduce exogenous variablesin stock return series forecasting However since we focuson the high-frequency return series where the inefficiencyof the market is obvious [17 18] it is reasonable to believethat the return series itself can provide enough informationfor its forecasting) The vector r119899 = (119903119899 119903119899minus1 119903119899minus2 ) is thusthe feature vector and 119903119899+1 is the label The training data set119865 includes every single return as label and all the returnsbefore it as the elements of the corresponding feature vectorAccording to former studies 119891 is believed to be nonlinearconsidering the complexity of the financial market

32 The New Kernel for SVR It is straightforward that everysingle return has impact on the market Due to behavioraleffects such as overreaction andunderreaction such impact isnot unidirectional during its remaining period In this studywe assume that each high-frequency stock return is an eventthat triggers momentum and reversal periodically and thusexpress the impact generated by the return 119903119894 as

119860(119903119894 119905119901) = 119903119894 cos [120596 (119894 119905119901)10038161003816100381610038161199031198941003816100381610038161003816 119905119901 + 120593 (119894)] 119890

minus120582119905119901 (4)

where 119894 is the time point when the return 119903119894 first occurs 119905119901is the time past since 119905 = 119894 and 120582 is the parameter thatcontrols the decay rate 120596(119894 119905119901)|119903119894| is the frequency of thecosine wave where 120596(119894 119905119901) is a nonzero factor and 120593(119894) is thephase factor Such an expression ensures that the same levelof return occurring at different time points can have differentimpact waves

According to (4) a larger return leads to greater (ampli-tude) and faster (frequency) price change in the future andas time goes by such an impact wave will gradually fadeto vanity This exactly meets the basic intuition about how

the market reacts to events For subsequent deduction (4) istransformed as follows

119860(119903119894 119905119901) = 119903119894 119886119894 cos [120596 (119894 119905119901)10038161003816100381610038161199031198941003816100381610038161003816 119905119901]

+ (1 minus 1198862

119894)12

sin [120596 (119894 119905119901)10038161003816100381610038161199031198941003816100381610038161003816 119905119901] 119890

minus120582119905119901

(5)

where 119886119894 isin [minus1 1]It is natural to assume that every return is comprised

of all the impact waves generated by the past returns (Weonly consider the predictable part of every future returnand ignore the innovation part) Thus we decompose everyreturn 119903119899 into a collection of decaying cosine waves

119903119899 =

infin

sum

119894=1

119860 (119903119899minus119894 119894Δ119905) (6)

where Δ119905 denotes the time interval length which is 1 minutein this study and thus the time passed since event 119903119899minus119894 is 119905119901 =119894Δ119905

Now we substitute (5) into (6) Taylor serial expansionis used on the ldquosinrdquo and ldquocosrdquo terms and the coefficients arerearranged to generate the following

119903119899 =

infin

sum

119894=1

infin

sum

119896=0

119862119894119896119903119899minus119894 [120596 (119899 minus 119894 119894Δ119905)1003816100381610038161003816119903119899minus119894

1003816100381610038161003816 119894Δ119905]119896119890minus120582119894Δ119905

=

infin

sum

119894=1

infin

sum

119896=0

1198621015840

119894119896119903119899minus119894 (

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 119894Δ119905)119896119890minus120582119894Δ119905

(7)

where 1198621015840119894119896119894119896 is the coefficient set to be determined

Therefore based on the assumptions above a nonlinearrelationship between past returns and the one-step-aheadfuture return is derived It is easy to see that 119903119899 is a linearcombination of 119903119899minus119894(|119903119899minus119894|119894Δ119905)

119896119890minus120582119894Δ119905

119894119896Thus it is possible tomap the original feature vectors to a proper form to make therelationship between label and feature vector linear Howeverwe once again fall into the dilemma that (7) is a collection ofinfinite series which makes the mapped feature vectors haveinfinite dimension and hard to compute It is also unlikely toderive the kernel function instead like before

To solve the above-mentioned problem we first considerthe decay property of the impact waves which is presented bythe factor 119890minus120582119894Δ119905 Since the decay rate 120582 is constant we believethat the impact of an event that is119898Δ119905 before the time beingis negligible and 119898 is the minimum integer that satisfies thecondition 119890

minus120582119898Δ119905119890minus120582Δ119905

lt 120575 where 120575 rarr 0 Since we donot know how little 120575 should be 119898 is determined throughexperiments in Section 42

It is also necessary to consider the high-frequency prop-erty of the data Such a property ensures that the scale of thetime interval Δ119905 is subtle compared to the time scale of thefluctuation of the market Thus it is reasonable to assumeΔ119905 ≪ 2120587120596(119894 119905119901)|119903119894| forall119894 where the right hand side is theperiod of the impact wave of return 119903119894 and ldquo≪rdquo representsat least 2 orders of magnitude smaller Furthermore sincethe Taylor serial expansion reserved till order 119902 has an error

4 Mathematical Problems in Engineering

err (119909) = (119891(119902+1)

(120585)(119902 + 1))119909119902+1 the error from truncating

the Taylor expansions in (7) satisfies

err le[120596 (119899 minus 119894 119894Δ119905)

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 119894Δ119905]119902+1

(119902 + 1)

= (120596 (119899 minus 119894 119894Δ119905)

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 Δ119905

2120587⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

≪1

)

119902+1

sdot(2120587119894)119902+1

(119902 + 1)

(8)

Thus setting 119902 = 3 can well ensure the accuracy of theapproximation

Now a much more accessible approximation of (7) isderived

119903119899 =

119898

sum

119894=1

119902

sum

119896=0

1198621015840

119894119896119903119899minus119894 (

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 119894Δ119905)119896119890minus120582119894Δ119905

(9)

Equation (9) explicitly states how the mapping is con-structed For any past return series 119903119899minus119894

119898

119894=1 the original

feature vector is r119899minus1 = (119903119899minus1 119903119899minus2 119903119899minus119898) and themapping120601 is defined as

120601 (r119899minus1) = 119903119899minus119894 (1003816100381610038161003816119903119899minus119894

1003816100381610038161003816 119894Δ119905)119896119890minus120582119894Δ119905

119898119902

119894=1 119896=0 (10)

It is important to understand that the new feature vector120601(r) is 119898 times (119902 + 1) dimension with each element 120601119894119896 givenas 120601119894119896 = 119903119899minus119894(|119903119899minus119894|119894Δ119905)

119896119890minus120582119894Δ119905 Now that 120601(r) has finite

dimension the dimension of the new feature space X1015840 is alsofinite and the kernel function is naturally built as the innerproduct in such a space

119870(r(1) r(2)) =119898

sum

119894=1

119902

sum

119896=0

120601(1)

119894119896120601(2)

119894119896 (11)

Although it is theoretically important to examinewhetherthe newkernel satisfiesMercerrsquos condition or not consideringthat some kernels which fail to meet the condition stilllead to perfectly converged results [15] we will examine theappropriateness of the new kernel through experiments

4 Empirical Experiments

41 Data One-minute prices of Chinese CSI 300 index fromJanuary 4 2010 to March 3 2014 (1000 trading days) areused as empirical data in this study The data are obtainedfromWind Financial Terminal and days with missing pricesare deleted The official trading hours are from 930 to 1130and from 1300 to 1500 and thus we have 240 one-minutereturns per day The returns within the same trading day areused for learning and forecasting since the continuousnessof the time is important in the above derivation (Althoughthere is a 15-hour break in each trading day buy and sellorders submitted in themorning are still valid and neworderscan still be submitted during this period thus we deem thetime as continuous in each trading day) Specifically the first100 + 119898 returns within each trading day are set as the in-sample data (119898 le 30) which are used for learning and the

last 110 returns are set as the out-of-sample data which areused for prediction and evaluationThe average performanceof the SVR during the first 500 trading days that is fromJanuary 4 2010 to February 1 2012 is used for determiningthe kernel parametersThe last 500 trading days that is fromFebruary 2 2012 to March 3 2014 is used for performancecomparison against the commonly used kernels To improvethe performance the logarithmic returns are normalized to[minus1 1] before being input into the SVR

42 Determining the Kernel Parameters Before evaluatingthe performance of the new kernel multiple parameters stillneed to be determined to make best use of the kernel Theundetermined parameters are119898 which measures how manyhistorical data are used 120582 which measures how fast theimpact of one event decays with time andΔ119905 which indicateshow we represent 1 minute numerically We optimize theparameters according to the out-of-sample forecast perfor-mance of the corresponding SVR which is evaluated byMSEand hit rate respectively MSE is the mean square error of thepredictions and is computed using the normalized returnsHit rate is the proportion of predicted returns that have thesame sign with the actual ones that is the directional forecastaccuracy rate

To determine 119898 all the other parameters are fixed and119898 varies from 1 to 30 The average MSE and hit rate in thefirst 500 trading days are plotted in Figures 1(a) and 1(b)respectively (Figures 1(a) and 1(b) are plotted with 120582 and Δ119905set to the optimal values Actually the trends of the plotsdo not vary with the other kernel parameters) The averageMSE decreases sharply with 119898 when 119898 le 21 is relativelyconstant when 119898 isin [21 26] and decreases slowly with 119898

when119898 isin [26 30]The average hit rate increases sharply with119898when119898 le 21 is relatively constant when119898 isin [21 25] andgets smaller afterwards Considering that smaller MSE andgreater hit rate are preferred 119898 = 30 and 119898 isin [21 25] areall appropriate choices suggested by the experiments In thefollowing comparative experiments 119898 is set to 25 (We havealso done comparative experiments with 119898 set to the othersuggested values and the results are consistent)

To determine 120582 all the other parameters are fixed and120582 varies from 1 to 50 The average MSE and hit rate in thefirst 500 trading days are plotted in Figures 2(a) and 2(b)respectively It is quite clear that the minimum average MSEand maximum average hit rate are both achieved at 120582 =

10 and thus we set 120582 to 10 in the following comparativeexperiments

The same method is used to optimize Δ119905 and the averageMSE and hit rate are plotted in Figures 3(a) and 3(b)respectively We can see that the minimum average MSE andmaximum average hit rate are both achieved at Δ119905 = 005and thus we set Δ119905 to 005 in the following comparativeexperiments

43 NewKernel versus CommonlyUsed Kernels Thenewker-nel is compared with the commonly used kernels in terms ofthe out-of-sample forecast performance of the correspondingSVR Although any function satisfying Mercerrsquos condition

Mathematical Problems in Engineering 5

003

0031

0032

0033

0034

0035

003650

0-da

y av

erag

e MSE

5 10 15 20 25 300(m)

(a) Average MSE versus119898

069

07

071

072

500-

day

aver

age h

it ra

te

5 10 15 20 25 300(m)

(b) Average hit rate versus119898

Figure 1 500-day average MSE and hit rate achieved with different values of119898

0031

0032

0033

0034

0035

0036

0037

500-

day

aver

age M

SE

10 20 30 40 500120582

(a) Average MSE versus 120582

067

068

069

07

071

072

07350

0-da

y av

erag

e hit

rate

10 20 30 40 500120582

(b) Average hit rate versus 120582

Figure 2 500-day average MSE and hit rate achieved with different values of 120582

can be used as the kernel function radial basis functionand sigmoid function are two widely used kernels Theformer tends to outperformothers under general smoothnessassumption [12] and the latter gives a particular kind of two-layer sigmoidal neural network [15] Thus these two kernelfunctions are used for performance comparison We realizeeach SVR by LIBSVM-320 [19] and the related codes aremodified for the new kernel

The out-of-sample MSE and hit rate are computed foreach of the three kernels on each of the last 500 trading daysThe results of the new kernel are plotted against those of theradial basis function kernel and the sigmoid function kernelin Figures 4 and 5 respectively

In Figure 4(a) the vertical and horizontal axes representthe MSE of the new kernel and the radial basis functionkernel respectively while in Figure 4(b) the vertical andhorizontal axes represent the hit rate of the newkernel and the

radial basis function kernel respectivelyThere are 500 pointscorresponding to the 500 trading days plotted in each subplotand the diagonal line representing 119910 = 119909 is for referenceWe can see that most points lie below the line 119910 = 119909 inFigure 4(a) and lie above the line 119910 = 119909 in Figure 4(b) Thisindicates that the newkernel leads to smallerMSE and greaterhit rate than the radial basis function kernel in most of the500 trading days Similarly Figures 5(a) and 5(b) indicate thatthe new kernel leads to smaller MSE and greater hit rate thanthe sigmoid function kernel in most of the 500 trading daysTherefore the SVR with the new kernel has obviously betterforecast performance in terms of both the MSE and the hitrate

In addition a simple trading strategy is carried out basedon the out-of-sample forecasts of the SVR with differentkernel specifications The initial capital is set as 100 oneach trading day The index is bought if the one-step-ahead

6 Mathematical Problems in Engineering

20 40 60 80 100 1200Δt(times100)

0031

0032

0033

0034

0035

0036

0037

500-

day

aver

age M

SE

(a) Average MSE versus Δ119905

20 40 60 80 100 1200Δt(times100)

069

07

071

072

073

500-

day

aver

age h

it ra

te

(b) Average hit rate versus Δ119905

Figure 3 500-day average MSE and hit rate achieved with different values of Δ119905

002 004 006 008 01 012 0140MSE-radial basis kernel

0

002

004

006

008

01

012

014

MSE

-new

ker

nel

(a) MSE

045 055 065 075 085035Hit rate-radial basis kernel

035

045

055

065

075

085H

it ra

te-n

ew k

erne

l

(b) Hit rate

Figure 4 New kernel versus radial basis function kernel from February 2 2012 to March 3 2014

predicted return is positive and exceeds a threshold and soldif the one-step-ahead predicted return is negative and below athreshold and no action is performed otherwiseThe thresh-old is set as the average of the (100 + 119898) normalized returnsin the training period divided by the scale coefficient ofln(1000) (The scale coefficient controls the trading strategyrsquossensitivity to index price change and a higher value leads tomore frequent tradingThe value of ln(1000) is arbitrarily setand can be adjusted The results are consistent as the scalecoefficient varies)

The variation of capital under such a strategy in the 110-minute out-of-sample period on February 2 2012 is plottedin Figure 6(a) We can see that the new kernel leads to highercapital gain than the other two kernels most of the timeAlso the average capital variation in the last 500 tradingdays is plotted in Figure 6(b) Unlike the fluctuant plots inFigure 6(a) the capital increases steadily when it is averaged

over the 500-day period no matter which kernel is usedThis confirms the effectiveness of SVR in forecasting high-frequency stock returns Once again the new kernel leads tothe highest capital gain and the advantage gets more obviousas the trading period is prolonged At the end of a day the newkernel leads to a return at about 06110min on averagewhile the resulting returns of the other two kernels are bothless than 04110min on average

Furthermore Studentrsquos 119905-test is used to test whether thenew kernel significantly outperforms the commonly usedkernels Specifically we calculate the differences of MSEhit rate and 110-minute capital gain between the forecastswith the new kernel and those with each comparative kernelrespectively on each of the last 500 trading days Table 1reports the mean (119909119889) and the standard deviation (119904119889) ofthese differences in these 500 trading days We test the nullhypothesis that ldquothe forecasts with the new kernel and those

Mathematical Problems in Engineering 7

002 004 006 008 01 012 0140MSE-sigmoid kernel

0

002

004

006

008

01

012

014M

SE-n

ew k

erne

l

(a) MSE

045 055 065 075 085035Hit rate-sigmoid kernel

035

045

055

065

075

085

Hit

rate

-new

ker

nel

(b) Hit rate

Figure 5 New kernel versus sigmoid function kernel from February 2 2012 to March 3 2014

New kernelRadial basis kernelSigmoid kernel

20 40 60 80 100 1200Time (min)

100

1005

101

1015

102

1025

Capi

tal

(a) Capital variation on February 2 2012

New kernelRadial basis kernelSigmoid kernel

100

1001

1002

1003

1004

1005

1006Av

erag

e cap

ital

20 40 60 80 100 1200Time (min)

(b) Average capital variation from February 2 2012 to March 3 2014

Figure 6 Capital variation of a simple trading strategy

with the comparative kernel have the same accuracy in termsof the specified criterionrdquo with the comparative kernel andthe criterion specified in the first row and the second row ofTable 1 respectively The 119905-statistics are reported in the lastrow of Table 1

We can easily see that the six null hypotheses are all signif-icantly rejected The new kernel has smaller MSE greater hitrate and higher 110-minute capital gain than both the radialbasis function kernel and the sigmoid function kernel Andthe differences are all significant at the 1 level Thereforethe results of Studentrsquos 119905-test indicate that the improvementin out-of-sample forecast accuracy brought about by the newkernel is significant both statistically and economically and

thus the newkernel is preferred in forecasting high-frequencystock returns

5 Summary and Conclusion

Support vector machine for regression is now widely appliedin time series forecasting problems Commonly used kernelssuch as the radial basis function kernel and the sigmoidfunction kernel are first derived mathematically for patternrecognition problems Although their direct applications intime series forecasting problems can generate remarkableperformance we argue that using a kernel designed according

8 Mathematical Problems in Engineering

Table 1 Out-of-sample forecast performance comparison results from February 2 2012 to March 3 2014 (119899 = 500)

Criterion New kernel versus (minus) radial basis kernel New kernel versus (minus) sigmoid kernelMSE Hit rate 110-min gain MSE Hit rate 110min gain

119909119889 minus15201119890 minus 3 48709119890 minus 2 20250119890 minus 3 minus13917119890 minus 3 53945119890 minus 2 23879119890 minus 3

119904119889 53684119890 minus 3 60785119890 minus 2 28530119890 minus 3 53658119890 minus 3 63875119890 minus 2 32202119890 minus 3

119905 =119909119889radic119899

119904119889

minus633lowastlowastlowast

1792lowastlowastlowast

1587lowastlowastlowast

minus580lowastlowastlowast

1888lowastlowastlowast

1658lowastlowastlowast

Note lowastlowastlowast indicates significance at the 1 level

to the specific nonlinear dynamics of the series under studycan further improve the forecast accuracy

Under the assumption that each high-frequency stockreturn is an event that triggers momentum and reversalperiodically we decompose each future return into a col-lection of decaying cosine waves that are functions of pastreturns Under realistic assumptions we reach an analyticalexpression of the nonlinear relationship between past andfuture returns and thus design a new kernel specificallyfor forecasting high-frequency stock returns Using high-frequency prices of Chinese CSI 300 index as empirical datawe determine the optimal parameters of the new kernel andthen compare the new kernel with the radial basis functionkernel and the sigmoid function kernel in terms of the SVRrsquosout-of-sample forecast accuracy It turns out that the newkernel significantly outperforms the other two kernels interms of the MSE the hit rate and the capital gain from asimple trading strategy

Our empirical experiments confirm that it is statisticallyand economically valuable to design a new kernel of SVRspecifically characterizing the nonlinear dynamics of thetime series under study Thus our results shed light onan alternative direction for improving the performance ofSVR Current study only utilizes past returns to predictfuture returns A natural extension is to introduce intradaytrading volumes and intraday highlow prices into the featurevector of the SVR and develop kernels characterizing thecorresponding nonlinear relationship between future returnand feature vector Another possible extension is to apply theSVR with new kernel in the energy markets where trading iscontinuous 24 hours We leave them for future work

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful for the financial support offered bytheNational Natural Science Foundation of China (71201075)and the Specialized Research Fund for the Doctoral Programof Higher Education of China (20120091120003)

References

[1] D A Hsieh ldquoTesting for nonlinear dependence in daily foreignexchange ratesrdquoThe Journal of Business vol 62 no 3 pp 339ndash368 1989

[2] J A Scheinkman and B LeBaron ldquoNonlinear dynamics andstock returnsrdquoThe Journal of Business vol 62 no 3 pp 311ndash3371989

[3] G S Atsalakis and K P Valavanis ldquoSurveying stock marketforecasting techniques-part II soft computingmethodsrdquoExpertSystems with Applications vol 36 no 3 pp 5932ndash5941 2009

[4] V N Vapnik The Nature Of Statistical Learning TheorySpringer New York NY USA 1995

[5] L Cao and F E H Tay ldquoFinancial forecasting using supportvector machinesrdquoNeural Computing amp Applications vol 10 no2 pp 184ndash192 2001

[6] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[7] K-J Kim ldquoFinancial time series forecasting using supportvector machinesrdquo Neurocomputing vol 55 no 1-2 pp 307ndash3192003

[8] V V Gavrishchaka and S Banerjee ldquoSupport vector machine asan efficient framework for stock market volatility forecastingrdquoComputational Management Science vol 3 no 2 pp 147ndash1602006

[9] S Sonnenburg G Ratsch C Schafer and B Scholkopf ldquoLargescale multiple kernel learningrdquo Journal of Machine LearningResearch vol 7 pp 1531ndash1565 2006

[10] A Rakotomamonjy F R Bach S Canu and Y GrandvaletldquoSimple MKLrdquo Journal of Machine Learning Research vol 9 pp2491ndash2521 2008

[11] C-Y Yeh C-W Huang and S-J Lee ldquoA multiple-kernelsupport vector regression approach for stock market priceforecastingrdquo Expert Systems with Applications vol 38 no 3 pp2177ndash2186 2011

[12] W Huang Y Nakamori and S-Y Wang ldquoForecasting stockmarket movement direction with support vector machinerdquoComputers and Operations Research vol 32 no 10 pp 2513ndash2522 2005

[13] J M Matıas and J C Reboredo ldquoForecasting performanceof nonlinear models for intraday stock returnsrdquo Journal ofForecasting vol 31 no 2 pp 172ndash188 2012

[14] J C Reboredo JMMatıas and R Garcia-Rubio ldquoNonlinearityin forecasting of high-frequency stock returnsrdquo ComputationalEconomics vol 40 no 3 pp 245ndash264 2012

[15] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[16] J C Platt ldquoFast training of support vector machines usingsequential minimal optimizationrdquo in Advances in Kernel Meth-ods Support Vector Learning B Scholkopf C J C Burgesand A J Smola Eds chapter 12 pp 185ndash208 MIT PressCambridge Mass USA 1999

Mathematical Problems in Engineering 9

[17] T Chordia R Roll and A Subrahmanyam ldquoEvidence on thespeed of convergence to market efficiencyrdquo Journal of FinancialEconomics vol 76 no 2 pp 271ndash292 2005

[18] D Y Chung and K Hrazdil ldquoSpeed of convergence to marketefficiency the role of ECNsrdquo Journal of Empirical Finance vol19 no 5 pp 702ndash720 2012

[19] C W Hsu C C Chang and C J Lin ldquoA practical guideto support vector classificationrdquo Tech Rep Department ofComputer Science National Taiwan University 2003

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article A New Kernel of Support Vector Regression ...downloads.hindawi.com/journals/mpe/2016/4907654.pdf · neural network (ANN), and the support vector machine for regression

Mathematical Problems in Engineering 3

3 A New Kernel for ForecastingHigh-Frequency Stock Returns

Most applications of SVR directly apply the commonlyused kernels and tune the kernel parameters for improvedforecast performance However we argue that a specificallydesigned kernel function which builds on the properties ofthe underlying data can enable the SVR to better capturethe nonlinear relationship between the original feature vectorx and label 119910 Thus in this study we develop a new kernelspecifically for forecasting high-frequency stock returns fromsome basic assumptions about the stock market

31 High-Frequency Stock Return Series Forecasting ProblemHigh-frequency stock return series refers to the return timeseries with one-minute or comparatively small time intervalsGiven a return time series 119868119899 = 119903119899 119903119899minus1 119903119899minus2 frompresent119905 = 119899 to some time point in the history a forecastingproblem is to find the one-step-ahead return 119903119899+1 based onthe knowledge of 119868119899

To put it in another way we need to determine thefunction 119903119899+1 = 119891(119903119899 119903119899minus1 119903119899minus2 ) that best fits the givenreturn series 119868119899 (Some studies introduce exogenous variablesin stock return series forecasting However since we focuson the high-frequency return series where the inefficiencyof the market is obvious [17 18] it is reasonable to believethat the return series itself can provide enough informationfor its forecasting) The vector r119899 = (119903119899 119903119899minus1 119903119899minus2 ) is thusthe feature vector and 119903119899+1 is the label The training data set119865 includes every single return as label and all the returnsbefore it as the elements of the corresponding feature vectorAccording to former studies 119891 is believed to be nonlinearconsidering the complexity of the financial market

32 The New Kernel for SVR It is straightforward that everysingle return has impact on the market Due to behavioraleffects such as overreaction andunderreaction such impact isnot unidirectional during its remaining period In this studywe assume that each high-frequency stock return is an eventthat triggers momentum and reversal periodically and thusexpress the impact generated by the return 119903119894 as

119860(119903119894 119905119901) = 119903119894 cos [120596 (119894 119905119901)10038161003816100381610038161199031198941003816100381610038161003816 119905119901 + 120593 (119894)] 119890

minus120582119905119901 (4)

where 119894 is the time point when the return 119903119894 first occurs 119905119901is the time past since 119905 = 119894 and 120582 is the parameter thatcontrols the decay rate 120596(119894 119905119901)|119903119894| is the frequency of thecosine wave where 120596(119894 119905119901) is a nonzero factor and 120593(119894) is thephase factor Such an expression ensures that the same levelof return occurring at different time points can have differentimpact waves

According to (4) a larger return leads to greater (ampli-tude) and faster (frequency) price change in the future andas time goes by such an impact wave will gradually fadeto vanity This exactly meets the basic intuition about how

the market reacts to events For subsequent deduction (4) istransformed as follows

119860(119903119894 119905119901) = 119903119894 119886119894 cos [120596 (119894 119905119901)10038161003816100381610038161199031198941003816100381610038161003816 119905119901]

+ (1 minus 1198862

119894)12

sin [120596 (119894 119905119901)10038161003816100381610038161199031198941003816100381610038161003816 119905119901] 119890

minus120582119905119901

(5)

where 119886119894 isin [minus1 1]It is natural to assume that every return is comprised

of all the impact waves generated by the past returns (Weonly consider the predictable part of every future returnand ignore the innovation part) Thus we decompose everyreturn 119903119899 into a collection of decaying cosine waves

119903119899 =

infin

sum

119894=1

119860 (119903119899minus119894 119894Δ119905) (6)

where Δ119905 denotes the time interval length which is 1 minutein this study and thus the time passed since event 119903119899minus119894 is 119905119901 =119894Δ119905

Now we substitute (5) into (6) Taylor serial expansionis used on the ldquosinrdquo and ldquocosrdquo terms and the coefficients arerearranged to generate the following

119903119899 =

infin

sum

119894=1

infin

sum

119896=0

119862119894119896119903119899minus119894 [120596 (119899 minus 119894 119894Δ119905)1003816100381610038161003816119903119899minus119894

1003816100381610038161003816 119894Δ119905]119896119890minus120582119894Δ119905

=

infin

sum

119894=1

infin

sum

119896=0

1198621015840

119894119896119903119899minus119894 (

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 119894Δ119905)119896119890minus120582119894Δ119905

(7)

where 1198621015840119894119896119894119896 is the coefficient set to be determined

Therefore based on the assumptions above a nonlinearrelationship between past returns and the one-step-aheadfuture return is derived It is easy to see that 119903119899 is a linearcombination of 119903119899minus119894(|119903119899minus119894|119894Δ119905)

119896119890minus120582119894Δ119905

119894119896Thus it is possible tomap the original feature vectors to a proper form to make therelationship between label and feature vector linear Howeverwe once again fall into the dilemma that (7) is a collection ofinfinite series which makes the mapped feature vectors haveinfinite dimension and hard to compute It is also unlikely toderive the kernel function instead like before

To solve the above-mentioned problem we first considerthe decay property of the impact waves which is presented bythe factor 119890minus120582119894Δ119905 Since the decay rate 120582 is constant we believethat the impact of an event that is119898Δ119905 before the time beingis negligible and 119898 is the minimum integer that satisfies thecondition 119890

minus120582119898Δ119905119890minus120582Δ119905

lt 120575 where 120575 rarr 0 Since we donot know how little 120575 should be 119898 is determined throughexperiments in Section 42

It is also necessary to consider the high-frequency prop-erty of the data Such a property ensures that the scale of thetime interval Δ119905 is subtle compared to the time scale of thefluctuation of the market Thus it is reasonable to assumeΔ119905 ≪ 2120587120596(119894 119905119901)|119903119894| forall119894 where the right hand side is theperiod of the impact wave of return 119903119894 and ldquo≪rdquo representsat least 2 orders of magnitude smaller Furthermore sincethe Taylor serial expansion reserved till order 119902 has an error

4 Mathematical Problems in Engineering

err (119909) = (119891(119902+1)

(120585)(119902 + 1))119909119902+1 the error from truncating

the Taylor expansions in (7) satisfies

err le[120596 (119899 minus 119894 119894Δ119905)

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 119894Δ119905]119902+1

(119902 + 1)

= (120596 (119899 minus 119894 119894Δ119905)

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 Δ119905

2120587⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

≪1

)

119902+1

sdot(2120587119894)119902+1

(119902 + 1)

(8)

Thus setting 119902 = 3 can well ensure the accuracy of theapproximation

Now a much more accessible approximation of (7) isderived

119903119899 =

119898

sum

119894=1

119902

sum

119896=0

1198621015840

119894119896119903119899minus119894 (

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 119894Δ119905)119896119890minus120582119894Δ119905

(9)

Equation (9) explicitly states how the mapping is con-structed For any past return series 119903119899minus119894

119898

119894=1 the original

feature vector is r119899minus1 = (119903119899minus1 119903119899minus2 119903119899minus119898) and themapping120601 is defined as

120601 (r119899minus1) = 119903119899minus119894 (1003816100381610038161003816119903119899minus119894

1003816100381610038161003816 119894Δ119905)119896119890minus120582119894Δ119905

119898119902

119894=1 119896=0 (10)

It is important to understand that the new feature vector120601(r) is 119898 times (119902 + 1) dimension with each element 120601119894119896 givenas 120601119894119896 = 119903119899minus119894(|119903119899minus119894|119894Δ119905)

119896119890minus120582119894Δ119905 Now that 120601(r) has finite

dimension the dimension of the new feature space X1015840 is alsofinite and the kernel function is naturally built as the innerproduct in such a space

119870(r(1) r(2)) =119898

sum

119894=1

119902

sum

119896=0

120601(1)

119894119896120601(2)

119894119896 (11)

Although it is theoretically important to examinewhetherthe newkernel satisfiesMercerrsquos condition or not consideringthat some kernels which fail to meet the condition stilllead to perfectly converged results [15] we will examine theappropriateness of the new kernel through experiments

4 Empirical Experiments

41 Data One-minute prices of Chinese CSI 300 index fromJanuary 4 2010 to March 3 2014 (1000 trading days) areused as empirical data in this study The data are obtainedfromWind Financial Terminal and days with missing pricesare deleted The official trading hours are from 930 to 1130and from 1300 to 1500 and thus we have 240 one-minutereturns per day The returns within the same trading day areused for learning and forecasting since the continuousnessof the time is important in the above derivation (Althoughthere is a 15-hour break in each trading day buy and sellorders submitted in themorning are still valid and neworderscan still be submitted during this period thus we deem thetime as continuous in each trading day) Specifically the first100 + 119898 returns within each trading day are set as the in-sample data (119898 le 30) which are used for learning and the

last 110 returns are set as the out-of-sample data which areused for prediction and evaluationThe average performanceof the SVR during the first 500 trading days that is fromJanuary 4 2010 to February 1 2012 is used for determiningthe kernel parametersThe last 500 trading days that is fromFebruary 2 2012 to March 3 2014 is used for performancecomparison against the commonly used kernels To improvethe performance the logarithmic returns are normalized to[minus1 1] before being input into the SVR

42 Determining the Kernel Parameters Before evaluatingthe performance of the new kernel multiple parameters stillneed to be determined to make best use of the kernel Theundetermined parameters are119898 which measures how manyhistorical data are used 120582 which measures how fast theimpact of one event decays with time andΔ119905 which indicateshow we represent 1 minute numerically We optimize theparameters according to the out-of-sample forecast perfor-mance of the corresponding SVR which is evaluated byMSEand hit rate respectively MSE is the mean square error of thepredictions and is computed using the normalized returnsHit rate is the proportion of predicted returns that have thesame sign with the actual ones that is the directional forecastaccuracy rate

To determine 119898 all the other parameters are fixed and119898 varies from 1 to 30 The average MSE and hit rate in thefirst 500 trading days are plotted in Figures 1(a) and 1(b)respectively (Figures 1(a) and 1(b) are plotted with 120582 and Δ119905set to the optimal values Actually the trends of the plotsdo not vary with the other kernel parameters) The averageMSE decreases sharply with 119898 when 119898 le 21 is relativelyconstant when 119898 isin [21 26] and decreases slowly with 119898

when119898 isin [26 30]The average hit rate increases sharply with119898when119898 le 21 is relatively constant when119898 isin [21 25] andgets smaller afterwards Considering that smaller MSE andgreater hit rate are preferred 119898 = 30 and 119898 isin [21 25] areall appropriate choices suggested by the experiments In thefollowing comparative experiments 119898 is set to 25 (We havealso done comparative experiments with 119898 set to the othersuggested values and the results are consistent)

To determine 120582 all the other parameters are fixed and120582 varies from 1 to 50 The average MSE and hit rate in thefirst 500 trading days are plotted in Figures 2(a) and 2(b)respectively It is quite clear that the minimum average MSEand maximum average hit rate are both achieved at 120582 =

10 and thus we set 120582 to 10 in the following comparativeexperiments

The same method is used to optimize Δ119905 and the averageMSE and hit rate are plotted in Figures 3(a) and 3(b)respectively We can see that the minimum average MSE andmaximum average hit rate are both achieved at Δ119905 = 005and thus we set Δ119905 to 005 in the following comparativeexperiments

43 NewKernel versus CommonlyUsed Kernels Thenewker-nel is compared with the commonly used kernels in terms ofthe out-of-sample forecast performance of the correspondingSVR Although any function satisfying Mercerrsquos condition

Mathematical Problems in Engineering 5

003

0031

0032

0033

0034

0035

003650

0-da

y av

erag

e MSE

5 10 15 20 25 300(m)

(a) Average MSE versus119898

069

07

071

072

500-

day

aver

age h

it ra

te

5 10 15 20 25 300(m)

(b) Average hit rate versus119898

Figure 1 500-day average MSE and hit rate achieved with different values of119898

0031

0032

0033

0034

0035

0036

0037

500-

day

aver

age M

SE

10 20 30 40 500120582

(a) Average MSE versus 120582

067

068

069

07

071

072

07350

0-da

y av

erag

e hit

rate

10 20 30 40 500120582

(b) Average hit rate versus 120582

Figure 2 500-day average MSE and hit rate achieved with different values of 120582

can be used as the kernel function radial basis functionand sigmoid function are two widely used kernels Theformer tends to outperformothers under general smoothnessassumption [12] and the latter gives a particular kind of two-layer sigmoidal neural network [15] Thus these two kernelfunctions are used for performance comparison We realizeeach SVR by LIBSVM-320 [19] and the related codes aremodified for the new kernel

The out-of-sample MSE and hit rate are computed foreach of the three kernels on each of the last 500 trading daysThe results of the new kernel are plotted against those of theradial basis function kernel and the sigmoid function kernelin Figures 4 and 5 respectively

In Figure 4(a) the vertical and horizontal axes representthe MSE of the new kernel and the radial basis functionkernel respectively while in Figure 4(b) the vertical andhorizontal axes represent the hit rate of the newkernel and the

radial basis function kernel respectivelyThere are 500 pointscorresponding to the 500 trading days plotted in each subplotand the diagonal line representing 119910 = 119909 is for referenceWe can see that most points lie below the line 119910 = 119909 inFigure 4(a) and lie above the line 119910 = 119909 in Figure 4(b) Thisindicates that the newkernel leads to smallerMSE and greaterhit rate than the radial basis function kernel in most of the500 trading days Similarly Figures 5(a) and 5(b) indicate thatthe new kernel leads to smaller MSE and greater hit rate thanthe sigmoid function kernel in most of the 500 trading daysTherefore the SVR with the new kernel has obviously betterforecast performance in terms of both the MSE and the hitrate

In addition a simple trading strategy is carried out basedon the out-of-sample forecasts of the SVR with differentkernel specifications The initial capital is set as 100 oneach trading day The index is bought if the one-step-ahead

6 Mathematical Problems in Engineering

20 40 60 80 100 1200Δt(times100)

0031

0032

0033

0034

0035

0036

0037

500-

day

aver

age M

SE

(a) Average MSE versus Δ119905

20 40 60 80 100 1200Δt(times100)

069

07

071

072

073

500-

day

aver

age h

it ra

te

(b) Average hit rate versus Δ119905

Figure 3 500-day average MSE and hit rate achieved with different values of Δ119905

002 004 006 008 01 012 0140MSE-radial basis kernel

0

002

004

006

008

01

012

014

MSE

-new

ker

nel

(a) MSE

045 055 065 075 085035Hit rate-radial basis kernel

035

045

055

065

075

085H

it ra

te-n

ew k

erne

l

(b) Hit rate

Figure 4 New kernel versus radial basis function kernel from February 2 2012 to March 3 2014

predicted return is positive and exceeds a threshold and soldif the one-step-ahead predicted return is negative and below athreshold and no action is performed otherwiseThe thresh-old is set as the average of the (100 + 119898) normalized returnsin the training period divided by the scale coefficient ofln(1000) (The scale coefficient controls the trading strategyrsquossensitivity to index price change and a higher value leads tomore frequent tradingThe value of ln(1000) is arbitrarily setand can be adjusted The results are consistent as the scalecoefficient varies)

The variation of capital under such a strategy in the 110-minute out-of-sample period on February 2 2012 is plottedin Figure 6(a) We can see that the new kernel leads to highercapital gain than the other two kernels most of the timeAlso the average capital variation in the last 500 tradingdays is plotted in Figure 6(b) Unlike the fluctuant plots inFigure 6(a) the capital increases steadily when it is averaged

over the 500-day period no matter which kernel is usedThis confirms the effectiveness of SVR in forecasting high-frequency stock returns Once again the new kernel leads tothe highest capital gain and the advantage gets more obviousas the trading period is prolonged At the end of a day the newkernel leads to a return at about 06110min on averagewhile the resulting returns of the other two kernels are bothless than 04110min on average

Furthermore Studentrsquos 119905-test is used to test whether thenew kernel significantly outperforms the commonly usedkernels Specifically we calculate the differences of MSEhit rate and 110-minute capital gain between the forecastswith the new kernel and those with each comparative kernelrespectively on each of the last 500 trading days Table 1reports the mean (119909119889) and the standard deviation (119904119889) ofthese differences in these 500 trading days We test the nullhypothesis that ldquothe forecasts with the new kernel and those

Mathematical Problems in Engineering 7

002 004 006 008 01 012 0140MSE-sigmoid kernel

0

002

004

006

008

01

012

014M

SE-n

ew k

erne

l

(a) MSE

045 055 065 075 085035Hit rate-sigmoid kernel

035

045

055

065

075

085

Hit

rate

-new

ker

nel

(b) Hit rate

Figure 5 New kernel versus sigmoid function kernel from February 2 2012 to March 3 2014

New kernelRadial basis kernelSigmoid kernel

20 40 60 80 100 1200Time (min)

100

1005

101

1015

102

1025

Capi

tal

(a) Capital variation on February 2 2012

New kernelRadial basis kernelSigmoid kernel

100

1001

1002

1003

1004

1005

1006Av

erag

e cap

ital

20 40 60 80 100 1200Time (min)

(b) Average capital variation from February 2 2012 to March 3 2014

Figure 6 Capital variation of a simple trading strategy

with the comparative kernel have the same accuracy in termsof the specified criterionrdquo with the comparative kernel andthe criterion specified in the first row and the second row ofTable 1 respectively The 119905-statistics are reported in the lastrow of Table 1

We can easily see that the six null hypotheses are all signif-icantly rejected The new kernel has smaller MSE greater hitrate and higher 110-minute capital gain than both the radialbasis function kernel and the sigmoid function kernel Andthe differences are all significant at the 1 level Thereforethe results of Studentrsquos 119905-test indicate that the improvementin out-of-sample forecast accuracy brought about by the newkernel is significant both statistically and economically and

thus the newkernel is preferred in forecasting high-frequencystock returns

5 Summary and Conclusion

Support vector machine for regression is now widely appliedin time series forecasting problems Commonly used kernelssuch as the radial basis function kernel and the sigmoidfunction kernel are first derived mathematically for patternrecognition problems Although their direct applications intime series forecasting problems can generate remarkableperformance we argue that using a kernel designed according

8 Mathematical Problems in Engineering

Table 1 Out-of-sample forecast performance comparison results from February 2 2012 to March 3 2014 (119899 = 500)

Criterion New kernel versus (minus) radial basis kernel New kernel versus (minus) sigmoid kernelMSE Hit rate 110-min gain MSE Hit rate 110min gain

119909119889 minus15201119890 minus 3 48709119890 minus 2 20250119890 minus 3 minus13917119890 minus 3 53945119890 minus 2 23879119890 minus 3

119904119889 53684119890 minus 3 60785119890 minus 2 28530119890 minus 3 53658119890 minus 3 63875119890 minus 2 32202119890 minus 3

119905 =119909119889radic119899

119904119889

minus633lowastlowastlowast

1792lowastlowastlowast

1587lowastlowastlowast

minus580lowastlowastlowast

1888lowastlowastlowast

1658lowastlowastlowast

Note lowastlowastlowast indicates significance at the 1 level

to the specific nonlinear dynamics of the series under studycan further improve the forecast accuracy

Under the assumption that each high-frequency stockreturn is an event that triggers momentum and reversalperiodically we decompose each future return into a col-lection of decaying cosine waves that are functions of pastreturns Under realistic assumptions we reach an analyticalexpression of the nonlinear relationship between past andfuture returns and thus design a new kernel specificallyfor forecasting high-frequency stock returns Using high-frequency prices of Chinese CSI 300 index as empirical datawe determine the optimal parameters of the new kernel andthen compare the new kernel with the radial basis functionkernel and the sigmoid function kernel in terms of the SVRrsquosout-of-sample forecast accuracy It turns out that the newkernel significantly outperforms the other two kernels interms of the MSE the hit rate and the capital gain from asimple trading strategy

Our empirical experiments confirm that it is statisticallyand economically valuable to design a new kernel of SVRspecifically characterizing the nonlinear dynamics of thetime series under study Thus our results shed light onan alternative direction for improving the performance ofSVR Current study only utilizes past returns to predictfuture returns A natural extension is to introduce intradaytrading volumes and intraday highlow prices into the featurevector of the SVR and develop kernels characterizing thecorresponding nonlinear relationship between future returnand feature vector Another possible extension is to apply theSVR with new kernel in the energy markets where trading iscontinuous 24 hours We leave them for future work

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful for the financial support offered bytheNational Natural Science Foundation of China (71201075)and the Specialized Research Fund for the Doctoral Programof Higher Education of China (20120091120003)

References

[1] D A Hsieh ldquoTesting for nonlinear dependence in daily foreignexchange ratesrdquoThe Journal of Business vol 62 no 3 pp 339ndash368 1989

[2] J A Scheinkman and B LeBaron ldquoNonlinear dynamics andstock returnsrdquoThe Journal of Business vol 62 no 3 pp 311ndash3371989

[3] G S Atsalakis and K P Valavanis ldquoSurveying stock marketforecasting techniques-part II soft computingmethodsrdquoExpertSystems with Applications vol 36 no 3 pp 5932ndash5941 2009

[4] V N Vapnik The Nature Of Statistical Learning TheorySpringer New York NY USA 1995

[5] L Cao and F E H Tay ldquoFinancial forecasting using supportvector machinesrdquoNeural Computing amp Applications vol 10 no2 pp 184ndash192 2001

[6] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[7] K-J Kim ldquoFinancial time series forecasting using supportvector machinesrdquo Neurocomputing vol 55 no 1-2 pp 307ndash3192003

[8] V V Gavrishchaka and S Banerjee ldquoSupport vector machine asan efficient framework for stock market volatility forecastingrdquoComputational Management Science vol 3 no 2 pp 147ndash1602006

[9] S Sonnenburg G Ratsch C Schafer and B Scholkopf ldquoLargescale multiple kernel learningrdquo Journal of Machine LearningResearch vol 7 pp 1531ndash1565 2006

[10] A Rakotomamonjy F R Bach S Canu and Y GrandvaletldquoSimple MKLrdquo Journal of Machine Learning Research vol 9 pp2491ndash2521 2008

[11] C-Y Yeh C-W Huang and S-J Lee ldquoA multiple-kernelsupport vector regression approach for stock market priceforecastingrdquo Expert Systems with Applications vol 38 no 3 pp2177ndash2186 2011

[12] W Huang Y Nakamori and S-Y Wang ldquoForecasting stockmarket movement direction with support vector machinerdquoComputers and Operations Research vol 32 no 10 pp 2513ndash2522 2005

[13] J M Matıas and J C Reboredo ldquoForecasting performanceof nonlinear models for intraday stock returnsrdquo Journal ofForecasting vol 31 no 2 pp 172ndash188 2012

[14] J C Reboredo JMMatıas and R Garcia-Rubio ldquoNonlinearityin forecasting of high-frequency stock returnsrdquo ComputationalEconomics vol 40 no 3 pp 245ndash264 2012

[15] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[16] J C Platt ldquoFast training of support vector machines usingsequential minimal optimizationrdquo in Advances in Kernel Meth-ods Support Vector Learning B Scholkopf C J C Burgesand A J Smola Eds chapter 12 pp 185ndash208 MIT PressCambridge Mass USA 1999

Mathematical Problems in Engineering 9

[17] T Chordia R Roll and A Subrahmanyam ldquoEvidence on thespeed of convergence to market efficiencyrdquo Journal of FinancialEconomics vol 76 no 2 pp 271ndash292 2005

[18] D Y Chung and K Hrazdil ldquoSpeed of convergence to marketefficiency the role of ECNsrdquo Journal of Empirical Finance vol19 no 5 pp 702ndash720 2012

[19] C W Hsu C C Chang and C J Lin ldquoA practical guideto support vector classificationrdquo Tech Rep Department ofComputer Science National Taiwan University 2003

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article A New Kernel of Support Vector Regression ...downloads.hindawi.com/journals/mpe/2016/4907654.pdf · neural network (ANN), and the support vector machine for regression

4 Mathematical Problems in Engineering

err (119909) = (119891(119902+1)

(120585)(119902 + 1))119909119902+1 the error from truncating

the Taylor expansions in (7) satisfies

err le[120596 (119899 minus 119894 119894Δ119905)

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 119894Δ119905]119902+1

(119902 + 1)

= (120596 (119899 minus 119894 119894Δ119905)

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 Δ119905

2120587⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

≪1

)

119902+1

sdot(2120587119894)119902+1

(119902 + 1)

(8)

Thus setting 119902 = 3 can well ensure the accuracy of theapproximation

Now a much more accessible approximation of (7) isderived

119903119899 =

119898

sum

119894=1

119902

sum

119896=0

1198621015840

119894119896119903119899minus119894 (

1003816100381610038161003816119903119899minus1198941003816100381610038161003816 119894Δ119905)119896119890minus120582119894Δ119905

(9)

Equation (9) explicitly states how the mapping is con-structed For any past return series 119903119899minus119894

119898

119894=1 the original

feature vector is r119899minus1 = (119903119899minus1 119903119899minus2 119903119899minus119898) and themapping120601 is defined as

120601 (r119899minus1) = 119903119899minus119894 (1003816100381610038161003816119903119899minus119894

1003816100381610038161003816 119894Δ119905)119896119890minus120582119894Δ119905

119898119902

119894=1 119896=0 (10)

It is important to understand that the new feature vector120601(r) is 119898 times (119902 + 1) dimension with each element 120601119894119896 givenas 120601119894119896 = 119903119899minus119894(|119903119899minus119894|119894Δ119905)

119896119890minus120582119894Δ119905 Now that 120601(r) has finite

dimension the dimension of the new feature space X1015840 is alsofinite and the kernel function is naturally built as the innerproduct in such a space

119870(r(1) r(2)) =119898

sum

119894=1

119902

sum

119896=0

120601(1)

119894119896120601(2)

119894119896 (11)

Although it is theoretically important to examinewhetherthe newkernel satisfiesMercerrsquos condition or not consideringthat some kernels which fail to meet the condition stilllead to perfectly converged results [15] we will examine theappropriateness of the new kernel through experiments

4 Empirical Experiments

41 Data One-minute prices of Chinese CSI 300 index fromJanuary 4 2010 to March 3 2014 (1000 trading days) areused as empirical data in this study The data are obtainedfromWind Financial Terminal and days with missing pricesare deleted The official trading hours are from 930 to 1130and from 1300 to 1500 and thus we have 240 one-minutereturns per day The returns within the same trading day areused for learning and forecasting since the continuousnessof the time is important in the above derivation (Althoughthere is a 15-hour break in each trading day buy and sellorders submitted in themorning are still valid and neworderscan still be submitted during this period thus we deem thetime as continuous in each trading day) Specifically the first100 + 119898 returns within each trading day are set as the in-sample data (119898 le 30) which are used for learning and the

last 110 returns are set as the out-of-sample data which areused for prediction and evaluationThe average performanceof the SVR during the first 500 trading days that is fromJanuary 4 2010 to February 1 2012 is used for determiningthe kernel parametersThe last 500 trading days that is fromFebruary 2 2012 to March 3 2014 is used for performancecomparison against the commonly used kernels To improvethe performance the logarithmic returns are normalized to[minus1 1] before being input into the SVR

42 Determining the Kernel Parameters Before evaluatingthe performance of the new kernel multiple parameters stillneed to be determined to make best use of the kernel Theundetermined parameters are119898 which measures how manyhistorical data are used 120582 which measures how fast theimpact of one event decays with time andΔ119905 which indicateshow we represent 1 minute numerically We optimize theparameters according to the out-of-sample forecast perfor-mance of the corresponding SVR which is evaluated byMSEand hit rate respectively MSE is the mean square error of thepredictions and is computed using the normalized returnsHit rate is the proportion of predicted returns that have thesame sign with the actual ones that is the directional forecastaccuracy rate

To determine 119898 all the other parameters are fixed and119898 varies from 1 to 30 The average MSE and hit rate in thefirst 500 trading days are plotted in Figures 1(a) and 1(b)respectively (Figures 1(a) and 1(b) are plotted with 120582 and Δ119905set to the optimal values Actually the trends of the plotsdo not vary with the other kernel parameters) The averageMSE decreases sharply with 119898 when 119898 le 21 is relativelyconstant when 119898 isin [21 26] and decreases slowly with 119898

when119898 isin [26 30]The average hit rate increases sharply with119898when119898 le 21 is relatively constant when119898 isin [21 25] andgets smaller afterwards Considering that smaller MSE andgreater hit rate are preferred 119898 = 30 and 119898 isin [21 25] areall appropriate choices suggested by the experiments In thefollowing comparative experiments 119898 is set to 25 (We havealso done comparative experiments with 119898 set to the othersuggested values and the results are consistent)

To determine 120582 all the other parameters are fixed and120582 varies from 1 to 50 The average MSE and hit rate in thefirst 500 trading days are plotted in Figures 2(a) and 2(b)respectively It is quite clear that the minimum average MSEand maximum average hit rate are both achieved at 120582 =

10 and thus we set 120582 to 10 in the following comparativeexperiments

The same method is used to optimize Δ119905 and the averageMSE and hit rate are plotted in Figures 3(a) and 3(b)respectively We can see that the minimum average MSE andmaximum average hit rate are both achieved at Δ119905 = 005and thus we set Δ119905 to 005 in the following comparativeexperiments

43 NewKernel versus CommonlyUsed Kernels Thenewker-nel is compared with the commonly used kernels in terms ofthe out-of-sample forecast performance of the correspondingSVR Although any function satisfying Mercerrsquos condition

Mathematical Problems in Engineering 5

003

0031

0032

0033

0034

0035

003650

0-da

y av

erag

e MSE

5 10 15 20 25 300(m)

(a) Average MSE versus119898

069

07

071

072

500-

day

aver

age h

it ra

te

5 10 15 20 25 300(m)

(b) Average hit rate versus119898

Figure 1 500-day average MSE and hit rate achieved with different values of119898

0031

0032

0033

0034

0035

0036

0037

500-

day

aver

age M

SE

10 20 30 40 500120582

(a) Average MSE versus 120582

067

068

069

07

071

072

07350

0-da

y av

erag

e hit

rate

10 20 30 40 500120582

(b) Average hit rate versus 120582

Figure 2 500-day average MSE and hit rate achieved with different values of 120582

can be used as the kernel function radial basis functionand sigmoid function are two widely used kernels Theformer tends to outperformothers under general smoothnessassumption [12] and the latter gives a particular kind of two-layer sigmoidal neural network [15] Thus these two kernelfunctions are used for performance comparison We realizeeach SVR by LIBSVM-320 [19] and the related codes aremodified for the new kernel

The out-of-sample MSE and hit rate are computed foreach of the three kernels on each of the last 500 trading daysThe results of the new kernel are plotted against those of theradial basis function kernel and the sigmoid function kernelin Figures 4 and 5 respectively

In Figure 4(a) the vertical and horizontal axes representthe MSE of the new kernel and the radial basis functionkernel respectively while in Figure 4(b) the vertical andhorizontal axes represent the hit rate of the newkernel and the

radial basis function kernel respectivelyThere are 500 pointscorresponding to the 500 trading days plotted in each subplotand the diagonal line representing 119910 = 119909 is for referenceWe can see that most points lie below the line 119910 = 119909 inFigure 4(a) and lie above the line 119910 = 119909 in Figure 4(b) Thisindicates that the newkernel leads to smallerMSE and greaterhit rate than the radial basis function kernel in most of the500 trading days Similarly Figures 5(a) and 5(b) indicate thatthe new kernel leads to smaller MSE and greater hit rate thanthe sigmoid function kernel in most of the 500 trading daysTherefore the SVR with the new kernel has obviously betterforecast performance in terms of both the MSE and the hitrate

In addition a simple trading strategy is carried out basedon the out-of-sample forecasts of the SVR with differentkernel specifications The initial capital is set as 100 oneach trading day The index is bought if the one-step-ahead

6 Mathematical Problems in Engineering

20 40 60 80 100 1200Δt(times100)

0031

0032

0033

0034

0035

0036

0037

500-

day

aver

age M

SE

(a) Average MSE versus Δ119905

20 40 60 80 100 1200Δt(times100)

069

07

071

072

073

500-

day

aver

age h

it ra

te

(b) Average hit rate versus Δ119905

Figure 3 500-day average MSE and hit rate achieved with different values of Δ119905

002 004 006 008 01 012 0140MSE-radial basis kernel

0

002

004

006

008

01

012

014

MSE

-new

ker

nel

(a) MSE

045 055 065 075 085035Hit rate-radial basis kernel

035

045

055

065

075

085H

it ra

te-n

ew k

erne

l

(b) Hit rate

Figure 4 New kernel versus radial basis function kernel from February 2 2012 to March 3 2014

predicted return is positive and exceeds a threshold and soldif the one-step-ahead predicted return is negative and below athreshold and no action is performed otherwiseThe thresh-old is set as the average of the (100 + 119898) normalized returnsin the training period divided by the scale coefficient ofln(1000) (The scale coefficient controls the trading strategyrsquossensitivity to index price change and a higher value leads tomore frequent tradingThe value of ln(1000) is arbitrarily setand can be adjusted The results are consistent as the scalecoefficient varies)

The variation of capital under such a strategy in the 110-minute out-of-sample period on February 2 2012 is plottedin Figure 6(a) We can see that the new kernel leads to highercapital gain than the other two kernels most of the timeAlso the average capital variation in the last 500 tradingdays is plotted in Figure 6(b) Unlike the fluctuant plots inFigure 6(a) the capital increases steadily when it is averaged

over the 500-day period no matter which kernel is usedThis confirms the effectiveness of SVR in forecasting high-frequency stock returns Once again the new kernel leads tothe highest capital gain and the advantage gets more obviousas the trading period is prolonged At the end of a day the newkernel leads to a return at about 06110min on averagewhile the resulting returns of the other two kernels are bothless than 04110min on average

Furthermore Studentrsquos 119905-test is used to test whether thenew kernel significantly outperforms the commonly usedkernels Specifically we calculate the differences of MSEhit rate and 110-minute capital gain between the forecastswith the new kernel and those with each comparative kernelrespectively on each of the last 500 trading days Table 1reports the mean (119909119889) and the standard deviation (119904119889) ofthese differences in these 500 trading days We test the nullhypothesis that ldquothe forecasts with the new kernel and those

Mathematical Problems in Engineering 7

002 004 006 008 01 012 0140MSE-sigmoid kernel

0

002

004

006

008

01

012

014M

SE-n

ew k

erne

l

(a) MSE

045 055 065 075 085035Hit rate-sigmoid kernel

035

045

055

065

075

085

Hit

rate

-new

ker

nel

(b) Hit rate

Figure 5 New kernel versus sigmoid function kernel from February 2 2012 to March 3 2014

New kernelRadial basis kernelSigmoid kernel

20 40 60 80 100 1200Time (min)

100

1005

101

1015

102

1025

Capi

tal

(a) Capital variation on February 2 2012

New kernelRadial basis kernelSigmoid kernel

100

1001

1002

1003

1004

1005

1006Av

erag

e cap

ital

20 40 60 80 100 1200Time (min)

(b) Average capital variation from February 2 2012 to March 3 2014

Figure 6 Capital variation of a simple trading strategy

with the comparative kernel have the same accuracy in termsof the specified criterionrdquo with the comparative kernel andthe criterion specified in the first row and the second row ofTable 1 respectively The 119905-statistics are reported in the lastrow of Table 1

We can easily see that the six null hypotheses are all signif-icantly rejected The new kernel has smaller MSE greater hitrate and higher 110-minute capital gain than both the radialbasis function kernel and the sigmoid function kernel Andthe differences are all significant at the 1 level Thereforethe results of Studentrsquos 119905-test indicate that the improvementin out-of-sample forecast accuracy brought about by the newkernel is significant both statistically and economically and

thus the newkernel is preferred in forecasting high-frequencystock returns

5 Summary and Conclusion

Support vector machine for regression is now widely appliedin time series forecasting problems Commonly used kernelssuch as the radial basis function kernel and the sigmoidfunction kernel are first derived mathematically for patternrecognition problems Although their direct applications intime series forecasting problems can generate remarkableperformance we argue that using a kernel designed according

8 Mathematical Problems in Engineering

Table 1 Out-of-sample forecast performance comparison results from February 2 2012 to March 3 2014 (119899 = 500)

Criterion New kernel versus (minus) radial basis kernel New kernel versus (minus) sigmoid kernelMSE Hit rate 110-min gain MSE Hit rate 110min gain

119909119889 minus15201119890 minus 3 48709119890 minus 2 20250119890 minus 3 minus13917119890 minus 3 53945119890 minus 2 23879119890 minus 3

119904119889 53684119890 minus 3 60785119890 minus 2 28530119890 minus 3 53658119890 minus 3 63875119890 minus 2 32202119890 minus 3

119905 =119909119889radic119899

119904119889

minus633lowastlowastlowast

1792lowastlowastlowast

1587lowastlowastlowast

minus580lowastlowastlowast

1888lowastlowastlowast

1658lowastlowastlowast

Note lowastlowastlowast indicates significance at the 1 level

to the specific nonlinear dynamics of the series under studycan further improve the forecast accuracy

Under the assumption that each high-frequency stockreturn is an event that triggers momentum and reversalperiodically we decompose each future return into a col-lection of decaying cosine waves that are functions of pastreturns Under realistic assumptions we reach an analyticalexpression of the nonlinear relationship between past andfuture returns and thus design a new kernel specificallyfor forecasting high-frequency stock returns Using high-frequency prices of Chinese CSI 300 index as empirical datawe determine the optimal parameters of the new kernel andthen compare the new kernel with the radial basis functionkernel and the sigmoid function kernel in terms of the SVRrsquosout-of-sample forecast accuracy It turns out that the newkernel significantly outperforms the other two kernels interms of the MSE the hit rate and the capital gain from asimple trading strategy

Our empirical experiments confirm that it is statisticallyand economically valuable to design a new kernel of SVRspecifically characterizing the nonlinear dynamics of thetime series under study Thus our results shed light onan alternative direction for improving the performance ofSVR Current study only utilizes past returns to predictfuture returns A natural extension is to introduce intradaytrading volumes and intraday highlow prices into the featurevector of the SVR and develop kernels characterizing thecorresponding nonlinear relationship between future returnand feature vector Another possible extension is to apply theSVR with new kernel in the energy markets where trading iscontinuous 24 hours We leave them for future work

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful for the financial support offered bytheNational Natural Science Foundation of China (71201075)and the Specialized Research Fund for the Doctoral Programof Higher Education of China (20120091120003)

References

[1] D A Hsieh ldquoTesting for nonlinear dependence in daily foreignexchange ratesrdquoThe Journal of Business vol 62 no 3 pp 339ndash368 1989

[2] J A Scheinkman and B LeBaron ldquoNonlinear dynamics andstock returnsrdquoThe Journal of Business vol 62 no 3 pp 311ndash3371989

[3] G S Atsalakis and K P Valavanis ldquoSurveying stock marketforecasting techniques-part II soft computingmethodsrdquoExpertSystems with Applications vol 36 no 3 pp 5932ndash5941 2009

[4] V N Vapnik The Nature Of Statistical Learning TheorySpringer New York NY USA 1995

[5] L Cao and F E H Tay ldquoFinancial forecasting using supportvector machinesrdquoNeural Computing amp Applications vol 10 no2 pp 184ndash192 2001

[6] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[7] K-J Kim ldquoFinancial time series forecasting using supportvector machinesrdquo Neurocomputing vol 55 no 1-2 pp 307ndash3192003

[8] V V Gavrishchaka and S Banerjee ldquoSupport vector machine asan efficient framework for stock market volatility forecastingrdquoComputational Management Science vol 3 no 2 pp 147ndash1602006

[9] S Sonnenburg G Ratsch C Schafer and B Scholkopf ldquoLargescale multiple kernel learningrdquo Journal of Machine LearningResearch vol 7 pp 1531ndash1565 2006

[10] A Rakotomamonjy F R Bach S Canu and Y GrandvaletldquoSimple MKLrdquo Journal of Machine Learning Research vol 9 pp2491ndash2521 2008

[11] C-Y Yeh C-W Huang and S-J Lee ldquoA multiple-kernelsupport vector regression approach for stock market priceforecastingrdquo Expert Systems with Applications vol 38 no 3 pp2177ndash2186 2011

[12] W Huang Y Nakamori and S-Y Wang ldquoForecasting stockmarket movement direction with support vector machinerdquoComputers and Operations Research vol 32 no 10 pp 2513ndash2522 2005

[13] J M Matıas and J C Reboredo ldquoForecasting performanceof nonlinear models for intraday stock returnsrdquo Journal ofForecasting vol 31 no 2 pp 172ndash188 2012

[14] J C Reboredo JMMatıas and R Garcia-Rubio ldquoNonlinearityin forecasting of high-frequency stock returnsrdquo ComputationalEconomics vol 40 no 3 pp 245ndash264 2012

[15] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[16] J C Platt ldquoFast training of support vector machines usingsequential minimal optimizationrdquo in Advances in Kernel Meth-ods Support Vector Learning B Scholkopf C J C Burgesand A J Smola Eds chapter 12 pp 185ndash208 MIT PressCambridge Mass USA 1999

Mathematical Problems in Engineering 9

[17] T Chordia R Roll and A Subrahmanyam ldquoEvidence on thespeed of convergence to market efficiencyrdquo Journal of FinancialEconomics vol 76 no 2 pp 271ndash292 2005

[18] D Y Chung and K Hrazdil ldquoSpeed of convergence to marketefficiency the role of ECNsrdquo Journal of Empirical Finance vol19 no 5 pp 702ndash720 2012

[19] C W Hsu C C Chang and C J Lin ldquoA practical guideto support vector classificationrdquo Tech Rep Department ofComputer Science National Taiwan University 2003

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article A New Kernel of Support Vector Regression ...downloads.hindawi.com/journals/mpe/2016/4907654.pdf · neural network (ANN), and the support vector machine for regression

Mathematical Problems in Engineering 5

003

0031

0032

0033

0034

0035

003650

0-da

y av

erag

e MSE

5 10 15 20 25 300(m)

(a) Average MSE versus119898

069

07

071

072

500-

day

aver

age h

it ra

te

5 10 15 20 25 300(m)

(b) Average hit rate versus119898

Figure 1 500-day average MSE and hit rate achieved with different values of119898

0031

0032

0033

0034

0035

0036

0037

500-

day

aver

age M

SE

10 20 30 40 500120582

(a) Average MSE versus 120582

067

068

069

07

071

072

07350

0-da

y av

erag

e hit

rate

10 20 30 40 500120582

(b) Average hit rate versus 120582

Figure 2 500-day average MSE and hit rate achieved with different values of 120582

can be used as the kernel function radial basis functionand sigmoid function are two widely used kernels Theformer tends to outperformothers under general smoothnessassumption [12] and the latter gives a particular kind of two-layer sigmoidal neural network [15] Thus these two kernelfunctions are used for performance comparison We realizeeach SVR by LIBSVM-320 [19] and the related codes aremodified for the new kernel

The out-of-sample MSE and hit rate are computed foreach of the three kernels on each of the last 500 trading daysThe results of the new kernel are plotted against those of theradial basis function kernel and the sigmoid function kernelin Figures 4 and 5 respectively

In Figure 4(a) the vertical and horizontal axes representthe MSE of the new kernel and the radial basis functionkernel respectively while in Figure 4(b) the vertical andhorizontal axes represent the hit rate of the newkernel and the

radial basis function kernel respectivelyThere are 500 pointscorresponding to the 500 trading days plotted in each subplotand the diagonal line representing 119910 = 119909 is for referenceWe can see that most points lie below the line 119910 = 119909 inFigure 4(a) and lie above the line 119910 = 119909 in Figure 4(b) Thisindicates that the newkernel leads to smallerMSE and greaterhit rate than the radial basis function kernel in most of the500 trading days Similarly Figures 5(a) and 5(b) indicate thatthe new kernel leads to smaller MSE and greater hit rate thanthe sigmoid function kernel in most of the 500 trading daysTherefore the SVR with the new kernel has obviously betterforecast performance in terms of both the MSE and the hitrate

In addition a simple trading strategy is carried out basedon the out-of-sample forecasts of the SVR with differentkernel specifications The initial capital is set as 100 oneach trading day The index is bought if the one-step-ahead

6 Mathematical Problems in Engineering

20 40 60 80 100 1200Δt(times100)

0031

0032

0033

0034

0035

0036

0037

500-

day

aver

age M

SE

(a) Average MSE versus Δ119905

20 40 60 80 100 1200Δt(times100)

069

07

071

072

073

500-

day

aver

age h

it ra

te

(b) Average hit rate versus Δ119905

Figure 3 500-day average MSE and hit rate achieved with different values of Δ119905

002 004 006 008 01 012 0140MSE-radial basis kernel

0

002

004

006

008

01

012

014

MSE

-new

ker

nel

(a) MSE

045 055 065 075 085035Hit rate-radial basis kernel

035

045

055

065

075

085H

it ra

te-n

ew k

erne

l

(b) Hit rate

Figure 4 New kernel versus radial basis function kernel from February 2 2012 to March 3 2014

predicted return is positive and exceeds a threshold and soldif the one-step-ahead predicted return is negative and below athreshold and no action is performed otherwiseThe thresh-old is set as the average of the (100 + 119898) normalized returnsin the training period divided by the scale coefficient ofln(1000) (The scale coefficient controls the trading strategyrsquossensitivity to index price change and a higher value leads tomore frequent tradingThe value of ln(1000) is arbitrarily setand can be adjusted The results are consistent as the scalecoefficient varies)

The variation of capital under such a strategy in the 110-minute out-of-sample period on February 2 2012 is plottedin Figure 6(a) We can see that the new kernel leads to highercapital gain than the other two kernels most of the timeAlso the average capital variation in the last 500 tradingdays is plotted in Figure 6(b) Unlike the fluctuant plots inFigure 6(a) the capital increases steadily when it is averaged

over the 500-day period no matter which kernel is usedThis confirms the effectiveness of SVR in forecasting high-frequency stock returns Once again the new kernel leads tothe highest capital gain and the advantage gets more obviousas the trading period is prolonged At the end of a day the newkernel leads to a return at about 06110min on averagewhile the resulting returns of the other two kernels are bothless than 04110min on average

Furthermore Studentrsquos 119905-test is used to test whether thenew kernel significantly outperforms the commonly usedkernels Specifically we calculate the differences of MSEhit rate and 110-minute capital gain between the forecastswith the new kernel and those with each comparative kernelrespectively on each of the last 500 trading days Table 1reports the mean (119909119889) and the standard deviation (119904119889) ofthese differences in these 500 trading days We test the nullhypothesis that ldquothe forecasts with the new kernel and those

Mathematical Problems in Engineering 7

002 004 006 008 01 012 0140MSE-sigmoid kernel

0

002

004

006

008

01

012

014M

SE-n

ew k

erne

l

(a) MSE

045 055 065 075 085035Hit rate-sigmoid kernel

035

045

055

065

075

085

Hit

rate

-new

ker

nel

(b) Hit rate

Figure 5 New kernel versus sigmoid function kernel from February 2 2012 to March 3 2014

New kernelRadial basis kernelSigmoid kernel

20 40 60 80 100 1200Time (min)

100

1005

101

1015

102

1025

Capi

tal

(a) Capital variation on February 2 2012

New kernelRadial basis kernelSigmoid kernel

100

1001

1002

1003

1004

1005

1006Av

erag

e cap

ital

20 40 60 80 100 1200Time (min)

(b) Average capital variation from February 2 2012 to March 3 2014

Figure 6 Capital variation of a simple trading strategy

with the comparative kernel have the same accuracy in termsof the specified criterionrdquo with the comparative kernel andthe criterion specified in the first row and the second row ofTable 1 respectively The 119905-statistics are reported in the lastrow of Table 1

We can easily see that the six null hypotheses are all signif-icantly rejected The new kernel has smaller MSE greater hitrate and higher 110-minute capital gain than both the radialbasis function kernel and the sigmoid function kernel Andthe differences are all significant at the 1 level Thereforethe results of Studentrsquos 119905-test indicate that the improvementin out-of-sample forecast accuracy brought about by the newkernel is significant both statistically and economically and

thus the newkernel is preferred in forecasting high-frequencystock returns

5 Summary and Conclusion

Support vector machine for regression is now widely appliedin time series forecasting problems Commonly used kernelssuch as the radial basis function kernel and the sigmoidfunction kernel are first derived mathematically for patternrecognition problems Although their direct applications intime series forecasting problems can generate remarkableperformance we argue that using a kernel designed according

8 Mathematical Problems in Engineering

Table 1 Out-of-sample forecast performance comparison results from February 2 2012 to March 3 2014 (119899 = 500)

Criterion New kernel versus (minus) radial basis kernel New kernel versus (minus) sigmoid kernelMSE Hit rate 110-min gain MSE Hit rate 110min gain

119909119889 minus15201119890 minus 3 48709119890 minus 2 20250119890 minus 3 minus13917119890 minus 3 53945119890 minus 2 23879119890 minus 3

119904119889 53684119890 minus 3 60785119890 minus 2 28530119890 minus 3 53658119890 minus 3 63875119890 minus 2 32202119890 minus 3

119905 =119909119889radic119899

119904119889

minus633lowastlowastlowast

1792lowastlowastlowast

1587lowastlowastlowast

minus580lowastlowastlowast

1888lowastlowastlowast

1658lowastlowastlowast

Note lowastlowastlowast indicates significance at the 1 level

to the specific nonlinear dynamics of the series under studycan further improve the forecast accuracy

Under the assumption that each high-frequency stockreturn is an event that triggers momentum and reversalperiodically we decompose each future return into a col-lection of decaying cosine waves that are functions of pastreturns Under realistic assumptions we reach an analyticalexpression of the nonlinear relationship between past andfuture returns and thus design a new kernel specificallyfor forecasting high-frequency stock returns Using high-frequency prices of Chinese CSI 300 index as empirical datawe determine the optimal parameters of the new kernel andthen compare the new kernel with the radial basis functionkernel and the sigmoid function kernel in terms of the SVRrsquosout-of-sample forecast accuracy It turns out that the newkernel significantly outperforms the other two kernels interms of the MSE the hit rate and the capital gain from asimple trading strategy

Our empirical experiments confirm that it is statisticallyand economically valuable to design a new kernel of SVRspecifically characterizing the nonlinear dynamics of thetime series under study Thus our results shed light onan alternative direction for improving the performance ofSVR Current study only utilizes past returns to predictfuture returns A natural extension is to introduce intradaytrading volumes and intraday highlow prices into the featurevector of the SVR and develop kernels characterizing thecorresponding nonlinear relationship between future returnand feature vector Another possible extension is to apply theSVR with new kernel in the energy markets where trading iscontinuous 24 hours We leave them for future work

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful for the financial support offered bytheNational Natural Science Foundation of China (71201075)and the Specialized Research Fund for the Doctoral Programof Higher Education of China (20120091120003)

References

[1] D A Hsieh ldquoTesting for nonlinear dependence in daily foreignexchange ratesrdquoThe Journal of Business vol 62 no 3 pp 339ndash368 1989

[2] J A Scheinkman and B LeBaron ldquoNonlinear dynamics andstock returnsrdquoThe Journal of Business vol 62 no 3 pp 311ndash3371989

[3] G S Atsalakis and K P Valavanis ldquoSurveying stock marketforecasting techniques-part II soft computingmethodsrdquoExpertSystems with Applications vol 36 no 3 pp 5932ndash5941 2009

[4] V N Vapnik The Nature Of Statistical Learning TheorySpringer New York NY USA 1995

[5] L Cao and F E H Tay ldquoFinancial forecasting using supportvector machinesrdquoNeural Computing amp Applications vol 10 no2 pp 184ndash192 2001

[6] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[7] K-J Kim ldquoFinancial time series forecasting using supportvector machinesrdquo Neurocomputing vol 55 no 1-2 pp 307ndash3192003

[8] V V Gavrishchaka and S Banerjee ldquoSupport vector machine asan efficient framework for stock market volatility forecastingrdquoComputational Management Science vol 3 no 2 pp 147ndash1602006

[9] S Sonnenburg G Ratsch C Schafer and B Scholkopf ldquoLargescale multiple kernel learningrdquo Journal of Machine LearningResearch vol 7 pp 1531ndash1565 2006

[10] A Rakotomamonjy F R Bach S Canu and Y GrandvaletldquoSimple MKLrdquo Journal of Machine Learning Research vol 9 pp2491ndash2521 2008

[11] C-Y Yeh C-W Huang and S-J Lee ldquoA multiple-kernelsupport vector regression approach for stock market priceforecastingrdquo Expert Systems with Applications vol 38 no 3 pp2177ndash2186 2011

[12] W Huang Y Nakamori and S-Y Wang ldquoForecasting stockmarket movement direction with support vector machinerdquoComputers and Operations Research vol 32 no 10 pp 2513ndash2522 2005

[13] J M Matıas and J C Reboredo ldquoForecasting performanceof nonlinear models for intraday stock returnsrdquo Journal ofForecasting vol 31 no 2 pp 172ndash188 2012

[14] J C Reboredo JMMatıas and R Garcia-Rubio ldquoNonlinearityin forecasting of high-frequency stock returnsrdquo ComputationalEconomics vol 40 no 3 pp 245ndash264 2012

[15] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[16] J C Platt ldquoFast training of support vector machines usingsequential minimal optimizationrdquo in Advances in Kernel Meth-ods Support Vector Learning B Scholkopf C J C Burgesand A J Smola Eds chapter 12 pp 185ndash208 MIT PressCambridge Mass USA 1999

Mathematical Problems in Engineering 9

[17] T Chordia R Roll and A Subrahmanyam ldquoEvidence on thespeed of convergence to market efficiencyrdquo Journal of FinancialEconomics vol 76 no 2 pp 271ndash292 2005

[18] D Y Chung and K Hrazdil ldquoSpeed of convergence to marketefficiency the role of ECNsrdquo Journal of Empirical Finance vol19 no 5 pp 702ndash720 2012

[19] C W Hsu C C Chang and C J Lin ldquoA practical guideto support vector classificationrdquo Tech Rep Department ofComputer Science National Taiwan University 2003

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article A New Kernel of Support Vector Regression ...downloads.hindawi.com/journals/mpe/2016/4907654.pdf · neural network (ANN), and the support vector machine for regression

6 Mathematical Problems in Engineering

20 40 60 80 100 1200Δt(times100)

0031

0032

0033

0034

0035

0036

0037

500-

day

aver

age M

SE

(a) Average MSE versus Δ119905

20 40 60 80 100 1200Δt(times100)

069

07

071

072

073

500-

day

aver

age h

it ra

te

(b) Average hit rate versus Δ119905

Figure 3 500-day average MSE and hit rate achieved with different values of Δ119905

002 004 006 008 01 012 0140MSE-radial basis kernel

0

002

004

006

008

01

012

014

MSE

-new

ker

nel

(a) MSE

045 055 065 075 085035Hit rate-radial basis kernel

035

045

055

065

075

085H

it ra

te-n

ew k

erne

l

(b) Hit rate

Figure 4 New kernel versus radial basis function kernel from February 2 2012 to March 3 2014

predicted return is positive and exceeds a threshold and soldif the one-step-ahead predicted return is negative and below athreshold and no action is performed otherwiseThe thresh-old is set as the average of the (100 + 119898) normalized returnsin the training period divided by the scale coefficient ofln(1000) (The scale coefficient controls the trading strategyrsquossensitivity to index price change and a higher value leads tomore frequent tradingThe value of ln(1000) is arbitrarily setand can be adjusted The results are consistent as the scalecoefficient varies)

The variation of capital under such a strategy in the 110-minute out-of-sample period on February 2 2012 is plottedin Figure 6(a) We can see that the new kernel leads to highercapital gain than the other two kernels most of the timeAlso the average capital variation in the last 500 tradingdays is plotted in Figure 6(b) Unlike the fluctuant plots inFigure 6(a) the capital increases steadily when it is averaged

over the 500-day period no matter which kernel is usedThis confirms the effectiveness of SVR in forecasting high-frequency stock returns Once again the new kernel leads tothe highest capital gain and the advantage gets more obviousas the trading period is prolonged At the end of a day the newkernel leads to a return at about 06110min on averagewhile the resulting returns of the other two kernels are bothless than 04110min on average

Furthermore Studentrsquos 119905-test is used to test whether thenew kernel significantly outperforms the commonly usedkernels Specifically we calculate the differences of MSEhit rate and 110-minute capital gain between the forecastswith the new kernel and those with each comparative kernelrespectively on each of the last 500 trading days Table 1reports the mean (119909119889) and the standard deviation (119904119889) ofthese differences in these 500 trading days We test the nullhypothesis that ldquothe forecasts with the new kernel and those

Mathematical Problems in Engineering 7

002 004 006 008 01 012 0140MSE-sigmoid kernel

0

002

004

006

008

01

012

014M

SE-n

ew k

erne

l

(a) MSE

045 055 065 075 085035Hit rate-sigmoid kernel

035

045

055

065

075

085

Hit

rate

-new

ker

nel

(b) Hit rate

Figure 5 New kernel versus sigmoid function kernel from February 2 2012 to March 3 2014

New kernelRadial basis kernelSigmoid kernel

20 40 60 80 100 1200Time (min)

100

1005

101

1015

102

1025

Capi

tal

(a) Capital variation on February 2 2012

New kernelRadial basis kernelSigmoid kernel

100

1001

1002

1003

1004

1005

1006Av

erag

e cap

ital

20 40 60 80 100 1200Time (min)

(b) Average capital variation from February 2 2012 to March 3 2014

Figure 6 Capital variation of a simple trading strategy

with the comparative kernel have the same accuracy in termsof the specified criterionrdquo with the comparative kernel andthe criterion specified in the first row and the second row ofTable 1 respectively The 119905-statistics are reported in the lastrow of Table 1

We can easily see that the six null hypotheses are all signif-icantly rejected The new kernel has smaller MSE greater hitrate and higher 110-minute capital gain than both the radialbasis function kernel and the sigmoid function kernel Andthe differences are all significant at the 1 level Thereforethe results of Studentrsquos 119905-test indicate that the improvementin out-of-sample forecast accuracy brought about by the newkernel is significant both statistically and economically and

thus the newkernel is preferred in forecasting high-frequencystock returns

5 Summary and Conclusion

Support vector machine for regression is now widely appliedin time series forecasting problems Commonly used kernelssuch as the radial basis function kernel and the sigmoidfunction kernel are first derived mathematically for patternrecognition problems Although their direct applications intime series forecasting problems can generate remarkableperformance we argue that using a kernel designed according

8 Mathematical Problems in Engineering

Table 1 Out-of-sample forecast performance comparison results from February 2 2012 to March 3 2014 (119899 = 500)

Criterion New kernel versus (minus) radial basis kernel New kernel versus (minus) sigmoid kernelMSE Hit rate 110-min gain MSE Hit rate 110min gain

119909119889 minus15201119890 minus 3 48709119890 minus 2 20250119890 minus 3 minus13917119890 minus 3 53945119890 minus 2 23879119890 minus 3

119904119889 53684119890 minus 3 60785119890 minus 2 28530119890 minus 3 53658119890 minus 3 63875119890 minus 2 32202119890 minus 3

119905 =119909119889radic119899

119904119889

minus633lowastlowastlowast

1792lowastlowastlowast

1587lowastlowastlowast

minus580lowastlowastlowast

1888lowastlowastlowast

1658lowastlowastlowast

Note lowastlowastlowast indicates significance at the 1 level

to the specific nonlinear dynamics of the series under studycan further improve the forecast accuracy

Under the assumption that each high-frequency stockreturn is an event that triggers momentum and reversalperiodically we decompose each future return into a col-lection of decaying cosine waves that are functions of pastreturns Under realistic assumptions we reach an analyticalexpression of the nonlinear relationship between past andfuture returns and thus design a new kernel specificallyfor forecasting high-frequency stock returns Using high-frequency prices of Chinese CSI 300 index as empirical datawe determine the optimal parameters of the new kernel andthen compare the new kernel with the radial basis functionkernel and the sigmoid function kernel in terms of the SVRrsquosout-of-sample forecast accuracy It turns out that the newkernel significantly outperforms the other two kernels interms of the MSE the hit rate and the capital gain from asimple trading strategy

Our empirical experiments confirm that it is statisticallyand economically valuable to design a new kernel of SVRspecifically characterizing the nonlinear dynamics of thetime series under study Thus our results shed light onan alternative direction for improving the performance ofSVR Current study only utilizes past returns to predictfuture returns A natural extension is to introduce intradaytrading volumes and intraday highlow prices into the featurevector of the SVR and develop kernels characterizing thecorresponding nonlinear relationship between future returnand feature vector Another possible extension is to apply theSVR with new kernel in the energy markets where trading iscontinuous 24 hours We leave them for future work

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful for the financial support offered bytheNational Natural Science Foundation of China (71201075)and the Specialized Research Fund for the Doctoral Programof Higher Education of China (20120091120003)

References

[1] D A Hsieh ldquoTesting for nonlinear dependence in daily foreignexchange ratesrdquoThe Journal of Business vol 62 no 3 pp 339ndash368 1989

[2] J A Scheinkman and B LeBaron ldquoNonlinear dynamics andstock returnsrdquoThe Journal of Business vol 62 no 3 pp 311ndash3371989

[3] G S Atsalakis and K P Valavanis ldquoSurveying stock marketforecasting techniques-part II soft computingmethodsrdquoExpertSystems with Applications vol 36 no 3 pp 5932ndash5941 2009

[4] V N Vapnik The Nature Of Statistical Learning TheorySpringer New York NY USA 1995

[5] L Cao and F E H Tay ldquoFinancial forecasting using supportvector machinesrdquoNeural Computing amp Applications vol 10 no2 pp 184ndash192 2001

[6] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[7] K-J Kim ldquoFinancial time series forecasting using supportvector machinesrdquo Neurocomputing vol 55 no 1-2 pp 307ndash3192003

[8] V V Gavrishchaka and S Banerjee ldquoSupport vector machine asan efficient framework for stock market volatility forecastingrdquoComputational Management Science vol 3 no 2 pp 147ndash1602006

[9] S Sonnenburg G Ratsch C Schafer and B Scholkopf ldquoLargescale multiple kernel learningrdquo Journal of Machine LearningResearch vol 7 pp 1531ndash1565 2006

[10] A Rakotomamonjy F R Bach S Canu and Y GrandvaletldquoSimple MKLrdquo Journal of Machine Learning Research vol 9 pp2491ndash2521 2008

[11] C-Y Yeh C-W Huang and S-J Lee ldquoA multiple-kernelsupport vector regression approach for stock market priceforecastingrdquo Expert Systems with Applications vol 38 no 3 pp2177ndash2186 2011

[12] W Huang Y Nakamori and S-Y Wang ldquoForecasting stockmarket movement direction with support vector machinerdquoComputers and Operations Research vol 32 no 10 pp 2513ndash2522 2005

[13] J M Matıas and J C Reboredo ldquoForecasting performanceof nonlinear models for intraday stock returnsrdquo Journal ofForecasting vol 31 no 2 pp 172ndash188 2012

[14] J C Reboredo JMMatıas and R Garcia-Rubio ldquoNonlinearityin forecasting of high-frequency stock returnsrdquo ComputationalEconomics vol 40 no 3 pp 245ndash264 2012

[15] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[16] J C Platt ldquoFast training of support vector machines usingsequential minimal optimizationrdquo in Advances in Kernel Meth-ods Support Vector Learning B Scholkopf C J C Burgesand A J Smola Eds chapter 12 pp 185ndash208 MIT PressCambridge Mass USA 1999

Mathematical Problems in Engineering 9

[17] T Chordia R Roll and A Subrahmanyam ldquoEvidence on thespeed of convergence to market efficiencyrdquo Journal of FinancialEconomics vol 76 no 2 pp 271ndash292 2005

[18] D Y Chung and K Hrazdil ldquoSpeed of convergence to marketefficiency the role of ECNsrdquo Journal of Empirical Finance vol19 no 5 pp 702ndash720 2012

[19] C W Hsu C C Chang and C J Lin ldquoA practical guideto support vector classificationrdquo Tech Rep Department ofComputer Science National Taiwan University 2003

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article A New Kernel of Support Vector Regression ...downloads.hindawi.com/journals/mpe/2016/4907654.pdf · neural network (ANN), and the support vector machine for regression

Mathematical Problems in Engineering 7

002 004 006 008 01 012 0140MSE-sigmoid kernel

0

002

004

006

008

01

012

014M

SE-n

ew k

erne

l

(a) MSE

045 055 065 075 085035Hit rate-sigmoid kernel

035

045

055

065

075

085

Hit

rate

-new

ker

nel

(b) Hit rate

Figure 5 New kernel versus sigmoid function kernel from February 2 2012 to March 3 2014

New kernelRadial basis kernelSigmoid kernel

20 40 60 80 100 1200Time (min)

100

1005

101

1015

102

1025

Capi

tal

(a) Capital variation on February 2 2012

New kernelRadial basis kernelSigmoid kernel

100

1001

1002

1003

1004

1005

1006Av

erag

e cap

ital

20 40 60 80 100 1200Time (min)

(b) Average capital variation from February 2 2012 to March 3 2014

Figure 6 Capital variation of a simple trading strategy

with the comparative kernel have the same accuracy in termsof the specified criterionrdquo with the comparative kernel andthe criterion specified in the first row and the second row ofTable 1 respectively The 119905-statistics are reported in the lastrow of Table 1

We can easily see that the six null hypotheses are all signif-icantly rejected The new kernel has smaller MSE greater hitrate and higher 110-minute capital gain than both the radialbasis function kernel and the sigmoid function kernel Andthe differences are all significant at the 1 level Thereforethe results of Studentrsquos 119905-test indicate that the improvementin out-of-sample forecast accuracy brought about by the newkernel is significant both statistically and economically and

thus the newkernel is preferred in forecasting high-frequencystock returns

5 Summary and Conclusion

Support vector machine for regression is now widely appliedin time series forecasting problems Commonly used kernelssuch as the radial basis function kernel and the sigmoidfunction kernel are first derived mathematically for patternrecognition problems Although their direct applications intime series forecasting problems can generate remarkableperformance we argue that using a kernel designed according

8 Mathematical Problems in Engineering

Table 1 Out-of-sample forecast performance comparison results from February 2 2012 to March 3 2014 (119899 = 500)

Criterion New kernel versus (minus) radial basis kernel New kernel versus (minus) sigmoid kernelMSE Hit rate 110-min gain MSE Hit rate 110min gain

119909119889 minus15201119890 minus 3 48709119890 minus 2 20250119890 minus 3 minus13917119890 minus 3 53945119890 minus 2 23879119890 minus 3

119904119889 53684119890 minus 3 60785119890 minus 2 28530119890 minus 3 53658119890 minus 3 63875119890 minus 2 32202119890 minus 3

119905 =119909119889radic119899

119904119889

minus633lowastlowastlowast

1792lowastlowastlowast

1587lowastlowastlowast

minus580lowastlowastlowast

1888lowastlowastlowast

1658lowastlowastlowast

Note lowastlowastlowast indicates significance at the 1 level

to the specific nonlinear dynamics of the series under studycan further improve the forecast accuracy

Under the assumption that each high-frequency stockreturn is an event that triggers momentum and reversalperiodically we decompose each future return into a col-lection of decaying cosine waves that are functions of pastreturns Under realistic assumptions we reach an analyticalexpression of the nonlinear relationship between past andfuture returns and thus design a new kernel specificallyfor forecasting high-frequency stock returns Using high-frequency prices of Chinese CSI 300 index as empirical datawe determine the optimal parameters of the new kernel andthen compare the new kernel with the radial basis functionkernel and the sigmoid function kernel in terms of the SVRrsquosout-of-sample forecast accuracy It turns out that the newkernel significantly outperforms the other two kernels interms of the MSE the hit rate and the capital gain from asimple trading strategy

Our empirical experiments confirm that it is statisticallyand economically valuable to design a new kernel of SVRspecifically characterizing the nonlinear dynamics of thetime series under study Thus our results shed light onan alternative direction for improving the performance ofSVR Current study only utilizes past returns to predictfuture returns A natural extension is to introduce intradaytrading volumes and intraday highlow prices into the featurevector of the SVR and develop kernels characterizing thecorresponding nonlinear relationship between future returnand feature vector Another possible extension is to apply theSVR with new kernel in the energy markets where trading iscontinuous 24 hours We leave them for future work

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful for the financial support offered bytheNational Natural Science Foundation of China (71201075)and the Specialized Research Fund for the Doctoral Programof Higher Education of China (20120091120003)

References

[1] D A Hsieh ldquoTesting for nonlinear dependence in daily foreignexchange ratesrdquoThe Journal of Business vol 62 no 3 pp 339ndash368 1989

[2] J A Scheinkman and B LeBaron ldquoNonlinear dynamics andstock returnsrdquoThe Journal of Business vol 62 no 3 pp 311ndash3371989

[3] G S Atsalakis and K P Valavanis ldquoSurveying stock marketforecasting techniques-part II soft computingmethodsrdquoExpertSystems with Applications vol 36 no 3 pp 5932ndash5941 2009

[4] V N Vapnik The Nature Of Statistical Learning TheorySpringer New York NY USA 1995

[5] L Cao and F E H Tay ldquoFinancial forecasting using supportvector machinesrdquoNeural Computing amp Applications vol 10 no2 pp 184ndash192 2001

[6] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[7] K-J Kim ldquoFinancial time series forecasting using supportvector machinesrdquo Neurocomputing vol 55 no 1-2 pp 307ndash3192003

[8] V V Gavrishchaka and S Banerjee ldquoSupport vector machine asan efficient framework for stock market volatility forecastingrdquoComputational Management Science vol 3 no 2 pp 147ndash1602006

[9] S Sonnenburg G Ratsch C Schafer and B Scholkopf ldquoLargescale multiple kernel learningrdquo Journal of Machine LearningResearch vol 7 pp 1531ndash1565 2006

[10] A Rakotomamonjy F R Bach S Canu and Y GrandvaletldquoSimple MKLrdquo Journal of Machine Learning Research vol 9 pp2491ndash2521 2008

[11] C-Y Yeh C-W Huang and S-J Lee ldquoA multiple-kernelsupport vector regression approach for stock market priceforecastingrdquo Expert Systems with Applications vol 38 no 3 pp2177ndash2186 2011

[12] W Huang Y Nakamori and S-Y Wang ldquoForecasting stockmarket movement direction with support vector machinerdquoComputers and Operations Research vol 32 no 10 pp 2513ndash2522 2005

[13] J M Matıas and J C Reboredo ldquoForecasting performanceof nonlinear models for intraday stock returnsrdquo Journal ofForecasting vol 31 no 2 pp 172ndash188 2012

[14] J C Reboredo JMMatıas and R Garcia-Rubio ldquoNonlinearityin forecasting of high-frequency stock returnsrdquo ComputationalEconomics vol 40 no 3 pp 245ndash264 2012

[15] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[16] J C Platt ldquoFast training of support vector machines usingsequential minimal optimizationrdquo in Advances in Kernel Meth-ods Support Vector Learning B Scholkopf C J C Burgesand A J Smola Eds chapter 12 pp 185ndash208 MIT PressCambridge Mass USA 1999

Mathematical Problems in Engineering 9

[17] T Chordia R Roll and A Subrahmanyam ldquoEvidence on thespeed of convergence to market efficiencyrdquo Journal of FinancialEconomics vol 76 no 2 pp 271ndash292 2005

[18] D Y Chung and K Hrazdil ldquoSpeed of convergence to marketefficiency the role of ECNsrdquo Journal of Empirical Finance vol19 no 5 pp 702ndash720 2012

[19] C W Hsu C C Chang and C J Lin ldquoA practical guideto support vector classificationrdquo Tech Rep Department ofComputer Science National Taiwan University 2003

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article A New Kernel of Support Vector Regression ...downloads.hindawi.com/journals/mpe/2016/4907654.pdf · neural network (ANN), and the support vector machine for regression

8 Mathematical Problems in Engineering

Table 1 Out-of-sample forecast performance comparison results from February 2 2012 to March 3 2014 (119899 = 500)

Criterion New kernel versus (minus) radial basis kernel New kernel versus (minus) sigmoid kernelMSE Hit rate 110-min gain MSE Hit rate 110min gain

119909119889 minus15201119890 minus 3 48709119890 minus 2 20250119890 minus 3 minus13917119890 minus 3 53945119890 minus 2 23879119890 minus 3

119904119889 53684119890 minus 3 60785119890 minus 2 28530119890 minus 3 53658119890 minus 3 63875119890 minus 2 32202119890 minus 3

119905 =119909119889radic119899

119904119889

minus633lowastlowastlowast

1792lowastlowastlowast

1587lowastlowastlowast

minus580lowastlowastlowast

1888lowastlowastlowast

1658lowastlowastlowast

Note lowastlowastlowast indicates significance at the 1 level

to the specific nonlinear dynamics of the series under studycan further improve the forecast accuracy

Under the assumption that each high-frequency stockreturn is an event that triggers momentum and reversalperiodically we decompose each future return into a col-lection of decaying cosine waves that are functions of pastreturns Under realistic assumptions we reach an analyticalexpression of the nonlinear relationship between past andfuture returns and thus design a new kernel specificallyfor forecasting high-frequency stock returns Using high-frequency prices of Chinese CSI 300 index as empirical datawe determine the optimal parameters of the new kernel andthen compare the new kernel with the radial basis functionkernel and the sigmoid function kernel in terms of the SVRrsquosout-of-sample forecast accuracy It turns out that the newkernel significantly outperforms the other two kernels interms of the MSE the hit rate and the capital gain from asimple trading strategy

Our empirical experiments confirm that it is statisticallyand economically valuable to design a new kernel of SVRspecifically characterizing the nonlinear dynamics of thetime series under study Thus our results shed light onan alternative direction for improving the performance ofSVR Current study only utilizes past returns to predictfuture returns A natural extension is to introduce intradaytrading volumes and intraday highlow prices into the featurevector of the SVR and develop kernels characterizing thecorresponding nonlinear relationship between future returnand feature vector Another possible extension is to apply theSVR with new kernel in the energy markets where trading iscontinuous 24 hours We leave them for future work

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful for the financial support offered bytheNational Natural Science Foundation of China (71201075)and the Specialized Research Fund for the Doctoral Programof Higher Education of China (20120091120003)

References

[1] D A Hsieh ldquoTesting for nonlinear dependence in daily foreignexchange ratesrdquoThe Journal of Business vol 62 no 3 pp 339ndash368 1989

[2] J A Scheinkman and B LeBaron ldquoNonlinear dynamics andstock returnsrdquoThe Journal of Business vol 62 no 3 pp 311ndash3371989

[3] G S Atsalakis and K P Valavanis ldquoSurveying stock marketforecasting techniques-part II soft computingmethodsrdquoExpertSystems with Applications vol 36 no 3 pp 5932ndash5941 2009

[4] V N Vapnik The Nature Of Statistical Learning TheorySpringer New York NY USA 1995

[5] L Cao and F E H Tay ldquoFinancial forecasting using supportvector machinesrdquoNeural Computing amp Applications vol 10 no2 pp 184ndash192 2001

[6] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[7] K-J Kim ldquoFinancial time series forecasting using supportvector machinesrdquo Neurocomputing vol 55 no 1-2 pp 307ndash3192003

[8] V V Gavrishchaka and S Banerjee ldquoSupport vector machine asan efficient framework for stock market volatility forecastingrdquoComputational Management Science vol 3 no 2 pp 147ndash1602006

[9] S Sonnenburg G Ratsch C Schafer and B Scholkopf ldquoLargescale multiple kernel learningrdquo Journal of Machine LearningResearch vol 7 pp 1531ndash1565 2006

[10] A Rakotomamonjy F R Bach S Canu and Y GrandvaletldquoSimple MKLrdquo Journal of Machine Learning Research vol 9 pp2491ndash2521 2008

[11] C-Y Yeh C-W Huang and S-J Lee ldquoA multiple-kernelsupport vector regression approach for stock market priceforecastingrdquo Expert Systems with Applications vol 38 no 3 pp2177ndash2186 2011

[12] W Huang Y Nakamori and S-Y Wang ldquoForecasting stockmarket movement direction with support vector machinerdquoComputers and Operations Research vol 32 no 10 pp 2513ndash2522 2005

[13] J M Matıas and J C Reboredo ldquoForecasting performanceof nonlinear models for intraday stock returnsrdquo Journal ofForecasting vol 31 no 2 pp 172ndash188 2012

[14] J C Reboredo JMMatıas and R Garcia-Rubio ldquoNonlinearityin forecasting of high-frequency stock returnsrdquo ComputationalEconomics vol 40 no 3 pp 245ndash264 2012

[15] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[16] J C Platt ldquoFast training of support vector machines usingsequential minimal optimizationrdquo in Advances in Kernel Meth-ods Support Vector Learning B Scholkopf C J C Burgesand A J Smola Eds chapter 12 pp 185ndash208 MIT PressCambridge Mass USA 1999

Mathematical Problems in Engineering 9

[17] T Chordia R Roll and A Subrahmanyam ldquoEvidence on thespeed of convergence to market efficiencyrdquo Journal of FinancialEconomics vol 76 no 2 pp 271ndash292 2005

[18] D Y Chung and K Hrazdil ldquoSpeed of convergence to marketefficiency the role of ECNsrdquo Journal of Empirical Finance vol19 no 5 pp 702ndash720 2012

[19] C W Hsu C C Chang and C J Lin ldquoA practical guideto support vector classificationrdquo Tech Rep Department ofComputer Science National Taiwan University 2003

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article A New Kernel of Support Vector Regression ...downloads.hindawi.com/journals/mpe/2016/4907654.pdf · neural network (ANN), and the support vector machine for regression

Mathematical Problems in Engineering 9

[17] T Chordia R Roll and A Subrahmanyam ldquoEvidence on thespeed of convergence to market efficiencyrdquo Journal of FinancialEconomics vol 76 no 2 pp 271ndash292 2005

[18] D Y Chung and K Hrazdil ldquoSpeed of convergence to marketefficiency the role of ECNsrdquo Journal of Empirical Finance vol19 no 5 pp 702ndash720 2012

[19] C W Hsu C C Chang and C J Lin ldquoA practical guideto support vector classificationrdquo Tech Rep Department ofComputer Science National Taiwan University 2003

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article A New Kernel of Support Vector Regression ...downloads.hindawi.com/journals/mpe/2016/4907654.pdf · neural network (ANN), and the support vector machine for regression

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of