Optimal variables for describing the evolution of NO2 concentration

arX

iv:1

111.

2008

v2 [

phys

ics.

data

-an]

14

May

201

2

Searching for optimal variables in real multivariate stochastic data

F. Raischela, A. Russob, M. Haasec, D. Kleinhansd,e, P.G. Linda, f

aCenter for Theoretical and Computational Physics, University of Lisbon, Av. Prof. Gama Pinto 2, 1649-003 Lisbon, Portugal

bCenter for Geophysics , IDL, University of Lisbon 1749-016 Lisboa, Portugal

cInstitute for High Performance Computing, University of Stuttgart, Nobelstr. 19, D-70569 Stuttgart, Germany

dInstitute for Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Goteborg, Sweden

eInstitute of Theoretical Physics, University of Munster,D-48149 Munster, Germany

f Departamento de Fısica, Faculdade de Ciencias da Universidade de Lisboa, 1649-003 Lisboa, Portugal

Abstract

By implementing a recent technique for the determination ofstochastic eigendirections of two coupled stochastic variables, weinvestigate the evolution of fluctuations of NO2 concentrations at two monitoring stations in the city of Lisbon, Portugal. We analyzethe stochastic part of the measurements recorded at the monitoring stations by means of a method where the two concentrationsare considered as stochastic variables evolving accordingto a system of coupled stochastic differential equations. Analysis of theirstructure allows for transforming the set of measured variables to a set of derived variables, one of them with reduced stochasticity.For the specific case of NO2 concentration measures, the set of derived variables are well approximated by a global rotation of theoriginal set of measured variables. We conclude that the stochastic sources at each station are independent from each other andtypically have amplitudes of the order of the deterministiccontributions. Such findings show significant limitations when predictingsuch quantities. Still, we briefly discuss how predictive power can be increased in general in the light of our methods.

Keywords: Stochastic Systems, Environmental Research, Pollutants,Langevin EquationPACS:[2010] 02.50.Ga, 02.50.Ey, 92.60.Sz,

Preprint submitted to arxiv December 6, 2013

http://arxiv.org/abs/1111.2008v2

Searching for optimal variables in real multivariate stochastic data

F. Raischela, A. Russob, M. Haasec, D. Kleinhansd,e, P.G. Linda, f

aCenter for Theoretical and Computational Physics, University of Lisbon, Av. Prof. Gama Pinto 2, 1649-003 Lisbon, Portugal

bCenter for Geophysics , IDL, University of Lisbon 1749-016 Lisboa, Portugal

cInstitute for High Performance Computing, University of Stuttgart, Nobelstr. 19, D-70569 Stuttgart, Germany

dInstitute for Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Goteborg, Sweden

eInstitute of Theoretical Physics, University of Munster,D-48149 Munster, Germany

f Departamento de Fısica, Faculdade de Ciencias da Universidade de Lisboa, 1649-003 Lisboa, Portugal

Contents

1 Introduction 2

2 NO2 measurements in Lisbon 3

3 Modeling stochasticity in series of NO2 concentra-tions: Langevin processes 4

4 Analysis of Markov properties for the series of NO2

concentrations 6

5 Deriving optimal variables: eigensystem for NO2

measurements at different stations 6

6 Transform of NO2 concentrations to the stochasticeigendirections 7

7 Comparison with standard methodologies 9

8 Discussion and Conclusion 10

1. Introduction

The industrial and urban development during the last decadeshas led to a general decrease of air quality, drastically affectingurban environmental and human life quality. Although, accord-ing to the European Environment Agency report[1], air qualityhas improved in general during the last years, this enhancementwas not significant enough to ensure good air quality in all ur-ban areas. One of the pollutants with negative impact on healthand environment is NO2. Anthropogenic NO2 is mainly emittedby vehicles and industrial processes. NO2 has not only severeeffects on health causing e.g. respiratory and cardiovasculardis-eases, it also affects the environment[2] as nitrogen depositionleads to eutrophication[3]. A better understanding of the mech-anisms that influence production, transport, and decompositionof NO2 is therefore important. Previous studies revealed thattemperature, wind speed and direction, relative humidity,cloudcover, dew point temperature, sea level pressure, precipitation,

Figure 1: NO2 measurement stations in the region of Lisbon (Portugal) at theSouthwestern coast of Europe. In this paper we focus on the set of measure-ments taken at the stations of Chelas and Avenida da Liberdade with approxi-mately 105 data points each extracted in the period between 1995 and 2006.

and mixing layer height are relevant meteorological variablesto model the concentrations of air pollutants[2, 4, 5, 6, 7].Inparticular, approaches that deal with the evolution of the NO2

concentration at individual city locations are important for fore-casting the air quality of urban regions.

Recently a framework[9, 10] for analyzing measurements oncomplex systems was introduced, aiming for a quantitative es-timation of drift and diffusion functions from measured data.These functions can be identified with the deterministic andstochastic contributions to the dynamics, respectively, and givea considerable insight into the underlying systems. The frame-work was already successfully applied for instance to describeturbulent flows[9] and the evolution of climate indices[11,29],stock market indices[12], and oil prices[13]. At the same time,the basic method has been refined in particular with respect tothe impact of finite sampling effects [14, 15], the impact of mea-surement noise[16, 17, 18], and the role of local eigendirections

Preprint submitted to arxiv December 6, 2013

of the diffusion matrices [19].

In this paper, we aim to apply some recent methods for de-riving variables with reduced stochastic fluctuations to empiri-cal data. Namely, we adapt this framework for analyzing mea-surements of NO2 concentrations in the metropolitan region ofLisbon, Portugal (see Fig. 1), taken over several years. We ar-gue that the temporal fluctuations of these concentrations re-sult from two independent contributions: one periodic and onestochastic. The periodic part describes daily, weekly, seasonaland yearly variations of the concentration, which is an acceptedand well-studied result[8]. The stochastic contribution can bemodelled through a stochastic differential equation[23] havingtwo terms, one drift forcing (deterministic) and one diffusivefluctuation (stochastic).

When addressing stochastic higher-dimensional systems itistypically difficult to identify the variables most relevant for aproper description of the system’s evolution. In geophysicalapplications, the reduction of the full set of variables to only afew variables often is achieved by means of the so-called Princi-pal Component Analysis (PCA)[21] or other standard reductionmethods, such as stepwise regression or ARIMA[22]. However,the inherent fluctuations are not so commonly investigated.

In this paper we will apply a recent method for reconstruct-ing the phase space of two stochastic variables, which evolveaccording to a set of two coupled stochastic equations definedthrough drift vectors and diffusion matrices [19]. The method isbased on the eigenvalues of the diffusion matrices, from whichit is possible to derive a path in phase space through which thedeterministic contribution is enhanced. This technique impliesa transformation of variables and allows for the investigationof the minimal number of independent sources of stochasticforcing in the system. In particular, a rather small eigenvalueof the diffusion matrix, compared to the average value of allthe other, corresponds to one eigendirection in which stochasticfluctuations may be neglected, reducing the number of stochas-tic variables taken for describing the system’s evolution.On thecontrary, having all eigenvalues of the same order of magnitudemeans that the number of independent stochastic forces equalsthe number of variables. Moreover, as we will see, a direct in-spection of the diffusion functions enables one to ascertain ifthe stochastic contributions, one for each variable, are coupledamong them or not. Therefore, we argue that the diagonaliza-tion of the diffusion matrices gives insight into the system.

We start in Sec. 2 by describing the properties and the prepa-ration of the data set. Consecutively, in Sec. 3 and 4, the mod-eling of the time series as a Langevin process is carried out andits transformation to a new coordinate system are describedinSecs. 5 and 6 respectively. In Sec. 7 we discuss the perfor-mance of the transformation of the coordinates obtained by ourapproach compared to other techniques commonly used for sta-tistical analysis of measured data. Section 8 closes this Letterwith a general summary and ideas on the interpretation of thetransformed time series with respect to the underlying environ-mental processes.

Figure 2: (a) Time series of the NO2 concentration at the station of Chelas,before detrending according to Eq. (2), and(b) a zoom-in of these “raw” timeseriesy1 compared to the detrended seriesx1, which takes averages of 52-weeksperiods, and then a second detrend with daily averages. Vertical offset of sameplots are done for clarity. For the station at Avenida da Liberdade similar fea-tures are found (not shown).

2. NO2 measurements in Lisbon

In this section we briefly describe the sets of data analyzed inthis paper as well as its preparation for analyzing the stochasticcomponents of the measurements.

The data set covers hourly measurements of NO2 concentra-tion, taken at 22 stations in the urban center of Lisbon recordedfrom of 1995 to 2006. For this study we choose the datafrom 1995 to 2005 for the monitoring stations at Chelas andat Avenida da Liberdade. These stations are located at a dis-tance of∼ 4 km from each other, see Fig. 1. In the following,the NO2 concentrations at the stations of Chelas and Avenidada Liberdade will be designated asy1(t) andy2(t), respectively,omitting the temporal dependency when not necessary.

Increments in time are always of 1 hour. Each of the datasets contains 105 measurement points approximately, includingsome periods of incomplete or erroneous measurements that aredisregarded for our analysis. In the case of the chosen stations,the series of measurementsy1 andy2 contain 3726 and 4548instances of measurement errors, respectively.

The concentration of NO2 is strongly driven by daily, weekly,monthly and yearly anthropogenic routines, and also by peri-

3

odic atmospheric processes. For instance the rush hours onworking days have an almost immediate impact on the NO2

concentration, and thus, on air quality. The 24 hours and oneweek cycles are both traffic related and mirror daily and weeklycycles. The measurements of NO2 are therefore influenced bydifferent periodic forcings and, since we are interested in thefluctuations of NO2 concentrations, the periodic behavior mustbe first detrended. The detrended series fory1 andy2, repre-sented below asx1 andx2 respectively, are obtained as follows.

One first partitions the data in segments of lengthN, whichwe suppose to be a multiple of relevant periodic fluctuationsinthe data set. As a second step, a mean segment is calculated byaveraging measurements with the same position in the segmentover the entire data set according to

Ni(n) ≡ 〈yi(t)|t = n+mN,m= 0, 1, . . .〉 (1)

for n = 0, 1, ...,N − 1. The detrended data setxi is then calcu-lated by subtracting the respective values of the mean segmentfrom the measured data,

xi(t) ≡ yi(t) − Ni(t modN) . (2)

for t = 1, . . . ,T with T > N. If T is the size of the data set,our simulations have shown that averages overN = 52 weeksis the best choice for the entire data set, to take into accountall known periodicities mentioned above. With this detrendingmethod, some periodicities with variable phase remain. To filteralso these periodicities, a second detrending withN = 1 day isthen performed on consecutive periods ofT = 14 days.

Figure 2a shows the original datay1 for the station of Chelas.A zoom-in of a small time interval is plot in Fig. 2b togetherwith the corresponding detrended datax1. From now on, if notstated explicitly otherwise, we will only consider the detrendedtime seriesx1 and x2. Next describe their characteristics bymeans of a stochastic process.

3. Modeling stochasticity in series of NO2 concentrations:Langevin processes

The detrended seriesxi in Eq. (2) reflect the remainingstochastic components of the measurements at the respectivestations of Chelas and Avenida da Liberdade. In this sectionwe assume that, with two variables, the stochastic process ismodeled by a system of two coupled Langevin equations, con-taining a deterministic and a stochastic part, described througha drift vector and a diffusion matrix, respectively.

For the general case of aK-dimensional state vectorX =(x1, ..., xK), the Ito-Langevin equations describing the evolutionof a particular trajectory in time read [23, 24]:

dXdt= h(X) + g(X)Γ(t), (3)

whereΓ = (Γ1, . . . , ΓK) is a set ofK independent stochasticforces with Gaussian distribution fulfilling

〈Γi(t)〉 = 0 (4a)

〈Γi(t)Γ j(t′)〉 = 2δi jδ(t − t′). (4b)

τ

−2

0

2

M(1)/τ

Raw Data

τ

Detrended

0 2 4 6 8

τ

0

100

200

M(2)/τ τc =5

0 2 4 6 8

τ

(a) (b)

(c) (d)

(−12.6,−18.5)(−6.3,−9.3)(0.0,0.0)

(9.5,13.9)

Figure 3: First conditional momentsM(1) for (a) the original series and(b) thedetrended series, with different NO2 concentrations (x1, x2) at each one of thetwo stations (see legend). The corresponding second conditional momentsM(2)

are shown in(c) and(d), respectively. These moments are computed accordingto Eqs. (9a) and (9b). While the original data presents oscillations beyond agiven time intervalτc ∼ 5, the detrended time series does not (see text). Thevalue of the corresponding Kramers-Moyal coefficient at the value of (x1, x2)chosen is given by Eq. (8) for the lowest value ofτ, i.e. one.

The two terms on the right hand side of Eq. (3) include both thedeterministic contribution,h = {hi}, and the stochastic contri-bution,g = {gi j }. The deterministic contribution describes thephysical forces which drive the system, while functionsg ac-count for the amplitudes of the different sources of fluctuationsΓ[10].

The coefficientsh andg are directly related to the drift vec-tors and diffusion matrices[23]

D(1)i (X) = hi(X) (5)

D(2)i j (X) =

K∑

k=1

gik(X)g jk(X) (6)

for i, j = 1, . . . ,K, describing the evolution of the joint proba-bility density function (PDF)f (X, t) by means of the Fokker-Planck equation [23, 24]:

∂

∂tf (X, t) = −

K∑

k=1

∂

∂xkD(1)

k (X) f (X, t)

+

K∑

k=1

K∑

m=1

∂2

∂xk∂xmD(2)

km(X) f (X, t). (7)

As done previously in other contexts[10, 11, 12, 13, 14, 16,17], the drift vector and the diffusion matrix can be derived di-rectly from the data.

Statistically, the drift and diffusion coefficients coefficientsD(1)

i andD(2)i j are defined as

D(k)(X) = limτ→0

1τ

M (k)(X, τ)k!

, (8)

4

x1

−15−10

−50

510

15x2

−20−10

010

20

h1

−3

0

3

a)

x1

−15−10

−50

510

15x2

−20−10

010

20

h2

−3

0

3

b)

x1

−15−10

−50

510

15

x2

−20

−10

0

10

20

D11

0

50

c)

x1

−15−10

−50

510

15

x2

−20

−10

0

10

20

D12

0

50

d)

x1

−15−10

−50

510

15

x2

−20

−10

0

10

20

D22

0

50

e)

Figure 4: For the detrended series we plot(a-b) both components of the drift vectorh = (h1, h2) and(c-e) the components of diffusion matrixD(2) = {D(2)i j }. The

corresponding fitted surfaces (black) are vertically offset for clarity. SinceD(2) is symmetric (see text) one hasD(2)12 = D(2)

21 .

with the first and second conditional moments given by

M(1)i (X, τ) = 〈Yi(t + τ) − Yi(t)|Y(t) = X〉 (9a)

M(2)i j (X, τ) = 〈(Yi(t + τ) − Yi(t))

·(Yj(t + τ) − Yj(t))|Y(t) = X⟩

, (9b)

Here 〈·|Y(t) = X〉 symbolizes conditional averaging over allevents that fulfill the conditionY(t) = X.

To determine the underlying Langevin equations, defined inEq. (3), one additionally needs to solve Eqs. (5) and (6). In par-ticular, the calculation of matricesg from the diffusion matricesrequires to solveD(2) = ggT, e.g. by means of diagonalization,D(2)

diag = PD(2)P−1, with P the orthogonal matrix of eigenvectors

of D(2). The family of solutions is given byg = PT√

D(2)diagPO,

whereO is an arbitrary orthogonal matrix, obeyingOOT = 1.The matricesD(2) are symmetric and positive semi-definite withall their eigenvalues real and non-negative (see Eq. (9b)),and

therefore√

D(2)diag is well-defined. For any choice ofO the anal-

ysis below does not change, and therefore we choose for sim-plicity O as the identity matrix.

The computation of the conditional moments is based ontheir statisticalτ-dependence for smallτ[10, 17]. Previousworks showed that Eqs. (9a) and (9b) are an operational defini-tion of the conditional moments that can easily be implementedfor the direct estimation of the drift and diffusion coefficientsfrom the data[10, 17]. In some practical situations, the limitin Eq. (8) can be approximated by the slope of a linear fit ofthe corresponding conditional moments at smallτ. When suchlinear fit is not possible, an alternative estimate is to consider

the first value ofM(τ)/τ at the lowest value ofτ[14]. We willuse this latter estimate for deriving the drift and diffusion coef-ficients, underlying the evolution of NO2 concentration in Lis-bon.

Within this framework, we consider the two-dimensionalsystem of NO2 concentrationsX = (x1, x2) describing the fluc-tuations at the stations of Chelas and Avenida da Liberdade,seeFig. 1. In order to comply with a Langevin process, as definedin Eq. (3), we first verify that both data sets exhibit Markovianproperties, which we show next for componentx1 only, for sakeof clarity. Forx2 the results are similar.

As Fig. 3 indicates, the conditional moments of the time se-ries show no evidence of measurement noise asτ approacheszero[16]: M(1)/τ do not diverge whenτ → 0. This is true,both before and after detrending. Forτ smaller than a limitingvalueτc, some oscillations are observed in the case without de-trending, although they have no impact on the estimate of thecorresponding Kramers-Moyal coefficients, as compared to ourmethod of using the value atτ = 1 as estimate. For details seeRef. [14].

The resulting components of the drift and diffusion coeffi-cients are plotted in Fig. 4. As one sees all surfaces are ade-quately fitted by a quadratic polynomial

p(x1, x2) = a1x21 + a2x2

2 + a3x1x2 + a4x1 + a5x2 + a6 , (10)

wherep denotes the drift and diffusion components,D(1)i and

D(2)i j respectively, and the coefficientsai are computed from a

least-square procedure on the drift and diffusion components asfunctions of the detrended variablesx1 andx2.

5

−15 0 15x

(1)1

0.0

0.1

0.2

0.3

p

x(2)1 =0.00

c)

−15 0 15x

(1)1

0.0

0.1

0.2

0.3

p

x(2)1 =3.92

d)

−15

0

15

x(2)

1

a)

0.2

00

0.100

0.100

0.050

0.050

0.010

0.010

0.005

0.005

−15

0

15

x(2)

1

b)

0.2

00

0.100

0.100

0.050

0.050

0.010

0.010

0.005

0.005

p(x(1)1 ,t|x (2)

1 ,t−τ2 )

p(x(1)1 ,t|x (2)

1 ,t−τ2 ;x(3)1 ,t−τ3 )

Figure 5: Contour plots of conditional probabilities (solid curves) and con-ditional two-point probabilities (dashed curves) computed from the detrendedtime seriesx1 and x2 with τ2 = 1 hour for(a) τ3 = 2 hours and(b) τ3 = 10hours. The corresponding cuts through contour planes, indicated by the hori-zontal dashed lines, are shown in(c) and(d) with a good matching between therespective one-point and two-point conditional probabilities. The distributionswere computed with 13 bins for each variable using a sample of105 data points.

4. Analysis of Markov properties for the series of NO2 con-centrations

The Markovian nature of the variablex1 can be investigatedby considering the differences between the conditional one-point probabilityp(x(1)

1 , t|x(2)1 , t − τ2) and the conditional two-

point probabilityp(x(1)1 , t|x

(2)1 , t − τ2; x(3)

1 , t − τ3). If the processis Markovian on time scales larger thanτ2, then these prob-ability distributions should not differ significantly[10] for anychoice ofτ3. Indeed, as can be seen from Fig. 5, the Markovianproperties seem to be fulfilled forτ2 = 1h and bothτ3 = 2handτ3 = 10h . We therefore observe strong indications that theprocess is Markovian already at the sampling rate of the datapoints of 1h and for time lags longer than 1h.

Further, it is also necessary to check the Gaussian nature ofthe stochastic forceΓ and ascertain it indeed obeys Eqs. (4). Us-ing the measured time series and the estimated KM-coefficients,the noiseΓ(t) can be reconstructed from a numerical discretiza-tion of Eq. (3) solved with respect toΓ[25], namely solving

Γ = g−1(X)(

X − h(X))

, (11)

whereX = X(t + 1)− X(t) andh(X) andg(X) are evaluated atX = X(t).

The resulting noise is analyzed with respect to its autocorre-lation, shown in Fig. 6: the autocorrelation decays to zero forthe very first values ofτ, which strongly supports to treatΓ1 andΓ2 as a white,δ-correlated noise source.

For ascertaining the Gaussian nature of the stochastic sourceswe plot in the inset of Fig. 6 the PDF of the reconstructed noisetime seriesΓ1 andΓ2 (solid lines) against a Gaussian distribu-tion (dashed lines).

0 10 20 30 40 50τ

0.0

0.5

1.0

ac

−10 −3 0 3 10noise

10-6

10-4

10-2

100

pdf

Gauss

Figure 6: Autocorrelation of the reconstructed dynamical noises Γ1,Γ2(stochastic fluctuation), indicating that they areδ-correlated. The inset showsthe probability density function (PDF) of the reconstructed noise normalized tovariance 1 (lines) and a normal distribution for comparison(dashed line).

As one sees from the inset, in the range comprising over95% of the Gaussian noise, the distributions for the stochasticsources are well approximated by a Gaussian distribution. Wefind it reasonable to assume, therefore, that the data seriescanbe approximated sufficiently well by a Fokker-Planck equation.The deviations observed for the extreme values, are common inthe analysis of long-term field measurements, showing tailsforclose to exponential decay.

From the tests described in this section one may satisfacto-rily take the seriesx1 andx2 as a set described by two coupledLangevin Equations, Eq. (3) withK = 2. Next we derive theseequations from the sets of measurementsx1 andx2.

5. Deriving optimal variables: eigensystem for NO2 mea-surements at different stations

Having successfully determined the drift and diffusionconstants describing the respective deterministic forcing andstochastic fluctuations of the system of NO2 concentration mea-surements, we now determine the eigensystem of the diffusionmatrices and investigate its principal directions. This procedurewas described in detail in [19] and was previously applied toatwo-dimensional sub-critical bifurcation[26] and to the analysisof human movement[27]. It will be briefly outlined here, forKvariables.

Diffusion matrices are numerically estimated on a mesh ofpoints in phase space, as shown for example in Fig. 4c-e. Thenat each mesh point theK eigenvalues and corresponding eigen-vectors of the estimated matrices are calculated. The diffusionmatrices contain information about the stochastic fluctuationsacting on the system and we use the local eigensystems of thematrix for a further characterization of these forces. In par-ticular, a vanishing eigenvalue indicates that the correspondingstochastic force may be neglected.

6

We are looking for a transform of the original coordinatesX = {xi} into new onesX = {xi}, such that the new coordinatesare aligned in the directions of the eigenvectors of the diffusionmatrix in each mesh point, i. e. the principal direction in whichthe diffusion matrix is diagonal. Diagonalizing the diffusionmatrix decouples the stochastic contribution in the set of vari-ables, and if the eigenvalues in the transformed coordinates aresignificantly different, we are able to restrict our investigationto the coordinates with lower stochasticity.

We therefore look for a two-times continuously differentiablefunctionF with

X = F(X, t), (12)

for which[23, 24] the deterministic and stochastic parts intheLangevin systems of equations, transform respectively as[19]

h(1)i (X) =

N∑

k=1

(

h(1)k (X)

∂Fi

∂xk

+

N∑

l=1

N∑

j=1

gl j (X)gk j(X)∂2Fi

∂xk∂xl

)

(13)

gi j (X) =

N∑

k=1

gk j(X)∂Fi

∂xk(14)

where the second equation readsg(X) = J(X)g(X), with J(X)the Jacobian of our transformationF. For reasons of clarity inthe following we do not explicitly notate the dependence onXandX.

The eigenvectorsuk of matrices g with coordinates inlocal bases ei , can be incorporated in matricesU =

[u1 u2 . . . uK ]. Defining U asU = JTU one then obtains(see Eq. (14))

UT ggTU = UTgTgU. (15)

By definition the inverse transformF−1(X) is chosen suchthat the normalized eigenvectors are given by

uk =1sk

∂F−1

∂xk(16)

with

sk =

∣

∣

∣

∣

∣

∣

∂F−1

∂xk

∣

∣

∣

∣

∣

∣

. (17)

i.e. the respective square sum of the columns in the Jaco-bian of the inverse transform. Taking into account this scal-ing factor the eigenvalues in the new coordinate system can becalculated[19],

D(2) = diag

λ1

s21

λ2

s22

. . .λK

s2K

, (18)

whereλi (i = 1, . . . ,K) are the eigenvalues of the diagonalizedmatrixD(2)

diag.In general, the eigenvalues of the diffusion matrix indicate

the amplitude of the stochastic force and the correspondingeigenvector indicates the direction towards which such forceacts. These directions can be regarded as principal axes of theunderlying stochastic dynamics[19].

In particular the vector field yielding at each mesh point theeigenvector associated to the smallest eigenvalue of matrix gdefines the paths in phase space towards which the fluctuationsare minimal. If this eigenvalue is very small compared to allthe other, the corresponding stochastic force can be neglectedand the system can be assumed to have onlyK − 1 independentstochastic forces, reducing the number of stochastic variablesin the system.

Notice however that, whereas in a Cartesian coordinate sys-tem the eigenvalues are strictly related to the amplitude ofdif-fusion in the corresponding eigenvectors, a nonlinear transfor-mation usually changes the metric[19, 28]. In the transformedsystem the direction of the maximal eigenvalue is not neces-sarily the direction with the highest diffusion. This disparityis accounted for by the factorsi above. A much more simplecase occurs when the eigenvectors are parallel (or almost) to afixed direction, meaning that the eigendirections at each pointin phase space are the same but rotated by a constant angle.Next we address such situation.

6. Transform of NO2 concentrations to the stochasticeigendirections

In this section we apply the procedure described previouslyto the two series of NO2 concentration,x1 andx2 in Chelas andAvenida da Liberdade, shown in Fig. 7a. The joint PDF of bothconcentrationsx1 is x2 is plotted in Fig. 7b showing the regionin phase-space most visited by the bivariate series (x1, x2). Aplot of the eigenvectors ofD(2) in Fig. 7c suggests that a contin-uous and smooth description of the corresponding sorted eigen-values exists. Here we place at each grid point of the phasespace one ellipsoid whose major and minor axis are given bythe (non-normalized) eigenvectors associated to the largest andsmallest eigenvalue respectively. Since the two eigenvalues aredifferent, the eigenvector corresponding to the lower eigenvaluedescribes the direction of minimum stochasticity. The eigen-vectors and eigenvalues of the diffusion matrix give locally theprincipal directions of stochastic fluctuations (diffusion).

In general, from a plot as the one in Fig. 7c it is possible to de-rive numerically the variable transformation in Eq. (12): at eachgrid point one determines the angle between the “largest” eigen-vector and the positive horizontal axis. Figure 7d shows thean-gle φ for the bivariate series (x1, x2). Rotating each ellipsoidseparately by the respectiveφ-angle aligns the largest eigenvec-tor along the horizontal direction and the smallest eigenvectoralong the vertical direction, yielding the two new (transformed)variables ˜x1 andx2. This angle can be derived at each grid pointfrom the corresponding diffusiong components namely

tan 2φ =2g12

g11− g22(19)

The angleφ or its absolute value quantifies the relative off-diagonal contribution that describes the coupling of the noiseterms by the diffusion matrix.

In general, what does such a transformation add to our un-derstanding about the system? First, by definition the transfor-mation decouples independent stochastic forces in the system.

7

−10 0 10x1

−20

0

20

x 2

−30−20−10 0 10 20 30x1

−40

−20

0

20

40

x2

7.0×10−4

5.0×10−4

3.0×10−4

1.0×10−4

−15−10 −5 0 5 10 15x1

−20

−15

−10

−5

0

5

10

15

x2

−π/2

0

⟨φ⟩π/2

φ

(a) (b)

(c) (d)

Figure 7: (a) The original time-seriesx1 andx2 and(b) their joint probabilitydensity function. In(c) we plot the two (orthogonal) eigenvectors defining thesemi-axis of an ellipsoid. The major semi-axis is associated with the largesteigenvalue and correspondingly the minor semi-axis with the smallest. Foreach grid point in phase space (x1, x2) we plot (d) the angleφ between theeigenvector associated to the largest eigenvalue and the positive x-semi-axis.As one sees from (c) and (d) the angleφ does not show significant disparity inits values within the considered range, indicating a stronguncoupling betweenthe two stations (see text). Thus, a global rotation of the axis may be considered(see Fig. 8).

The original (detrended) pair of variables as well as the trans-formed pair of variable obey Eq. (3), with one important dif-ference: the transformed pair of variables are such that eachvariable has a stochastic contribution governed by one indepen-dent stochastic force alone. In other wordsg(x1, x2) is diag-onal. For the original pair of variables the stochastic contri-bution mixes both independent stochastic forces. Second, ina reference frame where the two independent stochastic forcesare decoupled, their minimum and maximum magnitude reachthe largest difference between them. In other words, one alignsthe major and minor axis of the “diffusion ellipsoids” shownin Fig. 7c. In the particular case when one of the magnitudesis much smaller than other one, one of the variables can evenbe disregarded as a stochastic variable, reducing the number ofstochastic variables describing the system. A generalization toK variables is straightforward.

The value of〈φ〉 shown in Tab. 1 is the (x1, x2)-averaged an-gle 〈φ〉 ≃ 0.40π ∼ π/2 indicated at the scale of 7d. Further, oneinspection of Fig. 7c and 7d enables the observation that in ourpresent case theφ-angle remains approximately constant at anygrid point. Similar observations were made for the other pairsof stations in Lisbon (not shown). Consequently, we may con-clude that for our set of stations a global rotation is enoughtoalign the “diffusion ellipsoids”. For the stations in Chelas andAvenida da Liberdade, Figure 8a shows the result obtained afterperforming a global rotation by the median median(φ) ≃ 0.43π.

y x x x〈〉1 36 5.82 · 10−6 0.00935 −0.0145〈〉2 64.7 5.22 · 10−6 0.0541 −0.0721σ1 26.8 17.3 26.7 18.3σ2 40.8 25.5 15.1 23.3〈|φ|〉 1.11±0.282 1.31±0.203 0.117±0.114 1.33±0.105〈Q〉 0.703±0.182 0.714±0.103 0.679±0.0946 0.593±0.0445〈R1〉 0.157±0.125 0.247±0.167 0.19±0.122 0.169±0.118〈R2〉 0.183±0.13 0.207±0.134 0.245±0.15 0.243±0.153µ 0.479 0.457 −0.254 −0.401

Table 1: Characterizing different pairs of variables: the original pair of mea-suresy, the detrended pair of variablesx, the transformed pairx, and, for com-parison, the pairx transformed according to the simpler rules in (22). For eachvariable or pair of variables (i = 1, 2) we show the mean〈〉i and standard devia-tion σi of their distribution of observed values together with the rotation angleφ averaged over phase space, as well as the average coefficientsQ andRi forevaluating their stochastic and deterministic contributions. See Eqs. (19), (20)and (21). The correlation coefficient between both variables is also given ineach case (see text).

In Fig. 8b both eigenvaluesλi are plotted, correspondingto the length of the major and minor axis of the diffusion el-lipsoids. While there is a significant difference between botheigenvalues,λmax∼ 2.5λmin as shown in Fig. 8f, they are of thesame order of magnitude. Such observation indicates the pres-ence of two independent stochastic forces driving the bivariatesignal (x1, x2).

The stochastic contribution for each variables of the pair(x1, x2) obeying Eq. (3) can be compared through one param-eterQ defined at each pointx in phase-space as

Q2(x) =g2

11(x) + g212(x)

g221(x) + g2

22(x)(20)

where one orders the rows of matrixg to guaranteeQ < 1,i.e. variablex1 is chosen as the one having lower stochasticcontribution. WhenQ = 1 both stochastic contributions areequal. WhenQ ≪ 1 one stochastic contribution can be ne-glected, reducing by one the number of stochastic contributionsin the system. For an arbitrary number of stochastic variables,the generalization of Eq. (20) is straightforward[19].

Table 1 shows the value of coefficientQ for the set of mea-surementsy, for the detrended variablesx and for the trans-formed detrended variablesx. The coefficient is averaged overthe sample of points in the corresponding phase space. Fory and x the smallest stochastic contribution has a magnitudeof approximately 70% of the largest one, while for the trans-formed variables it decreases more than 2%. This magnitude isnot small enough to permit neglecting one variable. We con-sider this finding the central result of this letter: before trans-formation the pair of detrended variables include already twoindependent stochastic forces of the same order of magnitude.

One note is however important to stress at this point. Themethod applied here to empirical data deals with a transforma-tion that operates on the diffusion matrix alone. No constraintsrelated to the drift functions,h1 andh2 are considered. To eval-uate the predictability of each variablei one needs to comparethe total amplitude of the stochastic term with the deterministic

8

−15 0 15

x1

−10

0

10

x2

x1

−150

15x2

−80

8

λi

40

80

120

x1 −20−1001020

x 2

−10−5

05

10

| φ|

0

0.2

0.4

⟨|φ|

⟩

x1 −20−1001020

x 2

−10−5

05

10

Q

0.4

⟨Q⟩

0.6

0.8

x1 −20−1001020

x 2

−10−5

05

10

R1

0

0.3

0.6

⟨R1

⟩

x1 −20−1001020

x 2

−10−5

05

10

λmax/λmin

0

1.0

⟨⟩3.0

(a) (b) (c)

(d) (e) (f)

Figure 8: Transformed variables through a global rotation of the x1 andx2 axis (check Fig. 7) into new variables, ˜x1 and x2. (a) Eigenvectors of the transformedvariables and(b) its eigenvalues derived for the diffusion matrix of the transformed variables, together with quantities to evaluate some underlying properties of thesystem, namely(c) the rotation angleφ (see Eq. (19)),(d) the asymmetry of the stochastic influence at each variable, given by Q in Eq. (20),(e) the deterministiccoefficientR in Eq. (21) and(f) the quotient between the largestλmax and the smallestλmin for each grid point in the transformed phase position. For each propertythe average value is showed with horizontal surfaces and an explicit indication at the vertical axis.

term, namely

R2i (x) =

h2i (x)

g2i1(x) + g2

i2(x). (21)

Such expression is also straightforwardly extended toK vari-ables. The largerRi the more predictable the variablei may be,i.e. the smaller the stochastic overall contribution is comparedto the deterministic part governing the evolution of the variable.In our present case, as given in Tab. 1, while the detrendingy → x of our measurements increases the predictability of thenon-periodic modes in time, the global rotation has no majoref-fect: both coefficientsRi maintain the same order of magnitudeafter transform.

The correlation coefficient µ between both stations is alsogiven in Tab. 1. While detrending has no significant effect onthe correlation, the transformxi → xi indeed decreases its ab-solute value.

Figure 8c, 8d and 8e illustrate the numerical result of eachproperty,φ, Q andR for the transformed variables. Similar tosuch variables is the quotient between the maximum and mini-mum eigenvalues, shown in Fig. 8f. Similar plots are obtainedfor the other possible pairs of stations.

7. Comparison with standard methodologies

In this Section we first address the question of how good thecoordinate transform derived above is compared to other, pos-sibly simpler transforms.

For example, we may consider a transform to coordinateswhich describe the mean value and difference between the twomeasured time series, e.g.

(

x1

x2

)

= 12

(

x1 + x2

x1 − x2

)

. (22)

This choice is the simplest one for two variables, one describ-ing the total amountx1 + x2 and another describing the relativeamountx1 − x2. For such choice of variables we obtain a valueof Q = 0.59, which is essentially the same as for our “opti-mized” variables (see Tab. 1), The absolute value of the an-gle 〈|φ|〉 is however considerably larger than for our optimizedtransform, as is the correlation coefficient between the time se-ries, meaning that this simple transform fails to decouple thenoise sources. The drift-diffusion quotients yieldR1 = 0.17 andR2 = 0.24, showing again no better predictability in comparisonwith the original variables.

In our case we saw that the eigendirections do not dependmuch on the detrended variablesx1 andx2, which implies that

9

they are functionally decoupled. However, sometimes it is nec-essary to consider a proper scaling of the variables[19]. Insuchcases, we find it advisable to use a more general transform togeneralized polar coordinates given by

(

x1

x2

)

=

(

rg

θg

)

=

√

(αx1 + β)2 + (γx2 + δ)2)arctan

(

γx2+δ

αx1+β

)

+ ǫ

(23)

where in general the radial and angle variables,rg andθg, arefunctions of the detrended variablesx1 andx2. This approachhas the advantage that the inverse transformF−1 is given by thesimple form

x1

x2

=

1α[rg cos(θg − ǫ) − β]

1γ[rg sin(θg − ǫ) − δ]

. (24)

In addition, the metric factors introduced by the polar transformcan result in a more pronounced separation of the eigenvaluesin the transformed coordinates.

The other question addressed within this section is the com-parison between methods applied to choose the most appro-priate inputs. Theoretically, any set of input data can be fedinto a model for training and evaluation. However, the num-ber of possible variables to be used and the number of waysthey can be presented is too diverse to test all possible com-binations. A number of statistical methods can be applied inorder to choose the most appropriate set of predictors or inputs.Examples are, among other, stepwise regression, PCA, clusteranalysis and ARIMA. For details, see Ref. [22] and referencestherein. Such pre-processing procedures, reduce the numberof input variables into the models, thus eliminating the redun-dant information. In these standard procedures, the selectionof variables is usually made independently for each monitoringstation.

Another possible way to tackle redundancy is the pre-processing of data consisting of the computation of backwardstepwise regressions (BSR) conducted between a target variableand all the other data sets. Based on the available common pe-riod data sets, one constructs a collection of records, composingthe input vector, which includes the meteorological variables,air pollutant concentrations, etc, and together with it assumesthe corresponding target, which in our case is the atmosphericconcentration of a certain pollutant. Subsequently, one retainsthe smallest subset of statistically significant variablesto pre-dict a certain pollutant concentration automatically at a givenmonitoring station. In addition, BSR allows the determinationof the best time lags for each input variable, typically daily andweekly cycles.

The referred techniques also allow the comparison betweenthe original data sets and surrogate data sets including only thestochastic component. The stochastic component may be deter-mined through a rough approximation of a mathematical func-tion (e.g., sin x), or, for example, by the presented framework.After the selection of variables and the determination of cyclic

and stochastic behaviors on each time series, linear and non-linear models can be applied in order to model air pollution ineach monitoring station. The forecasting capabilities of the dif-ferent approaches can then be compared. Such models are alsoapplied to each decoupled time-series in order to predict nextdays air quality at each monitoring station. The applicationsof this framework, however, allows to determine the stochasticcomponent on a efficient manner, enhancing air quality predic-tions.

8. Discussion and Conclusion

In this paper, we investigated the stochastic properties ofaset of two simultaneous series, obtained by introducing a properdetrending of NO2 measurements, which is able to remove pe-riodic modes in the series. We focused in the measurements attwo different stations out from a set of 22 stations in Lisbon.

Based on validity tests we assumed, that the time series af-ter detrending were properly modeled by a system of Langevinequations. The validity of this assumption is discussed in sec-tion 3, showing that the data sets obey the Markov property toasufficient extent. The stochastic fluctuations show good resem-blance withδ-correlated Gaussian noise.

Calculating the eigenvalues of the diffusion matrices, wefound a transform that leads to a description in which the dif-fusion matrices are diagonal. Since the transformed variablesare derived directly from the transformation that diagonalizesthe diffusion matrices, they correspond to the orthogonal direc-tions in phase space in which fluctuations are stronger (largereigenvalue) and weaker (smaller eigenvalue) respectively.

Comparison between original and transformed variablesshowed that the two detrended variables are driven by stochas-tic forces almost decoupled from each other, showing an almostconstant rotation angle of the “diffusion ellipsoid” at each pointof phase space. Further, both stochastic sources have ampli-tudes of the order of the deterministic terms, indicating a shorthorizon of predictability. This procedure worked out well forthe NO2 data, since the transformation of variables resulted indecoupling the diffusion components in the new coordinates.Other transformation could be considered. For instance, wedis-cussed how this approach could be applied for other data setsin which the diffusion ellipsoids do not align in phase space,but instead depend in non trivial functional of the variables. Inthis case, the transform maps the detrended variables into twopolar-like coordinates.

One question that should be addressed in a forthcoming studyis to present a systematic overview on all pairs of stations stud-ied by us in this scope but not shown thoroughly, since it wasout of our main purposes. Doing that one would be able to com-pare in detail the results obtained through the method appliedin this paper with standard methods used for forecasting NO2

concentration at a specific spot in the city of Lisbon.

Acknowledgments

The authors thank DAAD and FCT for financial supportthrough the bilateral cooperation DREBM/DAAD /03/2009.

10

FR (SFRH/BPD/65427/2009) and PGL (Ciencia 2007) thankFundacao para a Ciencia e a Tecnologia for financial support,also with the support Ref. PEst-OE/FIS/UI0618/2011.

References

[1] EEA European Environment Agency,The European environment. Stateand outlook 2010 : synthesis, DOI: 10.2800/45773 (2010).

[2] N. de Nevers,Air Pollution Control Engineering(McGraw-Hill Compa-nies, 2000).

[3] ReportHealth aspects of air pollution with particulate matter, ozone andnitrogen dioxide, (World Health Organization, Denmark, 2004).

[4] M. Demuzere, R. Trigo, V. Arellano, N. van Lipzig, Atmospheric Chem-istry and Physics9, 26952714 (2009).

[5] M. Gardner, S. Dorling, Atmospheric Environment33, 709719 (1999).[6] J. Hooyberghs, C. Mensink, G. Dumont, F. Fierens, O. Brasseur, Atmo-

spheric Environment39, 32793289 (2005).[7] J. Kukkonen, L. Partanen, A. Karppinen, J. Ruuskanen, H.Junninen,

M. Kolehmainen, H. Niska, S. Dorling, T. Chatterton, R. Foxall, G. Caw-ley, Atmospheric Environment37, 45394550 (2003).

[8] M. Kolehmainen, H. Martikainen, J. Ruuskanen, Atmospheric Environ-ment35, 815-825 (2001).

[9] R. Friedrich and J. Peinke, Phys. Rev. Lett.78, 863 (1997).[10] R. Friedrich, J. Peinke, M. Sahimi, and M.R.R. Tabar, Phys. Rep.50687

(2011).[11] P.G. Lind, A. Mora, J.A.C. Gallas and M. Haase, Phys. Rev. E 72, 056706

(2005).[12] R. Friedrich, J. Peinke and Ch. Renner, Phys. Rev. Lett.84, 5224 (2000).[13] F. Ghasemi, M. Sahimi, J. Peinke, R. Friedrich, G.R. Jafari and

M.M.R. Tabar, Phys. Rev. E75, 060102 (2007).[14] D. Kleinhans, R. Friedrich, A. Nawroth, J. Peinke, Phys. Lett. A34642-

46 (2005).[15] S.J. Lade, Phys. Lett. A3733705-3709 (2009).[16] F. Boettcher, J. Peinke, D. Kleinhans, R. Friedrich, P.G. Lind, M. Haase,

Phys. Rev. Lett.97 090603 (2006).[17] P.G. Lind, M. Haase, F. Boettcher, J. Peinke, D. Kleinhans and

R. Friedrich, Phys. Rev. E81 041125 (2010).[18] J. Carvalho, F. Raischel, M. Haase and P.G. Lind, J. Physics 285012007

(2011).[19] V. Vasconcelos, F. Raischel, M. Haase, J. Peinke, M. Wachter, P.G. Lind

and D. Kleinhans, Phys. Rev. E84 031103 (2011).[20] R. Friedrich, S. Siegert, J. Peinke, St. Luck, M. Siefert, M. Lindemann,

J. Raethjen, G. Deuschl, G. Pfister, Physics Letters A271, 217 (2000).[21] K. Pearson, Philosophical Magazine2, 559 (1901).[22] D. Wilks, Statistical Methods in the Atmospheric Sciences(2nd Ed.)

No. 59 in International Geophysics (Academic Press, 2006).[23] H. Risken,The Fokker-Planck Equation, (Springer, Heidelberg, 1984).[24] C. W. Gardiner,Handbook of stochastic Methods, (Springer, Germany,

1997).[25] C. Micheletti, G. Bussi, and A. Laio, J. Chem. Phys.129, 074105 (2008).[26] J. Gradisek, R. Friedrich, E. Govekar and I. Grabec, Meccanica,38, 33

(2003).[27] A. M. van Mourik, A. Daffertshofer, and P. J. Beek, Biological cybernetics

94, 233 (2006).[28] S.J. Lade, Phys. Rev. E80 031137 (2009).[29] P.G. Lind, A. Mora, M. Haase and J.A.C. Gallas, Int. J. Bif. Chaos17(10)

3461-3466 (2007).

11

Documents

Optimal variables for describing the evolution of NO2 concentration