ESTIMATION OF PERIODICITY IN NON-UNIFORMLY SAMPLED ...ufdcimages.uflib.ufl.edu/UF/E0/04/30/72/00001/mishra_b.pdf · First, my sincere gratitude goes to my advisor Dr. Jose C. Pr´

ESTIMATION OF PERIODICITY IN NON-UNIFORMLY SAMPLED ASTRONOMICALDATA - AN APPROACH USING SPATIO-TEMPORAL KERNEL BASED

CORRENTROPY

By

BIBHU PRASAD MISHRA

A THESIS PRESENTED TO THE GRADUATE SCHOOLOF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

OF THE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCE

UNIVERSITY OF FLORIDA

2011

c© 2011 Bibhu Prasad Mishra

2

Dedicated to my parents and my younger sister

3

ACKNOWLEDGMENTS

First, my sincere gratitude goes to my advisor Dr. Jose C. Prıncipe for his wonderful

guidance and remarkable patience throughout my research, my committee members Dr.

John Harris and Dr. John M. Shea for their guidance and help throughout my graduate

studies. I would like to thank our collaborators Dr. Pavlos Protopapas and Dr. Pablo

A. Estevez for their valuable insight. I would also like to thank Alex and Abhishek for

their help during the initial part of the project, Rakesh for his suggestions, Austin for his

discussions on various topics and all members of CNEL for their knowledge on variety of

topics. Last but not the least I would like to thank my parents, my sister and friends for

their constant support and encouragement throughout my life.

4

TABLE OF CONTENTS

page

ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

CHAPTER

1 AN INTRODUCTION TO PERIODICITY ESTIMATION IN ASTRONOMICALDATA ANALYSIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.1 Overview of the Astronomical Data . . . . . . . . . . . . . . . . . . . . . . 101.2 Introduction to Estimation Techniques . . . . . . . . . . . . . . . . . . . . 11

2 PERIODICITY ESTIMATION TECHNIQUES: A REVIEW . . . . . . . . . . . . 14

2.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.1.1 Spline Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.1.2 Lomb Periodogram . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.1.3 Dirichlet Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 PERIODICITY ESTIMATION USING KERNEL BASED METHODS . . . . . . . 21

3.1 Correntropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.2 Spatio-temporal Kernel based Proposed Method . . . . . . . . . . . . . . 233.3 Kernel Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4 PERIODICITY ESTIMATION USING SPATIO-TEMPORAL KERNEL BASEDCORRENTROPY ON FOLDED TIME SERIES DATA . . . . . . . . . . . . . . . 33

4.1 Period Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.2 Kernel Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

APPENDIX: VARIABLE STEP SIZE . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5

LIST OF TABLES

Table page

2-1 Comparative performance using interpolation based techniques, Lomb periodogramand Dirichlet transform along with results published by Harvard University,Time Series Center. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3-1 Comparative performance using proposed 2D kernel based technique andcorrentropy on interpolated light curve due to [4] along with results publishedby Harvard University, Time Series Center. Correctly identified values are markedin bold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4-1 Comparative performance using proposed correntropy based technique alongwith results published by Harvard University, Time Series Center. Correctlyidentified values are marked in bold. . . . . . . . . . . . . . . . . . . . . . . . . 40

5-1 Performance evaluation of the existing techniques and the roposed techniques.The results published by Harvard University, Time Series Center has beenused as the golden standard for the evaluation. . . . . . . . . . . . . . . . . . . 45

6

LIST OF FIGURES

Figure page

2-1 The sampled values at non-uniformly spaced time intervals are in blue andinterpolated and uniformly re-sampled values in red for different values of p. . . 15

2-2 Frame selection from the light curve 1.3804.164. Note that the y-axis representsthe brightness magnitude of the star system. However the brighter the objectappears, the lower the value of its magnitude as it is customary in astronomyto plot the magnitude scale reversed. . . . . . . . . . . . . . . . . . . . . . . . . 16

3-1 Contour and surface plot of CIM(X,Y) with Y=0 in 2D space with a Gaussiankernel and a kernel size equal to 1. . . . . . . . . . . . . . . . . . . . . . . . . 23

3-2 Figure illustrating the reason why simple correntropy can not be directly usedin case of non-uniformly sampled data. . . . . . . . . . . . . . . . . . . . . . . 25

3-3 Plot of 2D kernel based measure with varying standard deviation values fortime kernel. In this case light curve 1.3810.19 has an time period of 88.9406days and light curve 1.3449.27 has a time period of 4.0349 days. . . . . . . . 28

3-4 Plot of 2D kernel based measure with varying standard deviation values formagnitude kernel. In this case light curve 1.3810.19 is the data set used andhas a true time period equal to 88.9406 days. . . . . . . . . . . . . . . . . . . 29

4-1 Reconstruction of single period of the signal by breaking the original signalinto frames of length equal to the true time period of the signal. . . . . . . . . . 34

4-2 Folding performed on a non-uniformly sampled signal with true period equalto 1 unit. Folding has been performed with trail period equal to 1 unit and 1.3units. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4-3 Plot of correntropy of transformed space with varying standard deviation valuesfor time kernel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4-4 Plot of correntropy of transformed space with varying standard deviation valuesfor magnitude kernel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5-1 Magnitude plot of light channel 1.3810.19. Note that the y-axis representsthe magnitude of the star. Magnitude measures the brightness of a celestialobject, however the brighter the object appears, the lower the value of its magnitude.It is customary in astronomy to plot the magnitude scale reversed. . . . . . . . 43

5-2 Plot of spatio-temporal kernel based correntropy for light curve 1.3448.153. . . 44

5-3 Plot of spatio-temporal kernel based correntropy for light curve 1.3810.19. . . 44

7

Abstract of Thesis Presented to the Graduate Schoolof the University of Florida in Partial Fulfillment of the

Requirements for the Degree of Master of Science

ESTIMATION OF PERIODICITY IN NON-UNIFORMLY SAMPLED ASTRONOMICALDATA - AN APPROACH USING SPATIO-TEMPORAL KERNEL BASED

CORRENTROPY

By

Bibhu Prasad Mishra

May 2011

Chair: Jose C. PrıncipeMajor: Electrical and Computer Engineering

Period estimation in non-uniformly sampled time series data is frequently a goal

in astronomical data analysis. There are various problems faced during estimation

of period. Firstly, data is sampled non-uniformly which makes it difficult to use simple

techniques such as Fourier transform for performing spectral analysis. Secondly, there

are large gaps in data which makes it difficult to interpolate the signal for re-sampling.

Thirdly, in data sets with smaller time periods, the non-uniformity in sampling and

noise in data pose even greater problems because of the lesser number of samples

per period. Lastly, one of the biggest problem is that the time period of these light

curves can not be easily identified by periodogram techniques because of the inherent

modulations in the light curve within a single instance of time period which must be

accounted for while estimating the period. Generally periodogram techniques give

a peak at fundamental frequency which may not be the frequency corresponding to

the true period but rather correspond to sub-multiple of the true period. In the present

work we first discuss few of the existing methods such as Fourier transform, Lomb

periodogram, Dirichlet transform for period estimation, then shifting our focus to kernel

based methods. A new spatio-temporal kernel based cost function has been proposed

which works directly on the non-uniformly sampled data. Furthermore, a spatio-temporal

kernel for correntropy on transformed space has been proposed to estimate the time

8

period of the data with enhanced accuracy. Finally, comparison of proposed methods

has been done to the existing techniques to highlight the improvement provided by the

kernel based methods.

9

CHAPTER 1AN INTRODUCTION TO PERIODICITY ESTIMATION IN ASTRONOMICAL DATA

ANALYSIS

Astronomical observations using visual wavelengths are called light curves (i.e.

brightness magnitude over time) and are used to quantify the motion of stars. Of

particular interest is the discrimination between periodic and non-periodic relative

movements and the quantification of the period of light curves obtained from a telltale

of objects such as eclipsing binaries, RRLs (pulsating variable stars named after RR

Lyrae), cepheids (intrinsically variable stars with exceptionally regular periods of light

pulsation), etc [11].

1.1 Overview of the Astronomical Data

The time series data analyzed in this work comes from photometric astronomical

surveys. These are basically time series of intensity of light collected from various

channels like telescopes, different spectral bands or various instruments. Due to

variations in atmosphere and the sky conditions the data collected is non-uniform

in nature and is noisy. Thus the light curve data comes in three columns: time, flux

and error. The error column gives an estimation of the error of measurement in the

photometric procedure. The MACHO (MAssive Compact Halo Object) survey [1] is

operated with the purpose of searching for the missing dark matter in the galactic halo,

like brown dwarfs or planets. In MACHO the light amplification is caused by bending of

space around a heavy object due to the phenomenon known as microlensing. Current

exiting techniques mostly use Lomb-Scargle (LS) periodogram [7],[9] which is an

extension of classical periodogram techniques but it works with non-uniformly sampled

data. The estimated period given by the LS periodogram is used to ”fold” the time series

modulo the estimated value for the period so that the periodic nature of data is clearly

seen. Then, the estimated period is trimmed such that the scatter of the folded plot is

reduced. Once this is achieved it is possible to perform calculations to obtain a more

precise estimate of the period. This final step known as analysis of variance (AoV) in

10

astronomy is due to [15]. These values published by Time Series Center at Harvard

University are used as the golden standard for comparison of the various algorithms

as the final values have been manually inspected by the Time Series Center team.

This process is computationally intensive and with data being collected from billions of

astronomical objects we need a technique which is more efficient and accurate at the

same time. This inherent difficulty of the problem requires computationally intelligent

techniques [2], [3], [12], [16], [18] to solve the problem. The present work proposes two

algorithm, one using a 2D Gaussian kernel on all pairs of sample points and the other

using information theoretic approach based on correntropy [6], [10], [13], [17] with a

new spatio-temporal kernel. We will be comparing the results of the current work with

the algorithm proposed in [4] and also with the existing methods involving interpolation,

Lomb periodogram etc.

1.2 Introduction to Estimation Techniques

There are several difficulties that need to be addressed in this area. First, the

data set normally consist of samples which have been taken at non-uniformly spaced

time instants, which prevents the direct usage of Fourier transformation to study the

spectral composition of the signal. Also as the data is non-uniformly sampled correlation

cannot be directly applied either. One possible alternative is to interpolate the data

and re-sample it periodically before applying the method of choice. The presence of

gaps in the time series also creates another problem as even interpolation won’t give

accurate results in an acceptable range. This problem is avoided by simply framing

the time series data and using those frames which don’t have gaps or have very few

consecutive missing points. Generally time series data with larger time periods allow

more missing points in a frame and also the frame length is larger; whereas for data sets

with smaller time periods smaller frame length are used and fewer number of missing

consecutive samples are allowed. There is also the problem of noise as each sample

point at each time instant is accurate with certain variance. To reduce the effect of error

11

on our estimation of periodicity of data we ignore samples which have variance values

exceeding a certain threshold.

Thus in the present work two categories of techniques for periodicity estimation

have been used. In the first category we deal with methods where techniques involve

the use of interpolation. The methods in this category make use of simple Fourier

transform, correlation and a recently developed kernel based technique known as

correntropy. The second category which do not require the use of interpolation and

re-sampling but work directly on the non-uniformly sampled data are Lomb periodogram

[7] and Dirichlet transform [8]. Although framing and interpolating enable use of simple

standard techniques such as correlation or Fourier transform this method no longer

uses the original data points directly as the original information is combined with

interpolation noise. Interpolation noise along with inherent noise in the collected data

further compromises the precision of period estimation of the light curve. Hence from

an engineering point of view it is better to use samples directly for periodicity estimation

purposes rather than using interpolated data. The most popular techniques which

work on the data samples directly without involving use of interpolation are Lomb

periodogram and Dirichlet transform. As we will see later even periodogram based

techniques have drawbacks owing to the nature of the light curves. In the current work

the data set used has been obtained from eclipsing binary star systems. In these

systems there are two eclipses per cycle which gives rise to the modulation effect and

hence periodogram based methods tend to give peak at frequencies corresponding to

the sub-multiple values of the true time period. These problems are addressed by the

proposed kernel based methods.

The rest of the chapters are organized as follows: Chapter 2 deals with methods

which involves interpolation and re-sampling of data and also the techniques such as

Lomb periodogram and Dirichlet transform which work directly on the non-uniformly

sampled data. Chapter 3 introduces the concept of Correntropy and deals with the

12

algorithm proposed in [4] and a new proposed technique based on spatio-temporal

Gaussian kernel. Chapter 4 deals with the final proposed algorithm which uses

spatio-temporal kernel based correntropy on a transformed space. Chapter 5 concludes

the work and discusses the potential problems which can be addressed in the future to

further improve the period estimation techniques.

13

CHAPTER 2PERIODICITY ESTIMATION TECHNIQUES: A REVIEW

This chapter deals with some of the existing techniques which are useful for

estimation of period of astronomical light curves. These methods can be broadly divided

into two categories. The first category encompasses the techniques which need data to

be uniformly sampled in time for analysis. The second category deals with the methods

which use non-uniformly sampled data directly for analysis. For the first category of

methods to be used we need to have uniformly sampled data which we do not have.

The best way to get uniformly sampled data from non-uniformly sampled data is to use

interpolation and then resample the interpolated data at regular intervals to get uniformly

sampled data. In the first category of techniques we will use the re-sampled data for

Fourier analysis and autocorrelation for estimating the period. In the second category

of methods we use Lomb periodogram and Dirichlet transform for estimation purposes.

Before further proceeding into the implementation of these various signal processing

tools we first briefly describe the methods themselves.

2.1 Theory

This section deals with spline interpolation, Lomb periodogram and Dirichlet

transform in the said order.

2.1.1 Spline Interpolation

As the data is sampled at non-uniformly spaced time instants, to make it uniform

for analytical purposes we use interpolation [5] and then re-sample the data at

uniformly spaced intervals. There are various kinds of interpolation methods such

as linear interpolation, polynomial interpolation and spline interpolation of which spline

interpolation is the most commonly used. For experimental purposes we have used a

cubic spline interpolation to interpolate the signal from the data given. The expression

in Equation 2–1 gives the cost function which is to be minimized for interpolation of the

14

A (p=1.0) B (p=0.5)

Figure 2-1. The sampled values at non-uniformly spaced time intervals are in blue andinterpolated and uniformly re-sampled values in red for different values of p.

data.

I (p) = p

n∑

j=1

w(j)|x(j)− f (t(j))|2 + (1− p)∫|D2f (t)|2dt (2–1)

The first term under summation is the term which controls the error and the integral

term control the smoothness of the spline where ’D’ is the representation for derivative

operator. ’p’ is the smoothness parameter and as ’p’ varies from 0 to 1 the smoothing

spline changes from one extreme to another. In the summation term, ’w(j)’ represents

the importance given to the error at each sampled instant. The data set used for

simulation has specified variances at each sampled instant which gives the estimation of

error at that instant. In our case we used w(j) = 1. For interpolation purposes we use a

value of 0.5 for p which seems to strike a balance between reducing interpolation error

and increasing the smoothness of interpolated signal. Taking p as 1 gave spikes in the

interpolated signal where there are gaps as shown in Figure 2-1. It is compared to the

plot for p = 0.5.

15

Figure 2-2. Frame selection from the light curve 1.3804.164. Note that the y-axisrepresents the brightness magnitude of the star system. However thebrighter the object appears, the lower the value of its magnitude as it iscustomary in astronomy to plot the magnitude scale reversed.

The interpolated data thus obtained has been used for Fourier transform and

Auto-correlation function (ACF) to estimate the period. In both the cases first frames

were chosen from the light curve such that there were at least 100 points in the frame

16

and the gap length in the frame did not exceed 10 days (Unit of time of the data used is

days). Figure 2-2 shows the frame selection from the light curve 1.3804.164.

Then interpolation was performed as described above and then re-sampling

was done at the rate of 20 samples per day. Then in case of Fourier transform the

re-sampled data is Hamming windowed. Then N-point FFT is performed such that N is

the lowest power of 2 such that it is greater than or equal to total number of points in the

resampled data set. The peak in the FFT plot is used to estimate the time period of the

light curve by simply inverting the frequency value at the peak. In case of ACF as the

name suggests auto correlation is performed on the interpolated data. Then the largest

peak other than the peak at zero lag is identified and the lag value at that peak is the

estimated value of the time period.

2.1.2 Lomb Periodogram

The traditional methods of spectral analysis needs signal to be uniformly sampled to

work on. But Lomb periodogram [7] does not need the samples to be evenly spaced and

hence could be used directly on data for our case. It also allows examining frequencies

higher than the mean Nyquist frequency i.e. the Nyquist frequency obtained by evenly

spacing the same number of data points at the mean sampling rate. The sole reason

for using periodogram analysis is that it provides a reasonably good approximation to

the spectrum obtained by fitting sine waves by least squares to the data and plotting

the reduction in the sum of residuals against frequency. This least squares spectrum

provides the best measure of the power contributed by the different frequencies to

the variance of data and can be regarded as natural extension of Fourier methods to

non-uniform data. It reduces to Fourier power spectrum in the limit of equal spacing.

The Lomb periodogram for zero mean time series x(tn) is defined as follows;

P(ω) =1

2σ2{C(ω) + S(ω)} (2–2)

17

where

σ2 = Var(x(tn)) (2–3)

C(ω) =[∑Nn=1 x(tn) cosω((tn)− τ(ω))]2∑Nn=1 cos

2 ω((tn)− τ(ω))(2–4)

S(ω) =[∑Nn=1 x(tn) sinω((tn)− τ(ω))]2∑Nn=1 sin

2 ω((tn)− τ(ω))(2–5)

and

τ(ω) =1

2ωarctan

{ ∑Nn=1 sin 2ωtn∑Nn=1 cos 2ωtn

}(2–6)

is an offset which makes the periodogram translation invariant. In our case we

have used this periodogram formula on each of the frame without interpolation and

re-sampling. In Lomb periodogram the frequency value has been calculated till f = 2

and the inverse of the peak frequency is the calculated period of the time series.

2.1.3 Dirichlet Transform

This transform generalizes the Z-transform and is better suited for analysis in case

of non-uniformly sampled data. The Dirichlet transform [8] preserves information about

sampling instants because it does not simply consider x(tn) as sequence of samples but

as a function of time instants tn. Dirichlet transform is defined as follows;

X (p) = D[x(tn)] =

∞∑n=0

x(tn)e−ptn (2–7)

where where p is a complex variable and is defined as p = σ + jω. For uniform sampling

this becomes equivalent to the Z-transform with z = epτ where τ is the uniform sampling

period. While calculating the Dirichlet transform value till f = 2 has been calculated. To

calculate the time period of the signal the peak is identified in the Dirichlet transform and

its inverse is taken.

2.2 Results

In this section we present results obtained using four techniques. The first two

techniques Fourier transform and auto-correlation are used along with cubic spline

interpolation as described in 2.1.1. The last two techniques i.e. Lomb periodogram

18

Table 2-1. Comparative performance using interpolation based techniques, Lombperiodogram and Dirichlet transform along with results published by HarvardUniversity, Time Series Center.

Light curve Harvard Fourier Auto- Lomb Dirichletblue channel -TSC transform correlation periodogram transform1.3810.19 88.9406 44.2811 89.05 44.5217 44.52171.4411.612 45.1143 22.5986 473.6 22.5055 22.50551.4168.434 43.9301 22.1405 87.5 22.0215 22.02151.3809.1058 28.9073 14.4991 28.85 14.4225 14.42251.4652.565 27.5718 13.7681 27.65 13.745 13.7451.4288.975 17.6131 8.8086 35.25 8.7897 8.78971.4539.778 16.2502 15.7538 31.2 8.1270 8.12701.4173.1409 14.1534 14.1241 113.8 7.0865 7.08651.3449.948 14.0064 7.0621 70.05 7.0137 7.01371.4174.104 8.4929 8.5333 85.05 4.249 4.2491.4538.81 5.5343 81.92 50.0 2.7676 2.76761.3564.163 4.7155 34.1333 198.2 1.179 1.1791.3804.164 4.1875 38.0718 62.975 2.0941 2.09411.3449.27 4.0349 102.4 10.35 2.0177 2.01771.3448.153 3.2765 17.0667 16.1 3.2768 3.27681.4539.37 2.9955 68.2667 40.2 1.4982 1.49821.3442.172 1.02059 22.7556 29.9333 0.5103 0.51031.3325.93 0.95176 19.5048 20.15 0.9517 0.95171.3444.880 0.90286 19.0615 29.0250 4.708 4.7081.3447.783 0.83615 159.9390 76.96 0.7183 0.7183

and Dirichlet transform work directly on the non-uniformly sampled data as mentioned

earlier. Data set has been obtained by MACHO survey and the unit of time in the data

set is equal to a day.

For applying the techniques based on interpolation mentioned above, on a data set

first a frame of data is selected having at least 100 sample points without gaps greater

than 10 days. One thing we notice is that it is possible to obtain more than one frame

of data from each light curve data set. In those cases we simply average out over the

various values obtained from a particular technique.

Then for methods involving use of interpolation, a cubic spline is used to approximate

the light curve and then the interpolated curve is re-sampled uniformly with Fs = 20

samples per day. For Fourier analysis we simply use the uniformly sampled data

19

obtained after spline interpolation and windowed using Hamming window. The highest

peak in the spectrogram gives the peak frequency which when inverted gives the true

time period. In case of analysis using Autocorrelation the uniformly sampled data is

directly passed to the auto-correlation function. Then the plot of the output from the auto

correlation function is analyzed to obtain the highest peak disregarding the peak at zero

lag. The highest peak corresponds to the periodicity of the light curve. While performing

Fourier transform the size of transform is the least power of 2, 2k s.t. 2k−1 < L <= 2k

where L is the total number of samples in the time domain signal.

In case of Lomb periodogram and Dirichlet transform we directly use the original

samples for estimation of period. We perform both Lomb periodogram and Dirichlet

transform with a frequency resolution of 12048

. In both the cases again the peak

corresponds to the periodicity of the light curve.

The Table 2-1 gives the period values estimated for various light curves by using the

four methods described above. The values which are correct estimate of the true period

are mentioned using bold font.

In the Table 2-1 we observe that in many cases the value of the estimated period is

half of the true period. The reason behind this is that the eclipsing binary star systems

have two eclipses per cycle and hence it gives rise to modulation effect which can be

seen in the signal. The effect is similar to as shown in Figure 5-1. In this case the signal

has an cycle length of around 88 days but due to the modulation we see a trough in

about every 44 days. Also as we are using periodogram techniques which aim at fitting

sine waves into the signal and hence tend to give a peak at frequency corresponding

to twice the cycle rate which is expected. Even ACF fails due to this modulation effect.

Hence keeping in mind the drawbacks of periodogram based methods and challenges

posed by the nature of data we move towards the use of kernel based methods.

20

CHAPTER 3PERIODICITY ESTIMATION USING KERNEL BASED METHODS

The present chapter introduces the recently developed kernel based technique

known as correntropy and then presents a spatio-temporal kernel based technique.

Then an algorithm proposed in [4] utilizing interpolation and correntropy is used for

simulation purposes and compared to the newly proposed spatio-temporal kernel based

technique.

3.1 Correntropy

Correntropy is a generalized correlation function introduced in [13]. It can be

defined by inner product of vectors, which can be computed by using a positive definite

kernel function, κ, satisfying Mercers conditions is defined in Equation 3–1.

κ(xi , xj) =< φ(xi),φ(xj) >, (3–1)

where φ(xi) transforms the data xi non-linearly from input space to a high-dimensional

feature space. There are various types of kernel functions like Gaussian, spline or

sigmoid but in this particular case we have used the Gaussian kernel. The Gaussian

kernel is defined as follows;

κ(xt , xs) =1√2πσexp

{− (xt − xs)

2

2σ2

}(3–2)

This value σ is known as the kernel size and is the free parameter in the Equation

3–2. This free parameter σ is chosen from the data-set itself. For defining correntropy

a Gaussian kernel has been used. Given a random process {xt , tεT} where t denotes

time and T denotes the index set of interest, then the correntropy function is defined as

shown below;

V (t, s) = E [κ(xt , xs)] (3–3)

21

Applying Taylor series expansion to the Gaussian kernel we can express the

correntropy function as;

V (t, s) =1√2πσ

∞∑

k=0

(−1)k(2σ2)kk!

E [(xt − xs)2k ] (3–4)

To estimate a univariate correntropy function we must require that the even moment

terms are shift invariant which is a stronger condition than the wide sense stationary

condition required by correlation function. Hence to use τ as follows;

V (τ) = V (t + τ , t) =1

N

N−1∑t=0

κ(xt , xt+τ) (3–5)

strict stationary on even moments is a sufficient condition when Gaussian kernel is

used. So V (τ) is estimated by using the Equation 3–5. The variable σ in Equation 3–2

determines the emphasis given to higher order moments as compared to second order

moment. Thus correntropy is a function of two arguments similar to correlation but with

the addition of higher order moments introduced by the kernel function. As σincreases

the higher order moments decay causing the second order moment to dominate and

hence correntropy approaches correlation. Due to introduction of higher order moments

correntropy has been found to produce sharper and narrower peaks corresponding to

similarity estimation compared to simple correlation function.

Another important property which correntropy induces in the input space is a well

defined metric known as the correntropy induced metric (CIM) [14]. It is defined as;

CIM(X ,Y ) = (κ(0)− V (X ,Y ))1/2 = (κ(0)− 1N

N−1∑n=0

κ(xn, yn))1/2 (3–6)

CIM can also be thought of as the root mean squared error (RMSE) between the two

random variable in the transformed high dimensional space. For the Gaussian kernel, it

has been observed that CIM behaves likes an L2 norm when the two vectors are close,

as L1 norm outside the L2 norm and as they go farther apart it becomes insensitive to

distance between the two vectors(L0 norm). The extent of the space over which the CIM

22

−4 −3 −2 −1 0 1 2 3 4−4

−3

−2

−1

0

1

2

3

4

A Contour plot

−4 −3 −2 −1 0 1 2 3 4

−4

−2

0

2

4

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

B Surface plot

Figure 3-1. Contour and surface plot of CIM(X,Y) with Y=0 in 2D space with a Gaussiankernel and a kernel size equal to 1.

acts as L2 or L0 norm is directly related to the kernel size σ. This is illustrated in Figure

3-1. This unique property in CIM is very useful in rejecting the outliers. In this aspect it is

different from simple correlation which provides a global measure.

Another important concept associated with that of correntropy is correntropy

spectral density (CSD). It is defined as;

P[f ] =

∞∑τ=−∞

(V (τ)− < V (τ) >)e−j2πfFs

τ (3–7)

where < V (τ) > is the mean value of correntropy. This is equivalent to Fourier transform

of the centered correntropy function.

3.2 Spatio-temporal Kernel based Proposed Method

In the previous section we defined correntropy which has been found to perform

better than auto-correlation. For applying correntropy in a similar fashion to auto-correlation

we would require uniformly sampled data. In case of non-uniformly sampled data, we

23

can perform interpolation and then re-sampling at regular intervals and then apply

correntropy as has been described in [4]. But in this method as we shall see the

original data points are never used and lots of sampled points are dropped while

finding frames of data without having large gaps in them. Hence if we could use

non-uniformly sampled data directly in a measure similar to correntropy it would be

more useful. Before proceeding to solve the problem we analyze the problem faced

while implementing simple correntropy on non-uniformly sampled data. In Figure 3-2

we see that in case of regularly sampled data for every lag value, sample point ’A’ in the

reference signal has a corresponding value in the time shifted signal which is not true

for the non-uniformly sampled data. Hence in uniformly sampled case we have pairs of

points which are passed into the kernel and the kernel output is summed over all pairs

to give the correntropy value as in Equation 3–5. Therefore in the present scenario of

non-uniform data instead of pairing each sample point with another sample value at a

fixed lag we pair each sample point with every other sample point but assign a certain

weight to each of those pairing as shown in Figure 3-2. Hence for a particular lag value

τ if two samples were sampled at time instants which are different from τ then that pair

is assigned a weight as κt(t, s + τ). We can clearly see that if the samples are exactly

spaced at interval τ then the weight assigned is maximum. Hence Equation 3–8 deals

with the new kernel output for the sample value in reference signal sampled at time ’t’.

Equation 3–9 shows the summation over all samples in the data set and finally Equation

3–10 deals with the normalization to give the expected value of the kernel output over all

samples compared to simple correntropy. In Equation 3–8,3–9 and 3–10 the left hand

side is the expression related to simple correntropy and the right hand side shows the

expression for new two dimensional kernel based technique. Also in Equations 3–8,3–9

and 3–10 κ deals with the sample values and κt deals with the time instant values.

κ(xt , xt+τ)x1→∑s

κ(xt , xs)xκt(t, s + τ) (3–8)

24

Figure 3-2. Figure illustrating the reason why simple correntropy can not be directlyused in case of non-uniformly sampled data.

∑t

κ(xt , xt+τ)x1→∑t

∑s

κ(xt , xs)xκt(t, s + τ) (3–9)

1

N

∑t

κ(xt , xt+τ)x1→∑t

∑s κ(xt , xs)xκt(t, s + τ)∑t

∑s κt(t, s + τ)

(3–10)

Equation 3–10 defines the final expression for using kernels to non-uniform data in

a similar fashion to correntropy. We use Gaussian kernel for κt which is the time kernel.

In order to implement the idea of using kernels directly to non-uniformly sampled

data we propose a new spatio-temporal kernel. It is defined on 2D vectors and the inner

product of vectors can be computed using a positive definite kernel function, κ, defined

in Equation 3–1. In the Equation 3–2 we have used a single dimensional value but in

our case we are dealing with 2 dimensional vectors. More concretely we define a two

dimensional vector h which has time value in one dimension and magnitude value in the

25

other. It is expressed as ha = [ta, xa]T and hb = [tb, xb]T . The product kernel κ is defined

as;

κ(ha,hb) = κ1(ta, tb) ∗ κ2(xa, xb) (3–11)

where κ1 and κ2 are both Gaussian kernels as defined in Equation 3–2 defined on time

(t) and magnitude (x) component of the data set respectively. This kernel is still positive

definite, being effectively a two dimensional Gaussian kernel with diagonal covariance

matrix with first diagonal component σ1 dealing with time component tk and second

diagonal component σ2 dealing with magnitude of data xk at that time instant. Thus the

new cost function is defined as follows;

K(τ) =

∑a,b κ(ha,hb + h

0τ)∑

a,b κ1(ta, tb + τ)(3–12)

where h0τ = [τ , 0].

Now the Periodicity estimation procedure is defined as follows;

1. Let H = {hk = [tk , xk ]T , 1 < k < N} where N is the total number of data pointsobtained by selecting frames of the light curve.

2. For the trial period T = τ calculate the new cost function K(τ) as defined inEquation 3–12 where (a, b) encompasses all the pairs of samples.

3. Vary the value of τ over the range 0.5 till 200 with a step size of 0.001 and repeatstep 2.

4. The value of τ which gives the first significant peak is the desired period.

A significant peak is determined as follows;

1. Denote the minimum value of the plot be denoted by Mn and maximum value byMx .

2. Dynamic range (d) = Mx −Mn3. Threshold (Th) = Mn + 0.9 ∗ d

4. Any peak which exceeds the threshold (Th) is a significant peak.

26

3.3 Kernel Size

In the present section we focus on the selection of the kernel sizes for the technique

proposed in Section 3.2. We can observe in Figure 3-3 that the peak becomes

more prominent and plot becomes more smooth by increasing σ1. We also take into

consideration the fact that for light curves which have a small time period, using a larger

value of σ1 would flatten the peak. This happens due to the reason that for smaller time

period the rate of change of magnitude value over a fixed period time is larger compared

to a data set having a larger time period. Hence for data sets with smaller time periods

many pairs which should have been given less weight due to the difference in the values

of their sampling instants, are given more weight. This distorts the plot and suppresses

the peak at the true period. So a trade-off is considered between these two opposing

factors and we have σ1 = 0.4.

For the case of magnitude kernel the value of σ2 is considered w.r.t. amplitude

dynamic range. Choosing a very large kernel size means any two magnitude values

from the corresponding vectors passed through the kernel will give similar output as

the kernel tapers very slowly. Choosing a very small kernel size would give an output

of 1 only when we have equal magnitude values and give output close to zero for any

other pair of amplitude values. This is clearly reflected in Figure 3-4 where in the plot of

σ2/(Dynamic range of magnitude) vs Trial period we see a larger kernel size gives a flat

plot having a value close to one irrespective of the assumed period value and where a

small kernel size gives a plot having value close to zero. Therefore to obtain a sharper

peak at the true period we choose σ2/(Dynamic range of magnitude) = 0.1 as the

optimum value.

For simplicity and to have the plot values restricted between 0 and 1 we drop the

normalizing factor for unit integral in the Gaussian kernel.

27

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

Trial period

Light curve 1.3810.19, sigma(time)=0.1

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

Trial period


0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

Trial period


0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

Trial period

Light curve 1.3449.27, sigma(time)=2

Figure 3-3. Plot of 2D kernel based measure with varying standard deviation values fortime kernel. In this case light curve 1.3810.19 has an time period of 88.9406days and light curve 1.3449.27 has a time period of 4.0349 days.

3.4 Results

In this section we compare the proposed technique based on 2D kernel to another

technique involving the use of correntropy and is due to [4]. This technique proposed by

[4] basically involves interpolation of data, followed by correntropy and then calculating

28

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

Trial period

Sigma(Magnitude)/(Dynamic range of magnitude) = 0.01

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1Sigma(Magnitude)/(Dynamic range of magnitude) = 0.1

Trial period

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1Sigma(Magnitude)/(Dynamic range of magnitude) = 1

Trial period0 20 40 60 80 100

0

0.2

0.4

0.6

0.8


Trial period

Figure 3-4. Plot of 2D kernel based measure with varying standard deviation values formagnitude kernel. In this case light curve 1.3810.19 is the data set used andhas a true time period equal to 88.9406 days.

the CSD. From the CSD the peaks are identified and are used to estimate the period of

the light curve. The results are compared in Table 3-1.

In the Table 3-1 we see that both correntropy on interpolated data and the proposed

method using 2D kernels perform better than the methods described in Chapter 2. First

we observe that for the interpolation based methods in case of auto-correlation and FFT

29

Table 3-1. Comparative performance using proposed 2D kernel based technique andcorrentropy on interpolated light curve due to [4] along with results publishedby Harvard University, Time Series Center. Correctly identified values aremarked in bold.

Light curve Harvard Proposed Correntropy onblue channel -TSC technique interpolated data1.3810.19 88.9406 88.83 44.46851.4411.612 45.1143 22.89 22.48891.4168.434 43.9301 43.93 21.87461.3809.1058 28.9073 14.31 14.34891.4652.565 27.5718 27.91 13.71021.4288.975 17.6131 17.97 8.68451.4539.778 16.2502 65.03 16.20761.4173.1409 14.1534 14.06 7.04861.3449.948 14.0064 14.03 6.99941.4174.104 8.4929 16.99 4.24121.4538.81 5.5343 11.01 5.48451.3564.163 4.7155 18.89 4.61791.3804.164 4.1875 20.97 4.18781.3449.27 4.0349 4.023 4.02391.3448.153 3.2765 22.88 3.22111.4539.37 2.9955 3.0 2.95701.3442.172 1.02059 22.98 22.75561.3325.93 0.95176 20.02 215.041.3444.880 0.90286 19.182 43.88571.3447.783 0.83615 17.975 275.5491

we get only three hits each but we get seven correct identifications for the CSD based

method. Although interpolation were done on same frames of data we see that CSD

performs better. Interpolation introduces significant amount of error when the gaps in

the frame are comparable to that of the true value of the time period. This can be seen

from the fact that in Table 2-1 for the case of auto-correlation and FFT almost all the

data sets correctly identified have a larger time period. Especially auto-correlation has

all the correct identifications for light curves which have a time period greater than 25

30

days. But in case of CSD based method data sets with time periods as low as 3 days

have been correctly identified. This happens despite the fact that the maximum allowed

gap size while choosing a frame is 10 days. This clearly shows that correntropy based

method is very efficient in rejecting the outliers which in this case is the interpolated

data generated at the region of large gaps. Another interesting observation is that CSD

based method identifies the time period as half of the true value in 9 cases. This can

again attributed to the modulation effect observed in the light curves as shown in Figure

5-1.

Again in the proposed technique which uses a 2D kernel we get better performance

compared to the existing techniques as it is able to get 8 correct identifications. This

proposed method is also able to identify correctly for data sets having a time period

as low as 3 days. The use of a gaussian kernel helps in correctly identifying the useful

sample pairs while calculating the final measure. In case of 6 light curves we find that

the proposed method gives a value of the period which is a multiple of the true period.

The reason behind this is that the method is unable to find enough sample pairs which

have time difference closer to the true time period. Thus the 2D kernel based measure

does not produce peak at the true peiod but rather produces peaks at multiples of the

true period for which it is able to find sufficient number of sample pairs. In 2 cases we

see that 2D kernel based method also gives a value of period which is half of the true

period. This can be attributed to the modulation effect as explained earlier.

Although these kernel based methods perform much better than the existing

methods we still see that these two techniques fail to produce any result for data sets

having a time period less than or close to 1 day. This can be seen in the results Table

3-1 for the last four light curves. The reason behind this is that average sampling rate

in the data sets are always greater than 1. Thus we need to develop a algorithm which

would be able to detect the correct value of time period even when the average sampling

rate is more than that of the true time period of data. We have non-uniformly sampled

31

data which implies that we have information from various phases in the time period

of the light curve. Thus we need to develop an algorithm which is able to exploit this

information to its advantage to approximate a single time period of the light curve.

32

CHAPTER 4PERIODICITY ESTIMATION USING SPATIO-TEMPORAL KERNEL BASED

CORRENTROPY ON FOLDED TIME SERIES DATA

This present chapter defines a 2-dimensional kernel based correntropy and then

uses it for identification of periodicity of periodic signals and quantification of the likely

period. Before describing the steps of the proposed technique first we discuss the

idea behind this proposed technique. A periodic signal repeats itself after a fixed

interval of time. If we compare two samples which have been collected at intervals

equal to a multiple of the period of the signal then it is expected that these values are

equal in magnitude. In our case this happens rarely because, first of all, the signal is

non-uniformly sampled with gaps and there are lots of noise and modulations. But still

if we take two samples at an interval close to the multiple of the true time period then

the magnitude will be comparable too. This idea suggests that one should be folding

the observations to the principal argument of the period. Thus if we know the period we

can reconstruct one period data as x(t) = x(t + nT ) where T is the period and n is

an integer. This idea is illustrated in Figure 4-1 where the signal with a time period of

10 units and average sampling time of 1 unit is used to reconstruct a single period. If

we fold the data using a value of T which is not a multiple of the true period then the

actual signal would not be obtained. It is easy to see that the period T will yield the

smoothest representation in the principal argument domain where as a value which is

not an integral multiple of T will yield a noisy representation as illustrated in Figure 4-2.

Therefore one needs to find a methodology to compare the similarity of the samples

both in time and in amplitude, which will be implemented with a two dimensional kernel.

We can see how we can create a single period of the signal by knowing the true period.

Unfortunately this method is greedy, and many different trial period value needs to be

evaluated to obtain the period for which the similarity is the highest.

More concretely, we define a two dimension vector h which has time value in one

dimension and magnitude value in the other. It is expressed as ha = [ta, xa]T and

33

Combining into one frame

Figure 4-1. Reconstruction of single period of the signal by breaking the original signalinto frames of length equal to the true time period of the signal.

hb = [tb, xb]T . The product kernel κ is defined as;

κ(ha,hb) = κ1(ta, tb) ∗ κ2(xa, xb) (4–1)

where kappa1 and kappa2 are both Gaussian kernel as defined in Equation 3–2 defined

on time (t) and magnitude (x) component of the data set respectively. This kernel is

still positive definite, being effectively a Gaussian kernel with diagonal covariance matrix

34

A B

Figure 4-2. Folding performed on a non-uniformly sampled signal with true period equalto 1 unit. Folding has been performed with trail period equal to 1 unit and 1.3units.

with first diagonal component σ1 dealing with time component tk and second diagonal

component σ2 dealing with magnitude of data xk at that time instant. Using the newly

defined kernel the correntropy equation is defined as follows;

V =1

N − 1N−1∑

i=1

κ(hi,hi+1) (4–2)

where hi is an ordered sequence of vectors.

The Section 4.1 deals with the proposed technique for estimation of time period of

the signal.

4.1 Period Estimation

The algorithm for the period T estimation is as follows;

1. Let H = {hk = [tk , xk ]T , 1 < k < N} where N is the total number of data pointsobtained by selecting frames of the light curve.

35

2. For the trial period T = p, the transformation φp on H is such that φp(H) = Y

where Y = {Ψk = [τk , xk ]T , 1 < k < N} s.t. τk = (tk −[tkp

]p)/p where [·] is floor

function.

3. Then we order the transformed vectors such that Ψki precedes Ψki+1 if τki <= τki+1.If τki = τki+1 we order the amplitudes s.t. xki <= xki+1

4. Calculate correntropy with the 2D kernel V (p) as Equation 4–2

5. Calculate correntropy with the time kernel only as a normalizing factor U(p) asU(p) = 1

N−1∑N−1i=1 κ1(τki , τki+1)

6. Vary the value of p over a range and repeat from step 2 to step 4.

7. The value of p which gives the first significant peak in the plot of V (p)/U(p) is thedesired period

A significant peak is determined as follows;

1. Denote the minimum value of the plot be denoted by Mn and maximum value byMx

2. Dynamic range (d) = Mx −Mn3. Threshold (Th) = Mn + 0.9 ∗ d

4. Any peak which exceeds the threshold (Th) is a significant peak.

In the above algorithm the range depends on some apriori knowledge of the periods

of interest. The range of 0.5 to 200 is used and step size of 0.0001 is used. For lower

values i.e. for values less than 2 step size of 0.00001 is used. The reason behind

different step sizes is that for lower values of p a small deviation in the estimated period

value can give noisy period reconstruction as the number of cycles is larger in the given

time interval. This is explained in Appendix.

4.2 Kernel Size

In this section we look into the selection of the kernel sizes. The value of σ1 is

considered w.r.t. average sampling period (determined by dividing the time interval over

which all vectors are spread by total number of vectors) for choosing an appropriate

value of standard deviation for the time kernel. We can observe in Figure 4-3 that the

36

peak becomes more prominent by increasing σ1 ∗ (Average sampling rate) and we also

take into consideration the fact that consecutive vectors in time passed through the

kernel should be given more importance as compared to vectors which are far apart

from each other in time. Giving more weight to consecutive vectors is especially more

significant as we are trying to measure similarity between vectors which are consecutive

in time, after the transformation of the 2D vectors during the implementation of our

proposed technique. To give more importance to vectors which are closer in time we

need to reduce the kernel size. So a trade-off is considered between these two opposing

factors and we have σ1 ∗ (Average sampling rate) = 1. One thing to be noticed is that the

average sampling rate is always fixed for all values of assumed period while scanning

over a range because in the proposed technique we scale all the 2D vectors in the time

range 0 − 1 after performing the modulo operation and the total number of vectors is

fixed.

Similarly for magnitude the value of σ2 is considered w.r.t. amplitude dynamic

range. Choosing a very large kernel size means any two magnitude values from the

corresponding vectors passed through the kernel will give similar output as the kernel

tapers very slowly. Choosing a very small kernel size would give an output of 1 only

when we have equal magnitude values and give output close to zero for any other

pair of amplitude values. This is clearly reflected in Figure 4-4 where in the plot of

σ2/(Dynamic range of magnitude) vs Trial period we see a larger kernel size gives a flat

plot having a value close to one irrespective of the assumed period value and where a

small kernel size gives a plot having value close to zero. Therefore to obtain a sharper

peak at the true period we choose σ2/(Dynamic range of magnitude) = 0.1 as the

optimum value.

For simplicity and to have the plot values restricted between 0 and 1 we drop the

normalizing factor for unit integral in the Gaussian kernel.

37

0.5 1 1.50

0.2

0.4

0.6

0.8

1Sigma(Time)x(Avg. sampling rate) = 0.01

Trial period

0.5 1 1.50

0.2

0.4

0.6

0.8

1Sigma(Time)x(Avg. sampling rate) = 0.1

Trial period

Co

rre

ntr

op

y v

alu

e

0.5 1 1.50

0.2

0.4

0.6

0.8

1

Trial period

Co

rre

ntr

op

y v

alu

e

Sigma(Time)x(Avg. sampling rate) = 10

0.5 1 1.50

0.2

0.4

0.6

0.8

1Sigma(Time)x(Avg. sampling rate) = 1

Trial period

Figure 4-3. Plot of correntropy of transformed space with varying standard deviationvalues for time kernel.

4.3 Results

In this section the results has been presented for 20 data sets and it is compared

with results published by Harvard University, Time Series Center which has been used

as the standard for comparison.

This algorithm shows a significant deal of improvement over all the previously

mentioned methods for estimation of time period. This method uses correntropy based

38

0.5 1 1.50

0.2

0.4

0.6

0.8


Trial period0.5 1 1.50

0.2

0.4

0.6

0.8


Trial period

0.5 1 1.50

0.2

0.4

0.6

0.8


Trial period0.5 1 1.50

0.2

0.4

0.6

0.8


Trial period

Figure 4-4. Plot of correntropy of transformed space with varying standard deviationvalues for magnitude kernel.

on 2D kernel and hence is able to exploit the advantage provided by the kernel based

methods in rejecting the outliers. Also this correntropy is performed on folded time

series data which makes it robust to the effects of average sampling rate as compared

to interpolation based methods or the kernel based methods described earlier. In Table

4-1 we see that the proposed algorithm gives 15 correct identifications and for the rest 5

cases it gives a value equal to half of the true period.

39

Table 4-1. Comparative performance using proposed correntropy based technique alongwith results published by Harvard University, Time Series Center. Correctlyidentified values are marked in bold.

Light curve Harvard 2D correntropyblue channel -TSC based technique1.3810.19 88.9406 44.50171.4411.612 45.1143 45.14411.4168.434 43.9301 43.93131.3809.1058 28.9073 14.45461.4652.565 27.5718 27.57481.4288.975 17.6131 17.61161.4539.778 16.2502 16.25081.4173.1409 14.1534 14.15091.3449.948 14.0064 14.00591.4174.104 8.4929 8.49281.4538.81 5.5343 5.53441.3564.163 4.7155 4.71561.3804.164 4.1875 4.18761.3449.27 4.0349 4.03471.3448.153 3.2765 3.27641.4539.37 2.9955 1.49771.3442.172 1.02059 0.51031.3325.93 0.95176 0.951761.3444.880 0.90286 0.902861.3447.783 0.83615 0.41807

The robustness to average sampling rate can be seen from the fact that for the last

four light curves with time periods close to 1 day the method gives 2 hits and for two

cases it gives half the value of true time period. The 5 cases where we get estimated

values as half of the true time period, can be attributed to modulation effect as described

in earlier chapters. However an interesting thing to note is that in all these five cases

the peaks at the true time period were larger than the peaks at half the value. Thus if in

some way we can fine tune the threshold then perhaps we would be able to get 100%

accuracy.

40

Although this method performs better than the rest of the techniques described

earlier it does require a larger computation time compared to the other methods due

to the folding in time involved at each trial value. Thus another important area to look

into is to get rough estimate of the time period using a faster and relaible technique and

then using the proposed method to correctly identify the period over a smaller range of

values.

41

CHAPTER 5CONCLUSION

In Chapter 2, 3 and 4 in many cases we see that the identified peak is at a value

which is half of the true period. The reason for getting a peak at sub-multiple of true

period is due to the shape of the signal which can be seen in Figure 5-1. We see the

modulation effect inside a period which is responsible for the peak at a value which is

half of the true period. This modulation effect tends to affect the periodogram techniques

such as Lomb periodogram, Dirichlet transform the most, as these methods try to fit sine

waves into the data. These methods tend to give a result in terms of the fundamental

frequency rather than looking at the actual number of periodic cycles per unit time. Out

of the 20 data sets in 15 instances Lomb periodogram and Dirichlet transform identify

the time period as half of the true time period. This can be seen in Table 2-1. This is

also seen in the scenario when FFT is used on interpolated data where in 7 instances

the detected period is half of the true period.

As we are dealing with eclipsing binary star systems and the magnitude waveform

is as shown in Figure 5-1 fundamental frequency identified by the spectral methods

almost always turns out to be twice the period cycle rate. In Chapter 2 interpolation

based methods have the added disadvantage that for data sets with smaller time period

the average number of samples available per period for interpolation is less and hence

the quality of interpolation is affected to a great extent. This effect can be seen by the

fact that as the time period of the data set used for testing decreases the interpolation

based methods tend to produce more erroneous results. This can be seen from the fact

that in case of FFT and Autocorrelation methods for the first 10 data sets we get three

correct identification but for the final 10 data sets which have smaller time periods we

get no correct identification.

In Chapter 3 the new proposed method gives 8 correct estimations whereas the

method proposed in [4] gives 7 hits. Again one thing to observe here is that for the 4

42

0 20 40 60 80 100 120 140 160 180 200

−9.9

−9.8

−9.7

−9.6

−9.5

−9.4

Figure 5-1. Magnitude plot of light channel 1.3810.19. Note that the y-axis representsthe magnitude of the star. Magnitude measures the brightness of a celestialobject, however the brighter the object appears, the lower the value of itsmagnitude. It is customary in astronomy to plot the magnitude scalereversed.

data sets which have time period close to one day neither of the method actually gives a

correct estimation. The problem arises as the data sets on an average have one sample

per period. This makes it difficult to either interpolate the samples or estimate period.

In Chapter 4 the proposed method uses folding to reconstruct a single period. Thus

even though data sets have fewer samples per period on an average it does not affect

the method. Hence this proposed technique even estimates with very high degree of

accuracy the period of data sets having time period close to or less than a day. In fact

it is able to give 15 accurate estimates and the remaining 5 give values which are half

of the true period. If we compare this method with the methods described earlier, the

43

0 2 4 6 8 10 12 14 16 18 200.72

0.74

0.76

0.78

0.8

0.82

0.84

0.86

Trial period

Light curve 1.3448.153

Figure 5-2. Plot of spatio-temporal kernel based correntropy for light curve 1.3448.153.

0 10 20 30 40 50 60 70 80 90 1000.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Trial period

Light curve 1.3810.19

Figure 5-3. Plot of spatio-temporal kernel based correntropy for light curve 1.3810.19.

44

Table 5-1. Performance evaluation of the existing techniques and the roposedtechniques. The results published by Harvard University, Time Series Centerhas been used as the golden standard for the evaluation.

Index Method used Correct Average absoluteidentification relative error for(20 light correctly identifiedcurves used) time period

1 FFT on interpolated data 3 0.012462 Auto-correlation on 3 0.00202

Interpolated data3 Lomb periodogram 2 0.000084 Dirichlet transform 2 0.000085 CSD on interpolated data 7 0.005816 2D kernel based measure 8 0.00927

(Proposed)7 2D kernel based correntropy 15 0.00009

on folded time series data(Proposed)

value of the estimate given by the 2D kernel based correntropy technique are more

accurate. Another thing to note here is that peaks obtained in this case are very sharp

especially for data sets with a smaller time period. This can be easily seen in Figure 5-2

which has a sharper peak as compared to that of Figure 5-3. Thus we can see that the

spatio-temporal kernel based correntropy method is superior to the existing methods

not only in terms of number of hits where it estimates the period correctly but also in the

degree of accuracy of those hits. This can be seen from the Table 5-1.

Future Directions

We see that for methods based on interpolation the average number of samples in

a period affects the performance of the procedure. In these methods we have used fixed

allowable gap size i.e. 10 days when selecting frames but in case of light curves with

smaller time periods especially those with time period less than 10 days it will severely

impair the interpolation and hence cause most of the methods to fail. Thus to improve

45

perhaps we can use an adaptive allowable gap size in the data set. Thus light curves

with smaller time period will use a smaller gap size compared to those with larger time

period.

The periodogram based techqniues suffer from the fact that they tend to fit

sinusoids to the data sets and hence give the time period corresponding to the

fundamental frequency. One observation that can be made here is that periodogram

based techniques are simply using sine waves as the basis function to capture the

information content of the data set. Later perhaps we can look into developing a new set

of basis functions using the fuzzy knowledge about the shape of the curves. This will

cause the method to be more robust to the modulation effects.

The first proposed method which uses a spatio-temporal kernel also fails especially

for data sets with smaller time period. This is because of the fact that this method uses

a fixed kernel size for all lag values, but for data sets with smaller time periods we need

the kernel size to be sensitive to the small changes in the time difference values in the

sample pairs. The reason is that for light curves with smaller time periods for a specific

shape the rate of change is higher. Thus a smaller kernel size would be preferable.

In the second proposed technique using spatio-temporal based correntropy

although the performance shows a significant improvement over previously existing

methods although it still fails in few cases due to the modulation effect. To counter the

modulation effect we have used the concept of a significant peak in Section 4.1 while

describing the procedure. Although we have used a adaptive threshold for various data

sets to identify significant peaks the fraction of dynamic range is used is fixed. This

actually depends on the amount of modulation. This also holds for the first proposed

method based on spatio-temporal kernel. We need to look into a way to identify the

degree of modulation or in other words the shape of the curve. Another drawback

is that we have to scan over a fixed range of values to identify the true period. Thus

development of an efficient algorithm to detect the range of values or the order of the

46

time period is worth looking into. This will in turn reduce the number of computations.

The amount of improvement in computation time would be significant as for each

value of the trial period we perform folding in time thus considerably slowing down

the algorithm. Fewer number of trial periods would thus reduce computation time

significantly.

Thus we see that in all the interpolation based techniques and the proposed

methods in some way we need to have a rough estimate of the time period value

to make the all these procedures more adaptive and thus perform better. In case of

periodogram based methods we need to be able to develop basis function specific to

the light curve to use instead of pure sinusoids. Also the proposed methods need the

knowledge of shape of the curve to some extent to further improve their performance.

47

APPENDIX: VARIABLE STEP SIZE

This chapter deals with the step size used while scanning over a range of values

to implement the period estimation algorithm described in Chapter 4. The variable

step size is necessitated by the fact that using a smaller step size increases the

computational complexity but using a larger step size especially might cause the

algorithm to miss the peak if true period of the light curve is small. If periodicity of the

light curve is large then we can afford to use a larger step size while scanning the range

of values without missing the peak corresponding to the time period. This effect can be

seen in the Figure 5-2 and 5-3. For light curve 1.3448.153 which has a time period equal

to 3.2764 days we see peaks are much sharper and for light curve 1.3810.19 which has

a time period equal to 88.94063 days the peaks are wider. This means we can afford

to use larger step size for light curve 1.3810.19 and yet be able to identify the peaks

whereas we cannot use a larger step size for light curve 1.3448.153 without the risk of

missing out the peak. We present below a proof on how choosing a larger step size can

cause the algorithm to fail.

Let, Number of days over which light curve data is collected = T True period

(unknown) =p Trial period = q (variable for our experiment) Now (say) step size being

used in neighborhood of p = r

This value r is the resolution being used to scan the range of values in neighborhood

of p Error is introduced as the true period may not be present in the set of trial period

values. This error is minimized when a trial period closest to the true value of the period

is used while scanning over a fixed range. Let the error = ε It easy to see that |ε| < r/2During the folding process number of times the period is folded T

p+ε

Accumulated error over all the periods = Tp+ε

∗ ε Phase shift from first period through

last period is;

fracTp + ε ∗ |ε|p≈ 2π|ε|T

q2<2π(r/2)T

q2(A–1)

48

(p >> ε has been assumed which holds true for our experimentation purposes and

q = p + ε) This phase shift needs to be minimized to get a proper reconstruction of the

period if q happens to be the closest value to the true period. This can be controlled by

varying resolution r over the whole range. We choose r such that phase shift over all

the periods during the folding process is just small fraction of complete cycle i.e. period

reconstruction is not noisy.

Hence we set this allowable phase shift as;

0.005 ∗ 2π = 0.01π (A–2)

From Equation A–1 and A–2 we get 2π(r/2)Tq2

= 0.01π Solving we get;

r =0.01q2

T(A–3)

Thus step size r is determined about a value by Equation A–3 Hence in the

algorithm a smaller step size is used while scanning over smaller trial period values

and a larger step size for large trial period values.

49

REFERENCES

[1] C. Alcock, R.A. Allsman, D.R. Alves, T.S. Axelrod, A.C. Becker, D.P. Bennett, K.H.Cook, N. Dalal, A.J. Drake, K.C. Freeman, M. Geha, K. Griest, M.J. Lehner, S.L.Marshall, D. Minniti, C.A. Nelson, B.A. Peterson, P. Popowski, M.R. Pratt, P.J. Quinn,C.W. Stubbs, W. Sutherland, A.B. Tomaney, T. Vandehei and D. Welch, ”The MACHOProject: Microlensing Results from 5.7 Years of LMC Observations,” AstrophysicalJournal, vol. 542, pp. 281-307, 2000.

[2] B. E. Boser, I. M. Guyon and V. N. Vapnik, ”A Training Algorithm for Optimal MarginClassifiers,” Proceedings of the 5th Annual ACM Workshop on COLT, pp. 144-152,Pittsburgh, USA, 1992.

[3] J. Debosscher, L. M. Sarro, C. Aerts, J. Cuypers, B. Vandenbussche, R. Garrido andE. Solano, ”Automated Supervised Classification of Variable Stars. I. Methodology,”Astronomy and Astrophysics, vol. 475, pp. 1159-1183, December, 2007.

[4] Pablo A. Estevez, Senior Member, IEEE, Pablo Huijse, Pablo Zegers, SeniorMember, IEEE, Jose C. Principe, Fellow Member, IEEE, and Pavlos Protopapas,”Period Detection in Light Curves from Astronomical Objects Using Correntropy,”IJCNN, July 18-23, 2010.

[5] Prabhu Babu and Petre Stoica, ”Spectral analysis of non-uniformly sampled data - areview,” Digital Signal Processing, 2009, Elsevier.

[6] Jian-Wu Xu, Puskal P. Pokharel, Antonio R.C.Paiva and Jose C. Prıncipe,”Non-Linear Component Analysis based on Correntropy,” IJCNN, July 16-21,2006.

[7] N.R. Lomb, ”Least Square Frequency Analysis of Unequally Spaced Data,” Astro-physics and Space Science, vol. 39, Feb. 1976, p. 447-462.

[8] Andrzej Wojtkiewicz and Michai Tustytiski, ”Application of the Dirichlet Transformin Analysis of Non-Uniformly Sampled Signals,” Proceeding of the internationalconference on Acoustic, Speech and Signal Processing. p. V.25 - V.28, 1992.

[9] J. D. Scargle, ”Studies in Astronomical Time Series Analysis. II. Statistical Aspectsof Spectral Analysis of Unevenly Spaced Data,” The Astrophysical Journals, vol. 263,pp. 835-853, December, 1982.

[10] A. Gunduz and J. C. Prıncipe, ”Correntropy as a Novel Measure for NonlinearityTests,” Signal Processing, vol. 89, pp. 147-23, 2009.

[11] M. Petit, Variable Stars (New York: Wiley), 1987.

[12] P. Protopapas, J. M. Giammarco, L. Faccioli, M. F. Struble, R. Dave and C. Alcock,”Finding Outlier Light Curves in Catalogues of Periodic Variable Stars,” MonthlyNotices of the Royal Astronomical Society, vol. 369, pp. 677-696, June, 2006.

50

[13] I. Santamarya, P. P. Pokharel and J. C. Prıncipe, ”Generalized Correlation Function:Definition, Properties, and Application to Blind Equalization,” IEEE Transactions onSignal Processing, vol. 54, no. 6, pp. 2187-2197, June, 2006.

[14] W. Liu, P. P. Pokharel, and J. C. Prıncipe, ”Correntropy: Properties and applicationsin non-gaussian signal processing,” IEEE Transactions on Signal Processing, vol. 55,no. 11, pp. 52865298, 2007.

[15] A. Schwarzenberg-Czerny, ”On the advantage of using analysis of variance forperiod search,” Monthly Notices of the Royal Astronomical Society (MNRAS) , vol.241, pp. 153-165, 1989.

[16] G. Wachman, R. Khardon, P. Protopapas and C. Alcock, ”Kernels for Periodic TimeSeries Arising in Astronomy,” Proceedings of the European Conference on MachineLearning, Lecture Notes in Computer Science, Vol. 5782, pp. 489-505, 2009.

[17] J.-W. Xu and J. C. Prıncipe, ”A Pitch Detector Based on a Generalized CorrelationFunction,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 16,no. 8, pp. 1420-1432, November, 2008.

[18] T.-F. Wu, C.-J. Lin and R. C. Weng, ”Probability Estimates for Multi-ClassClassification by Pairwise Coupling,” Journal of Machine Learning Research,vol. 5, pp. 975-1005, 2004.

51

BIOGRAPHICAL SKETCH

Bibhu Prasad Mishra was born in India in 1987. He did his schooling in Rourkela

before joining engineering school. He started his engineering on August,2005 at IIT

Kharagpur. In 2009 he graduated with Honors and received his Bachelor of Technology

(B.Tech) in Electronics and ECE. Upon graduation he joined University of Florida to

pursue Master of Science degree in Electrical and Computer Engineering. He has been

working with Dr. Prıncipe in Computational NeuroEngineering Laboratory (CNEL) since

Spring 2010. He received his Master of Science degree in Department of Electrical and

Computer Engineering in University of Florida in 2011.

52

Documents

ESTIMATION OF PERIODICITY IN NON-UNIFORMLY SAMPLED ...ufdcimages.uflib.ufl.edu/UF/E0/04/30/72/00001/mishra_b.pdf · First, my sincere gratitude goes to my advisor Dr. Jose C. Pr´