SADC Course in Statistics Appreciating General Forms of Longitudinal Data (Session 06)

SADC Course in Statistics

Appreciating General Forms of Longitudinal Data

(Session 06)

2To put your footer here go to View > Header and Footer

Learning Objectives

By the end of this session, you will be able to

• explain ways in which time series data poses different requirements to those of longitudinal research

• discuss basic features of longitudinal approaches and how these compare to the features of cross-sectional (as well as time series) studies

• move on to further unaided study of reports and other literature about longitudinal work


Demands of time series methods

In many studies data collection through time is important. Key information can be gleaned from linking data files from different times, but this may not yield “time series” data.

Time series methods assume we have a long series of regularly repeated measurements of the same quantitative variable, according to precisely the same definitions and measurement protocol.


Longitudinal studies

The word “longitudinal” is used to relate to studies with observations over an extended period of time. They need not be precisely the same observations every time, e.g. school-children’s educational achievement may be assessed several times in their school lifetimes – each time by a different test, as the children progress. Formal (external) testing may be after varied lengths of time say 4, 3, 2 & 2 years.


Example: cohort studies - 1

Epidemiological terminology; used more widely. Classic medical examples

(i) look at, and arrange to follow up, an initial disease free group (usually large number)

(ii) track, and measure, their exposure to risk factors (e.g. diet, environment, smoking) &

(iii) at conclusion of study (often years later) assess disease incidence, “outcome”

(iv) relate disease status statistics to risk factors.


Example: cohort studies - 2

Main measures are Before & After. Interest is in a long-term change, rather than month-by-month.

There are thousands being observed in the cohort, rather than one rain gauge in a meteorological time series.

Nature of interim measurements may change as science progresses e.g. new risk factors identified.


Longitudinal vs. cross-sectional

In many cases a research-oriented study may either be conceived of as being on a one-off basis or on repeated observation (longitudinal = “Long’l” below) basis.

A key benefit of Long’l is follow-up i.e. revisit same subjects to measure change. Before = “B” and after = “A” measures on same subjects show real difference through time. If “B” and “A” are on separate subjects, the difference may be due to subjects, not due to time.


Tracking - 1

Follow-up or “matching” requires putting serious resources into tracking systems, so subjects don’t disappear. If they do, “B”/”A” comparison is void, and (i) cost of “B” data collection is wasted, (ii) features of original sample design are damaged.

Tracking may use name, address, GPS measures, family and friend and neighbour contacts, “club facilities” to promote loyalty, school registrations to follow family’s children.


Tracking - 2

Keeping control of tracking information is itself a data management issue!

Tracking rather static rural populations is much easier than with urban poor e.g. slum dwellers. Generally rural pop.n move less and social networks usually easier for “outsider” interviewers to interrogate.

Definitional problems accompany movers even if traced. Say farm worker becomes a migrant mine worker: complex effects on physical and mental health.


Institutional settings - 1

Time series data usually emerge from large institutions with stable systems able, and motivated, to ensure “precisely the same definitions and measurement protocol”.

Thus definition and focus usually inflexible. Time series datasets generally have – or are

treated as – ONE key measure of interest + some subsidiary, explanatory, variables.

Definition of the one measure must be agreed by many people, commonly used and understood.


Institutional settings - 2

Long’l studies often represent “research”

• new or imperfectly-understood theme,

• need to look at wider range of variables,

• study as a whole done as a one-off, not a repeated monitoring exercise

Difficulties arise because (i) compared to cross-sectional studies, long’l work is costly e.g. tracking costs; (ii) needs quite long-term commitment from researchers.


Is longitudinal research necessary?

Consider study of poverty, and livelihoods of poor. How people became poor is a process taking place through time. Measuring once cannot truly capture that process.

Precarious livelihoods are often a feature of poverty: earning opportunities are unstable seasonal, short-lived, or few – thus one-off observation often misinterprets longer-term situation with regard to trend, seasonality, occurrence of shocks (family or crop health, climate etc) – so answer, as in this example, is “Yes, longitudinal research is necessary!”


Study design: cause & effect - 1

Often a strong focus on cause and effect (look back at final session of Demography & Epidemiology module). Simple example is study of an intervention with classic design involving sets of matched pairs with and without the intervention:-

Before

Before

After

After

Intervention

NO Intervention


Study design: cause & effect - 2

If the “befores” are matched but a difference shows up in the “afters” – consistently over many matched pairs, there is strong support that the consistent intervention (e.g. micro-finance initiative brought in at village level) was the cause.

If design lacks the matched non-intervention cases, the causal argument is weaker.

Note this form of impact assessment must be planned at inception, NOT afterwards!


Long’l design: sample structure - 1

Need to track, revisit and build relationship with sampled respondents is critical.

Cost dictates they are grouped like sample clusters. Often also want to see effects in context of locality where households or businesses are located so community data collection part of study.

Standard statistical theory of cluster sampling (in module H6) is not related to longitudinal problems. Clusters are not supposed to be interesting in their own right and are randomly selected.


Long’l design: sample structure - 2

Main analytic tool is comparison of results for same community/household/business over time. Less stress on representativeness of the communities than in cluster sampling, & random selection of clusters is implausible:-

• need to generate interesting interim findings, relevant to “clusters” (communities) and to overall policy, so as to maintain funding.

• thus choose series of “cluster settings” carefully as “sentinel sites” (see Basic Module BX), not at random but so that each sentinel site shows something important as study progresses.


Long’l design – sample structure - 3

Within each sentinel site, normally choose a probability-based sample e.g. simple random sample of qualifying households (e.g. qualifying in having joined micro-finance group).

Tracking decision, based on detailed study objectives, is whether to track same individuals or individuals in same positions in the sentinel site sample e.g. EITHER follow time progress of households who were at time zero in the micro-finance group OR compare 2-yr-olds there in 2010 with 2-yr-olds there in 2006, because focused on health status of 2-yr-olds.


LiteratureThis is a very large area of research. A few books are:-

De Vaus, D. (2001) Research Design in Social Research. Sage, London.

Hakim, C. (2000, 2nd ed.) Research Design: successful designs for social and economic research. Routledge, London.

Menard, S. (1991) Longitudinal Research. Sage University Papers 76.

Rose, D. (editor) (2000) Researching Social and Economic Change: the uses of household panel studies. Routledge, London.

Ruspini, E. (2002) Introduction to Longitudinal Research. Routledge, London.


Documents

SADC Course in Statistics Appreciating General Forms of Longitudinal Data (Session 06)