15
The Data we Collect (and how we collect them) Arie Kapteyn

The Data we Collect (and how we collect them) Arie Kapteyn

Embed Size (px)

Citation preview

The Data we Collect(and how we collect them)

Arie Kapteyn

Data collection at CESR

CESR’s Nubis software provides tools for the collection of data using traditional interviewing techniques: Face-to-face, phone and self interviews.

Collected data in some 30 countries on five continents, usually in complicated longitudinal surveys (DLHS in India: 1.5 million respondents)

We can also collect data in the background. E.g: Physical Activity, GPS, heart-rate, and send that to a central server.Send short (follow-up) questions to cellphone or watch asking for details about observed activity or location.

Countries where we work (or worked)

UAS: Real-time, Contextualized data collection and intervention technologies

Just-In-Time Adaptive Data Collection and Intervention by smartwatch and smartphone

UAS: ZEMI smartphone app

Data collection using smart phones

Mobile Technologies:Understanding Behavior in Real Time

• Sensors sensing behavior• GPS sensing place• Sound/device recog. sensing

conversation, other people, mood• EMA/SMS collects & provides data • On demand according to times, places, in

response to sensed events• Integrates wireless data from

wearable/deployable sensors • Record of phone, email, Internet use• Patterns of change over time and place• Real- or near-time data

transfer/feedback• Electronic records of financial

transactions

Source: Lane et al. 2011

We run a population representative Internet Panel of Households

What is an Internet Panel?

• Any device that can connect to the Internet• Passive measurement, using bluetooth for

instance• Mobile devices of any sort• Essentially two types:

– Convenience (non-probability) panels– Probability panels

10

Probability Panels

• Selection probabilities known. – Need sampling frame (denominator)

• Get internet access for those without it.

11

Continuous Presidential Election Polling

(Final forecast: 3.32% advantage Obama; final count 3.64%)

Why are probability internet panels with low response rates superior to convenience panels?

• Coverage of non-internet population• Selectivity of respondents who sign up for

convenience panels.– 30% of online surveys completed by 0.25% of the

U.S. population (Miller, 2006) – 15-25% of vendor samples from a common pool of

respondents (Craig et al., 2013)– Panel participants belong to 7 online panels

(Tourangeau, Conrad, and Couper, 2013)

13

Financial Transaction Data• Financial aggregation firms like Mint.com and

Check.me have people share their financial passwords with them so that participants can be provided with summaries of their expenditures and savings.

• We ask UAS respondents to sign up with one of these financial aggregators and allow us access to the data. Potentially that will allow us to produce estimates of national consumption in real time.

Critical Infrastructure

• Our web-site for respondents has to be up without interruption. So we cannot afford downtime

• If downtime is inevitable (and planned) we need a warning well in advance so that we can warn our panel members.