Lars Lyberg Stockholm University Frimis, November 11, 2015 What’s Going on in Survey Research?

Lars Lyberg

Stockholm University

Frimis, November 11, 2015

What’s Going on in Survey Research?

A Changing Survey Landscape

• Probability and nonprobability sampling• Total survey error• New technology• Big data• International surveys• Hard-to-survey populations

Probability Sample

Every object in the target population has a known non-zero probability of being selected

• Very few samples in market, opinion and social research live up to this definition

• Reasons include nonresponse, frame problems, and special research goals

The Origins of Probability Sampling

• Introduced in 1934• Basically a financial breakthrough• Data collection was expensive• To be able to say something about a population

based on a relatively small sample and a margin of error to go with that was almost like magic

A Couple of Giants

Sir Ronald Fisher Jerzy Neyman

Problems

• It took a while for probability sampling to be accepted

• The sampling theory did not handle other error sources very well

• Basically the only “allowed” error source is sampling

Issues Associated with Sampling

• Ridiculous response rates• Increased demands for timely data• Access to large volumes of (inexpensive) data • Margins of error are understated• Discussions about nonprobability sampling• New less expensive ways of collecting data• The advent of opt-in panels• Proper inference not always possible

Examples of Statements

• Probability sampling is the only reasonable way to achieve representativity

• Probability samples are not representative due to nonresponse

• There is no theoretical foundation for opt-in panels

• There are theories and methods based on modeling and weighting

More Statements

• Studies show that probability sampling is more accurate that nonprobability sampling

• Some of these comparisons are flawed since weighting of the nonprobability samples has not been sufficiently ambitious

• Even though results from opt-in panels might be biased to some extent they come at a fraction of the costs for a probability sample and much quicker

The Current Situation

• Both probability and nonprobability sampling have problems

• Bayesian inference gaining ground• Lots of experimentation needed• Quality criteria need to be defined

The Recent British Election

• Whilst the Conservatives won convincingly, 18% of the campaign polls had suggested a dead heat and a further 46% had suggested Labour leads.

• Of the 36% of polls that registered Conservative leads, three out of four showed leads that were less than half the actual outcome.

• Both probability sampling and panels failed.• The British Polling Council has initiated an

investigation on why things went wrong.

Due to selecting Errors due toa sample instead of mistakes or systemthe entire pop’n deficiencies

Total Survey Error

SamplingError

NonsamplingError

Risk of Bias and Variance by Error Source

MSE Component Var Bias

Sampling error High Low

Specification error Low High

Nonresponse error Low High

Frame error Low High

Measurement error High High

Data Processing error High High

What to do about Total Survey Error

• Minimize variances and biases through QA, QC, QM, and best practices

• Estimate the size of the total error

• Apply risk management

New Technology

• Smartphones as a data collection mode

• Social media as an information source

• GPS

Big data is a term that describes data sets so large and complex that they cannot be processed and analyzed with conventional software systems.

Sources:• Transaction databases• Social media• The Internet of Things

A Black Swan

A black swan is an undirected and unpredicted event.It is rare, has an extreme impact but in retrospect we saw it coming

• Internet - yes

• 9/11 - yes

• The Lehman Brothers crash - yes

• The advent of Big Data - ?

The Three V’s

• Volume• Tera- to Peta- to Exabytes of data, stored and

processed

• Variability• Structured, unstructured, text, images, maps,

multimedia• Varying sources

• Velocity• Streaming data, from seconds to milliseconds

• Veracity• Can we trust Big Data? Can we use it? Proxies,

indicators

Big Data

Examples of Big Data with use or potential use in statistics production

• Google searches (flu trends)

• Traffic camera data

• Retail scanner data

• Credit card and transaction data

• GPS data

Hype of Big Data

Gartner’s hype curve

Source: Wikipedia

Happiness and Well-being

The common survey question: How satisfied areyou with your life?

BD alternative• 10 million tweets that are coded for happiness

(rainbow, love, beauty, hope, wonderful, wine…) and non-happiness (damn, boo, ugly, smoke, hate, lied,…)

• Happiest states: Hawaii, Utah, Idaho, Maine, Washington

• Saddest states: Louisiana, Mississippi, Maryland, Michigan, Delaware

Big Data Challenges

• Data quality

• Data analytics

• Confidentiality concerns

Mono Surveys vs 3MC Surveys• 3MC=multinational, multregional and multicultural contexts• One population vs more than one population• In 3MC TSE or MSE as planning criteria must be complemented by equivalence or comparability• 3MC surveys need to be designed with a mixture of standardization and flexibility to achieve operational equivalence• Implementation and control much more demanding in 3MC surveys

Examples of 3MC Surveys

• Adult literacy (IALS)• Adult skills (PIAAC)• Student assessment

(PISA)• European Social

Survey (ESS)• World values (WVS)• Health, ageing and

retirement (SHARE)• Electoral systems

(CSES)

• Gallup World Poll (GWP)

• European Statistical System

• Marketing surveys on customer satisfaction, brand names, attitudes, finances etc

• Pure entertainment surveys

Some Special Features in a 3MC Survey Setting

• Comparability is the main goal• Concepts must have a uniform meaning• Risk management differs• Financial and methodological resources

differ (3MC’s are expensive)• National and international interests are

in conflict• Scientific challenge• Administrative challenge• National pride is at stake

Response Rates in PIAAC, Cycle I (%)

• Australia 71

• Austria 53

• Belgium 62

• Canada 58

• Cyprus 73

• Czech Republic 66

• Denmark 50

• Estonia 63

• Finland 66

• Germany 55

• Ireland 72

• Italy 56

• Japan 50

• Korea 75

• Netherlands 51

• Norway 62

• Poland 54

• Slovak Republic 66

• Spain 48

• Sweden 45

• UK-England 59

• UK-Northern Ireland 65

• USA 70

Challenges in 3MC Surveys

• Design (what can vary, what is rigid)• Translation• Adaptation• Culturally different error structures• Data fabrication• Quality control• Often too many countries involved

Hard-to-survey Populations (H2S)

• Homeless• Prostitutes• Refugees• Victims• Persons with disabilities• Minorities• Illegal aliens• Rare (fans, musicians, language groups,

extremists)• Mobile populations (nomads, migrants,

students)

Methodological Approaches to H2S

• Innovative sampling methods• Venue-based (red light districts, voting

facilities)• Indirect sampling• Snowball and respondent driven

• Qualitative studies (anthropology etc)

• Formative research

The End of Theory

Faced with massive data, this approach to science — hypothesize, model, test — is becoming obsolete. Petabytes allow us to say: ‘Correlation is enough.’ We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.

Chris Anderson 2008 31

The Future of Surveys is Uncertain

• Too many surveys, too much off-the-shelf tools• Active participation going down• Passive participation going up• Many problems are global• Decision makers need data fast and at low cost• The design-based approach needs refreshment• Decision makers need data from different

sources• The big survey institutes are worried

Endnote • Our industry needs innovations and less fighting• We need to merge with other research cultures• We need to know more about combining data

sources• We need to account for all major sources of

uncertainty that is associated with data collection and analysis of data

• We need to develop new theories for handling error structures, combining data sources, and reaching equivalence

Over and Out

Lars Lyberg Stockholm University Frimis, November 11, 2015 What’s Going on in Survey Research?

Documents

Lars Bergström Dept. of Physics Stockholm …...Lars Bergström Dept. of Physics Stockholm University WIN2002 Lars Bergstrom, lbe@physto.se WIN2002 Lars Bergstrom, lbe@physto.se The

Bergtunnlar i PowerCivilbentleyuser.se/2013_BUSdagar/ÅF_BUS_2013-11-08.pdf · Lars-Erik Svensson Robert Edlinger Björn Eriksson Bergtunnlar i PowerCivil Inom Förbifart Stockholm

Changes in Patterns of Substance Use ─ Tobacco Lars Ramström Institute for Tobacco Studies Stockholm, Sweden 49th International ICAA Conference on Dependencies

Lars Bergström Department of Physics Stockholm University · Kashiwa Symposium Lars Bergström, lbe@physto.se Lars Bergström Department of Physics Stockholm University Kashiwa Symposium

Lars Lyberg Quality Assurance and Control

Lars Lyberg Quality Assurance and Control - Harvard … Design Data Collection Analysis/Interpretation Concepts Population Mode of Administration Questions Questionnaire revise revise

Dr. Lars M. Ramstom, Stockholm Sweden on Emerging Tobacco Products and the Impact on Tobacco Control Policy

Voice use in teaching environments: Speakers' comfort Lyberg … · II Viveka Lyberg Åhlander, Roland Rydell, & Anders Löfqvist, Speaker’s comfort in teaching environments: Voice

1 Department of Economics, Stockholm School of Economics, P.O. Box 6501, SE-113 83 Stockholm, Sweden, Lars E.O. Svensson Stockholm School of

The Quality Package Seminarium: Livsmedelskvalitet ur olika perspektiv, Stockholm, 19.1.2011 Lars Hoelgaard, Vicegeneraldirektör

CellPoint Connect AB (publ) Investor Briefing Stockholm September 7, 2005 Michael Mathiesen, CEO - Lars Ridderström, CFO

Lars Frisell - Stockholm School of Economics · 2009-06-05 · Lars Frisell* Abstract We study a sender-receivergame between an uninformed government and two imperfectly informed

Anesthesia for Cesarean Section - mkaic.org · Anesthesia for Cesarean Section Lars Irestedt Karolinska University Hospital, Stockholm, Sweden

Conference Directorshomes.chass.utoronto.ca/~jrbrown/Dubrovnikposter.pdf · 2014. 4. 25. · Conference Directors Lars Bergström, University of Stockholm James Robert Brown, University

John Ågren Dept. of Matls. Sci. & Engg. Royal Institute of Technology Stockholm, 100 44 Sweden Acknowledgement: Reza Naraghi, Samuel Hallström, Lars Höglund

The Roots of Total Survey Design Lars Lyberg Stockholm University QMMS Seminar Leinsweiler, Nov 7-9, 2010

2015 - OECD.org - OECD · Stockholm 2016 Cover design by Julia Demchenko . Table of contents Preface ..... 1 Sammanfattning ... Stockholm, January 2016 Lars Heikensten, Chair . 3

© Vattenfall AB Launching of the second European Climate Change Programme (ECCP II) Lars Strömberg Vattenfall AB Stockholm/Berlin Stakeholder Conference

Cutting tobacco’s death toll − an overview of different options Lars M. Ramström Institute for Tobacco Studies Stockholm, Sweden 5th Annual Conference

stockholmskallan.stockholm.se › PostFiles › ... · L O SPA ULRICAS, Q,ftctláltgtaðe Rongligct Refidence STOCKHOLM; tora 6 Oûobcr unèetðånigbet bett)gaö 21f LARS N: SWEDENSTIERNA