21
Data context: new developments for research the social sciences 4 th Luso-Brazilian Conference on Open Access, University of Sao Paulo 9 th October 2013 Peter Elias

Data context new developments for research the social sciences

Embed Size (px)

DESCRIPTION

Oficina 6 - Confoa 2013 - Gestão de Periódicos Científicos - ministrada pelo Prof. Dr. Peter Elias

Citation preview

Page 1: Data context new developments for research the social sciences

Data context: new developments for research the social sciences

4th Luso-Brazilian Conference on Open Access,

University of Sao Paulo

9th October 2013

Peter Elias

Page 2: Data context new developments for research the social sciences

Structure of the presentation

• Recent reports - what’s going on?

• What constitutes data in the social sciences?

• What problems do we face with the more traditional forms of data?

• New forms of data

• Challenges using new data types

• The report of the Administrative Data Taskforce

• What does this mean for journals?

Page 3: Data context new developments for research the social sciences

Recent reports…

Royal Society 2012

OECD 2013 ESRC, MRC, Wellcome Trust 2012

RCUK 2012

Page 4: Data context new developments for research the social sciences

Science as an Open Enterprise (Royal Society 2012)

Royal Society 2012

The main thrust of this report was that transparency and

openness should characterise all scientific research. As a

major part of this, data sharing should be regarded as the

norm and researchers, their funders and research

institutions should adopt this stance in all their research

activities. An important recommendation relates to

situations where data hold personal information. In such

cases, appropriate safeguards should be put in place to

prevent disclosure of such details whilst facilitating data

sharing.

Page 5: Data context new developments for research the social sciences

New Data for Understanding the Human Condition: international perspectives. (OECD 2013)

The focus of this report was on the need for global collaboration over data sharing. This will require improved incentives for researchers who agree to share data, and the adoption of agreed standards and protocols for data description. Additionally, the report calls for an international approach to the use of ‘Big Data’ for research, covering collaboration over the exploration of the research value of new forms of data, the development of tools for their analysis and improved access to administrative datasets on a cross-national basis.

OECD 2013

Page 6: Data context new developments for research the social sciences

Report of the Administrative Data Taskforce 2012 (ESRC, MRC, Wellcome 2012)

This cross-departmental Taskforce proposes a

major boost to the resources available for linkage

and sharing across administrative datasets with

the establishment of Administrative Data

Research Centres in the countries in the UK.

Additionally, all taskforce members are agreed

that new legislation is required in order to

overcome current legal obstacles to record-level

linkage between data held by different

administrative bodies.

Page 7: Data context new developments for research the social sciences

Investing for Growth: Capital infrastructure for the 21st Century (RCUK 2012)

This report sets out priorities for capital investment for research. A major theme throughout is to improve UK capacity to harness ‘Big Data’, emphasising the key importance of longitudinal data, of linking socioeconomic data sources to other data, including administrative records, private sector, and biomedical data, as well as ensuring these resources are accessible for social scientific research to benefit the economy, health and other sectors.

Page 8: Data context new developments for research the social sciences

What constitutes data in the social sciences? • Research interests focus upon people and organisations, their

interaction, their evolution – seeking to understand better the behavioural relationships between them

• Data types of interests relate to people and organisations, variously classified as

Aggregated/disaggregated

Spatially referenced/time-stamped

Longitudinal/cross-sectional

Quantitative/qualitative

Structured/unstructured

• Data structures include ‘rectangular’ datasets, hierarchical data, textual, numerical, audio, video

Page 9: Data context new developments for research the social sciences

What problems do we face with the more traditional forms of data?

• Discovery (NESSTAR; CESSDA; Data Management Plans)

• Documentation (DDI; SDMX)

• Access (DWB; IHSN)

• Reuse (CESSDA)

• Preservation (CESSDA)

Page 10: Data context new developments for research the social sciences

New forms of data Broad category

of data Detailed categories Examples

Category A:

Government

transactions

Individual tax records Income tax; tax credits

Corporate tax records Corporation tax; sales; tax, value added tax

Property tax records Tax on sales of property; tax on value of property

Social security payments State pensions; hardship payments: unemployment benefits;

child benefits

Import/export records Border control records; import/export licensing records

Category B:

Government

and other

registration

records

Housing and land use

registers Registers of ownership

Educational registers School inspections; pupil results

Criminal justice registers Police records; court records

Social security registers Registers of eligible persons

Electoral registers Voter registration records

Employment registers Employer census records: registers of persons joining/leaving

employment

Population registers Births; marriages; civil unions; deaths; immigration/emigration

records; census records

Health system registers Personal medical records; hospital records

Vehicle/driver registers Driver licence registers; vehicle licence registers

Membership registers Political parties; charities; clubs

Page 11: Data context new developments for research the social sciences

Broad category

of data Detailed categories Examples

Category C:

Commercial

transactions

Store cards Supermarket loyalty cards

Customer accounts Utilities; financial institutions; mobile phone usage

Other customer records Product purchases; service agreements

Category D:

Internet usage

Search terms Google; Bing; Yahoo search activity

Website interactions Visit statistics; user generated content

Downloads Music; films; TV

Social networks Facebook; Twitter; LinkedIn

Blogs; news sites Reddit

Category E:

Tracking data

CCTV images Security/safety camera recordings

Traffic sensors Vehicle tracking records; vehicle movement records

Mobile phone locations: GPS

data

Category F:

Satellite and

aerial imagery

Visible light spectrum Google Earth©

Night-time visible radiation Landsat

Infrared; radar mapping

New forms of data – contd.

Page 12: Data context new developments for research the social sciences

Challenges using new data types

• Provenance

• Replicability

• Durability

• Volume

• Ethics

• Confidentiality

• Legal issues

• Access may be strictly controlled

Page 13: Data context new developments for research the social sciences

Focus from here on one particular data type:

Administrative data – reuse for research

Page 14: Data context new developments for research the social sciences

What are administrative data?

Data which are the product of an administrative system. They are generated by organisations for operational purposes or as a legal requirement. They might identify people and/or organisations and may contain detailed spatial information, be time-stamped. They are produced by public and private sector organisations. They are not designed for research.

Page 15: Data context new developments for research the social sciences

What is the research value of such data?

• They already exist. No additional data collection costs associated with research use.

• They are typically large national datasets, permitting more detailed research to be undertaken than would otherwise be the case.

• They record a process, which can be documented and understood.

• Linkage between data relating to different time periods can create longitudinal resources.

• Linkage to other data sources (e.g. surveys) can enhance these resources.

Page 16: Data context new developments for research the social sciences

What are the problems associated with their research use?

• Not designed for research. This may pose difficulties for their

use in specific research areas.

• They are not subject to statistical standards or statistical

quality controls.

• They may be difficult to access, and linkage may be prohibited

or may not be feasible.

• As the systems that generate them change, so might the data.

• Their preservation for research is not regarded as a

fundamental objective – may lead to problems with metadata.

Page 17: Data context new developments for research the social sciences

Some of the problems currently faced by researchers

• Inconsistent access conditions.

• Severe time delays in granting access or refusal.

• Lack of information about selection and/or linking of administrative datasets.

• Restricted access to datasets – especially for addressing the counterfactual.

• Data controller making unilateral decision about appropriateness of data for research.

• Research permitted then publication denied.

Page 18: Data context new developments for research the social sciences

Terms of reference for the Taskforce

• identification of potential risks and benefits from increased research use of administrative data;

• identification of likely resource implications arising from increased research use of administrative data;

• the development and introduction of common procedures to provide more efficient access to administrative datasets;

• clarification of the legal situation governing the use of routine data;

• clarification of when consent is required and what consent procedures should be used;

• identification of possible need for legislative change to improve access to administrative data for research.

Page 19: Data context new developments for research the social sciences

What has the Taskforce recommended?

• Improved access and linkage procedures and arrangements for their governance.

• A clearer legal environment for linkage between data held by different departments.

• A common accreditation process for researchers applying for access to and linkage between administrative datasets.

Page 20: Data context new developments for research the social sciences

Where are we now?

• £34 million released by government .

• Four Administrative Data Centres commissioned.

• A new UK Administrative Data Service set up.

• A national governing authority is being established.

• New legislation under preparation.

• Now commissioning centres for local government and private sector data

Page 21: Data context new developments for research the social sciences

What are the implications for libraries and journals?

• Libraries as home for secure remote access facilities .

• More attention to data documentation and discovery tools.

• Building up capacity within the research community to facilitate research using the improved access and data linkage arrangements.

• Subject knowledge of librarians to extend to administrative datasets.

• To be solved – open access and access to administrative data