Upload
samuel-lindsey
View
218
Download
3
Tags:
Embed Size (px)
Citation preview
RELU Conference , 20 January 2005
RELU Data Support Service RELU-DSS
Data Management Workshop
Louise Corti and Isabella Tindall
RELU Conference , 20 January 2005
Workshop overview• Guidance for creating and sharing high quality data• Will cover the key practical, technical, legal and ethical
issues including:
• An overview of the RELU themes and projects• Data Management Policy and the RELU Data Support
Service• ESRC’s and NERC’s existing Datasets Policies • Accessing ESRC and NERC archived data holdings• Data held by third parties• QA and data management plans• Data formats, metadata and standards that allow for
longer term sharing and archiving • Ethical and legal issues• Questions/Discussion
RELU Conference , 20 January 2005
RELU Programme• Rural Economy and Land Use Programme
• Harnessing the sciences for sustainable rural development:
Rural areas in the UK are experiencing a period of considerable change. The rural economy and land use programme aims to advance understanding of the challenges caused by this change today and in the future. Interdisciplinary research is being funded between 2004 and 2009 in order to inform policy and practice with choices on how to manage the countryside and rural economies.
The rural economy and land use programme enables researchers to work together to investigate the social, economic, environmental and technological challenges faced by rural areas. The programme will encourage social and economic vitality of rural areas and promote the protection and conservation of the rural environment.
RELU Conference , 20 January 2005
Themes and data
• RELU themes:– A The Integration of Land and Water Use – B The Environmental Basis of Rural Development – C Sustainable Food Chains (Call 1)– D Economic and Social Interactions with the Rural
Environment
• Call 1: 27 projects funded; smaller pilots/ scoping/capacity building and 8 major data research projects
• Programme is both using and creating a variety of data sources
• Disparate types of data – social and environmental and biological data
RELU Conference , 20 January 2005
Call 1: Research projects
• Eating Biodiversity: An Investigation of the Links between Quality Food Production and Biodiversity Protection
• Comparative Assessment of Environmental, Community & Nutritional Impacts of Consuming Fruit & Vegetables Produced Locally and Overseas
• Biological Alternatives to Chemical Pesticide Inputs in the Food Chain: An Assessment of Environmental and Regulatory Sustainability
• Warmwater Fish Production as a Niche Production and Market Diversification Strategy for Organic Arable Farmers with Implications for Sustainability and Public Health
• Implications of a Nutrition Driven Food Policy for Land Use and the Rural Environment
• Sustainable and Holistic Food Chains for Recycling Livestock Waste to Land
• Integration of Social and Natural Sciences to Develop Improved Tools for Assessing and Managing Food Chain Risks Affecting the Rural Economy
• Re-Bugging the System: Promoting Adoption of Alternative Pest Management Strategies in Field Crop Systems
RELU Conference , 20 January 2005
RELU Data Management PolicyThe data management policy enhances the capabilities for interdisciplinarity and therefore improves the ability of the research community to:
• apply learning from one field to another
• combine different methodological approaches and sources of information
• cross-fertilise ideas and concepts
• understand scientific, technological and environmental problems in their social and economic contexts
RELU Conference , 20 January 2005
Policy principles• Publicly funded research data are a valuable, long term
resource
• To ensure maximum research exploitation data must be managed effectively from day-1
• Researchers must collect data in such a way as to ensure longer term sharing
• and manage their data effectively during the life of a project
• RELU funds will support data management through the life of the project
• Data must be made available by researchers for archiving: ESRC and NERC supported data centres provide long-term, post-project data management
RELU Conference , 20 January 2005
RELU Data Support Service
• Set up to provide a support service for RELU researchers and staff to gain information and guidance on issues surrounding longer-term data sharing and preservation
• Joint support service run by:– ESRC/JISC supported UK Data Archive at Essex – The NERC-supported Centre for Ecology &
Hydrology
• Funded for one year supporting one FTE and outreach activities: 1 Jan 05 – 31 Dec 05
RELU Conference , 20 January 2005
RELU-DSS• a data management advisory and support service for Call 1
award holders and Call 2 applicants and successful award holders
• a web-based information portal that will provide
– expert guidance on data management issues
– a searchable meta-data catalogue, detailing the data that RELU award-holders are intending to produce
• a programme of outreach and training aimed at RELU award holders
• the facilitation of access to key external data sources for RELU projects, where required
• guidance to the PMG and data sub-group on data management issues and longer-term costing for supporting RELU projects’ data management
RELU Conference , 20 January 2005
Research Council Data Policies
RELU Data Management Policy builds on :
• NERC data policy found in the Data Policy Handbook available from the NERC web site www.nerc.ac.uk/data/documents/datahandbook.pdf
• ESRC Datasets Policy found in the www.esrc.ac.uk/esrccontent/researchfunding/sec17.asp
RELU Conference , 20 January 2005
ESRC Datasets Policy –what is expected of award holders?
• to preserve and share data from ESRC funded research
• funding allowed to prepare data for archiving
• all award-holders must offer data for deposit to the ESDS within 3 months of the end of the award
• any potential problems should be notified to the ESDS at the earliest opportunity
• final payment will be withheld if dataset has not been deposited within 3 months of the end of the award, except where a waiver has been agreed in advance
RELU Conference , 20 January 2005
NERC Data Policy –Thematic Programmes
• all managers of NERC programmes are expected to be familiar with the Policy
• scientists are expected to consider all the scientific data management implications of their projects at the planning stage (and before submitting grant applications), consulting the Designated Data Centres (DDCs) responsible for scientific data in their subject area.
• The appropriate DDC should be consulted as soon as it is clear what datasets will be emerging from the project. At the end of their projects grant holders are required to offer to deposit with NERC a copy of datasets resulting from their research
RELU Conference , 20 January 2005
Longer-term data sharing
• data centres /archives make (selected) data created available to other bona fide researchers
• safeguards to protect the interests of the original collector, who may retain Intellectual Property Rights
• preserve data using up-to-date curation systems and keep apace with technology and data trends
RELU Conference , 20 January 2005
RELU Theme C data types• Social data – people based
– Micro (survey)• Household or individual level attributes• Behaviour, attitudes and options
• Business/company– Farm level data– Aggregated
• UK Census e.g. small area statistics)• Retail statistics• health indicators
• GIS/Spatial data geographically referenced environmental databases– Ordnance survey– Road networks– Settlement
RELU Conference , 20 January 2005
RELU Theme C data types continued
• Water quality, land fill, air quality, emission levels
• Soil data, eg mineral composition
• Ecological data, animal and bird distributions
• Agricultural census
• Climate and meteorological data
• River flow data
• Biochemical data relating to foods/habitats
RELU Conference , 20 January 2005
Existing 3rd party datasets
• Research Council data centres – Rothmansted (BBSRC experimental samples of crops
and soils)– Economic and social data service (eg ESRC Health and
Lifestyle survey)– EDINA/UK Borders (boundary data for admin areas)
• Public/Private Research institutes – Macaulay
• soils and derived; climate; land cover; land capability data
• Department for Environment, Food and Rural Affairs (DEFRA) eg Farm Business survey
• Scottish Executive Environment and Rural Affairs Department (SEERAD)
• Environment Agency (EA)• National Soil Research Institute• Met Office
RELU Conference , 20 January 2005
Use of 3rd party datasets
• 3rd parties likely to require RELU to:
– Identify one point of contact for discussing data issues
• E.g. a NERC Data Centre for EA datasets– All partners in a project to sign licenses for use of
data– The Data Centre to be responsible for issuing
licenses to other projects wishing to use the same data
– The Data Centre to distribute the data, once licenses have been signed
RELU Conference , 20 January 2005
Access to ESRC/NERC
data resources
RELU Conference , 20 January 2005
ESRC/JISC Data Centre• national data archiving and dissemination
service, running from 1 Jan. 2003
www.esds.ac.uk
• jointly supported by: – Economic and Social Research Council – Joint Information Systems Committee
• partners:– UK Data Archive (UKDA), Essex – Manchester Information and Associated – Services (MIMAS), Manchester– Cathie Marsh Centre for Census and Survey
Research (CCSR), Manchester – Institute of Social and Economic Research
(ISER), Essex
RELU Conference , 20 January 2005
ESDS overview• ESDS Management
– central help desk service; coherent and flexible collections development policy; central registration service; universal data portal
• ESDS Access and Preservation – collections development strategy; ingest activities -
including data and documentation processing; metadata creation; data dissemination services; long-term preservation
• Specialist data services– ESDS Government– ESDS International– ESDS Longitudinal – ESDS Qualidata
–dedicated web sitesdata and documentation enhancementstailored user support outreach and training
RELU Conference , 20 January 2005
ESDS HoldingsData for research and teaching purposes and used in all sectors and for many different disciplines
• official agencies - mainly central government
• individual academics - research grants
• market research agencies
• public records/historical sources
• links to UK census data
• qualitative and quantitative
• international statistical time series
• access to international data via
links with other data archives worldwide
• history data service in-house (AHDS)
• 4,000+ datasets
in the collection
• 200+ new
datasets are
added each year
• 6,500+ orders for
data per year
• 18,000+ datasets
distributed
worldwide pa
RELU Conference , 20 January 2005
The large-scale government surveys
• General Household Survey• Labour Force Survey• Health Survey for England/Wales/Scotland • Family Expenditure Survey• British Crime Survey• Family Resources Survey • National Food Survey/Expenditure and Food Survey • ONS Omnibus Survey • Survey of English Housing • British Social Attitudes• National Travel Survey• Time Use Survey
RELU Conference , 20 January 2005
Benefits of the large-scale government datasets
• good quality data– produced by experienced research organisations– usually nationally representative with large
samples– good response rates– very well documented
• continuous data– allows comparison over time– data is largely cross-sectional
• hierarchical data– individual and household– intra-household differences– household effects on individuals
0
5
10
15
20
25
30
1979 1985 1989 1991 1993 1995 1998 2000
Percentage of women aged 18-49 cohabiting
General Household Survey
RELU Conference , 20 January 2005
Search on ‘Environmental’
200+ datasets
found
RELU Conference , 20 January 2005
Types of qualitative data
• diverse data types: in-depth interviews; semi-structured interviews; focus groups; oral histories; mixed methods data; open-ended survey questions; case notes/records of meetings; diaries/research diaries
• multimedia: audio, video, photos and text (most common is interview transcriptions)
• formats: digital, paper, analogue audio-visual
• data structures - differ across different ‘document types’
RELU Conference , 20 January 2005
International data providers• International Monetary Fund • OECD • United Nations• World Bank • Eurostat• International Labour
Organisation• UK Office for National
Statistics
• freely available to UK HE/FE – data licensing costs are paid by ESRC
• datasets delivered over the web via Beyond 20/20
Databanks cover:
• economic performance and development
• trade, industry and markets
• employment• demography, migration
and health• governance• human development • social expenditure• education• science and technology • land use and the
environment
RELU Conference , 20 January 2005
ESDS: Online access to data and user guides
• web pages – easy to navigate format– web catalogue with variable level searching– subject browsing and major series – free web access to online doc - pdf user guides and forms
• registration– one-off registration with userid/password– online account management and “Shopping Basket” ordering– data are freely available for the majority of users– One-stop Athens authentication
• data download and online browsing – web download in various software formats - SPSS, STATA, tab-delimited,
word – Nesstar – online data analysis and visualisation– ESDS International online system– ESDS Qualidata online browsing system
RELU Conference , 20 January 2005
NERC Data Centres
• NERC’s data holdings – core asset
• Network of 7 Designated Data Centres who are responsible for managing NERC funded data and implementation of the NERC Data Policy data centres
• Central directory – the NERC metadata gateway
• E-Science funded NERC Data Grid under development
RELU Conference , 20 January 2005
NERC Designated Data Centres
• Antarctic Environmental Data Centre: Responsible for all NERC's data from the Antarctic, regardless of discipline.
• British Atmospheric Data Centre: Responsible for atmospheric sciences data.
• British Oceanographic Data Centre: Responsible for marine data.
• National Geosciences Information Service: Responsible for geosciences data.
• National Water Archive: Responsible for NERC's hydrological data and for the Government's National River Flow Archive.
• Environmental Information Centre: Responsible for all other NERC terrestrial and freshwater data.
• NERC Earth Observation Data Centre: Responsible for NERC’s non-discipline-related remotely sensed data of the surface of the Earth acquired by satellite and airborne sensors.
RELU Conference , 20 January 2005
NERC Data Centre Holdings• The NERC MetaData Gateway simultaneously
searches the catalogues of data held at several of the NERC designated data centres.
RELU Conference , 20 January 2005
QA and Data Management Plans
RELU Conference , 20 January 2005
Data Management Plan
• proforma to complete (Section 3 of the Project Communication and Data Management Plan)
• highlighting data management and custody issues at an early stage
• providing a basis for quality assurance within the Programme
• providing a basis from which award holders and the Programme Director can report and monitor project and overall RELU Programme progress
RELU Conference , 20 January 2005
Data management
• Award holders will be required to provide full metadata together with a description of the datasets which their project generates – metadata is the information necessary to
interpret, understand and use a given dataset without reference to the original data collector
• Agree the technical arrangements for data management and archiving (including decisions concerning final archiving destination for project data sets– formats for supply of data– licence agreements; IPR etc.
RELU Conference , 20 January 2005
Information required from plan• requirements for access to existing datasets
• details of new and derived datasets to be produced
• quality assurance of data
• formats and standards
• data description and documentation
• ethical, legal issues and IPR resolution
• data back-up procedures, security
• archiving data (for Research Council data archives)
• data management representative
RELU-DSS helps support these areas
RELU Conference , 20 January 2005
Quality control and data management issues
• Survey data
• Qualitative data
• Environmental data
RELU Conference , 20 January 2005
Characteristics of a “good” archived research collection
• Life cycle approach taken
• accurate data, well organised and labelled files
• appropriate measurement of key concepts
• supporting data/documentation should be deposited to a standard that would enable them to be used by a third partycreated– major stages of research recorded – research/measurement instruments documented
• data that can be stored in user-friendly “dissemination” formats, but can also be archived in a future-proof “preservation” format
• consent, confidentiality & copyright resolved
RELU Conference , 20 January 2005
ESDS: Supporting documentation•To produce catalogue record and user guide
– funding application– questionnaire/Interview schedules– description of methodology (details of sample design,
response rate, etc)– “codebook”(variable names, variable descriptions, code
names and variable formatting information)– technical report describing the research project.– communication with informants on confidentiality– Coding schemes / themes– End of award report– software description/versions used– bibliographies, resulting publications– code used to create derived variables or check data
(e.g. SPSS, STATA or SAS “command files”).
•Anything that adds insight or aids understanding and secondary usage
RELU Conference , 20 January 2005
Standardised description
(metadata) fields taken from DDI
specification for social science datasets
RELU Conference , 20 January 2005
Survey data - variables
RELU Conference , 20 January 2005
Labelling of survey data
• all variables should be named. Variable names should not exceed 8 characters where possible, as the most common format for disseminating data is SPSS
• all variables should be labelled. Labels should be brief (preferably < 80 characters), but precise and always make explicit the unit of measurement for continuous (interval) variables. Where possible, all variable labels should reference the question number (and if necessary questionnaire). For example, the variable q11bhexc might have the label “q11b: hours spent taking physical exercise in a typical week”. This gives the unit of measurement and a reference to the question number (q11b), so the user can quickly and easily cross-reference to it
RELU Conference , 20 January 2005
Labelling of survey data II
• for categorical variables, all codes (values) should be given a brief label (preferably < 60 characters). For example, p1sex (gender of person 1) might have these value labels: 1 = male, 2 = female, -8 = don’t know, -9 = not answered
• where possible, all such labelling should be created and supplied to the UKDA as part of the data file itself. This is the expectation with data supplied in one of the three major statistical packages - SPSS, STATA or SAS.
RELU Conference , 20 January 2005
QA survey data: validation checks
Computer aided surveys (CAPI, CATI or CAWI)
• these are the most accurate way of gathering survey data, but the software (e.g. Blaise) and hardware (e.g. a laptop for every interviewer) may be beyond project resources
• computer aided surveys allow one to build in as many logical checks - on question routing and responses - as is possible at the point of data creation
Non computer aided surveys
• less control over initial responses, but checks can performed:– at the point of data entry/transcription if “data entry” software is
used. However, there are few cheap data entry packages around– the only feasible option may be to enter data without checks
directly into a spreadsheet style interface (e.g. Excel worksheet, SPSS data view), and perform validation checks afterwards - via command files in statistical packages or Visual Basic code in Excel or Access
RELU Conference , 20 January 2005
An example of data seemingly untouched by the human eye:
Originating error in text variables:
Occupation Description of Occupation‘sole trader’ ‘purveyor of seafood’
Propagated error in derived numeric variables:• Respondent was coded under the standard
occupational (SIC) code relating to food retailers:52.2 Retail sale of food, beverages and tobacco in specialised stores
RELU Conference , 20 January 2005
Identifiers
‘Direct' and 'indirect' identifiers may threaten confidentiality
• Direct identifiers may have been collected as part of the survey administration process and include names, addresses including postcode information, telephone number etc.
• Indirect identifiers are variables which include information that when linked with other publicly available sources, could result in a breach of confidentiality. This could include geographical information, workplace/organisation, education institution or occupation
RELU Conference , 20 January 2005
Quantitative data
• Remove the identifier from the dataset
• Aggregate/reduce the precision of a variable – record the year of birth rather than the day, month and year;
record postcode sectors (first 3 or 4 digits) rather than full postcode
• Bracket a coded (categorical) variable – aggregated SOC up to 'minor group' codes by removing the
terminal digit
• Generalise the meaning of a nominal (string) variable
• Restrict the upper or lower ranges of a continuous variable
RELU Conference , 20 January 2005
Online access to dataNESSTAR:
• browse detailed information (metadata) about these data sources, including links to other sources
• do simple data analysis and visualisation on microdata
• bookmark analyses
• download the appropriate subset of data in one of a number of formats (e.g. SPSS, Excel)
• Data ,must be ‘perfect’ - 100% labelled
RELU Conference , 20 January 2005
Derived and aggregated products
• Permission to share and IPR is main issue
• Range of potential parties with interest:– Owners, funders, data gatherers, employers
other stakeholders, etc.
• All original source information must be recorded
RELU Conference , 20 January 2005
Transcribing qualitative data
• integrated into the ongoing research – budget accordingly
• full transcriptions or summaries
• costs and benefits;– self transcription– internal team transcription– external transcription
• full transcriptions;– consistent layout– speaker tags– line breaks– header with identifier / other details – checked for errors
RELU Conference , 20 January 2005
Qualitative data: identifiers removed
• Scheme devised – different for each dataset
• Ideally should reflect any pseudonyms used in publications
• Confidentiality respected
• Anonymisation?
• Problems of anonymisation– Applied too weakly– Applied to strongly– Timing – Potential for distortion– Examples
• User undertakings
• Appropriate and sympathetic
RELU Conference , 20 January 2005
Qualitative Research• e.g set of in-depth interviews
• Data list: list of contents of research collection
• acts as a point of entry for secondary user
• qualitative data: excel template interviewee/case study characteristics
RELU Conference , 20 January 2005
Online access to qualitative data
• new emphasis on providing direct access to collection content
– supports more powerful resource discovery
– greater scope for searching and browsing content of data (supplementary to higher level study-related metadata)
– since users can search and explore content directly… can retrieve data immediately
• providing access to qualitative data via common interface (EDSD Qualidata Online)
• supporting tools for searching, retrieval, and analysis across different datasets
Means that data must be
accurate and standardised
RELU Conference , 20 January 2005
RELU Conference , 20 January 2005
Back up and security
• Digital, paper and audio media are fragile. Digital media are even easier to change/copy/delete!
• a good backup procedure will protect against a range of mishaps such as: – accidental changes to data– accidental deletion of data – loss of data due to media or software faults– virus infections & hackers– catastrophic events (such as fire or flood)
• Back up frequently, retain off site copies
• Consider storage conditions, fireproofing etc.
RELU Conference , 20 January 2005
ESDS in-house processing
• in-house data processing
– ‘cleaning up’ research data
– Collating documentation received from depositor
– repairing minor errors
– meeting users’ expectations
– cannot engage in major processing tasks unless destined for publishing into online systems
RELU Conference , 20 January 2005
Environmental Data
RELU Conference , 20 January 2005
Example: LOCAR Programme
• To better understand the hydrological, physical, chemical and biological processes operating in lowland catchments
• To improve modelling to support the integrated management of lowland catchment systems
• To create a database
– £7.75 Million– Three
catchments– 12 Research
projects– Field
Programme
RELU Conference , 20 January 2005
Flow of data through LOCAR
The LOCAR Field Programme
Water level Flow
Ecology Biology
Rainfall Evaporation
NERC and 3 rd Parties
e.g.: OS, EA, MO
LOCAR Data Centre
Lab NERC meta data gateway
LOCAR PI’s and
Users
Finding data
Requesting data
Receiving data
Supplying Supplying data data
The LOCAR Field Programme
Water level Flow
Quality
Ecology Biology
Recharge Evaporation
Groundwater NERC and 3 rd
Parties
e.g.: OS, EA, MO
LOCAR Data Centre
Lab
Processing & QC
NERC meta data gateway
LOCAR PI’s and
Users
Finding data
Requesting data
Receiving data
Supplying Supplying data data
CST UserData Centre
RELU Conference , 20 January 2005
• Acquire major datasets • Provide data to LOCAR Scientists• Establish standards for data definition and
exchange• Receive data and model output from scientists• Publish appropriate data at the end of the
Programme
• Ensure long term security and availability of LOCAR data
Objectives of the LOCAR Data Centre
RELU Conference , 20 January 2005
Pang and Lambourn Catchments
Site Pang
Lambourn TernFromePiddle
Recharge 7 4 3
Borehole 5 13 9
Water Quality 6 6 14
Flow 8 6 11
RELU Conference , 20 January 2005
Datasets from NERC
• River Network• DTM• Land Cover• HOST• Daily Mean Flows• Rainfall• Ground Water Level • Keyworth Borehole Archive
Records• Wellmaster Borehole data• Geological maps
RELU Conference , 20 January 2005
Raingauges
• Automatic Raingauges– 0.2 mm tipping
bucket - hourly• Manual Raingauges
– Checking Automatic gauge
• Rainwater collector – Rainwater chemistry
samples
• Water levels– Deep boreholes
• Flow – EA gauging
stations– Ultrasonic
doppler flow meter
Level and Flow
RELU Conference , 20 January 2005
Automatic Weather Station
• Solar & net radiation• Wind speed & direction• Air temperature• Relative humidity• Atmospheric pressure• Rainfall• Soil temperature & heat
flux
• Carbon Dioxide and Water Vapour Fluxes
Hydra (Mk 3)
RELU Conference , 20 January 2005
Water Quality
• Temperature• Conductivity• Dissolved oxygen• pH• Turbidity• River level• Automatic water
sampler
• Salmon counts• Smolt counts• Redd counts• Fish surveys• River Habitat
Surveys• Plant surveys (Mean
Trophic Rank)• Diatom surveys• Chironomid Exuviae• Macro invertebrate
surveys
Ecology
RELU Conference , 20 January 2005
Soil Moisture
• Neutron Probe– Soil water content – Radioactive source– Manual
• Profile Probe– 6 shallow depths– Dielectric constant– Automatic
• Tensiometers– Puncture Tensiometers
(Shallow, Manual)– Purgeable Tensiometers
(Shallow, Automatic)– Equitensiometers
(Deeper, Automatic)– Deep jacking
tensiometers (depths up to 60m)
• Soil Water Chemistry– Suction Samplers
Soil Water Potential
RELU Conference , 20 January 2005
Set up Tasks
• Hardware and Software requirements• Create dictionaries• Load site and instrument data• Format conversion facilities• Methods• QC• Meet with 3rd party suppliers• Load 3rd party & NERC data• Liaise with CSTs and PIs• Website
RELU Conference , 20 January 2005
Operational Tasks:
• Receive and load: – Field data – Data from researchers
• Maintenance• Data dissemination• Develop software• Meetings with:
– researchers– CSTs– data managers
• Attend workshops, seminars and annual science meeting
• Report to steering committee
RELU Conference , 20 January 2005
What can the Data Centre offer Scientists?
• Data– Access to the field programme data– Access to NERC data– Access to third party data
• Data Management– Data Centre – acquire, store,
disseminate, long term storage, standards
– Web site
RELU Conference , 20 January 2005
What does the Data Centre ask of Scientists?
• Appoint– Quality and Data Managers
• Write and Maintain– Quality and Data Management
Plans
• Supply– Data sets and metadata
• Observe the Data Policy• Meet with the Data Centre
RELU Conference , 20 January 2005
Access to datasets
• Build a metadata database• Build a thesaurus of terms• Provide a web based search tool• Later provide web access to the datasets
RELU Conference , 20 January 2005
Searching for metadata on the web
• Search:– by keyword– by project– detailed search– by theme
• Description of selected dataset:– Title– Abstract– Contact– Extent
RELU Conference , 20 January 2005
Ethical and legal issues
RELU Conference , 20 January 2005
Up front
• issues of consent and confidentiality allowing archiving should be included in the project management plan & addressed before data collection starts
• longer-term rights management in place and IPR issues considered
• unless a waiver on deposition has been agreed, researchers should not make commitments to informants which preclude archiving their data
RELU Conference , 20 January 2005
Consent for archiving• anonymity and privacy of research participants should be
respected
• explicit ‘informed’ consent gained
• information for research participants should be clear and coherent and include:
– purpose of research – what is involved in participation – benefits and risks – storage and access to data – usage of data (current and future uses)– withdrawal of consent at any time– Data Protection & Copyright Acts
• N.B. Additional measures are needed when participants are unable to consent through incapacity or age
• reflect needs and views of all
• works in practice
RELU Conference , 20 January 2005
Legal issues in data preparation
• ‘Duty of confidentiality’
• Law of Defamation
• Data Protection Act 1998 and EU Directive
• Copyright Act 1988
• Freedom of Information
RELU Conference , 20 January 2005
Duty of Confidentiality• disclosure of information may constitute a breach of
confidentiality and possibly a breach of contract
• not governed by an Act of Parliament• not necessarily in writing• can be a legal contractual
• exemptions are:– relevant police investigations or proceedings– disclosure by court order– ‘public interest’ - defined by the courts– ethical obligations in cases of disclosure of child abuse
RELU Conference , 20 January 2005
Law of Defamation
• a defamatory statement is one which may injure the reputation of another person, company or business
RELU Conference , 20 January 2005
Data Protection Act 1998
• eight principles:– Fairly and lawfully processed – Processed for limited purposes – Adequate, relevant and not excessive – Accurate – Not kept longer than necessary – Processed in accordance with the data subject's
rights – Secure
– Not transferred to countries without adequate protection
• allows for secondary use of data for research purposes under certain conditions
RELU Conference , 20 January 2005
Options for preserving confidentiality
• anonymisation
• consent to archive at the time of field work
• researcher contacts informants retrospectively
• user undertakings
• in exceptional circumstances - permission to use or closure of material
RELU Conference , 20 January 2005
Copyright Act 1988• developed for the broadcasting industry not research!
• protection of author’s rights
• multiple copyrights apply:– automatically assigned to the speaker– researcher holds the copyright in the sound recording of an
interview obtain written assignment of copyright from
interviewee, or oral agreement (license) to use– employer holds the copyright in research data
obtain copyright clearance from employer)
• copyright lasts for 70 years after the end of the year in which the author dies
• copying work is an infringement unless it is for the purposes of research, private study, criticism or review or reporting current events, and if the use can be regarded as being in the context of 'fair dealing
• seek legal advice on problem issues
RELU Conference , 20 January 2005
Freedom of Information
• Freedom of Information Act 2000
A statutory right for individuals and organisations to request information held by public authorities.FOI specifically excludes environmental information which is covered by …
• Environmental Information Regulations 2004
• Enables individuals and organisations to obtain environmental information held by public authorities….
Many RELU data sets will fall under the EIRs
RELU Conference , 20 January 2005
What is the legislation?
• Statutory rights of access to information
• Apply to public authorities – BBSRC, ESRC, NERC and the universities are public authorities
• Any one, anywhere can request copy of any information you hold – includes data sets
• Not all information has to be released
• Must respond to most requests in 20 days
RELU Conference , 20 January 2005
Exemptions –information protected by law
• Don’t Panic - not all information has to be made available under FoI & EIRs
• FOI & EIRs provide a number of exemptions that can be applied to the release of information
• The presumption is that information will be made available unless for good reason (a public interest test).
• Exemptions protect scientific output, commercial business and personal information (through the Data Protection Act)
• Exemptions can be complex and difficult to apply. If in doubt, ask….
RELU Conference , 20 January 2005
RELU-DSS
• The DSS will provide support to RELU award holders (Call 1 and 2) and round 2 applicants for Call 2, through a telephone and email help desk, a web portal and a series of training events.
• http://www.esds.ac.uk:8080/aandp/create/reludss.asp
• Email: [email protected]
• Tel: 01206 872572 or 01206 872974