Community Capability Model for Data-Intensive Research An archaeological case-study

Preview:

DESCRIPTION

Community Capability Model for Data-Intensive Research An archaeological case-study . Graeme Earl Archaeological Computing Research Group & Digital Economy University Strategic Research Group. Archaeology: data-intensive research. - PowerPoint PPT Presentation

Citation preview

Archaeology: data-intensive research• Big Data research: capture (e.g. CT) & processing

(e.g. HPC)• Workflows and provenance• Citation of data• Interlinking data and publication• Retention and migration• Deposit/ Self-deposit• Exposure: Semantic Web

AHRC Portus Project• 2007-11 AHRC Portus

Project• 2011-2014 AHRC Portus

in the Roman Mediterranean Project

• Excavation, Survey, Geophysics, Finds Analysis

• Computer graphic simulation; field data capture and processing; data management

Portus and the Tiber Valley

Institutional Data Management Blueprint • Produce framework for managing research data for

an HEI• Scope and evaluate a pilot implementation plan for

an institution-wide data model

6

Policy Best Practice

Pilot Projects

Training &

Workshops

Key Findings1. Group research practice is embedded and unified

2. Group data management capabilities vary widely

3. Data management is carried out on an ad-hoc basis in many cases

4. Researchers’ demand for storage is significant

5. Users want more support for backup, particularly for large quantities of data

6. Researchers resort to their own best efforts in many cases, where central support does not meet their needs

Key Findings• Researchers want to keep

their data for a long time• There is a need from

researchers to share data, both locally and globally

• Data curation and preservation support needs to be improved

• Data can be BIG• Project management is

part of the data cycle

Portus data requirements analysis• Data integration• Data volume• Data variety• Data management (versioning,

migration, linking)• Fit to archaeological practice

– Context– Finds– Field practice– Technological practice

Portus Data Guidance• Archaeology Data Service – Guides to Good Practice; Deposit

• CIDOC CRM

• The English Heritage STAR project

• The Forum for Information Standards in Heritage (FISH)

• MIDAS XML schema

• JISC INCREMENTAL and SUDAMIH

• Temporal period metadata: English Heritage Timelines Thesaurus; COMMONERAS

• Asset management: JISCDIGITALMEDIA; EXIF, XMP or IPTC data

• Getty Art and Architecture Thesaurus

• Collections Trust’s Archaeological Objects Thesaurus are used.

• Archaeological geospatial data: UK GEMINI (Geo-spatial Metadata Interoperability Initiative)

• Geophysical data: English Heritage Geophysical Survey Database

• Laser scanning data: Laser Scanning Addendum to the Metric Survey Specification (Heritage3d.org); 3D-COFORM

• Reflectance Transformation Imaging: CHI and AHRC RTISAD (Empirical Provenance)

How address human factors holding research back?• Skills: Broad range of archaeologically specific courseware,

exemplars and best practice guidance• Legal: Remains problematic but increasing support and

expectation of data sharing from administration• Collaboration: archaeology is collaborative from the ground

up• Closed/ Open: Archaeology is at a cross-roads; funding

means that open-ness is likely to be the only option but disciplinary cultures need to be addresses; strong open community

• E.g. research by Leif Isaksen on Roman Port Networks

How address economic factors holding research back?• Sustainability: Research council funding and

mandate: institutional and disciplinary repositories; user-focussed tools

• Operational costs: complex issue, being addressed at Institutional level e.g. policy re: disciplinary curation

• Transactional costs: limited attempts to operate a pay model for data transactions in archaeology: some museum and media related; open model dominating current work

How address technical factors holding research back?• Standards: CIDOC CRM/ CRM-EH, FISH etc. –

embedding in technology and data uptake key• Access and exposure: Linked Open Data; green or

gold open access• Platforms and tools: broad range, but increasing

trend towards standardisation

Portus in SharePoint 2010

• Archaeology Sharepoint Pilot (Metadata and Visualisation Component)

Virtual Research Environment (VRE)Toolkit for SharePoint

• SES Sharepoint Pilot (Project Management)

DepositMO Project Overview• Embed deposit culture

into everyday work of researchers

• Turn repositories into an extension of the desktop

• Extend Microsoft Office• Target ePrints and

Dspace• Enable repository deposit

as part of everyday workflow

Future for Portus Data• Planned for deposit from the start of the AHRC Portus

Project• Complete migration to partnership: Sharepoint archive

and ARK database• Strategies for big data• Strategies for integrated workflows

Scientific Workflows (e.g. I2S2; MyExperiment) and Blogs (e.g. Blog3)

e.g. RTI SharePoint 2010:

(Big) Data MetadataProject pages, blogs, wikis, finances

www.southampton.ac.uk/muvis

Future for Portus Data• The latest AHRC funding (2011-2014) will take the

project from analysis of data through to linking of publications to the data upon which they are based– Expose Linked Open Data– Repository integration is core to our on-going

work, including self-deposit (SWORD-ARM & DataPool)

– VRE developments to enable Sharepoint -> Repository

– Experimental Data-Publication linking e.g. RIN

Recommended