Big Data's Long Tail

Preview:

DESCRIPTION

 

Citation preview

Big Data’s Long Tail

Carly Strasser, John Kunze,Trisha Cruse

University of California Curation Center, California Digital Library

10 December 2012

From Flickr by rahen z

The Long Tail

Size of dataset

# datasets

The Long Tail

Size of dataset

# researchers# datasets

The Long Tail

Size of dataset

# researchers# datasets

# grants

The Long Tail

Size of dataset

# researchers# datasets

# grants

grant ($)

The Long Tail

Size of dataset

# researchers# datasets

# grants

grant ($)

With data managers

and fancy tools

Do-it-yourself tools

From Flickr By puck90

UGLY TRUTH

Many researchers…are not taught data management

don’t know what metadata are

can’t name data centers or repositories

don’t share data publicly or store it in an archive

aren’t convinced they should share data

Intercept researchers where they already

work

Facilitate

Archiving

Sharing

Publishing

Data management

& organization

Data re-use & reproducibility

DataUp: the vision

Open Source Tool Add-in & Web

Application

Earth, Environmental,

& Ecological Researchers

?

Add-in • Download and install• Appears as “ribbon” in Excel• Windows Excel 2007+

Web-based application • Website that does something

with user’s files• New user interface• Any platform/spreadsheet

software?

FeaturesBest practices checkGenerate metadataGenerate citation

Post data to repository

Requirements

SENT TO MSRReleased Sept

4, 2012

Best Practices Check

Best Practices Check

Generate Metadata

17

Attribute Metadata

Create Data Citation

Create Data Citation: Get Identifier

ask CDL’s Merritt

repository for id

... which asks EZID for an id

Upload to a Repository

Tip: you can also choose a practice repository

The long tail of the long tail

A data repository for

AnyoneAnywhere

Build community

Add repositories

Add metadata schema

From

ani

mati

onre

sour

ces.

org

ONEShare, Merritt, UCSB,

Dryad, etc.

NSF DataNet, INTEROP, etc.

dataup.cdlib.org@DataUpCDLfacebook.com/DataUpCDL

John.Kunze@ucop.educarlystrasser@gmail.comPatricia.Cruse@ucop.edu

Recommended