18
Interesting Problems in Academic Data William Gunn Head of Academic Outreach Mendeley @mrgunn

Open Data Bay Area: Interesting Problems in Academic Data

Embed Size (px)

DESCRIPTION

Mendeley Talk

Citation preview

Page 1: Open Data Bay Area: Interesting Problems in Academic Data

Interesting Problems in Academic Data

William GunnHead of Academic OutreachMendeley@mrgunn

Page 2: Open Data Bay Area: Interesting Problems in Academic Data
Page 3: Open Data Bay Area: Interesting Problems in Academic Data
Page 4: Open Data Bay Area: Interesting Problems in Academic Data

Academic data

• Not data like the Climate Corp.• Not data like Facebook or Twitter

• Metadata! (not the NSA kind)• Information about the scholarly

outputs of academic researchers

Page 5: Open Data Bay Area: Interesting Problems in Academic Data

Page, Lawrence and Brin, Sergey and Motwani, Rajeev and Winograd, Terry (1999) The PageRank Citation Ranking: Bringing Order to the Web. Technical Report. Stanford InfoLab.

Page 6: Open Data Bay Area: Interesting Problems in Academic Data

http://nexus.od.nih.gov/all/2012/02/13/age-distribution-of-nih-principal-investigators-and-medical-school-faculty/

Page 7: Open Data Bay Area: Interesting Problems in Academic Data

The bog we’re stuck in

• Conservative culture• Technophobia in humanities• policy issues• Academic incentives

– beholden to publishing companies for status

– data & code aren’t citable– data & code aren’t easily shareable

• extra work for no extra credit, reproducibility issues

Page 8: Open Data Bay Area: Interesting Problems in Academic Data

Time to market

• http://freethedata.org

• Gene patent reform

• Portable Legal Consent

Page 9: Open Data Bay Area: Interesting Problems in Academic Data
Page 10: Open Data Bay Area: Interesting Problems in Academic Data

Beholdenness

• Currently, you’re judged almost if not entirely based on what you publish

• Journals which are cited more (high IF) are worth more.

• In China, you get a salary bonus if you publish in a high IF journal -> gaming and predatory behavior

• In the US, it’s where you work

Page 11: Open Data Bay Area: Interesting Problems in Academic Data

Altmetrics

• Re-building the reputation system for academia

• Collect data about more kinds of outputs

• Make data and code first class objects

• Attribute impact to the object, not the top-level container (journal or institution)

• We don’t know what the data mean, yet.

Page 12: Open Data Bay Area: Interesting Problems in Academic Data

Increasing age of grant awardees

• No one is really working on this problem

• Grant agencies don’t know how to make data-driven decisions, because they don’t have enough good data.

• This is a hard problem.

Page 13: Open Data Bay Area: Interesting Problems in Academic Data

Making data and code citable

• DateCite• CrossRef

• CC4 has new additions to make things work better– clearer CC-BY attribution guidelines– Handles sui generis database rights (for

EU)– makes it easier for publisher and user

Page 14: Open Data Bay Area: Interesting Problems in Academic Data

Reproducibility Issues

• Code isn’t produced with high quality

• Analyses are hard to re-run• Code is hard to share

• Data sets are hard to re-use– rights issues and provenance and

context– Does that dataset mean what I think it

means?

Page 15: Open Data Bay Area: Interesting Problems in Academic Data

Author Disambiguation

• If you want item-level credit to accrue to researchers, you need to be able to tell them apart– Y. Wang had 3,926 publications in 2011

– ORCID is working on this, but it’s a hard problem.

Page 16: Open Data Bay Area: Interesting Problems in Academic Data

Recommender Systems

• Better data for funding agencies and publishers and researchers– Mendeley hosted a Recommender

Systems Workshop, active on this problem

• Moving from descriptive to predictive stats about research.

Page 17: Open Data Bay Area: Interesting Problems in Academic Data

Non-problems

• Social networks for researchers– Researchers use Mendeley,Twitter,

LinkedIn, FB to some degree• a place for people to comment on

articles

Page 18: Open Data Bay Area: Interesting Problems in Academic Data