Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories Dr....

Preview:

Citation preview

Developing Infrastructure to Support Closer Collaboration of Aggregators with Open

Repositories

Dr. Nancy Pontika & Dr. Petr KnothCOnnecting Repositories (CORE)

Open University, UK

LIBER 2015, 24 – 26 June, London

Mission of CORE

Aggregate all open access content distributed across different systems worldwide, enrich this content and provide access to it through a set of services …

[Source: http://core.ac.uk/about#mission]

Need for a UK aggregator

Bringing the UK’s open access research outputs together:• Feasibility study commissioned

by Jisc, published June 2014• Referred to as “Open Mirror”

[Source : https://repository.jisc.ac.uk/5570/1/JISC_REPORT_open_mirror_09051

4_FINAL_WEB.pdf]

Three levels of support

Programmable Data Access

- CORE API - CORE Data Dumps

- Researchers- Developers - Companies

Transaction Information

Access

- CORE Portal- CORE Mobile - CORE Plugin

- Researchers- Students

- Life long learners

Analytical Information

Access

- CORE Policy -CORE Compliance

Analytics- CORE Dashboard

- Funders - Governments- Data Providers

[Source: http://www.dlib.org/dlib/november12/knoth/11knoth.html]

CORE Statistics• Content: 20M+ records, 600+ repositories, 1.8M+

full-texts • The UK national aggregator - Jisc• Full-text aggregator (not just metadata)• Placed among Top 10 search engines for research

that go beyond Google [Jisc, 2013]• Listed among Top 100 Thesis and Dissertation

Resources• Part of Jisc’s Repositories Shared Services Project

(RSSP)

Aggregation process • Metadata download, extraction and cleaning• Full-text harvesting• Text extraction• Language detection• Extraction of citation references from text• Identification of related content• Detection of duplicate items• Parsing of author names• Indexing

CORE Applications • CORE Portal– Search engine providing open access content

• CORE Mobile – Android and iOS apps

• CORE Plugin– For repositories and journals

• CORE API– Programmable access to million of resources

• CORE Dashboard – Tool for repository managers

CORE Dashboard : purpose

• Harvested Records

• Metadata

• Harvesting Process

• Standards

• Repository Managers

• Funders

• Repositories• Journals

Data Providers Collaboration

QualityTransparency

Institution main page

Edit repository information

Invitations

Content

Manage record visibility statusTake down

Manage record visibility statusTake down

Manage record visibility statusTake down

Take up

Manage record visibility statusTake down

Take up

Update metadata records

• Asynchronous process • Item is queued in the CORE system• Record is updated within 12 hours

Statistics

Issues : 3 types

When harvesting your repository/document we encountered an error that we couldn't resolve. These errors need to be fixed in order to to harvest your repository/document.

We encountered an error but we were still able to harvest the repository/document. We strongly recommend that these issues are resolved as they may lead to incompatibility problems in the future.

This may not be a problem but it may be a clue for misconfiguration or future incompatibilities.

Issues : good news

Issues : good news

Issues : bad news…

Issues: Robots.txt

Issues: Robots.txt

Issues: Document Issues

Issues: Malformed PDF url

Dashboard benefits - Increased and simplified collaboration between

aggregators and content providers- Improved control of the content provider over the

harvested content- Reduction of scepticism and fear of sharing

content with other systems- Improvement of the harvesting process- Broadening of the open access content

discoverability and thus reuse of the open access content where permitted

Would you like to take a look?

Dashboard still in BETA but we welcome volunteer testers

Email me at nancy.pontika@open.ac.uk

Many thanks to…CORE developers: • Matteo Cancellieri• Samuel Pearce• Drahomira Herrmannova• Lucas Anastasiou

Volunteer testers: • Chris Biggs, Metadata & Repository Specialist, Open University• Nick Sheppard, Repository Developer, Leeds Beckett University

Thank you

Questions

CORE Contacts: Nancy Pontika nancy.pontika@open.ac.ukPetr Knoth petr.knoth@open.ac.uk Website: http://core.ac.uk Twitter: @oacore

Recommended