CORE: Aggregating and Enriching Content to Support Open Access

Petr KnothThe Open University

Outline

1. Aggregating Open Access (OA) publications – why, how, what for?

2. The CORE system3. Supporting research in mining databases of scientific

publications (DiggiCORE)

Outline

Growth of items in Open Access repositories

Growth of Open Access repositories

Growth of articles in OA journals

Growth of OA journals

Green Open Access - statistics

Why we need aggregations?

“Each individual repository is of limited value for research: the real power of Open Access lies in the possibility of connecting and tying together repositories, which is why we need interoperability. In order to create a seamless layer of content through connected repositories from around the world, Open Access relies on interoperability, the ability for systems to communicate with each other and pass information back and forth in a usable format. Interoperability allows us to exploit today's computational power so that we can aggregate, data mine, create new tools and services, and generate new knowledge from repository content.’’

[COAR manifesto]

Access to information according to the level of abstraction

Metadata Transfer

Interoperability

Metadata

Content

Semantic Enrichm

Interfaces

Repository

RepositoryRaw data access

Transaction information access

Analytical information access

Aggregation

Who should be supported by aggregations?

The following users groups (divided according to the level of abstraction of information they need):

• Raw data access. • Transaction information access.• Analytical information access.

Who should be supported by aggregations?

• The following users groups (divided according to the level of abstraction of information they need):• Raw data access. Developers, DLs, DL researchers, companies …• Transaction information access. Researchers, students, life-long learners …• Analytical information access. Funders, government, bussiness intelligence

Layers of an aggregation system

Metadata Transfer Interoperability

OLTP OLAP

Metadata Content

Enrichment

Interfaces

Layers of an aggregation system

Metadata Transfer Interoperability

OLTP OLAP

Metadata Content

Enrichment

Interfaces

OAI-PMH, OAI-ORE … Dublin Core, XML, RDF … PDF, Word …

Annotations

Catalog records

StatisticsAPIs (REST, SOAP, XML-RPC), UIs, Dashboards

Access to information according to the level of abstraction

Repository

Analytical information accessInterfaces

Metadata Transfer

Interoperability

Metadata

Content

Enrichment

Related systems

Aggregation projects – BASE

Repository

Metadata Transfer

Interoperability

Metadata

Content

Enrichment

Aggregation projects – OAISter/WorldCAT

Repository

Metadata Transfer

Interoperability

Metadata

Content

Enrichment

Aggregation projects – RepUK

Repository

Metadata Transfer

Interoperability

Metadata

Content

Enrichment

Aggregations need access to content, not just metadata!

• Certain metadata types can be created only at the level of the aggregation

• Certain metadata can be changing in time• Ensuring content:• accessibility• availability• validity• quality• …

Aggregation projects – CiteSeerX

Repository

Metadata Transfer

Interoperability

Metadata

Content

Enrichment

Should an aggregation system support all three user types?

Can be realised by more than one systemproviding that

the dataset is the same!

Outline

CORE objectives

• CORE aims to provide a comprehensive technical infrastructure

for Open Access scholarly publications that will support access and reuse of scholarly materials at different levels of abstraction.

• A nation-wide aggregation system that will improve the discovery of publications stored in British Open Access Repositories (OARs).

What does CORE provide at different aggregation levels?

Repository

Metadata Transfer

Interoperability

Metadata

Content

Enrichment

CORE functionality

Step 1: Metadata and full-text harvesting

Content harvesting, processing

Repository

Metadata Transfer

Interoperability

Metadata

Content

Enrichment

Semantic similarity, Citation extraction, classsification, …

CORE functionality

Step 2: Semantic enrichment

Semantic enrichment

Repository

Metadata Transfer

Interoperability

Metadata

Content

Enrichment

CORE functionality

Step 3: Providing a set of services on top of the aggregation

Providing services

CORE applications

• CORE Portal• CORE Mobile• CORE Plugin• CORE API• Repository Analytics

Repository

Metadata Transfer

Interoperability

Metadata

Content

Enrichment

CORE ApplicationsCORE Portal – Allows searching and navigating scientific publications aggregated from Open Access repositories

CORE Applications

CORE Mobile – Allows searching and navigating scientific publications aggregated from Open Access repositories

CORE ApplicationsCORE Plugin – A plugin to system that recommendations for related items.

Repository

Metadata Transfer

Interoperability

Metadata

Content

Enrichment

CORE ApplicationsCORE API – Enables external systems and services to interact with the CORE repository.

Repository

Metadata Transfer

Interoperability

Metadata

Content

Enrichment

CORE ApplicationsRepository Analytics – is an analytical tool supporting providers of open access content (in particular repository managers).

Repository

Metadata Transfer

Interoperability

Metadata

Content

Enrichment

Repository Analytics

CORE API

CORE Portal, CORE Mobile, CORE Plugin

CORE statistics

• Content• 5.4M records• 192 repositories• 402k full-texts

• Started: February 2011• Budget: 140k£

Outline

publications ( )

Partners

Advisory Board

Objective

Software for exploration and analysis of very large and fast-growing amounts of research publications stored across Open Access Repositories (OAR).

DiggiCORE networks

Three networks: (a) semantically related papers,(b) citation network, (c) author citation network

DiggiCORE objectives

Allow researchers to use this platform to analyse publications. Why?• To identifying patterns in the behaviour of research

communities• To detect trends in research disciplines• To gain new insights into the citation behaviour of researchers• To discover features that distinguish papers with high impact

Questions the system can help answering?

• What are the attributes of impact publications?• Do these attributes differ in the humanities, social sciences and

computer sciences?• What are the features of research groups within disciplines and

how do these features relate to contributions generated by the group?

• What are the attributes of high-impact authors and what is their role within the group?

• What are the dynamics of successful research groups?

Questions the system can help answering?

• What is the mechanism of cross-fertilisation within disciplines, especially between the humanities and the sciences?

• Who are the authors whose work is worth monitoring because they contribute to the achievements of their own discipline and also inspire other disciplines?

• How should the novice in the discipline get acquainted with key achievements in the discipline?

• How should he/she search for the most important publications?

Summary

• The rapid growth of OA content provides both an opportunity as well as a challenge.

• Aggregations should serve the needs of different user groups. • Aggregations need to aggregate content, not just metadata. • We can have many services that are part of the infrastructure,

but should work with the same data.

Thank you!

Yes we can!

CORE: Aggregating and Enriching Content to Support Open Access

Education

CORE – Aggregating the world’s open access research papers · 2017. 11. 30. · 2016/2015 : - - Created Date: 20170612115451Z

CORE – Aggregating the world’s open access …6 1 Информационные технологии в системе управления человеческими ресурсами

ERP implementatieproblemen - CORE – Aggregating the world’s … · 2017. 9. 4. · Open Universiteit Nederland Faculteit: Managementwetenschappen ... ERP implementation complexity

CORE – Aggregating the world’s open access research papers · 2018-04-01 · наблюдений в среднем в конце октября начале ноября выпадает

CORE – Aggregating the world’s open access research papers

CORE – Aggregating the world’s open access …inaccurately conveyed research designs, selectively omitted research results, and incorrectly interpreted intervention outcomes. (Morris,

Biosensors and Bioelectronics - CORE · PDF fileContents lists available at ScienceDirect Biosensors and Bioelectronics journal homepage: Gold aggregating gold: A novel nanoparticle

Aggregating to Inform

CORE – Aggregating the world’s open access research papers · 2017-12-19 · 2 Полухин И.В. Таможенный аутсорсинг на примере таможенного

Enriching lives through recreation Enriching lives through recreation

· Web viewbenefiting banks : enriching communitiesAppendix 1. benefiting banks : enriching communities-i-benefiting banks : enriching communitiesAppendix 2

CORE – Aggregating the world’s open access research papers2016. 5. 8. · CORE – Aggregating the world’s open access research papers

Aggregating Lapidary Still Lifes

MASTER’S THESIS - CORE – Aggregating the world’s

CORE – Aggregating the world’s open access research papers · Created Date: 4/27/2017 12:45:32 PM

“Aggregating governance indicators”

PERENCANAAN PEMBELAJARAN - CORE – Aggregating ...proses penyusunan materi pelajaran, penggunaan media pembelajaran, penggunaan pendekatan dan metode pembelajaran, dan penilaian dalam

Open intents Aggregating Apps

The stories of the exiles - CORE – Aggregating the world

Aggregating Data Streams for More Effective Revenue Management€¦ · Aggregating Data Streams for More Effective Revenue Management. AggregAting DAtA StreAmS for more effective