Upload
amia-barton
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
www.cineca.it
~
Integrate external services in DSpace submission process
How to make self-deposit easy and improve metadata quality and presence of full-text
Andrea Bollini – Susanna Mornati
Topics
⁄ Some context:⁄ CINECA a brief overview⁄ DSpace as part of a CRIS solution
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ Make the repository an active actor:⁄ Discovering missing content⁄ Improve Fulltext presence
⁄ Integration of external services:⁄ Bibliographic database: Scopus, PubMed, CrossRef, ArXiv, etc.⁄ Publishers policy: Sherpa/Romeo
⁄ Owned companies: Kion, SCS.⁄ Employees: 400 (+150 Kion) ⁄ Total turnover: 70M€
The Company
⁄ Interuniversity Consortium⁄ No-Profit⁄ Founded in 1969⁄ Headquarter in Bologna
⁄ 57 Members⁄ 54 Universities⁄ 2 Research institutes⁄ MIUR
as last week!
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ The “merging process” of the three Italian Consortia started in September 2012
⁄ It was concluded in July 1st 2013 (last week!)
The Merge
2.0⁄ 67 Members
⁄ More than 700 employees (+ 150 Kion)
⁄ The only Italian Interuniversity Consortium
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Higher Education• Solutions & Services for the University Administration• Services for the Ministry of Education, University and
Research (MIUR)
Scientific Research• High Performance Computing – FERMI: 2° in EU / 7° WW)• Scientific Visualization & Interactive Virtual Environments
Technological Innovation• Data Center• Information and Knowledge Management Services• Health Care Systems
What CINECA does
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
• Cineca Board of Directors
Product Managers
Board
Product Managers
Board
U-GOV & SURplus
Restricted Board
U-GOV & SURplus
Restricted Board
Customer ServiceBoard
Customer ServiceBoard
Technical & Delivery Board
Technical & Delivery Board
AppsRoad Map
TechRoad Map
• University Customers• Focus Groups
• University Customers• Cineca Technical Board
Requirements Re
quire
men
ts
How we work with Universities
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Solutions for HE
= ERP
Authentication
= Best of Breed
AU
GW Gateway
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
SURplus: CINECA’ CRIS System
⁄ An interoperable infrastructure made of different components
⁄ Ingestion of data from any legacy systems adopted by an institution
⁄ Maintenance of specific functional requirements, data model and preferred technologies at the level of applications
⁄ Data warehouse and Business Intelligence tools to facilitate aggregations of data and the application of measurement parameters and algorithms
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
SURplus: Dimension
⁄ Beginning of activities: 2004
⁄ 9 institutions
⁄ 22 institutional repositories
⁄ Total modules: 77
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Topics
⁄ Integration of external services:⁄ Bibliographic database: Scopus, PubMed, CrossRef, ArXiv, etc.⁄ Publishers policy: Sherpa/Romeo
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ Make the repository an active actor:⁄ Discovering missing content⁄ Improve Fulltext presence
⁄ Some context:⁄ CINECA a brief overview⁄ DSpace as part of a CRIS solution
CINECA is a registered service provider at DuraSpace
Long-term collaboration with DSpace community, since 2003
Upgrades are periodically released to the open source community
DSpace: SURplus’ Open Archive Module
⁄ Manages collection and dissemination of research results
⁄ Simplifies data collection’s processes
⁄ Service Integration
The OA Module, developed on DSpace:
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
“dissemination of
entities’ descriptions in
the research
environment which go
beyond publications”
DSpace-CRIS: SURplus’ Expertise & Skills
DSpace-CRIS: designed together with the Hong Kong University & released as open-source
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
IR as part of a CRIS system: what change?
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ Benefits:⁄ Strong deposit mandate⁄ More funding
⁄ Issues to mitigate:⁄ IR become a critical application⁄ Author have a “requirements” perception
Wasting time Late submission
Professional supportHA infrastructureDedicated team
advocacy
Make the submission process easy
The information already exists in other database!
Topics
⁄ Integration of external services:⁄ Bibliographic database: Scopus, PubMed, CrossRef, ArXiv, etc.⁄ Publishers policy: Sherpa/Romeo
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ Make the repository an active actor:⁄ Discovering missing content⁄ Improve Fulltext presence
⁄ Some context:⁄ CINECA a brief overview⁄ DSpace as part of a CRIS solution
New first submission step
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Available providers: each provider is a spring service
Free search form
Main metadata common to all publication types (article, book, etc.)
Title of the contributionYear
Authors/Editors
New first submission step
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Lookup by unique identifier
Each provider declares which identifiers is able to manage
New first submission step
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
For each result providers are shown that match the record.
Grouping is done via DOI
Modal box publication details
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Records from different providers are merged to get richer metadata
The system guesses a collection for the submission but the user
can change it if required
Manual submission
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
When lookup fails the user can always proceed manually
Batch import from external source
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Import data (identifiers or structured text) can be inputed manually or uploaded as a file
Format/provider must be specified by the user
Batch import from external source
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ Request are processed:⁄ Inline for specific providers and/or within configured data
limits Submitter can immediately complete the pre-filled submissions
⁄ In a background process Submitter will receive a summary email with import
result Pre-filled submissions are available as in-progress
submission in the MyDSpace
The legacy batch import feature for JSPUI has been already shared as pull request on GitHub, see DS-1252
Enhanced Describe step: showing metadata source
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Tran
slati
on lo
gic
orig
inal
n
orm
alize
d
Technical details
PubMed Lookup Provider
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
PubMedrecord
JAVA Bean
Mapping file
DSpaceItem
Normalizedrecord
Enhancer plugins
Split, aggregate fieldsDerive data
ISSN Journal title…
arXiv Lookup Provider
arXivrecord
JAVA Bean
Mapping file
Scopus Lookup Provider
Scopusrecord
JAVA Bean
Mapping file
…
Tran
slati
on lo
gic
Nor
mal
ized
R
epos
itory
Mapping file
<bean name="pubmedService" class=“...service.PubmedService"/>
<bean name="pubmedLookupProvider" class=“...lookup.PubmedLookupProvider">
<property name="pubmedService" ref="pubmedService"/>
</bean>
implements SubmissionLookupProvider
public class PubmedLookupProvider extends ConfigurableLookupProvider
public abstract class ConfigurableLookupProvider
public class PubmedItem{ private String pubmedID; private String doi; private String issn; private String eissn; private String journalTitle; private String title; private String pubblicationModel; private String year; private String volume; private String issue; private String language; private List<String> type; private List<String> primaryKeywords; private List<String> secondaryKeywords; …
Topics
⁄ Integration of external services:⁄ Bibliographic database: Scopus, PubMed, CrossRef, ArXiv, etc.⁄ Publishers policy: Sherpa/Romeo
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ Make the repository an active actor:⁄ Discovering missing content⁄ Improve Fulltext presence
⁄ Some context:⁄ CINECA a brief overview⁄ DSpace as part of a CRIS solution
Enhanced upload step
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Using the ISSN or EISSN provided in the describe step
the upload form is improved showing on the right side the publisher policy from the Sherpa/Romeo database
Enhanced upload step
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Access policy for the bitstream:Open access, embargo, intranet,
etc.
Deposit of fulltext to the national database for individual CVs
Topics
⁄ Integration of external services:⁄ Bibliographic database: Scopus, PubMed, CrossRef, ArXiv, etc.⁄ Publishers policy: Sherpa/Romeo
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ Make the repository an active actor:⁄ Discovering missing content⁄ Improve Fulltext presence
⁄ Some context:⁄ CINECA a brief overview⁄ DSpace as part of a CRIS solution
What is the problem?
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
(very) late submissions produce some issues for the repository both at technical and organization level:/ The system is subjected to periods of intense input activities.
DSpace, but in general IR software, scales well for read operations less well for write operations
/ IR staff involved in workflow get lot of task to perform in small period
Get researcher aware
Remind researcher about IR presence
Intercept early new content
How we plan to mitigate the problem?
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Citation databases provide APIs to perform search (we already use them for the lookup) and in some cases they provide additional APIs or search filters/indexes to make more raffinated search and allow scanning of the database. The interesting filters/indexes are:/ Time based (much better if related to insertion in the
citation database)/ Author ID (better if related to a «standard/common»
identifier as ORCID)/ Affiliation/ Subject category
Implementation idea
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Allow the researcher to store personal preferences about scanning:/ Enabled providers (e.g disable arXiv if you are not a
physicist)/ Frequencies/ Subject categories filters
AuthorIDs will be stored/retrieved from the Researcher profile.Subject categories could be proposed from previous items or researcher profile.
DSpace-CRIS: Researcher profile
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Who are the potential targets?
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ ORCID⁄ Scopus⁄ Web of Science⁄ arXiv⁄ PubMed Central⁄ DBLP⁄ REPEC
The Repository itself!
The repository as source of missing content?
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ The submitter has to match authors of publication with the University staff to higthlight internal authors ⁄ Sometimes matches are missing⁄ Othertimes matches are wrong (homonymous)
⁄ External authors could become «internal» at some point in the future
The repository as source of missing content?
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ Send email to internal «co-authors» when a submission is done prevent wrong attribution (and reduce duplication)
⁄ Allow researcher to unclaim publications from her profile last chance to fix wrong attribution
⁄ Allow researcher to claim publications fix missing attribution and/or engagement of new researcher
The last two features are included in the DSpace-CRIS addon
Current implementation: claim/unclaim publications in the repository
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
This is the current status of the publication
U Unlinked
You can claim itA Active, simple claimS Make it a selected publicationH Claim it but hide from you public profile
Current implementation: claim/unclaim publications in the repository
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
You can unclaim a publicationU Unlink
Current implementation: claim/unclaim publications in the repository
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
Topics
⁄ Integration of external services:⁄ Bibliographic database: Scopus, PubMed, CrossRef, ArXiv, etc.⁄ Publishers policy: Sherpa/Romeo
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ Make the repository an active actor:⁄ Discovering missing content⁄ Improve Fulltext presence
⁄ Some context:⁄ CINECA a brief overview⁄ DSpace as part of a CRIS solution
Improve fulltext presence
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
⁄ Use the Sherpa/Romeo policy database to analyze repository content
⁄ Use external database API to find an actual fulltext (arXiv, pubmed, ...why not the publisher version via library subscription?)
⁄ Send email to researcher to validate found PDFs or ask for an «author» versions
⁄ Use statistics to encourage upload
127.000+ items
65.000+ items
9,4% 17,2%
Sherpa/Romeo Statistics (Example)
www.cineca.it | Integrate external services in DSpace submission process | OR2013| July 2013
51%ISSN
36%Not in Sherpa24.000 items
7,3% have a fulltext…
5,3% open access
32% green21.000 items
www.cineca.it | Innovative Open Source Technologies for a CRIS: SURplus | euroCRIS | May 2013
SURplus: prevision 2014
⁄ 50+ institutional repositories (DSpace)
⁄ 10 research portals (DSpace-CRIS)
www.cineca.it
~
Thank you!Andrea Bollini
SURplus - http://www.cineca.it/en/content/surplus
DSpace-CRIS - http://cilea.github.com/dspace-cris