Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
CDSTAR – Overview
Oliver Schmitt, GWDG
Introduction Object Storage
Defini&on and func&on principle
§ Object storage is the intelligent evolu3on of disk storage. It is about crea&ng, storing and distribu&ng variable-‐sized data objects, and their associated metadata, rather than simply placing blocks of data on tracks and sectors.
§ Each object has its rich metadata inextricably linked to it, enabling long-‐term preserva3on while ensuring data remains safe and accessible over &me [1].
Image according to [2]
Block Device
Object Storage
Introduction Object Storage
Defini&on and func&on principle
Opera3ons § read block § write block Security § weak § full disk Alloca3on • external
Block Device
Object Storage
Opera3ons § create object § read object § update object § Delete object Security § strong § per object Alloca3on • local
Introduction Object Storage
Object concept in detail
Object
Bitstream n
Bitstream 1
Metadata
Access Rights
File Attributes
…
/objects /bitstreams
/metadata /accesscontrol
REST Interface-‐Calls
Object Storage Usage Scenario
Defini&on and func&on principle
GWDG-‐CDSTAR
Web application Desktop Applications Mobile Applications
Metadata Research Data
Portal-‐Environments Middleware
Object Storage Amazon – Simple Storage Service (S3)
Defini&on and func&on principle
§ Amazon is a webservice for online file storage
§ Technical details not publicly disclosed
§ Access via REST / SOAP
§ S3 REST-API used by other projects
§ Service properties:
§ File size up to 5 TB
§ Metadata up to 2 KB
§ Partial coverage of long-term archiving
§ Costs (First TB):
§ Storage ~ 30 Cent / GB / Year
§ Archive ~ 12 Cent / GB / Year
public cloud
Object Storage OpenStack Object Storage - Swift
Defini&on and func&on principle
§ OpenStack Swift is the Object Storage for realizing (private) cloud installations
§ Storing files and images for the cloud
§ Access via REST, S3-Interface
§ Files stored virtual rings
§ Service properties:
§ Software-based solutions
§ Spreading files for redundancy in the data center
§ Replication of node content
§ Horizontal out scaling
§ Usage of commodity hardware
private cloud
Object Storage and Data Curation Methodology
GWDG Common Data Storage Architecture
REST-‐Interface
Map-‐Reduce-‐Engine Search-‐Engine PID-‐Service
Storage-‐Abstrac&on Event-‐Handling (Rules-‐Engine)
worker cloud
map reduce
CouchDB-Plugin
repository portal
SIP builder
search- and access portal
search engine
ES-PluginStorage-Instance
object storage
REST interface
GWDG CDSTAR Engine
PID service
task broker
descriptive information
access
search results
search queries
administration
preservation planning
pro
duce
r
consu
mer
aggregationsupload
Building a own meta object storage GWDG object storage for R&D and production
REST-‐API
REST-‐Sub-‐System-‐API
Java-‐Storage/Plugin-‐API
AAI-‐Filter-‐API
Java API (interface / abstract classes)
n Support for addi&onal authen&ca&on schemes
n Support for addi&onal authoriza&on sources
Java API (interface / abstract classes)
n Support for addi&onal storage back-‐ends
n Support for event-‐handling
Subsystem-‐specific REST-‐API
n Elas&c Search REST-‐API for search queries
n CouchDB-‐REST API for Map-‐Reduce on metadata
Full API-‐Documenta3on (GWDG Report No. 78 – 66. pages)
n Storage & Metadata
n Search
n Access Control
Virtual Research Environment for soeb
Requirement Analysis Scenario A - soeb
Object Storage
Virtual LDAP
REST-‐interface
LDAP-‐Connec&on
Web browser
CRC 1002 Portal for long-‐term archiving
Requirement Analysis Scenario B – CRC 1002
Object Storage
Virtual LDAP
Web browser REST-‐interface
LDAP-‐Connec&on Middleware
REST-‐interface
Intermediate Results Usage of GWDG CDSTAR
Current usage of GWDG CDSTAR
§ soeb VFU (Productive since December 2013)
§ DARIAH-Portal (Test online since July 2013)
§ Cloud4e (Productive since December 2013)
§ UMG SFB 1002 (implementation started)
Full API-‐Documenta3on (GWDG Report No. 78 – 66 pages)