Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Experiences with a distributed NoSQL DB
Luke Kreczko - UoB Particle Physics
21.06.2017 Luke Kreczko – JGI DB Workshop 1
Overview
➢ Introduction
➢ CouchDB
➢ Geography: Landslide modelling
➢ Particle Physics: Live Analysis
➢ Children of the 90’s
➢ Briefly: CouchDB @ CMS
➢ Changes: Then & Now
➢ Summary & Outlook2
Introduction - DICE
Data Intensive Computing Environment (DICE)
➢ ~1300 cores, 1.1 PB of distributed storage (HDFS)
➢ Shared between physics groups, projects with medicine & others
➢ Idle resources backfilled with grid jobs (GridPP, WLCG, ++)
➢ Previously projects with geography, CSE and computer science
3
We use
4
Global (grid) storage management
Cluster Management
Monitoring (Prometheus)
Memcached
Distributed Databases
5
CouchDB
Short-lived on our cluster
Used in a few projects
Cloud Spanner
(Big)CouchDB on DICE
➢ Project started by Simon Metson and Mike Wallace (now IBM)
➢ Before CouchDB 2.0 the only way to run distributed CouchDB was BigCouch (by Cloudant)
➢ Based on 1.1 version of CouchDB plus custom clustering
➢ Today: data base service IBM Cloudant
➢ Setup on DICE:
➢ 3 nodes with 12 cores and 3 x 3 TB disks = 27 TB
➢ Used for landslide modeling
➢ Trialed for particle physics
➢ Evaluated for children of the 90’s
6
(Big)CouchDB
➢ A schema-less JSON document database management system written in Erlang.
➢ Uses a RESTful API, making access very
Easy to access via the web
➢ Scaling through BigCouch.
➢ No SPOF using BigCouch as any node can
handle any request.
➢ Views created via a Map/Reduce set-up.
➢ Easy replication and maintenance through
the ”futon” system.
7
Landslide modelling:
➢ Management of Slope Stability in Communities (MoSSaiC)
➢ Funding from governments, UNDP, USAID & The World Bank
➢ Task: risk assessment of slopes
➢ What happens if a once-in-50-years storm hits?
➢ Can a slope made safe for less than the cost of slope failure?
8
9
The problem
Simulation takes ~ 1 hour on single machine➢ Multiple scenarios, multiple scenarios & multiple runs add complexity
10
Need to parallelise workflow
11
12
Landslides: Workflow and data management
13
Particle Physics use case: Live analysis
➢ Views are and interesting way to set up entire analysis
➢ Also allow for live tuning of parameters (e.g. sliders for phase space cuts)
➢ In theory could be used to update calibration constants on-the-fly
➢ Idea: set up views and wait for the data to trickle in
➢ Analysis will build up “by itself”
➢ Everyone involved would have immediate access to the results
➢ The problem: Incompatible data formats
14
High Energy Particle Physics (HEP) file format
➢ The HEP community predominantly uses the ROOT file format
➢ Binary file format that can store primitives, strings, custom C++ objects and collections of them
➢ Can contain directory structure (like a file-based file system)
➢ Objects can be grouped by using “Trees” - one entry == one event
➢ At user level (compared to experiment) we usually use “flat” trees
➢ Flat: only contains primitives (& strings) and collections of them
➢ Code-indepent at this stage
➢ Best comparison to a DB:
➢ Column-wise storage with collections in some columns
15
HEP -> CouchDB
➢ The easiest conceivable test was to convert each event into a JSON document
➢ Each stored variable would be a field
➢ A test file of 1GB (compressed) - 140k events
➢ Expanded to ~10 GB in CouchDB (~ factor 10)
➢ The understanding back then was
➢ Documents are independent - keys (variable names)
➢ Compression was not yet available (came in CouchDB 1.2)
➢ Whole analysis data sets are (after huge reduction) O(TB)
➢ Would be curious to reinvestigate with CouchDB 2.0
16
Children of the 90’s
➢ Long term health research project of more than 14,000 mothers
enrolled during pregnancy between 1991 and 1992
➢ Both mothers and children have been periodically measured and
questioned through
➢ 8340 mothers and 8365 children (after observed data cleaned).
➢ 2,543,887 genotypes imputed for each individual.
➢ These data are stored in the comma-separated files by chromosome
17
Children of the 90’s: The problem
➢ Data retrieval is very slow:
➢ Example extraction takes 30 minutes per variant
➢ O(100) variants per subset of individuals
➢ CouchDB seen as possible solution
➢ reduce data retrieval to a few seconds
➢ request variants via web interface
➢ get immediate results in desired data format
➢ Challenges: data security & data quality checking
➢ Shared DB instance was deemed inadequate at the time
➢ Project did not take off due 18
Other projects: The Compact Muon Solenoid (CMS)
➢ Simon also introduced CouchDB to other projects
➢ The CMS experiment (at the Large Hadron Collider) was one of them
➢ CMS migrated from Oracle to CouchDB for its Data Management and Workflow
Management (DMWM)
➢ https://developer.ibm.com/dwblog/2013/oracle-couchdb-data-management/
➢ Successful & ongoing, provided CMS an easy way to share data between services
➢ Provides management for O(10PB) of data per year and >100k compute jobs per day
➢ Extended last year for the Data Popularity Service (replicates/removes data sets
based on access)
19
Changes: Then & Now
➢ Saw 10x increase in data volume when importing data (ROOT -> CouchDB)
➢ CouchDB 1.2 brought compression
➢ No native support for clustering - BigCouch always a version behind
➢ Native clustering in open source version since CouchDB 2.0
➢ Security & lack of encryption inadequate for some projects
➢ Security keeps improving & commercial variantes offer encryption
➢ For some projects CouchDB might be worth re-visiting
20
Summary and Outlook
➢ Clustered CouchDB (BigCouch) successfully used for Landslide modelling and
CMS Data & Workflow management
➢ BigCouch on DICE has been completed and shut down
➢ The old version (1.1) did not meet the requirements of the other two projects
➢ CouchDB continues to improve: newer version might be viable
➢ For particle physics we are now targeting Pandas Dataframes & Dask
➢ Since the project started (2012) many more distributed databases become
available (see slide 5)
➢ One of them might be right for your project21