Experiences with a distributed NoSQL DB · Introduction CouchDB Geography: Landslide modelling Particle Physics: Live Analysis Children of the 90’s Briefly: CouchDB @ CMS Changes:

Experiences with a distributed NoSQL DB

Luke Kreczko - UoB Particle Physics

21.06.2017 Luke Kreczko – JGI DB Workshop 1

Overview

➢ Introduction

➢ CouchDB

➢ Geography: Landslide modelling

➢ Particle Physics: Live Analysis

➢ Children of the 90’s

➢ Briefly: CouchDB @ CMS

➢ Changes: Then & Now

➢ Summary & Outlook2

Introduction - DICE

Data Intensive Computing Environment (DICE)

➢ ~1300 cores, 1.1 PB of distributed storage (HDFS)

➢ Shared between physics groups, projects with medicine & others

➢ Idle resources backfilled with grid jobs (GridPP, WLCG, ++)

➢ Previously projects with geography, CSE and computer science

3

https://wikis.bris.ac.uk/display/dic/Home

https://www.gridpp.ac.uk/

http://wlcg-public.web.cern.ch/

http://www.bristol.ac.uk/geography/research/hydrology/research/slope/mossiac/

https://www.cse.org.uk/

We use

4

Global (grid) storage management

Cluster Management

Monitoring (Prometheus)

Memcached

Distributed Databases

5

CouchDB

Short-lived on our cluster

Used in a few projects

Cloud Spanner

(Big)CouchDB on DICE

➢ Project started by Simon Metson and Mike Wallace (now IBM)

➢ Before CouchDB 2.0 the only way to run distributed CouchDB was BigCouch (by Cloudant)

➢ Based on 1.1 version of CouchDB plus custom clustering

➢ Today: data base service IBM Cloudant

➢ Setup on DICE:

➢ 3 nodes with 12 cores and 3 x 3 TB disks = 27 TB

➢ Used for landslide modeling

➢ Trialed for particle physics

➢ Evaluated for children of the 90’s

6

https://www.ibm.com/analytics/us/en/technology/cloud-data-services/cloudant/

(Big)CouchDB

➢ A schema-less JSON document database management system written in Erlang.

➢ Uses a RESTful API, making access very

Easy to access via the web

➢ Scaling through BigCouch.

➢ No SPOF using BigCouch as any node can

handle any request.

➢ Views created via a Map/Reduce set-up.

➢ Easy replication and maintenance through

the ”futon” system.

7

Landslide modelling:

➢ Management of Slope Stability in Communities (MoSSaiC)

➢ Funding from governments, UNDP, USAID & The World Bank

➢ Task: risk assessment of slopes

➢ What happens if a once-in-50-years storm hits?

➢ Can a slope made safe for less than the cost of slope failure?

8

9

The problem

Simulation takes ~ 1 hour on single machine➢ Multiple scenarios, multiple scenarios & multiple runs add complexity

10

Need to parallelise workflow

11

12

Landslides: Workflow and data management

13

Particle Physics use case: Live analysis

➢ Views are and interesting way to set up entire analysis

➢ Also allow for live tuning of parameters (e.g. sliders for phase space cuts)

➢ In theory could be used to update calibration constants on-the-fly

➢ Idea: set up views and wait for the data to trickle in

➢ Analysis will build up “by itself”

➢ Everyone involved would have immediate access to the results

➢ The problem: Incompatible data formats

14

High Energy Particle Physics (HEP) file format

➢ The HEP community predominantly uses the ROOT file format

➢ Binary file format that can store primitives, strings, custom C++ objects and collections of them

➢ Can contain directory structure (like a file-based file system)

➢ Objects can be grouped by using “Trees” - one entry == one event

➢ At user level (compared to experiment) we usually use “flat” trees

➢ Flat: only contains primitives (& strings) and collections of them

➢ Code-indepent at this stage

➢ Best comparison to a DB:

➢ Column-wise storage with collections in some columns

15

https://root.cern.ch/input-and-output

HEP -> CouchDB

➢ The easiest conceivable test was to convert each event into a JSON document

➢ Each stored variable would be a field

➢ A test file of 1GB (compressed) - 140k events

➢ Expanded to ~10 GB in CouchDB (~ factor 10)

➢ The understanding back then was

➢ Documents are independent - keys (variable names)

➢ Compression was not yet available (came in CouchDB 1.2)

➢ Whole analysis data sets are (after huge reduction) O(TB)

➢ Would be curious to reinvestigate with CouchDB 2.0

16

Children of the 90’s

➢ Long term health research project of more than 14,000 mothers

enrolled during pregnancy between 1991 and 1992

➢ Both mothers and children have been periodically measured and

questioned through

➢ 8340 mothers and 8365 children (after observed data cleaned).

➢ 2,543,887 genotypes imputed for each individual.

➢ These data are stored in the comma-separated files by chromosome

17

Children of the 90’s: The problem

➢ Data retrieval is very slow:

➢ Example extraction takes 30 minutes per variant

➢ O(100) variants per subset of individuals

➢ CouchDB seen as possible solution

➢ reduce data retrieval to a few seconds

➢ request variants via web interface

➢ get immediate results in desired data format

➢ Challenges: data security & data quality checking

➢ Shared DB instance was deemed inadequate at the time

➢ Project did not take off due 18

Other projects: The Compact Muon Solenoid (CMS)

➢ Simon also introduced CouchDB to other projects

➢ The CMS experiment (at the Large Hadron Collider) was one of them

➢ CMS migrated from Oracle to CouchDB for its Data Management and Workflow

Management (DMWM)

➢ https://developer.ibm.com/dwblog/2013/oracle-couchdb-data-management/

➢ Successful & ongoing, provided CMS an easy way to share data between services

➢ Provides management for O(10PB) of data per year and >100k compute jobs per day

➢ Extended last year for the Data Popularity Service (replicates/removes data sets

based on access)

19

https://developer.ibm.com/dwblog/2013/oracle-couchdb-data-management/

Changes: Then & Now

➢ Saw 10x increase in data volume when importing data (ROOT -> CouchDB)

➢ CouchDB 1.2 brought compression

➢ No native support for clustering - BigCouch always a version behind

➢ Native clustering in open source version since CouchDB 2.0

➢ Security & lack of encryption inadequate for some projects

➢ Security keeps improving & commercial variantes offer encryption

➢ For some projects CouchDB might be worth re-visiting

20

Summary and Outlook

➢ Clustered CouchDB (BigCouch) successfully used for Landslide modelling and

CMS Data & Workflow management

➢ BigCouch on DICE has been completed and shut down

➢ The old version (1.1) did not meet the requirements of the other two projects

➢ CouchDB continues to improve: newer version might be viable

➢ For particle physics we are now targeting Pandas Dataframes & Dask

➢ Since the project started (2012) many more distributed databases become

available (see slide 5)

➢ One of them might be right for your project21

dask.pydata.org