Upload
dataconomy-media
View
138
Download
1
Embed Size (px)
Citation preview
© 2016 MapR Technologies
Crystal Valentine, PhD VP of Technology Strategy
The Human Face of Big Data
© 2016 MapR Technologies
Human Welfare
India
1.3 billion residents 640,000 villages 22 official languages $40 billion per year on subsidies 40% leakage
Subsidies as % of Government Expenditure
0
2
4
6
8
10
12
14
16
18
20
2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Largest Biometric Database in the World
PEOPLE
20 BILLION BIOMETRICS
Massive NoSQL Database Application
1M registrations (Trillions of matches) / day 5 TB incremental data / day 100M+ authentications / day 200 ms response 0.0001 false positive rate Billion+ audit records per day
Always-on high availability
● Mirroring and backup across DCs
● Strong consistency for update and insert
DC #1 Gurgaon
DC #3 Bangalore
Regional Backup #2
Regional Backup #4
Broad Application Support through Open APIs
Enrollment Services (40k enrollment stations) Authentication Services Fraud detection frameworks Administrative Services Analytics and Reporting Services Logistics services Contact Center
Impact >1 billion Aadhaars issued 95% of adults in India 500K-700K new enrollees everyday 200 milliseconds to verify an ID 240 million bank accounts 122 million LPG consumers 113 million ration cards
© 2016 MapR Technologies
Cancer Genomics
Human Genome Sequencing
3 Billion base pairs 25,000 Genes Massively parallel WGS pipeline Sequence over 10k Gbp/week
Cancer: a complex disease
2012: • 14.1M new cases • 8.2M deaths
1 tumor ~ 100 billion cells Characterized by pathway disruption
Interaction Network
Vertices = genes Edges = interactions “pathway” = subnetwork
Finding Mutated Pathways Network G = (V,E)
Binary mutation matrix
Patie
nts
Genes
= mutated = not mutated
+ =
Connected subnetworks mutated in significant number
of patients
*TCGA (Nature, 2011 and NEJM, 2013)
Individualized Treatment
Clinical / Lab Genomic / Network Analysis
Lifestyle / Phenotypic /
Health sensors
Electronic Medical Records
Literature / Research
Collaboration + Innovation
Susan G. Komen Big Data for Breast Cancer (BD4BC) Initiative • Pledged to reduce mortality rate 50% in next decade • Partnerships in 50+ countries • Extensive network of researchers, health care providers,
patient advocates, clinics • Leverage shared/public data sets + big data technologies
© 2016 MapR Technologies
Our challenge