Upload
pieter-de-leenheer
View
453
Download
0
Embed Size (px)
Citation preview
Data Governance in a Big Data Era
Pieter De Leenheer, PhDStanford University
Nov 3, 2016
Misconceptions of Data Governance that impede Data Valuation• Data governance is a published repository of common definitions. • Data governance is a concern of – and hence managed by – IT.• Data governance is just data quality (DQ) and master data
management (MDM). • Data governance is siloed by business function.• Data governance provides no value or participation for the data-
consuming community.
Admin• http://www.slideshare.net/pdeleenh/data-governance-in-the-big-data-era
Hierarchical Data Management• Formal
• Operational and analytical data
• Inward Focus:• Improve Internal/external coordination• Understand customer• Predict next transaction
• Controlled by Central Provider• MDM, DWH, DM, Dashboards• Tedious Waterfall • Comprised by Obsolete Cost assumption
• Consumer• Small Elite C-level
Hierarchical Data Governance• Wikipedia: “a set of processes that ensures that important data assets
are formally managed throughout the enterprise. Data governance ensures that data can be trusted and that people can be made accountable for any adverse event that happens because of low data quality”.
• biased by Total (Data) Quality Management practice• Suggest ‘policing’ rather than ‘empowerment’
• How to evolve to a democratic networked approach?• Involves IC’s and middle-management• With less middle-men slack• Dealing with Big Data
Data Big Bang• Phenomenon: connectivity between
• Social• Knowledge• Technology
• Draws curiosity• Web Science (Pentland, etc)• Big Data Native Market Entrants (23andMe, Uber,
Inventure)• Disruption
• Bottom up• Starting From data• Low end
• +80% unstructured data or ‘dark matter’
Three Forces Shaping the Digital Economy (1)1. Digitalization of the Physical
• Entertainment, Wealth, Biology, Chemistry
• MPx, Paypal, Bitcoin, 3d printing, IoT, VR
2. Sustained and accelerated growth of digital power (despite slow down Moore’s Law)• Mass parallelization (Hadoop and Hive)• Move function and reliability to software• Miniaturization
Three Forces Shaping the Digital Economy (2)3. Modular and Generative Programmability
“By carefully excluding features that are not universally useful Internet technologies became easily adopted on a massive scale and gave the Web a generative [i.e. self-reproductive] character” (Zittrain, 2009).
• This opens new business models unimaginable before:• apps extend function of a smartphone• aggregations of components in complex machines
• once digitized opens new ways of manipulation and transport
The “Dark Matter” of Big Data Universe• Observed consequence of these forces:
1. Consumerization of Digital Technologies pivoting around 20002. Grassroot Participation / Peer-based 3. Digitalization of Trust
• All contribute to Big Data• (2) and (3) contribute to Social Capital: Dark Matter (aka
unstructured data)? • Human communication, Text heavy• Context: emphasis, emotion, location at moment of capturing
changes meaning:
• “I did not say Peter’s talk stinks”
Data-driven Hierarchies, Networks &HybridsHierarchical Networked Network peers provide ideas, feedback but
also service (uber driver analogy data scientist)
Product Ownership Service (hence Data) AccessExample: Uber doesn’t own. It only dispatches information about rolling material to riders and focus over lifetime value retention.
Data analogy: access to data more important than owning as cost of IS is marginal and replaced by data value appreciation by using community
Passive resources (material, goods)
Active resources (data, consumer)
Value-in-exchange Value-in-use
Acquisition Retention Example: Saas, Netflix, Costco, etc.
Data analogy: From formal roles and responsibilities to support internal process to social capital based trust
Process Relations
Provider push Consumer pulls Example: Feedback, mods on games, user participation, A/b testing etc.
Data analogy: data helpdesk
Consumerization of tech, grassroot participation, digitalization of trust
Shift in Data Governance Approaches• Consequences of digital forces gigantic risk on organizations even with
hierarchical governance • Hierarchical data governance
• Few consumers served by a central oblique provider• Inward• Compromises on old obsolete cost assumptions of digital power• Use of digital optimizes to some extent• Not scalable for big data by larger ‘data scientist’ populations
• Combine with Networked Approach• Democratization (production)
• Breadlines• Consumerization of BI and cheap digital power• Many serve many• Supports customer
• Amazonification (consumption)• Access, SLA, Trust, etc
• Outward
Big Data Analytics Challenges• When everybody has data scientists: predict next
transaction is not competitive anymore• from 'predict next transaction' to life-long relation
building and value creation• reduce search and navigation for customer with
better apps • crowd sourcing to cross compare with and learn
from other customers (Opower, INRIX, zillow)• get trust from customer through branded non-intrusive
apps: personal health monitoring, Nest• Retention analysis example
Big Data Governance Challenges• Scalable Balance between (hierarchical) control and (networked) empowerment • Minimize search for data sets
• Advanced descriptors such as business glossary• Manage attention drift in case of proliferation
• Usage (page ranking): data sets that are reused more are more relevant• Digitalization of Trust
• Authenticity: lineage and provenance• data sets owned by people in your social capital
• Price: prices may be a mechanism but is difficult to identify a fair price and establish a currency-based market for data assets: see Infonomics
• Service level agreements
Digitalization of Trust Challenges• In Hierarchical data governance trust
• established by a centrally sanctioned competence center• Or external appointed trustees with formal roles: steward,
owners, architects• In networked peer-driven approach Trust is more complicated:
• Authenticity: is the data factual or opiniated?• Intention: does this data have good intentions? Can I use
it without peril? Hidden privacy concerns I should be aware of?
• Assess expertise or quality: are people involved skilled or certified stewards?
• Is it accurately representing our business reality, i.e. customer base?
• Is it complete and up to date? • Has it be certified through standard process?
Danger of the old paradigm models• Weapons of Math Destruction (WMD) are models
• Threaten to destabilize• Equality• Democracy
• Traits of WMDs• Opaque• Unregulated• Uncontestable• …hence : ungoverned
The Rise of the Chief Data Officer (CD0) [6]
Data governance & stewardship provide the right level of control and trust in data
Data Infrastructure (IT) Data Consumers (Business)
LEADERSHIPCEO, CFO, VP, Marketing
ROLESData Scientist, Business Analyst
TECHNOLOGYVisualization, Self-service BI
NEED
Data Authority
LEADERSHIPCIO
ROLESInformation Manager, Data Architect, Data Modeler
TECHNOLOGYHadoop, Databases, Data Integration
Data Authority
LEADERSHIPChief Data Officer
ROLESData Governance Manager,
Data Steward
TECHNOLOGYData Stewardship
Platform
Recommendations for the Chief Data Officer• Collaboration: inwards / outwards• Data Space: traditional data / big
data• Value Impact: service / strategy
• Join our MIT Sloan CDO Research• http://www.iscdo.org/
Conclusion• Digital forces have digitally empowered individuals in the organization• Hybrid data governance approach should combine
• Hierarchical control of critical data assets to enhance internal coordination• Networked peer-driven empowerment to drive ‘serendipity’• On a shared platform
• Key challenges are:• Digitalization of trust with focus on social capital• Big data analytics that drives life-time value for customer• Get rid of old models that are oblique, unregulated and incontestable• Recognize CDO Leadership and Role transition
Recommended Reading• O’Neil, C.: Weapons of Math Destruction• Franks, B.: Taming the Big Data Tidal Wave• Sundararajan, A.: The Sharing Economy• Pentland, S.: Social Physics: How Good Ideas Spread• Madnick, R. et al.: A Cubic Framework for the Chief Data Officer• Zittrain, J.: The Future of the Internet• https://www.collibra.com/blog/unleash-the-data-democracy-5-misco
nceptions-of-data-governance/• https://www.collibra.com/blog/the-rise-of-the-chief-data-officer-cdo/