19
Data Governance in a Big Data Era Pieter De Leenheer, PhD Stanford University Nov 3, 2016

Data Governance in the Big Data Era

Embed Size (px)

Citation preview

Page 1: Data Governance in the Big Data Era

Data Governance in a Big Data Era

Pieter De Leenheer, PhDStanford University

Nov 3, 2016

Page 2: Data Governance in the Big Data Era

Misconceptions of Data Governance that impede Data Valuation• Data governance is a published repository of common definitions. • Data governance is a concern of – and hence managed by – IT.• Data governance is just data quality (DQ) and master data

management (MDM). • Data governance is siloed by business function.• Data governance provides no value or participation for the data-

consuming community.

Page 3: Data Governance in the Big Data Era

Admin• http://www.slideshare.net/pdeleenh/data-governance-in-the-big-data-era

Page 4: Data Governance in the Big Data Era

Hierarchical Data Management• Formal

• Operational and analytical data

• Inward Focus:• Improve Internal/external coordination• Understand customer• Predict next transaction

• Controlled by Central Provider• MDM, DWH, DM, Dashboards• Tedious Waterfall • Comprised by Obsolete Cost assumption

• Consumer• Small Elite C-level

Page 5: Data Governance in the Big Data Era

Hierarchical Data Governance• Wikipedia: “a set of processes that ensures that important data assets

are formally managed throughout the enterprise. Data governance ensures that data can be trusted and that people can be made accountable for any adverse event that happens because of low data quality”.

• biased by Total (Data) Quality Management practice• Suggest ‘policing’ rather than ‘empowerment’

• How to evolve to a democratic networked approach?• Involves IC’s and middle-management• With less middle-men slack• Dealing with Big Data

Page 6: Data Governance in the Big Data Era

Data Big Bang• Phenomenon: connectivity between

• Social• Knowledge• Technology

• Draws curiosity• Web Science (Pentland, etc)• Big Data Native Market Entrants (23andMe, Uber,

Inventure)• Disruption

• Bottom up• Starting From data• Low end

• +80% unstructured data or ‘dark matter’

Page 7: Data Governance in the Big Data Era

Three Forces Shaping the Digital Economy (1)1. Digitalization of the Physical

• Entertainment, Wealth, Biology, Chemistry

• MPx, Paypal, Bitcoin, 3d printing, IoT, VR

2. Sustained and accelerated growth of digital power (despite slow down Moore’s Law)• Mass parallelization (Hadoop and Hive)• Move function and reliability to software• Miniaturization

Page 8: Data Governance in the Big Data Era

Three Forces Shaping the Digital Economy (2)3. Modular and Generative Programmability

“By carefully excluding features that are not universally useful Internet technologies became easily adopted on a massive scale and gave the Web a generative [i.e. self-reproductive] character” (Zittrain, 2009).

• This opens new business models unimaginable before:• apps extend function of a smartphone• aggregations of components in complex machines

• once digitized opens new ways of manipulation and transport

Page 9: Data Governance in the Big Data Era

The “Dark Matter” of Big Data Universe• Observed consequence of these forces:

1. Consumerization of Digital Technologies pivoting around 20002. Grassroot Participation / Peer-based 3. Digitalization of Trust

• All contribute to Big Data• (2) and (3) contribute to Social Capital: Dark Matter (aka

unstructured data)? • Human communication, Text heavy• Context: emphasis, emotion, location at moment of capturing

changes meaning:

• “I did not say Peter’s talk stinks”

Page 10: Data Governance in the Big Data Era

Data-driven Hierarchies, Networks &HybridsHierarchical Networked Network peers provide ideas, feedback but

also service (uber driver analogy data scientist)

Product Ownership Service (hence Data) AccessExample: Uber doesn’t own. It only dispatches information about rolling material to riders and focus over lifetime value retention.

Data analogy: access to data more important than owning as cost of IS is marginal and replaced by data value appreciation by using community

Passive resources (material, goods)

Active resources (data, consumer)

Value-in-exchange Value-in-use

Acquisition Retention Example: Saas, Netflix, Costco, etc.

Data analogy: From formal roles and responsibilities to support internal process to social capital based trust

Process Relations

Provider push Consumer pulls Example: Feedback, mods on games, user participation, A/b testing etc.

Data analogy: data helpdesk

Consumerization of tech, grassroot participation, digitalization of trust

Page 11: Data Governance in the Big Data Era

Shift in Data Governance Approaches• Consequences of digital forces gigantic risk on organizations even with

hierarchical governance • Hierarchical data governance

• Few consumers served by a central oblique provider• Inward• Compromises on old obsolete cost assumptions of digital power• Use of digital optimizes to some extent• Not scalable for big data by larger ‘data scientist’ populations

• Combine with Networked Approach• Democratization (production)

• Breadlines• Consumerization of BI and cheap digital power• Many serve many• Supports customer

• Amazonification (consumption)• Access, SLA, Trust, etc

• Outward

Page 12: Data Governance in the Big Data Era

Big Data Analytics Challenges• When everybody has data scientists: predict next

transaction is not competitive anymore• from 'predict next transaction' to life-long relation

building and value creation• reduce search and navigation for customer with

better apps • crowd sourcing to cross compare with and learn

from other customers (Opower, INRIX, zillow)• get trust from customer through branded non-intrusive

apps: personal health monitoring, Nest• Retention analysis example

Page 13: Data Governance in the Big Data Era

Big Data Governance Challenges• Scalable Balance between (hierarchical) control and (networked) empowerment • Minimize search for data sets

• Advanced descriptors such as business glossary• Manage attention drift in case of proliferation

• Usage (page ranking): data sets that are reused more are more relevant• Digitalization of Trust

• Authenticity: lineage and provenance• data sets owned by people in your social capital

• Price: prices may be a mechanism but is difficult to identify a fair price and establish a currency-based market for data assets: see Infonomics

• Service level agreements

Page 14: Data Governance in the Big Data Era

Digitalization of Trust Challenges• In Hierarchical data governance trust

• established by a centrally sanctioned competence center• Or external appointed trustees with formal roles: steward,

owners, architects• In networked peer-driven approach Trust is more complicated:

• Authenticity: is the data factual or opiniated?• Intention: does this data have good intentions? Can I use

it without peril? Hidden privacy concerns I should be aware of?

• Assess expertise or quality: are people involved skilled or certified stewards?

• Is it accurately representing our business reality, i.e. customer base?

• Is it complete and up to date? • Has it be certified through standard process?

Page 15: Data Governance in the Big Data Era

Danger of the old paradigm models• Weapons of Math Destruction (WMD) are models

• Threaten to destabilize• Equality• Democracy

• Traits of WMDs• Opaque• Unregulated• Uncontestable• …hence : ungoverned

Page 16: Data Governance in the Big Data Era

The Rise of the Chief Data Officer (CD0) [6]

Data governance & stewardship provide the right level of control and trust in data

Data Infrastructure (IT) Data Consumers (Business)

LEADERSHIPCEO, CFO, VP, Marketing

ROLESData Scientist, Business Analyst

TECHNOLOGYVisualization, Self-service BI

NEED

Data Authority

LEADERSHIPCIO

ROLESInformation Manager, Data Architect, Data Modeler

TECHNOLOGYHadoop, Databases, Data Integration

Data Authority

LEADERSHIPChief Data Officer

ROLESData Governance Manager,

Data Steward

TECHNOLOGYData Stewardship

Platform

Page 17: Data Governance in the Big Data Era

Recommendations for the Chief Data Officer• Collaboration: inwards / outwards• Data Space: traditional data / big

data• Value Impact: service / strategy

• Join our MIT Sloan CDO Research• http://www.iscdo.org/

Page 18: Data Governance in the Big Data Era

Conclusion• Digital forces have digitally empowered individuals in the organization• Hybrid data governance approach should combine

• Hierarchical control of critical data assets to enhance internal coordination• Networked peer-driven empowerment to drive ‘serendipity’• On a shared platform

• Key challenges are:• Digitalization of trust with focus on social capital• Big data analytics that drives life-time value for customer• Get rid of old models that are oblique, unregulated and incontestable• Recognize CDO Leadership and Role transition

Page 19: Data Governance in the Big Data Era

Recommended Reading• O’Neil, C.: Weapons of Math Destruction• Franks, B.: Taming the Big Data Tidal Wave• Sundararajan, A.: The Sharing Economy• Pentland, S.: Social Physics: How Good Ideas Spread• Madnick, R. et al.: A Cubic Framework for the Chief Data Officer• Zittrain, J.: The Future of the Internet• https://www.collibra.com/blog/unleash-the-data-democracy-5-misco

nceptions-of-data-governance/• https://www.collibra.com/blog/the-rise-of-the-chief-data-officer-cdo/