21
11/16/2017 1 © 2017 COMMVAULT SYSTEMS, INC. ALL RIGHTS RESERVED. Patrick McGrath – Director Solutions Marketing – Archive, Search, Analytics Data Protection and Information Governance Across Data Silos What are these?

Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

Embed Size (px)

Citation preview

Page 1: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

1

© 2017 COMMVAULT SYSTEMS, INC. ALL RIGHTS RESERVED.

• Patrick McGrath – Director Solutions Marketing – Archive, Search, Analytics

Data Protection and Information Governance Across Data Silos

What are these?

Page 2: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

2

Creating new value from old content

4

UC Berkeley Prosopography Services

Common info themes with many organizations?

• Information expensive to manage, conserve/preserve and use

• Most information is dark or inaccessible, constrained by medium, volume or steward

• Understanding of information constrained by issues such as language, context and

timeliness

• Information managed, governed and stored centrally within the organization

• Decisions influenced by information from within the organization

5However, times have changed

Goldmine or Minefield?

Page 3: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

3

Everyone approaches information from a different angle

Records, Archivists, Librarians“I wont accept this without my 193 fields of metadata!”

6

Legal“What do you mean you cant find it?”

Information Technology“I don’t care. Stop using so much storage!”

Content Creators, Researchers“Get out of my way – I have work to do!”

•Data Sprawl

Page 4: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

4

Digital Transformation and the Internet of ThingsForces fueling the move to the cloud

8

Explosion of Digital Data

…of all digital data ever created was created in the last 2 years.

Source: Sintef ITC

90%

Connected Devices Skyrocketing

Source: Gartner

…things could be connected to the Internet by 2020, about 5.5 million devices added every day.

20.8BILLION

Cloud Usage on the Rise

…of organizations are managing some of their data in cloud infrastructure.

Source: IDC

70%

Customers

Vendor/Partners

Employees

Products

Intellectual Property

Contracts

Providers, SLA’s

Applications• ERP• CRM• ECM• EDW• Archive• Backups

Devices (e.g. Laptops, IoT)

Locations/Jurisdictions

In different parts of the Organization

Data subject areas

Spread across…Containing sensitive information

Data sprawl

How to manage visibility and consistency of data handling?

Page 5: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

5

Heart of the problem

10

“… while the average large UK business now uses 24 systems to manage and store personal data, 1 in 5 use over 40 systems to do so.”

- Nick Ismail, Information Age (citing 2017 OnePoll Survey)

“There is one application for every 5-10 employees generating copies of the same files leading to massive amounts of duplicate, idle data…”

- Michael Vizard, ITBusinessEdge.com

Copy

Replicate

MailboxArchive

MailboxBackup

Data copies and silos

Email

MailServer

Files

File Analytics

ComplianceArchive Mailbox

Archive

MultipleBackups

ComplianceCopy

OutlookPSTs

ComplianceReplica

ArchiveBackup

MultipleBackups

ArchiveBackup

Datacentre File Servers

File Archive

EndpointBackup

ServerBackup

ServerBackup

Personal Cloud & Devices

Dept.  FileServers

Remote FileServers

Salesforce

End User

Page 6: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

6

Complexity hinders compliance and increases risk

LEGACY SYSTEMSDATA CENTERS CLOUD DATA SaaS

PAIN: LACK OF CONTROL AND ANALYSIS• Archive and Search systems create silos• Lack common search and collate• Multiple access controls to manage• Gaps in coverage present risk• Drives demand for more ‘data lakes’

projects

PAIN: VISIBILITY OF EXTERNAL DATA• Data held externally is difficult to track• Protection managed by 3rd party• Limited ability to archive or manage

retention• Risk of data on unsanctioned Clouds• Mobile and Shadow IT

PAIN: BACKUP AND RECOVERY RISKS• Too many siloed solutions & repositories• Impossible to set common policies• Reporting is a challenge• Variable controls for access & audit• Complexity leads to gaps in coverage

? ? ? ?

x?

Silo

Silo

Change drivers

Ransomware Data Privacy / GDPR

• Hack leads to data encryption, loss or copying

• Unless price paid, could lead to

• Halt of business operations for critical data

• Publication of sensitive data

• Could also lead to notifiable loss incident

• EU personal data privacy

• Serious consequences (€£₽$)

• Focus on EU resident personal data

• Global companies also liable

• Process and technology change for many

• Consent, requests, breach notification, etc.

13

• Customer demands

• External competition

• Workforce competition

• Compliance and security

Key Takeaways: Know your critical and sensitive data.

Get rid of it if you don’t need it!

Page 7: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

7

Where is this highly controlled data?

14AIIM Report – Understanding GDPR in 2017

14

15AIIM Report – Understanding GDPR in 2017

Control over the controlled data impacted by…

Page 8: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

8

2003 20162003 2005 2007 2009 2011 2013 2015

California SB1386 EnactedJul 2003

UCB Grad Division IncidentMar 2005

UCB J-School IncidentAug 2009

UCB E&I IncidentMay 2015

UCB RSSP IncidentMay 2006

UCB UHS IncidentMay 2009

UCB Cap Proj / Real Estate IncidentAug 2014

UCB BFS IncidentFeb 2016

UC Berkeley data breach incidents

16

Breach Detected

Declare IncidentAssessment

PlanningExecution

RemediationRemediation

Data Breach Response Phases

GDPR• 72 Hours to notify authorities

• “Without undue delay” to notify victims

• YOU are responsible for the data handling of your providers

Page 9: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

9

•So, how to deal with landmines and goldmines?

Data Platform

1. Ingest• Unstructured (Files, Social) • Structured Data (DB)• Metadata & Usage Info• Dedupe

• IoT• Big Data

• Backup/Archive (Store) – OR• In Place Indexing (No-Store)

3. Govern• Access Rules• Wipe/Erasure • Encryption• Movement• Synchronization• Retention/Disposition

• Monitor• Alerts/Notifications• Process Initiation/

Automation

5. Use

Big Data, Analytics, 360 Dashboards & Reporting, BusinessIntelligence

Collaboration

eDiscoverySearchResearchInvestigationsCase

2. Understand• Contents• Usage• Meaning and Context• Data Profiling/Entity Extraction• Recommendations

Primary Data Source

MailboxesData

Center Cloud EndpointsApps &

DatabasesIoT &

External

Applications & Ecosystem

6. Extend4. Recover

Operational Recovery

DR/HotsiteDev/Test

Page 10: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

10

File system data source example – Storage optimization

Duplicates

Orphaned files

Sensitive data

Sensitive Data Detection

21

Page 11: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

11

Data Analytics Applications

• Architecture to enable content-aware applications

• User profile based applications

• Fine tuned for the specific knowledge and use case for a desired outcome

• Core capabilities

• Data indexing

• Data detection

• Visualizations / Reports

• Workflow

• Data policy automation

• API access

• Audit trails

Data collection, indexing, analytics visualization and action!

Data Index

Content Index, Federation,

Virtualize, Enrich

Virtual Repository

Infrastructure

Traditional Mixed & Converged Software Defined Cloud

SAN

Live Data

Files AD

Apps SAAS

Ingest

Data Services

Profile-Based Applications

Stored Data

Inge

st Files Edge

Email

Search & Analytics UnlocksSensitive Data Management

DISCOVER: Discover risk data, across file, endpoint, email and structured data, and present for risk evaluation and action taking, removal or retention by defined policy.

Map Enterprise Information

Data Center Cleanup

Automate Classification &

Retention

Optimize Accessibility

Accelerate Breach Notification

Planning

Automate Storage Tiering &

Disposition

Encrypt & Protect End User Computer

Data

Optimize Business Continuity

Demonstrate Compliance

Detect Ransomware

Identify Anomalous Access

Monitor for Personal Data in

Unauthorized Locations

Simplify Response to Access,

Rectification and Erasure Requests

PROTECT: Minimize use of risk data and protect from loss, breach or damage.

MANAGE: Ensure risk data is always managed to standards with ongoing risk assessments.

MANAGE

PROTECT

DISCOVER

Page 12: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

12

•Getting to the right information quickly• Search and machine learning

Compliance search – Commvault and LucidworksBeta Coming Soon!

AI intelligence with ease of use

Page 13: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

13

Prioritize and reduce costs with Machine Learning

26

Do it at scale

>5M documents/hour

Increase relevancy

Find and review what matters

Lower costs

80-90% reduction

Integrated AI

Powerful but easy to use

Leveraging the AI ecosystem

27

Define Review Set to Analyze Brainspace Pulls Information from the Review Set

Investigation starts by defining the search parameters of a Review Set

The plugin streams data from the Review Set into Brainspace using the templated

field map

Initiate Investigation

Create a Collection in Brainspace

Overlay Full Report

Execute Visual

Analytics

Full report data is synced into Commvault

when build is complete

Brainspace receives streamed text and

metadata to create a Cluster Wheel, perform Comm

Analysis, and display a dashboard

Analytics and SyncArchive Process BuildAnalyze

User can perform actions on the

Review Set based on the synchronized Brainspace tags

Collection Sync to

Review Set

User creates a collection in Brainspace using Visual

Analytics

Page 14: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

14

Integration Steps

First step: Define Review Set to Analyze

Second Step: Transfer Review Set to Brainspace

Third Step: Perform Brainspace Analytics

Fourth Step: Create a Collection from your Analytics Result and Sync back to Commvault

Fifth Step: Take Action in Review Set from Brainspace Tag

Second Step: Transfer Review Set to Brainspace

29

Pick Review Set

Build Process

Ingestion Fields Preconfigured

Page 15: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

15

Third Step: Perform Brainspace Analytics

• Analytics Dashboard

• Transparent Concept Search

• Cluster Wheel

• Conversation Analysis

• Communication Analysis

• Advanced Document Classification ‐ Predictive Coding

• Advanced Document Classification ‐ Continuous Multi‐Modal Learning(CMML)

Analytics Dashboard

The overview dashboard is completely interactive and provides insight at a glance for the entire dataset, including:

o Duplicates & Near‐Duplicateso Timelineo Faceted Listso Concept Searcho Document Results

Page 16: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

16

Transparent Concept Search

Brainspace’s next‐generation Transparent Concept Search provides the advantages of concept searching without the traditional drawbacks.

Transparent Concept Search significantly reduces the time and expense resulting from over‐inclusive document retrieval by allowing users interact with the concept expansion to boost or eliminate concepts.No black box.

Cluster Wheel

The Brainspace Cluster Wheel showcases our dynamic learning by organizing all documents into conceptually similar clusters.

The wheel is animated and interactive, showing neighborly populations of documents, making early assessment intuitive even for extremely large datasets.

Page 17: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

17

Conversation Analysis

Using Conversations allows you to visualize email activities within a dataset. 

Users can track the flow of information throughout an organization by exploring what emails have been sent to who and determine what email domains have been most accessed.

Communication Analysis

Who said what to whom?Brainspace’s communication analysis view adapts to any active query and provides interactive exploration of email conversations, including:

o Interactive Social Grapho To, CC and BCC filteringo Sender/Recipient Volumeo Top relationshipso Top Termso Alias Consolidation

Page 18: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

18

Advanced Document ClassificationPredictive Coding

Brainspace’s Predictive Coding uses our patented machine learning technology together with Logistic Regression and Active Learning to help you review less and decrease your associated costs.

Brainspace gives you more control by allowing you to set your target recall at the beginning and allow you to adjust it by providing feedback on depth for recall performance throughout the process.

Advanced Document ClassificationContinuous Multi-Modal Learning(CMML)

The Continuous Multi‐Modal Learning, or CMML, workflow can be carried out entirely in Brainspace, and integrates supervised learning with Brainspace’stagging system.

CMML focuses on finding target documents during training, rather than on producing a predictive model to identify documents for later review. 

Page 19: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

19

Fourth Step: Create a Notebook from the Analytics Result and Sync Back to Commvault

Create Notebook

Select tag

Sync

Fifth Step: Take Action in Review Set from BrainspaceTag

Page 20: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

20

Conclusion

Given explosive growth of• Data volumes• Number of silos• Security threats• Compliance requirements

Considerations• Develop Information Governance as a core capability• Align data protection to the needs of the data and the business• Increase data intelligence• Drive data visibility across silos• Automate data policy

•Questions? Discussion

Page 21: Data Protection and Information Governance Across Data · PDF fileData Protection and Information Governance Across Data Silos ... • Consent, requests, breach ... Data Center Cleanup

11/16/2017

21

PROTECT. ACCESS. COMPLY. SHARE.COMMVAULT.COM | 888.746.3849 | [email protected]© 2017 COMMVAULT SYSTEMS, INC. ALL RIGHTS RESERVED.

Thank you.

Patrick McGrathDirector, Solutions Marketing, Content

[email protected]

@patrickiest