48
November 6-9, Seattle, WA NoSQL for the DBA @LynnLangit

NoSQL for the SQL Server Professional

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: NoSQL for the SQL Server Professional

November 6-9, Seattle, WA

NoSQL for the DBA

@LynnLangit

Page 2: NoSQL for the SQL Server Professional
Page 3: NoSQL for the SQL Server Professional

BigData = Exponentially More Data

Retail Example -> ‘Feedback Economy’• Number of transactions• Number of behaviors (collected every minute)

12:00 12:30 1:00 1:30 2:00 2:300

500

1000

1500

2000

2500

PurchasesLocationsPhone data

Page 4: NoSQL for the SQL Server Professional

BigData = ‘Next State’ Questions

• What could happen?• Why didn’t this happen?• When will the next new

thing happen?• What will the next new

thing be?• What happens?

Collecting Behavior

aldata

Page 5: NoSQL for the SQL Server Professional

So Why Change?

Page 6: NoSQL for the SQL Server Professional

Hitting (Relational) Walls

CA• Highly-available consistency

CP• Enforced consistency

AP• Eventual consistency

Page 7: NoSQL for the SQL Server Professional

Is NoSQL just Hadoop?

HUGE Hype factor in 2011 / 2012

Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license• enables applications to work with thousands of nodes and 

petabytes of data• was inspired by Google's MapReduce and Google File System

 (GFS) papers

Page 8: NoSQL for the SQL Server Professional

Oracle Loader for Hadoop

SQL Server Connector for Hadoop

Page 9: NoSQL for the SQL Server Professional

Working with Hadoop

Tools / Languages• Java (JDK) / Eclipse• MapReduce

• Map (query/format)• Reduce (aggregate)• plug-in for Eclipse

(Java)• Pig (ETL -- Java)• Hive (HQL Query)

• HBase tables• Others

• Mahout (analyze)• Karmasphere

(analyze)• R (analyze)

Page 10: NoSQL for the SQL Server Professional

10 November 6-9, Seattle, WA

Demo- HadoopDemo

Page 11: NoSQL for the SQL Server Professional

The reality…two pivots

Storage Methods• SQL (RDBMS) • NoSQL

Storage Locations• On premises • Cloud-hosted

Page 12: NoSQL for the SQL Server Professional

So many NoSQL options

More than just the Elephant in the roomOver 120+ types of NoSQL databases

Page 13: NoSQL for the SQL Server Professional

Flavors of NoSQL

Page 14: NoSQL for the SQL Server Professional

Graph Database

Use for data with• a lot of many-to-many relationships• recursive self-joins • when your primary objective is quickly

finding connections, patterns and relationships between the objects within lots of data

• Examples: Neo4J, FreeBase (Google)

Page 15: NoSQL for the SQL Server Professional

Column Database

Wide, sparse column setsSchema-lightExamples:

• Cassandra• HBase• BigTable• GAE HR DS

Page 16: NoSQL for the SQL Server Professional

More about Column Databases

Type A• Column-families• Non-relational• Sparse• Examples: HBase, Cassandra, xVelocity (SQL 2012 BISM)

Type B• Column-stores• Relational• Dense• Example:

• SQL Server 2012 Columnstore index

Page 17: NoSQL for the SQL Server Professional

Document Database (Mongo DB)

Use for data that is  • document-oriented (collection of

JSON documents) w/semi structured data• Encodings include XML, YAML, JSON & 

BSON

• binary forms • PDF, Microsoft Office documents -- Word,

Excel…)

• Examples: MongoDB, CouchDB

Page 18: NoSQL for the SQL Server Professional

18 November 6-9, Seattle, WA

Demo - MongoDBDemo

Page 19: NoSQL for the SQL Server Professional

Key / Value Database

• Schema-less• State (Persistent or Volatile)• Examples

• AWS Dynamo DB• Project Voldemort

Page 20: NoSQL for the SQL Server Professional

So which type of NoSQL? Back to CAP…

CP = noSQL/columnHadoopBig TableHBaseMemCacheDB

AP = noSQL/document or key/valueDynamoDBCouchDBCassandraVoldemort

CA = SQL/RDBMSSQL Sever / SQL AzureOracleMySQL

Page 21: NoSQL for the SQL Server Professional

Faster! – Move it to memory

• Microsoft• xVelocity / PowerPivot / SQL Server 2012 BISM• Hekatron (SQL Server future)

• Clourdera• Hadoop with Impala

• Google• BigQuery / Dremel

• Others• MapR and…• Dremel –> Drill• Redis - NoSQL

Page 22: NoSQL for the SQL Server Professional

What about the cloud?

Page 23: NoSQL for the SQL Server Professional

Cloud-hosted NoSQL up to 50x CHEAPER

Page 24: NoSQL for the SQL Server Professional

Consumer Storage Buckets

• Dropbox• Box• Windows SkyDrive• Google Drive• Amazon Cloud Drive• Apple iCloud

Page 25: NoSQL for the SQL Server Professional

Developer BLOB Storage Buckets

• Amazon – S3 or Glacier• Google – Cloud Storage• Microsoft Azure BLOBS• Others

Page 26: NoSQL for the SQL Server Professional

26 November 6-9, Seattle, WA

AWS S3 & Glacier

Demo

Page 27: NoSQL for the SQL Server Professional

Cloud-hosted RDBMS

AWS RDS – SQL Server, MySQL, Oracle• Medium cost• Solid feature set, i.e. backup,

snapshot• Use existing toolingGoogle – MySQL• Lowest cost• Most limited RDBMS functionalityMicrosoft – Windows Azure SQL Database• Highest cost• Azure VMs w/MySQL

Page 28: NoSQL for the SQL Server Professional

Other types of cloud data services

Hosting public datasets• Pay to read• Earn revenue by offering for read

Cleaning / matching (your) data • ETL – Microsoft Data Explorer, Google

Refine• Data Quality – Windows Azure Data

Market, InfoChimps, DataMarket.com

Page 29: NoSQL for the SQL Server Professional

Cloud – RDBMS, NoSQL & Hadoop

AWS Google Microsoft

Cloud RDBMS SQL Server, Oracle / mySQL

MySQL SQL Azure

NoSQL buckets S3 or Glacier Cloud Storage Azure Storage

NoSQL databases

DynamoDB H/R Datastore on GAE

Azure Tables

Streaming Machine Learning

Custom EC2 Prospective Search &Prediction API

StreamInsight & Mahout with Hadoop

Document or Graph

MongoDB on EC2

Freebase (g) MongoDB on Windows Azure

Hadoop Elastic MapReduce using S3 & EC2

MapR & GCE Windows Azure HDInsight

Data sets & other

Karmasphere Translation APIFull-text search

Azure Marketplace

Page 30: NoSQL for the SQL Server Professional

Pick your mix and then…

NoSQL

• Host locally• Host in the

Cloud

RDBMS

• Host locally• Host in the

Cloud

Other Service

s

• Use Cloud Data Markets

• Use Cloud ETL

Page 31: NoSQL for the SQL Server Professional

What about me?

Page 32: NoSQL for the SQL Server Professional

Common DBA Tasks in NoSQLRDBMS NoSQL

Import Data Import Data

Setup Security Setup Security

Perform a Backup Make a copy of the data

Restore a Database Move a copy to a location

Create an Index Create an Index

Join Tables Together Run MapReduce

Schedule a Job Schedule a (Cron) Job

Run Database Maintenance Monitor space and resources used

Send an Email from SQL Server

Set up resource threshold alerts

Search BOL Interpret Documentation

Page 33: NoSQL for the SQL Server Professional

33 November 6-9, Seattle, WA

Demo Hadoop IIDemo

Page 34: NoSQL for the SQL Server Professional

Making Sense – Asking Questions

Page 35: NoSQL for the SQL Server Professional

Data Scientists…

Page 36: NoSQL for the SQL Server Professional

Com

pari

ng…

Page 37: NoSQL for the SQL Server Professional

Karmasphere Studio for AWS

Page 38: NoSQL for the SQL Server Professional

38 November 6-9, Seattle, WA

Demo Hadoop IIIDemo

Page 39: NoSQL for the SQL Server Professional

Hadoop Connector to Excel

Page 40: NoSQL for the SQL Server Professional

40 November 6-9, Seattle, WA

Demo Google BigQuery

Demo

Page 41: NoSQL for the SQL Server Professional

Google BigQuery

Query as a service• Uses Dremel – not Hadoop• Pay to play (i.e. storage / query)

Hive (HQL-style) syntax• Web interface • Connector to Excel

Programmable• Command-line tools • APIs

Page 42: NoSQL for the SQL Server Professional

NoSQL To-Do List

Understand CAP & types of NoSQL databases• Use NoSQL when business needs designate• Use the right type of NoSQL for your business problem

Try out NoSQL on the cloud• Quick and cheap for behavioral data• Mashup cloud datasets• Good for specialized use cases, i.e. dev, test , training

environments

Learn NoSQL access technologies• New query languages, i.e. MapReduce, R, Infer.NET • New query tools (vendor-specific) – Google Refine,

Amazon Karmasphere, Microsoft Excel connectors, etc…

Page 43: NoSQL for the SQL Server Professional

The Changing Data Landscape

NoSQLRDBMS

OtherService

s

Page 44: NoSQL for the SQL Server Professional

www.TeachingKidsProgramming.org• Free Courseware• Do a Recipe Teach a Kid (Ages 10 ++)• Java or Microsoft SmallBasic

• recipes)

Page 45: NoSQL for the SQL Server Professional

www.TeachingKidsProgramming.org• Free Courseware • Do a Recipe Teach a Kid (Ages 10 ++)• Java or Microsoft SmallBasic

• recipes)

Page 46: NoSQL for the SQL Server Professional

Toward Data Craftsmanship…

Follow me @LynnLangit

RSS my blog www.LynnLangit.com

Hire me• To help build your BI/Big Data solution• To teach your team next gen BI• To learn more about using NoSQL

solutions

Page 48: NoSQL for the SQL Server Professional

48November 6-9, Seattle, WA

Thank youfor attending this session and the 2012 PASS Summit in Seattle