Session 1 introduction to Big Data

  • Published on
    19-Feb-2017

  • View
    78

  • Download
    1

Embed Size (px)

Transcript

  • 09-08-2016

    1

    Big Data Analytics

    Dr. Umesh R. Hodeghatta, Ph.D

    http://www.mytechnospeak.com

    Email: umesh@mytechnospeak.com

    8/9/2016 1

    Outline

    Introduction to Big Data Processing

    What is Big Data

    Big DATA Challenges

    How to handle Big Data

    BIG Data Applications

    Big Data Architecture

    Big Data Tools

    8/9/2016 2

    What is Big Data

    Everyday, we create billions of bytes of data

    sensors used to gather climate information

    Social media posts

    digital pictures and videos

    Transaction records

    Web logs

    Cell phones, WhatsAp messages 1. Structured data 2. Unstructured data 3. Semi structured data

    8/9/2016 3

    Examples of Big Data

    Facebook handles 40-50 billion photos from its site members everymonth.

    Walmart handles more than 2 million customer transactions every hour, databases estimated to contain more than 2.5

    petabytes of data.

    FICO Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide

    235 Terabytes of data collected by the US Library of Congress in 2011

    8/9/2016 4

    http://www.mytechnospeak.com/

  • 09-08-2016

    2

    What do you do with DATA?

    What can you do with DATA?

    Why do you need BIG DATA?

    8/9/2016 5

    What is BIG-DATA

    DATA is BIG in volume

    Challenges are different from SMALL data

    How to handle BIG DATA processing?

    CPU?

    Memory?

    Storage?

    New tools and techniques required

    8/9/2016 6

    Characteristics of BIG DATA

    BIG DATA

    Structured &

    Unstructured

    Volume

    Terrabytes Zettabytes

    Batch processing Streaming Data

    8/9/2016 7

    Characteristics of Big Data

    Volume

    The size of the data

    Variety

    Documents, messages, facebook posts, youtube videos, etc

    Velocity

    Generation and processing of data

    Complexity

    Multiple sources, large volumes, less processing time

    8/9/2016 8

  • 09-08-2016

    3

    Enabler of Big Data

    Challenges

    Capturing data

    Curation

    Storage

    Searching

    Sharing

    Transfer

    Analysis

    Presentation

    Increase in storage capacities

    Increase in processing powers

    Abundance of data

    8/9/2016 9

    Big Data and Analytics

    "Big Data" refers to not only storage of data but a practice which deals with analysing this data and utilizing the data analysis: To derive strategic decisions

    such as introducing new category of products, restructuring the organization and improving the customer care services by analysing the customer care recordings/logs.

    Learning outcome of marketing campaigns

    Planning price strategy

    8/9/2016 10

    Big Data Technologies

    Processing

    Infrastructure to process and store huge volume of structured and unstructured data in real time

    Amazon, IBM, Microsoft, etc

    Analytics

    Capabilities to analyze the data, build the model and complex analysis

    MapReduce, MongoDB, NoSQL, Cassandra

    8/9/2016 11

    Magic Quadrant for Business Intelligence and Analytics Platforms

    8/9/2016 12

  • 09-08-2016

    4

    Hype Cycle for Emerging Technologies, 2015

    8/9/2016 13 8/9/2016 14

    Big Data Tools

    8/9/2016 15

    Big Data Tools

    NOSQL Database MongoDB, Cassandra, Hbase, Zookepper, Redis

    MapReduce

    Hadoop, Hive, Pig, Kafka, Flume, MapR, Oozie, Greenplum

    Processing R, Lucene, Solr, ElasticSearch, Google,

    Storage HDFS, S3

    Servers Elastic, Beanstalk, Google App Engine

    8/9/2016 16

  • 09-08-2016

    5

    BIG DATA Players

    8/9/2016 17

    BIG DATA Applications

    8/9/2016 18

    Recommendation Systems

    8/9/2016 19

    Recommendation Engine

    8/9/2016 20

  • 09-08-2016

    6

    Online Advertisement

    Online Users

    Users Profile

    News Articles - Reauters - NYTimes - LATimes - Associate Press

    Trend System

    Trends and Segments

    Stock Market Jobs City Events

    ONLNE ADVERTISEMENT

    8/9/2016 21

    Telecommunication

    Telecom Network (25000~ devices

    Alarms

    Provisioning

    Fault Analysis - Fault Detection

    Prediction - New

    Customers - New Services

    8/9/2016 22

    Social Network Analysis

    8/9/2016 23

    Useful Websites

    http://www.kdnuggets.com/

    http://www.kaggle.com

    8/9/2016 24

    http://www.kdnuggets.com/http://www.kdnuggets.com/http://www.kaggle.com/

  • 09-08-2016

    7

    Assignment

    Two Application of BIG Data

    Two different Areas/Domains

    Total Size of Data

    Tools and Technologies Used

    8/9/2016 25

    End Of Session

    8/9/2016 26

    Dr. Umesh R. Hodeghatta, Ph.D

    http://www.mytechnospeak.com

    Email: umesh@mytechnospeak.com

    http://www.mytechnospeak.com/

Recommended

View more >