Public, Private, and Hybrid Cloud Computing With Low ?· Public, Private, and Hybrid Cloud Computing…

  • Published on

  • View

  • Download

Embed Size (px)


  • Computing Techniques for Big Data

    IEEE PresentationSeptember 24, 2012

    Gayn B. Winters, Ph.D.My IEEE email is gaynwinters

    Public, Private, and Hybrid Cloud ComputingWith Low Latency

  • Multiples of bytes

    kilobyte (KB) 10^3

    megabyte (MB) 10^6

    gigabyte (GB) 10^9

    terabyte (TB) 10^12

    petabyte (PB) 10^15

    exabyte (EB) 10^18

    zettabyte (ZB) 10^21

    yottabyte (YB) 10^24

    Library of Congress ~ 10 PB

    Actually 1024


    1024^3, etc.

    1024 ^ 7


  • Big Data Big Data = Exceeds in quantity, transfer

    speed, and/or data type variety. (3V's)

    Big Data worth $100B growing 10% per year

    As of June 2012, ~2.5 Exabytes (10^18) of business data are created DAILY

    Twitter receives 12TB of data a day

    Modern airline eng: 20TB/hr performance data

    NASA Sq. Kilometer Array: 100's of TB/sec = 10's of Exabytes/day

    Machines dominate data generation!!!

  • Moving Data to the Cloud will make this transmission ramp even steeper!

  • What do you do ...?(Tipping Points)

    When your data size exceeds what can be put on or handled by one machine?

    When your daily calculations take longer than a day?

    When your fancy compute systems have terrible ROA? E.g. 10% utilization. (IT's dirty secret!)

  • Virtualization

    VMM: run multiple VMs on a single machine


    In the 1960's, IBM partitioned mainframes into virtual machines as R&D tool on System 360. VM/370 released in 1972. OK performance.

    17 non-privileged instructions on x86 had to be trapped and replaced for virtualization. VMware (now EMC) and Concentrix (now Microsoft).

    2005 VM support in Intel VT-x and AMD-V.

    Virtual IO, storage, networks

    Live VM Migration

  • Clouds

  • Cloud Computing

    Elastically and Independently provision:

    Compute resources


    Network capacity: access and internal

    Resource Pooling and Load Balancing

    Only pay for what you use; includes transmission, maintenance, etc.

    Failover, backup, restore, disaster recovery

    Security and Availability Guaranteed, but

    See my blog:

  • Cloud ComputingNIST Definition

    Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model is composed of five essential characteristics, three service models, and four deployment models.

  • NIST Service Models

    Software as a Service (SaaS)

    rent an app+

    Platform as a Service ( PaaS)

    rent an OS+

    Infrastructure as a Service (IaaS)

    rent a datacenter

  • NIST Deployment Models

    Private Cloud

    Own everything, provide cloud services internally

    Local or at a remote data center

    Community Cloud

    Private cloud with shared ownership and use

    Public Cloud

    Vendor owned, rent everything

    Hybrid Cloud

    Mix of the above

    The big trend now!!!

  • Public Cloud Vendors

    Amazon Web Services (AWS)






    Dozens of others....

  • Building Clouds

    Need VMM virtualization technology

    Some Linux, VMware, Citrix Xen, many startups

    Need infrastructure technology

    Eucalyptus copies AWS APIs, has AWS hybrids, most installations

    OpenStack started by NASA, Rackspace; now part of Apache; 100's of supporting IaaS vendors

    CloudStack started by; purchased by Citrix, now part of Apache. Many installations, but fewer supporters than OpenStack.

    Other technology: monitoring, CASE, etc.

  • PCworld's Virtualization Table

  • Why Public Clouds?

    Usually cheaper than private clouds

    Public cloud vendors benefit from economies of scale

    Equipment: volume discounts

    Energy: can negotiate discounts and can locate their data centers where energy costs are low and air conditioning isn't much needed.

    Can recruit more (and better?) expertise

    Better for high usage elasticity

    Seasonal, tax, audits, random projects, fast growth, negative growth, schools, ...

  • Why Private Clouds?

    Availability (Public clouds are < 99.9%) discusses outages

    Security and Regulatory Conformance

    Performance configure own hardware for

    Base CPU, memory, speed to storage

    High IO/s

    Low latency

    Transition step to hybrid and public clouds

  • Google: Failures/Year/Cluster ~0.5 heat problems (pwr down most eq < 5 min, 2 day MTTR

    ~1 PDU failure (500-1000 machines gone, ~6 hrs MTTR

    ~1 rack move ( ~500-1000 machines down, ~6 hrs MTTR

    ~1 network rewiring (rolling ~5% down over 2 day span)

    ~20 rack failures (40-80 machines disappear, 1-6 hrs MTTR

    ~6 racks see 50% packet loss (?? MTTR)

    ~8 net maint failures (30 min connectivity loss MTTR)

    ~12 router reloads (2-3 min MTTR, web and clusters)

    ~3 router failures (web access 1 hr MTTR)

    ~1000 machine failures; ~many thousands disk failures

    Many minor, hw and sw problems; many Internet problems

  • Hadoop Distributed File System(Modeled after Google's GFS)

    Divide (big) file into large chunks 64-128MB.

    Replicate each chunk at least 3 times, storing each chunk as a linuxfile on a data server. Usually put 2 in the same rack.

    Optimize for reads and for record appends.

    Before every read, checksum the data.

    Maintain a master name server to keep track of metadata, incl. chunk locations, replicas,...

    Clients cache metadata and read/write directly to the data servers.

  • Chunk/Data Server

    Chunk/Data Server

    Chunk/Data Server

    Chunk/Data Server

  • HDFS/GFS Notes A client interacts with master name server only to get file

    metadata, which it caches. Then it can interact directly with the data servers.

    Data servers update their metadata with the master every minute or so. (Heartbeat)

    Large chunk size reduces size of metadata, master-client interaction, and network overhead.

    Master name server is a single point of failure that is addressed with shadow servers and logs.

    Applications must optimize to the file system.

    No Network Attached Storage or RAID (too expensive)

    No Mainframes (or big databases)

  • MapReduceSystem to Process Big Files

    Prepare a Big File for input as (key1, value1) pairs into multiple Mapper jobs

    Map(key1, value1) generates intermediate (key2, value2) pairs and separates this output into R files for input into R parallel Reducers.

    When all mappers done, the R reducers read and Sort their input files.

    Reduce(key2, value2) does a data reduction processing stage. Output = R Big Files of (key3, value3) pairs (not sorted nor combined.)

  • MapReduce System

    Job Master


  • Interesting Google Applications of MR

    Inverting {outgoing links} to {incoming links}

    Efficient support of Google Queries

    Stitching together overlapping satellite images tor Google Earth

    Rendering map-tile images of road segments for Google Maps

    Over 15,000+ MR applications; 100,000+ MR jobs/day

  • Simple MapReduce Examples

    Job Master


  • Sort on Key

    Map(key, value) = (key, value) identity fcn

    Reduce (key, value) = (key,value) identity fcn

    With multiple reducers, Map needs to have a partitioner class or a hash function so that k1 < k2 implies hash(k1) < hash(k2), an ordered hash function

  • String Searchwith Sorted Output

    Map (docName, contents) (docName, Contents, line_number)

    If string In Contents Then emit(docName, Contents, line_number)

    Use sorted hash to assign output to the reducers

    Reduce(docName, Contents) = (docName, Contents[line_number-1:line_number+1]

  • Index a Document

    map(pageName, pageText)

    For word w In pageText

    Emit(w, pageName) to file(hash(w))

    Use ordered hash to output to reducers

    reduce(word, values)

    For each pageName in values Until word changes Do AddToPageNameList(pageName)

    Emit(word, : , PageNameList)

  • Example Word Index


    Dog ACat APlay ADig AGirl BDog BPlay BFun B

    Boy CCat CGirl CPlay CFun C

    Dog ACat ADig ADog B

    Play APlay BGirl BFun B

    Boy CCat C

    Girl CPlay CFun C

    Boy CCat ACat CDog ADog BDig A

    Fun BFun CGirl BGirl CPlay APlay BPlay C

    Boy: CCat: A, CDog: A, BDig: A

    Fun: B, CGirl: B, CPlay: A, B, C



  • Latency

    One way network latency = time from packet send start at the source to packet arrival start at the destination.

    Round trip network latency = time from packet sent start at source to packet arrival start back at the source when the target just returns the packet with no processing. (Ping)

    Transaction time = latency + processing time

  • Low Latency Applications Trading, esp. arbitrage

    Info Week: A 1 millisecond advantage in trading can be worth $100 Million per year.

    Broadcasting market data, publishing volatilities, reconfiguring connections and performing other time-critical tasks.

    Connecting news to trades. Weather forecast in Brazil affect futures' price of coffee.

    Telecom voice and video quality

    Beware of Droop: handshake slow speed slowing down throughput on fiber channels.

  • Latency in Networking 4G and LTE mobile connections are driving the

    removal of latency problems across the entire infrastructure.

    Voice and video (and games of course)

    Trading using mobile devices.

    Light in a vacuum, 3.33 microseconds per km, through fiber ~5.0 microseconds/km (almost never a straight line nor unaided.)

    Distance NYC and LA is 2462 miles (3961 km) but 2777 miles (4469 km) driving. Fiber distance is close to driving distance.

    One way latency >> 22.3 milliseconds

    Add amplifier and regeneration droop time, longer distances.



  • Latency in the Cloud

    Assume your packet to the cloud (private or public) contains the command to start a job.

    Latency is measured from the time the packet starts to arrive to the time the job starts.

    Includes sending and receiving the entire transaction packet

    Low latency applications want (and will pay for) 10 GigE connections (and even 40 GigE sometimes) from clients to servers. These are in or near the datacenter!

  • Low Latency Requirements


    High MTTF (highly reliable equipment)

    MTTR (nanosecond failover)

    Bandwidth: 10-40 GigE

    Latency of all components in 100s of nanoseconds or less

    Timing, Timestamping, Synchronization

    Accurate clocks with nanosecond precision

  • What's Happening at the Bleeding Edge?

  • International Securities Exchange

    Orders acknowledged in 310 us, from entry to ISE network, through the matching engine, and back via 10 GigE

    Round trip mass quote to acknowledgement


  • BATS Licensed 2008, 11-12% of securities trading

    All electronic trading

    Roundtrip from firewall to FIX handler to matching engine and back

    Sustained under heavy load, including open and close of day trading

    10 GigE

    SAN storage (EMC and Hitachi)

    Avg 191 us (99.9% within 2 milliseconds)



    59 us round trip for order leaving customer port, through colo network and firewalls to the matching engine and back.

    Measured across the day, including open and close of trading.

    10 GigE (NASDAQ also sells access to its 40 GigE subnet)


  • CheckPoint Firewalls

    Data Security Checks

    Security clearance in 5 us

    Moves 110 billion bits/sec thru firewall

    Uses 108 core processors

    Supports 300,000 connections/second

    Forwards 60,000,000 packets/second


  • Switch Latency

    Switch hardware

    Zeptonics, a Sydney-based financial technology firm, said clients had begun testing its trading switch, which routes messages inside high-speed data centers in 130 nanoseconds.

    Cisco Nexus 3016 switch can get the last number or letter of a 1,024-character message onto a wire in 1.17 microseconds, or millionths of a second. The last number or letter of a 256-character message makes it onto the wire in 950 nanoseconds, or billionths of a second. Has nanosecond accuracy/consistency for PTP,

  • Cabling Latency

    Latency in nanoseconds for fiber, and milliseconds for copper. Obvious choice!

    Fiber: lighter, lower power, lower heat.

    BUT, fiber to the desk is impractical due to cost and the (current) lack of video, voice, and power over Ethernet (PoE) services. Consider combo cables, e.g. from siemon.

    All cabling and installation must be of the highest quality. Do not go cheap here!

  • How to get TCP Latency < 1 s

    Often TCP buffering gets in the way and adds ~40ms latency. The flush() function on the socket won't help.

    Set TCP_NODELAY on the socket (in Java, use Socket.setNoDelay(true) )

    Be careful of unbalanced networks. Too much TCP_NODELAY causes congestion.

    Another approach: use UDP as many real-time applications do. Sometimes, however, TCP performs better!

  • Clocks


    SyncPoint PCIe-1000 PTP clock for Linux

    170 nanosecond accuracy to UTC (USNO)

    Synchronizes all servers on LAN within 10 nanoseconds of each other


    Other: Korusys, Meinberg, Endrun, Optisynx, TimeSync,

    Absolute Time: GPS (~100 ns), DCF77 (long wave from Germany at 77.5 Khz, accuracy 2.5 us), CDMA/GSM mobile net, accuracy 10 us.

  • Saved the Best for Last - Servers

  • Google's Server

  • Google Server Backside

    Photo by Stephen Shankland of CNET

  • Google Server Details

    2 CPUs

    2 Hard Drives

    Custom 12v Power Supply

    Low cost, peak capacity, high efficiency

    Gigabyte MB converts 12v to lower

    12v Battery Backup (99.9% efficiency vs 95%)

    8 sticks DRAM

    Ethernet, USB, and other ports

    2U = 3.5 high

  • Facebook Rack of Servers1.5U, no big UPS, no AC, tall heat sinks, four huge

    fans, no tools, 6 lbs lighter, cheaper, mb from Quanta Computer (Taiwan)

  • Facebook's Open Compute Server

  • Open Compute(From Facebook's Specifications)

    Open Rack (OU = 1.5U) 21 wide

    60mm fans more efficient than 40mm

    Tall cooling towers

    Custom power supplies use 277 volt AC to avoid step downs from outside power. (Delta Electronics of Taiwan and Power One (Calif)

    One UPS per row: 20 48v batteries. Power supply has separate connection for UPS power.

    Datacenter has separate surge and harmonics suppression.

  • Facebook Open Compute v2 Server

  • Intel Open Compute mb v2

  • AMD Roadrunner Open Compute v2Designed for Financial Apps

  • What is your ideal server?

    CPU etc



    PCIe vs SATA

    SSD vs HDD

    Network Connections

    UPS and Power Supply


  • Consulting and Careers

    Data Scientist Hadoop Administrator- new job titles

    IBM's Big Data University (2,400 at their bootcamps, 14,000 registered for on-line)

    Hadoop certification (Cloudera, IBM, Hortonworks, MapR, Informatica)

    3rd Hadoop World (Oct 23-25, 2012 NYC)

    A deluge of Hadoop activities in Silicon Valley

    Consult on using Google Apps, AWS, Hyper-V, etc.

    Performance experts needed!!! (esp. Low Latency)

    Security experts needed!!!

    Analysis/algorithms experts needed !!!

  • Thank You!


    Please contact me at 366-4296 or if you remember letters better FON-GAYN (area code 714) or at with username gaynwinters for




View more >