1
Computation Reuse in Analytics Job Service at Microsoft Alekh Jindal, Shi Qiao, Hiren Patel, Zhicheng Yin, Jieming Di, Malay Bag, Marc Friedman, Yifung Lin, Konstantinos Karanasos, Sriram Rao Microsoft Motivation CloudViews Architecture Performance Impact Job Service/ Serverless Analytics/ Analytics-as-a-Service: User Only provide SQL queries over stored data Cloud provider takes care of manage h/w, s/w, and execution Users only pay for the processing cost SCOPE Job Service Hyper scale data processing system for internal data analytics at Microsoft Computation Reuse Challenge Recurring workloads with new inputs/parameters Always online with SLA requirements Cost estimations very challenging Materialized views over recurring workloads CloudViews Analyzer Feedback Loop View Selection Physical Design View Expiry CloudViews Runtime Metadata Service Online Materialization Query Rewriting Synchronization Job Coordination Key Ingredients Reuse over Recurring Workloads Query Optimization Query Rewriting using Views Online View Materialize We Are Hiring! Improved Latency and Processing time. Periodic jobs New inputs/ parameters Different data formats Lots of user code

Computation Reuse in Analytics Job Service at Microsoftjindal-web.appspot.com/posters/cloudviews-poster-sigmod18.pdf · Computation Reuse in Analytics Job Service at Microsoft Alekh

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computation Reuse in Analytics Job Service at Microsoftjindal-web.appspot.com/posters/cloudviews-poster-sigmod18.pdf · Computation Reuse in Analytics Job Service at Microsoft Alekh

Computation Reuse in Analytics Job Service at MicrosoftAlekh Jindal, Shi Qiao, Hiren Patel, Zhicheng Yin, Jieming Di, Malay Bag,

Marc Friedman, Yifung Lin, Konstantinos Karanasos, Sriram Rao

Microsoft

Motivation

CloudViews Architecture

Performance Impact

• Job Service/ Serverless Analytics/ Analytics-as-a-Service:• User Only provide SQL queries over stored data• Cloud provider takes care of manage h/w, s/w, and execution• Users only pay for the processing cost

• SCOPE Job Service• Hyper scale data processing system for internal data analytics at Microsoft

• Computation Reuse Challenge• Recurring workloads with new inputs/parameters• Always online with SLA requirements• Cost estimations very challenging

✓Materialized views over recurring workloads

✓CloudViews Analyzer✓Feedback Loop✓View Selection✓Physical Design✓View Expiry

✓CloudViews Runtime✓Metadata Service✓Online Materialization✓Query Rewriting✓Synchronization✓Job Coordination

Key Ingredients

Reuse over Recurring Workloads Query Optimization

Query

Rewriting

using Views

Online View

Materialize

We Are Hiring!

Improved Latency and Processing time.

• Periodic jobs

• New inputs/

parameters

• Different data

formats

• Lots of user code