24
Presto Past, Present, and Future Dain Sundstrom

Presto Meetup @ Facebook (2014-05-14)

Tags:

Embed Size (px)

DESCRIPTION

Presto: Past, Present, and Future In the talk we discuss the progress since Presto was open sourced, what the Presto team is working on now, and what we will be working on over the next year.

Citation preview

Page 1: Presto Meetup @ Facebook (2014-05-14)

PrestoPast, Present, and Future

Dain Sundstrom

Page 2: Presto Meetup @ Facebook (2014-05-14)

SELECT now() - INTERVAL ‘6’ MONTH

Page 3: Presto Meetup @ Facebook (2014-05-14)

By The Numbers▪6 months▪15 releases▪30 contributors▪662 commits▪1406 files changed▪130,305 insertions(+) 43,699 deletions(-)

Page 4: Presto Meetup @ Facebook (2014-05-14)

New SQL Features▪Create table▪Distinct aggregations▪Cross joins▪Custom functions

Page 5: Presto Meetup @ Facebook (2014-05-14)

Optimizations▪Range predicate push down▪Distributed aggregations▪Distributed window functions▪Distinct-limit optimization▪Approximate queries

Page 6: Presto Meetup @ Facebook (2014-05-14)

Type System▪Plugins can add new scalar types▪Extensible operators ▪DATE, TIME, TIMESTAMP and INTERVAL▪Time zones with DST rules▪Localized parse and format▪HyperLogLog type

Page 7: Presto Meetup @ Facebook (2014-05-14)

New Connectors▪Hadoop 1.x▪Hadoop 2.x▪CDH 5▪Custom S3 integration for Hadoop▪Cassandra▪TPC-H

Page 8: Presto Meetup @ Facebook (2014-05-14)

SELECT now()

Page 9: Presto Meetup @ Facebook (2014-05-14)

Hive 0.13 Support▪New file formats▪ORC▪Parquet▪DWRF▪Vectorized ORC (2-3x more efficient)▪ORC stripe skipping

Page 10: Presto Meetup @ Facebook (2014-05-14)

Index Joins▪Targeting low cardinality joins▪Lazy hash build▪Predicate push down▪Aggregation push down▪Initial version in already checked in▪Currently supported in HBase and MySQL

Page 11: Presto Meetup @ Facebook (2014-05-14)

Connectors▪HBase▪Requires features in Facebook HBase▪Index joins▪JDBC (MySQL)▪Sharding ▪Index joins

Page 12: Presto Meetup @ Facebook (2014-05-14)

Views▪Create/drop views▪View definition stored in connector▪Fully optimized by Presto▪Views stored in Presto syntax▪Not compatible with existing Hive views

Page 13: Presto Meetup @ Facebook (2014-05-14)

Machine Learning▪Supports classification and regression▪Multiple algorithms (Currently only SVM)▪Feature extraction and normalization▪New functions and types▪Possibly extend SQL grammar▪Highly experimental

Page 14: Presto Meetup @ Facebook (2014-05-14)

Continuous Integration▪Continuous correctness testing▪Run queries against prod and trunk▪Continuous benchmark▪Run full test suite with every connector

▪Faster release cycle

Page 15: Presto Meetup @ Facebook (2014-05-14)

SELECT now() + INTERVAL ‘1’ YEARAPPROXIMATE AT 95.0 CONFIDENCE

Page 16: Presto Meetup @ Facebook (2014-05-14)

SQL Features▪Structs, Maps and Lists▪Table generating functions▪Scalar sub queries▪Features required to run all TPC-DS▪Create table with partitioning▪Possibly: Insert, delete, drop partition

Page 17: Presto Meetup @ Facebook (2014-05-14)

Execution Engine▪Huge joins and aggregations▪Hash distributed▪Co-distributed and co-partitioned▪Spill to disk (flash)▪Work stealing▪Basic task recovery

Page 18: Presto Meetup @ Facebook (2014-05-14)

Native Store▪Stores data directly on worker nodes▪Uses custom data format▪Initial use cases▪Store for ‘hot’ data▪Store for ‘live’ data▪Support co-distributed data

Page 19: Presto Meetup @ Facebook (2014-05-14)

Security▪Authentication▪Username/password, Kerberos, SSL cert▪Authorization▪Integration with plugins▪Grant permissions from SQL

Page 20: Presto Meetup @ Facebook (2014-05-14)

New REST API▪Prepared statements▪Bound parameters▪Server managed sessions▪Explicit support for non-query (DML/DDL)▪Split query submission, stats, and data fetching

Page 21: Presto Meetup @ Facebook (2014-05-14)

ODBC Driver ▪Targeting major BI tools▪Tableau, MicroStrategy and Excel▪Support for Windows, Mac and Linux▪Will require new REST API▪Written in D▪Entirely open source (ASL2)

Page 22: Presto Meetup @ Facebook (2014-05-14)

Plugins▪Plugin repository▪Manage plugins from CLI▪Function catalogs▪Push down joins and aggregations▪Custom optimizers

Page 23: Presto Meetup @ Facebook (2014-05-14)

SELECT questionFROM audienceWHERE isAwesome(question)

Page 24: Presto Meetup @ Facebook (2014-05-14)

(c) 2007 Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0