23

Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general
Page 2: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

ManagingtheProliferationProblemOracleLabsperspectiveonlanguage-integratedquerying&dataanalytics

Confidential– OracleInternal/Restricted/HighlyRestricted

HassanChafiSr.Director,ResearchandAdvancedDevelopmentOracleLabs

MatthiasBrantnerLaurentDaynesSungpack Hong

Page 3: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

SafeHarborStatementThefollowingisintendedtooutlineourgeneralproductdirection.Itisintendedforinformationpurposesonly,andmaynotbeincorporatedintoanycontract.Itisnotacommitmenttodeliveranymaterial,code,orfunctionality,andshouldnotberelieduponinmakingpurchasingdecisions.Thedevelopment,release,andtimingofanyfeaturesorfunctionalitydescribedforOracle’sproductsremainsatthesolediscretionofOracle.

Note:ThespeakernotesforthisslideincludeinstructionsforwhentouseSafeHarborStatementslides.

Tip!Remembertoremovethistextbox.

Confidential– OracleInternal/Restricted/HighlyRestricted 3

Page 4: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

TheTradeoffTriangleLeadstoProliferationofSolutions

Confidential– OracleInternal/Restricted/HighlyRestricted 4

Productive/Easeofuse

General/ExpressivePowerExecutionPerformance

Page 5: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

• ProgrammingLanguages(andassociatedlibraries)– Java– Python– R– Javascript– Ruby

• Databases– RowandOLTP– ColumnarandOLAP– DomainSpecific(Spatial,Multimedia)– ScaleOut(NoSQL)

• AnalyticalRuntimes–MapReduce,PIG,Flume– Delite,Spark,Flink– SparkProcessing,Storm– Tensorflow,Caffe

Confidential– OracleInternal/Restricted/HighlyRestricted 5

ProliferationofSolutionsEverywhereYouLook

Page 6: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

ZoominonOneDomain:GraphAnalysis• Twomajorapporaches– Computationalgraphanalytics• Iteratethegraphmultipletimesandcomputeitsmathematicalproperties

– Graphpatternmatching• Querythegraphtofindsub-graphsthatmatchtothegivenrelationshippattern

•Manyworkloadsrequireapplyingbothapproachesatonce

6

spouse

friendfriend

friend

Page 7: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

GraphSystemLandscape

GraphDatabase GraphFrameworkFocus:datamodeling,management,andquerying

RDF

PropertyGraph

Focus:graphtraversalandcomputation

RDF

PropertyGraph

John NewYorkbornIn

hasCarfriendOf

Hyundai Mary

LiveIn

favor

poetry

Twodifferentdatamodels

name:JohnhasCar:HyundaibornIn:New York

name:MaryliveIn:NewYorkfavor:poetry

label:friendOf

igraph NetworkX

SingleMachine

ClusterEnvironment

Executionenvironments

PGX

PGX(dist)

Newin12.2c

Integrationtoprovidebestofbothworlds

Confidential– OracleInternal 7

Page 8: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

• AboutPGQL– NostandardquerylanguageforPropertyGraphsexists

– CypherfromNeo4J(proprietary)hasseveralissuesbothinsyntaxandsemantics

– PGQLisourownproposal,implementedinPGX

• PublicationActivities– Publishedthespecification(pgql-lang.org)– WeprovideaparserforitonGitHubforwideradoption

– WorkingwithISO/SQLtoextendSQLwithGraph

PGQL:aPropertyGraphQueryLanguageGRADES2016

Confidential– OracleInternal 8

Page 9: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

ProliferationofLanguages• Languageshavebecomeafashionfordevelopers– Differentlanguagesfordifferenttasks• Web:JavaScript,Ruby,Php;Statisticalcomputing:R,Python;Systemscomputing:C,C++• Alltiersarebecomingpolyglot

–Manylargecompaniesallowdeveloperstochoose

• LongTailProblem(TIOBEPLIndexfromAugust2016)

OracleConfidential– HighlyRestricted

0%

5%

10%

15%

20%

Java C

C++ C#

Python PHP

JavaScrip

t

VisualBasic…

Perl

Assembly…

Delphi/O

bject…

Ruby

VisualBasic

Swift

Objective-C

Groo

vy R

MATLAB

PL/SQL

Go

ProgrammingLanguageMarketShare

Page 10: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

GraalVM

Confidential– OracleInternal/Restricted/HighlyRestricted 10

Truffle/Graal

JavaScript R Ruby Python Go LLVM(C, Fortran)

HotSpotJVM

SubstrateVM

Polyglot Embeddable Performance

Page 11: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

GraalVM forDataProcessing

UDFandStoredProcedureLanguageinDatabaseSystems

(OracleDatabase,MySQL,OracleNoSQL)

Just-in-timecompilationforSQLoperatorsandexpressions

Confidential– OracleInternal/Restricted/HighlyRestricted 11

Page 12: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

JavaScriptUser-DefinedFunction

OracleConfidential– Internal/Restricted/HighlyRestricted 12

var validator = require(‘validator’);

module.exports.isEmail = function(email) {return validator.isEmail(input);

};

--- InSQLSELECTisEmail(email)FROMADDRESSBOOK;

// Export entry point to the Database (CommonJS)

// Use off-the-shelf NPM module

// TypeScript declarationexport function isEmail(email : string) : boolean;

Page 13: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| Confidential– OracleInternal/Restricted/HighlyRestricted 13

ExpressioninWHEREclauseImpactonPerformance

30.01

5.63

0

5

10

15

20

25

30

35

sql wse

SQLExpressionOnly

31.93

5.22

05

101520253035

sql+js wse+js

SQLExpression+JavaScriptFunction

select count(*) from wmbint2 where (n1 * ((((n2 * 2) - n3)+n4) –

JS_ADD(n5, n6))) > 0;

select count(*) from wmbint2 where (n1 * ((((n2 * 2) - n3)+n4) –

(n5 + n6))) > 0;

second

s

Page 14: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| Confidential– OracleInternal/Restricted/HighlyRestricted 14

AverageofseveralrunsinsamesessionTPC-HQ1&Q6

0

2

4

6

8

10

12

Oracle Walnut Oracle Walnut

Q1 Q6

Fetch LNX Walnut Other JIT0

0.5

1

1.5

2

2.5

1 2 3 4 5

Times(s)

Run#

JIT

Walnut

LNX

Fetch

Page 15: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

SomeotherKeyChallengesResultingfromSolutionProliferation• DataMovementisabigissueforcustomers– Savvycustomersareawareofthisanditisoneofthefirstquestionstheyaskwhenbeingpitchedanewcomputationalengine

• Customerswanttoreducetimetoinsightfromrealtimedata– Dubiousonanythingthatrequireshoursofpre-processingwithnoincrementalstory

• Typicalworkflowsspanvariousdatasourcesthattypicallyarequeryableandencompasssometypeofsecuritymodel,socopyingthedataisnotalwaysfeasible– UnstableGeopoliticalenvironmentaddsfurtherrelevancytothisissue

Confidential– OracleInternal/Restricted/HighlyRestricted 15

Page 16: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

• TopPGXrequestwas/isvisualizationandhigherlevelinterfaces(targetingtheExceluser)

• Ledtoadoptionofnotebookinterfaces– FirstintroducedbyMathematica1.0followedby

IPython (Jupyter),RStudio,thenZeppelin– EveryBigDatavendorisintroducinganotebook

product

• Generalrequirementforbigdataera– SWEngineersrequireIDEs– DataScientistsrequireDataStudios

• WesetouttoonlydevelopagraphvisualizationplugininJettobeembeddedinZeppelin– HitincompatibilitiesbetweenJetandZeppelin– DecidedtowriteaJetfront-endtoZeppelin

• Ambitionsgrewasweencounteredsuccess

MashupsofSolutionsmadeevenSimplerwithNotebooks

Page 17: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

EnoughProblems,WhataretheOpportunities?• Canweachievemoreinteroperabilitybetweenruntimes?– Notonlyinhidingdatamovementdetail– ButalsoinbreakinguptheoverallworkflowcomputationintoahighlevelIRthatcanallowforoptimalplacementofcomputationandreductionofdatamovement

• Howdowegetvendorstoadoptaproposedsolution?– Networkeffectatplayhere

• Standardsaretypicallythemainwaytoimprovethesituation• WebStandardsseemtobeagoodmodelintermsofbalancingcommonstandardsandinnovations

Confidential– OracleInternal/Restricted/HighlyRestricted 17

Page 18: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

WhatShouldThisLookLike?• CoreIR/Algebrafordataanalytics–OthersystemscanextendthisIRwithvendorspecificextensions,butthoseshouldalwaysbeexpressibleintermsofthecoreIRthatarepartofthecorestandardIR

• SecurityModelshouldbebakedin,withcelllevelsecuritylabelmodel– ReturnNullifuserdoesn’thaveprivilegetoaccessthedata– EnsureNullsemanticsallowreasonableprocessinginthepresenceofnulls

• DataProvenancemightalsobeimportanttobakein– Govt Regulationandcrossborderdatamovementemergingregulationbothachallengeandanopportunity

Confidential– OracleInternal/Restricted/HighlyRestricted 18

Page 19: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

PluggableRuntimesandOptimizers• RuntimesthatarecompatiblewiththisIRhavetoimplementatleastasubsettogetthe“certified”seal,withmultiplelevelsofcompatibilitysupported– Scan+Filterataminimum

• RuntimesshouldreporttoaglobaloptimizerthesetofIRoperatorstheysupport• DatamovementbetweenRuntimes(includingdatalayouttransformation)shouldbeexplicitlysurfacedinthecostmodeltomakedecisiononwhendatashouldnotbemovedevenwhensignificantperformancegainsaretobehad• Datacachingandincrementalupdatestothedatashouldalsobepartofthedesignofthesystem– Dataisrarelystatic

Confidential– OracleInternal/Restricted/HighlyRestricted 19

Page 20: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

LookingforwardtotheGreatDiscussions

Confidential– OracleInternal/Restricted/HighlyRestricted 20

Page 21: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

SafeHarborStatementTheprecedingisintendedtooutlineourgeneralproductdirection.Itisintendedforinformationpurposesonly,andmaynotbeincorporatedintoanycontract.Itisnotacommitmenttodeliveranymaterial,code,orfunctionality,andshouldnotberelieduponinmakingpurchasingdecisions.Thedevelopment,release,andtimingofanyfeaturesorfunctionalitydescribedforOracle’sproductsremainsatthesolediscretionofOracle.

Note:ThespeakernotesforthisslideincludeinstructionsforwhentouseSafeHarborStatementslides.

Tip!Remembertoremovethistextbox.

Confidential– OracleInternal/Restricted/HighlyRestricted 21

Page 22: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general

Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| Confidential– OracleInternal/Restricted/HighlyRestricted 22

Page 23: Managing the Proliferation Problem€¦ · Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general