Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
ManagingtheProliferationProblemOracleLabsperspectiveonlanguage-integratedquerying&dataanalytics
Confidential– OracleInternal/Restricted/HighlyRestricted
HassanChafiSr.Director,ResearchandAdvancedDevelopmentOracleLabs
MatthiasBrantnerLaurentDaynesSungpack Hong
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
SafeHarborStatementThefollowingisintendedtooutlineourgeneralproductdirection.Itisintendedforinformationpurposesonly,andmaynotbeincorporatedintoanycontract.Itisnotacommitmenttodeliveranymaterial,code,orfunctionality,andshouldnotberelieduponinmakingpurchasingdecisions.Thedevelopment,release,andtimingofanyfeaturesorfunctionalitydescribedforOracle’sproductsremainsatthesolediscretionofOracle.
Note:ThespeakernotesforthisslideincludeinstructionsforwhentouseSafeHarborStatementslides.
Tip!Remembertoremovethistextbox.
Confidential– OracleInternal/Restricted/HighlyRestricted 3
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
TheTradeoffTriangleLeadstoProliferationofSolutions
Confidential– OracleInternal/Restricted/HighlyRestricted 4
Productive/Easeofuse
General/ExpressivePowerExecutionPerformance
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
• ProgrammingLanguages(andassociatedlibraries)– Java– Python– R– Javascript– Ruby
• Databases– RowandOLTP– ColumnarandOLAP– DomainSpecific(Spatial,Multimedia)– ScaleOut(NoSQL)
• AnalyticalRuntimes–MapReduce,PIG,Flume– Delite,Spark,Flink– SparkProcessing,Storm– Tensorflow,Caffe
Confidential– OracleInternal/Restricted/HighlyRestricted 5
ProliferationofSolutionsEverywhereYouLook
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
ZoominonOneDomain:GraphAnalysis• Twomajorapporaches– Computationalgraphanalytics• Iteratethegraphmultipletimesandcomputeitsmathematicalproperties
– Graphpatternmatching• Querythegraphtofindsub-graphsthatmatchtothegivenrelationshippattern
•Manyworkloadsrequireapplyingbothapproachesatonce
6
spouse
friendfriend
friend
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
GraphSystemLandscape
GraphDatabase GraphFrameworkFocus:datamodeling,management,andquerying
RDF
PropertyGraph
Focus:graphtraversalandcomputation
RDF
PropertyGraph
John NewYorkbornIn
hasCarfriendOf
Hyundai Mary
LiveIn
favor
poetry
Twodifferentdatamodels
name:JohnhasCar:HyundaibornIn:New York
name:MaryliveIn:NewYorkfavor:poetry
label:friendOf
igraph NetworkX
SingleMachine
ClusterEnvironment
Executionenvironments
PGX
PGX(dist)
Newin12.2c
Integrationtoprovidebestofbothworlds
Confidential– OracleInternal 7
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
• AboutPGQL– NostandardquerylanguageforPropertyGraphsexists
– CypherfromNeo4J(proprietary)hasseveralissuesbothinsyntaxandsemantics
– PGQLisourownproposal,implementedinPGX
• PublicationActivities– Publishedthespecification(pgql-lang.org)– WeprovideaparserforitonGitHubforwideradoption
– WorkingwithISO/SQLtoextendSQLwithGraph
PGQL:aPropertyGraphQueryLanguageGRADES2016
Confidential– OracleInternal 8
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
ProliferationofLanguages• Languageshavebecomeafashionfordevelopers– Differentlanguagesfordifferenttasks• Web:JavaScript,Ruby,Php;Statisticalcomputing:R,Python;Systemscomputing:C,C++• Alltiersarebecomingpolyglot
–Manylargecompaniesallowdeveloperstochoose
• LongTailProblem(TIOBEPLIndexfromAugust2016)
OracleConfidential– HighlyRestricted
0%
5%
10%
15%
20%
Java C
C++ C#
Python PHP
JavaScrip
t
VisualBasic…
Perl
Assembly…
Delphi/O
bject…
Ruby
VisualBasic
Swift
Objective-C
Groo
vy R
MATLAB
PL/SQL
Go
ProgrammingLanguageMarketShare
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
GraalVM
Confidential– OracleInternal/Restricted/HighlyRestricted 10
Truffle/Graal
JavaScript R Ruby Python Go LLVM(C, Fortran)
HotSpotJVM
SubstrateVM
Polyglot Embeddable Performance
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
GraalVM forDataProcessing
UDFandStoredProcedureLanguageinDatabaseSystems
(OracleDatabase,MySQL,OracleNoSQL)
Just-in-timecompilationforSQLoperatorsandexpressions
Confidential– OracleInternal/Restricted/HighlyRestricted 11
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
JavaScriptUser-DefinedFunction
OracleConfidential– Internal/Restricted/HighlyRestricted 12
var validator = require(‘validator’);
module.exports.isEmail = function(email) {return validator.isEmail(input);
};
--- InSQLSELECTisEmail(email)FROMADDRESSBOOK;
// Export entry point to the Database (CommonJS)
// Use off-the-shelf NPM module
// TypeScript declarationexport function isEmail(email : string) : boolean;
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| Confidential– OracleInternal/Restricted/HighlyRestricted 13
ExpressioninWHEREclauseImpactonPerformance
30.01
5.63
0
5
10
15
20
25
30
35
sql wse
SQLExpressionOnly
31.93
5.22
05
101520253035
sql+js wse+js
SQLExpression+JavaScriptFunction
select count(*) from wmbint2 where (n1 * ((((n2 * 2) - n3)+n4) –
JS_ADD(n5, n6))) > 0;
select count(*) from wmbint2 where (n1 * ((((n2 * 2) - n3)+n4) –
(n5 + n6))) > 0;
second
s
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| Confidential– OracleInternal/Restricted/HighlyRestricted 14
AverageofseveralrunsinsamesessionTPC-HQ1&Q6
0
2
4
6
8
10
12
Oracle Walnut Oracle Walnut
Q1 Q6
Fetch LNX Walnut Other JIT0
0.5
1
1.5
2
2.5
1 2 3 4 5
Times(s)
Run#
JIT
Walnut
LNX
Fetch
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
SomeotherKeyChallengesResultingfromSolutionProliferation• DataMovementisabigissueforcustomers– Savvycustomersareawareofthisanditisoneofthefirstquestionstheyaskwhenbeingpitchedanewcomputationalengine
• Customerswanttoreducetimetoinsightfromrealtimedata– Dubiousonanythingthatrequireshoursofpre-processingwithnoincrementalstory
• Typicalworkflowsspanvariousdatasourcesthattypicallyarequeryableandencompasssometypeofsecuritymodel,socopyingthedataisnotalwaysfeasible– UnstableGeopoliticalenvironmentaddsfurtherrelevancytothisissue
Confidential– OracleInternal/Restricted/HighlyRestricted 15
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
• TopPGXrequestwas/isvisualizationandhigherlevelinterfaces(targetingtheExceluser)
• Ledtoadoptionofnotebookinterfaces– FirstintroducedbyMathematica1.0followedby
IPython (Jupyter),RStudio,thenZeppelin– EveryBigDatavendorisintroducinganotebook
product
• Generalrequirementforbigdataera– SWEngineersrequireIDEs– DataScientistsrequireDataStudios
• WesetouttoonlydevelopagraphvisualizationplugininJettobeembeddedinZeppelin– HitincompatibilitiesbetweenJetandZeppelin– DecidedtowriteaJetfront-endtoZeppelin
• Ambitionsgrewasweencounteredsuccess
MashupsofSolutionsmadeevenSimplerwithNotebooks
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
EnoughProblems,WhataretheOpportunities?• Canweachievemoreinteroperabilitybetweenruntimes?– Notonlyinhidingdatamovementdetail– ButalsoinbreakinguptheoverallworkflowcomputationintoahighlevelIRthatcanallowforoptimalplacementofcomputationandreductionofdatamovement
• Howdowegetvendorstoadoptaproposedsolution?– Networkeffectatplayhere
• Standardsaretypicallythemainwaytoimprovethesituation• WebStandardsseemtobeagoodmodelintermsofbalancingcommonstandardsandinnovations
Confidential– OracleInternal/Restricted/HighlyRestricted 17
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
WhatShouldThisLookLike?• CoreIR/Algebrafordataanalytics–OthersystemscanextendthisIRwithvendorspecificextensions,butthoseshouldalwaysbeexpressibleintermsofthecoreIRthatarepartofthecorestandardIR
• SecurityModelshouldbebakedin,withcelllevelsecuritylabelmodel– ReturnNullifuserdoesn’thaveprivilegetoaccessthedata– EnsureNullsemanticsallowreasonableprocessinginthepresenceofnulls
• DataProvenancemightalsobeimportanttobakein– Govt Regulationandcrossborderdatamovementemergingregulationbothachallengeandanopportunity
Confidential– OracleInternal/Restricted/HighlyRestricted 18
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
PluggableRuntimesandOptimizers• RuntimesthatarecompatiblewiththisIRhavetoimplementatleastasubsettogetthe“certified”seal,withmultiplelevelsofcompatibilitysupported– Scan+Filterataminimum
• RuntimesshouldreporttoaglobaloptimizerthesetofIRoperatorstheysupport• DatamovementbetweenRuntimes(includingdatalayouttransformation)shouldbeexplicitlysurfacedinthecostmodeltomakedecisiononwhendatashouldnotbemovedevenwhensignificantperformancegainsaretobehad• Datacachingandincrementalupdatestothedatashouldalsobepartofthedesignofthesystem– Dataisrarelystatic
Confidential– OracleInternal/Restricted/HighlyRestricted 19
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
LookingforwardtotheGreatDiscussions
Confidential– OracleInternal/Restricted/HighlyRestricted 20
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|
SafeHarborStatementTheprecedingisintendedtooutlineourgeneralproductdirection.Itisintendedforinformationpurposesonly,andmaynotbeincorporatedintoanycontract.Itisnotacommitmenttodeliveranymaterial,code,orfunctionality,andshouldnotberelieduponinmakingpurchasingdecisions.Thedevelopment,release,andtimingofanyfeaturesorfunctionalitydescribedforOracle’sproductsremainsatthesolediscretionofOracle.
Note:ThespeakernotesforthisslideincludeinstructionsforwhentouseSafeHarborStatementslides.
Tip!Remembertoremovethistextbox.
Confidential– OracleInternal/Restricted/HighlyRestricted 21
Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| Confidential– OracleInternal/Restricted/HighlyRestricted 22