65

Machine Learning Diagnostics Using Oracle AHF-Pub · Pattern Recognition, & BN Engines Time CPU ASM IOPS Network % util Network_ Packets Dropped Log file sync Log file parallel write

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    MachineLearningDiagnostics UsingOracleAutonomousHealthFramework

    MarkV.Scardina– DirectorofProductManagementAnkitaKhandelwal– ProductManagerOracleAutonomousHealthFrameworkOctober4,2017

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    SafeHarborStatementThefollowingisintendedtooutlineourgeneralproductdirection.Itisintendedforinformationpurposesonly,andmaynotbeincorporatedintoanycontract.Itisnotacommitmenttodeliveranymaterial,code,orfunctionality,andshouldnotberelieduponinmakingpurchasingdecisions.Thedevelopment,release,andtimingofanyfeaturesorfunctionalitydescribedforOracle’sproductsremainsatthesolediscretionofOracle.

    3

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    ProgramAgenda

    IntroducingAppliedMachineLearningforOperations

    AppliedMachineLearningforReal-timePrevention

    AppliedMachineLearningforRapidRecovery

    ODAManagementApplianceProfile

    ForFurtherInformation/Q&A

    1

    2

    3

    4

    5

    4

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    ProgramAgendawithHighlight

    IntroducingAppliedMachineLearningforDiagnostics

    AppliedMachineLearningforReal-timePrevention

    AppliedMachineLearningforRapidRecovery

    ODAManagementApplianceProfile

    ForFurtherInformation/Q&A

    1

    2

    3

    4

    5

    5

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    WhyAppliedMachineLearning?

    • Bringsanapplication’sperspectiveversusaplatformtoolkitviewpoint• Bringsdatascience,algorithms,anddomainexpertisetogether• Packagesmachinelearningintousable,real-worldoperationalalgorithmsandmodelsthatareappliedatruntime

    • Producesresultsandrecommendationseasilyunderstoodandtrustedbynon-datascientist/analystend-users

    6

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    AppliedMachineLearningforDiagnostics

    • GenericML-extractedDataClustersareinsufficientfordiagnostics

    • Operationaldatacorrelationdoesnotdeterminerootcause

    • Trustedrootcausedeterminationcriticaltoswiftcorrectiveactions

    • Algorithmsselectedand modelsbuiltrequiredomainexpertise

    • Modelsrefinedviafieldfeedback

    7

    SubjectMatterExpert

    ASH

    MLKnowledgeExtraction

    ModelGeneration

    HumanSupervision

    ApplicationOptimizedModels

    Feedback

    ScrubData

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    ClusterVerification

    Utility

    ORAchk

    ClusterHealthMonitor

    ClusterHealthAdvisor

    TraceFileAnalyzer

    HangManager

    MemoryGuard

    QualityofService

    ManagementOracle12cAutonomousHealthFramework

    8

    PoweredbyAppliedMachineLearning

    ManagedCentrallybyODADSC

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    AppliedMLinOracleAutonomousHealthFramework

    9

    OracleSupportServicesOracleEnterpriseManager

    ClusterHealthAdvisorHangManagerQoSMPolicies

    PreventativeActions

    Manual Auto

    TraceFileAnalyser

    GIMgmtRepository

    SRs

    CorrectiveActions

    Manual Auto

    Real-timePrevention RapidRecoveryInputs

    Bugs/SRs BestPractices Metrics Logs Diagnostics

    Inputs

    Prognostics Alerts Metrics Logs Bugs/SRs

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    ProgramAgenda

    IntroducingAppliedMachineLearningforDiagnostics

    AppliedMachineLearningforReal-timePrevention

    AppliedMachineLearningforRapidRecovery

    ODAManagementApplianceProfile

    ForFurtherInformation/Q&A

    1

    2

    3

    4

    5

    10

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    AppliedMachineLearning– ClusterHealthAdvisor(CHA)

    11

    • Monitorsinreal-timeOracledatabase*systemsandtheirhosts• Detectsearlyimpendingaswellasongoingsystemfaults• Diagnosesandidentifiesthemostlikelyrootcauses• ProvidestargetedactionsforpreventionorescalationofDB/serverproblems

    • Generatesrelevantalertsandnotificationsforrapidresponse• Releasedin12.2andcurrentlyundertestbymajorRACcustomersforproduction

    *CurrentlyRAC/R1NDatabasesonly

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    ClusterHealthAdvisor- ScopeofProblemDetection

    • Over30nodeanddatabaseproblemshavebeenmodeled• Over150OSandDBmetricpredictorsidentified• Problemnetworkmodelcreatedbaseduponitssignature• ProblemDetectionin12.2.0.1includes

    – Interconnect,GlobalCacheandClusterProblems– HostCPUandMemory,PGAMemorystress– IOandStoragePerformanceissues– ReconfigurationandRecoveryissues–WorkloadandSessionabnormalvariations

    12

    BestEffortImmediateGuidedDiagnosis

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    ClusterHealthAdvisor(CHA)ArchitectureOverview

    13

    OSData

    GIMR

    CHADDriver

    DBData

    CHM

    NodeHealth

    PrognosticsEngine

    DatabaseHealth

    PrognosticsEngine

    OSModel

    DBModel

    • cha – Clusternoderesource• SingleJavaoracle.cha.server.CHADDriver daemonpernode

    • ReadsClusterHealthMonitordatadirectlyfrommemory

    • ReadsDBASHdatafromSMRw/oDBconnection• UsesOSandDBmodelsanddatatoperformprognostics

    • StoresanalysisandevidenceintheGIManagementRepository

    • SendsalertstoEMCCIncidentManagerpertarget

    EMCCAlert

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    AppliedMachineLearning– ClusterHealthAdvisor

    • ActualInternalandExternalcustomerdatadrivesmodeldevelopment

    • Appliedpurpose-builtAppliedMLforknowledgeextraction

    • ExpertDevteamscrubsdata• GeneratesBayesianNetwork-baseddiagnosticroot-causemodels

    • UsesBN-basedrun-timemodelstoperformreal-timeprognostics

    14

    DiscoversPotentialCluster&DBProblems

    CHADevTeamASH

    MLKnowledgeExtraction

    BNModels

    ExpertSupervision

    CHARuntimeModel

    Feedback

    CHA

    CHA

    ScrubData

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 15

    DataSourcesandDataPointsClusterHealthAdvisor

    Time CPU ASMIOPS

    Network%util

    Network_PacketsDropped

    Logfilesync

    Logfileparallelwrite

    GCCRrequest

    GCcurrentrequest

    GCcurrentblock2-way

    GCcurrentblock busy

    Enq:CF-contention

    15:16:00 0.90 4100 13% 0 2ms 600us 0 0 300us 1.5ms 0

    ACHADataPoint contains>150signals(statisticsandevents)frommultiplesources

    OS,ASM,Network DB(ASH,AWRsession,systemandPDBstatistics)

    Statisticsarecollectedata1secondinternalsamplingrate,synchronized,smoothedandaggregatedtoaDataPointevery5seconds

    Confidential– OracleRestricted

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 16

    ModelsCapturetheDynamicBehaviorofallNormalOperationModelsCaptureallNormalOperatingModes

    0

    5000

    10000

    15000

    20000

    25000

    30000

    35000

    40000

    10:00 2:00 6:00

    51009025

    4024

    2350

    4100

    2205010000

    21000

    4400

    2500

    4900

    800

    IOPS

    usercommits(/sec)

    logfileparallelwrite(usec)

    logfilesync(usec)

    Amodelcapturesthenormalloadphasesandtheirstatisticsovertime,andthusthecharacteristicsforallloadintensitiesandprofiles.Duringmonitoring,anydatapointsimilartooneofthevectorsisNORMAL.OnecouldsaythatthemodelREMEMBERSthenormaloperationaldynamicsovertime

    In-MemoryReferenceMatrix(Partof“Normality”Model)

    IOPS #### 2500 4900 800 ####

    User Commits #### 10000 21000 4400 ####

    LogFileParallelWrite #### 2350 4100 22050 ####

    LogFile Sync #### 5100 9025 4024 ####

    … … … … … …

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 17

    CHAModel:FindSimilaritywithNormalValuesClusterHealthAdvisor

    Observedvalues(PartofaDataPoint)

    CHAestimator/predictor(ESEE):“basedonmynormalitymodel,thevalueofIOPSshouldbeinthevicinityof~4900,butitisreportedas10500,thisiscausingaresidualof~5600inmagnitude”,

    CHAfaultdetector:“suchhighmagnitudeofresidualsshouldbetrackedcarefully!I’llkeepaneyeontheincomingsequenceofthissignalIOPS andifitremainsdeviantI’llgenerateafaultonit”.

    In-MemoryReferenceMatrix(Partof“Normality”Model)

    IOPS #### 2500 4900 800 ####

    User Commits #### 10000 21000 4400 ####

    LogFileParallelWrite #### 2350 4100 22050 ####

    LogFile Sync #### 5100 9025 4024 ####

    … … … … … …

    10500

    20000

    4050

    10250

    ResidualValues(PartofaDataPoint)

    5600

    -1000

    -50

    325

    Observed-Predicted=

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 18

    InlineandImmediateFaultDetectionandDiagnosticInferenceClusterHealthAdvisor

    MachineLearning,PatternRecognition,

    &BNEngines

    Time CPU ASMIOPS

    Network%util

    Network_PacketsDropped

    Logfilesync

    Logfileparallelwrite

    GCCRrequest

    GCcurrentrequest

    GCcurrentblock2-way

    GC currentblock busy

    Enq:CF-

    contention

    15:16:00 0.90 4100 88% 105 2ms 600us 504ms 513 ms 2 ms 5.9 ms 0

    15:16:00 OK OK HIGH1HIGH2

    OK OK HIGH3

    HIGH3

    HIGH4

    HIGH4

    OK

    Input:DataPointatTimet

    FaultDetectionandClassification

    DiagnosticInference

    15:16:00

    Symptoms1. NetworkBandwidthUtilization2. NetworkPacketLoss3. GlobalCacheRequestsIncomplete4. GlobalCacheMessageLatency

    RootCause(TargetofCorrectiveAction)

    NetworkBandwidthUtilizationDiagnosticInferenceEngine

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 19

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 20

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 21

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 22

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 23Confidential– OracleRestricted 23

    Cluster Health AdvisorThe degradation is caused by a higher than expected utilization of shared storage devices for this database. No evidence of significant increase in I/O demand on the local node.

    Problem

    Confidence

    Action

    95.17%Validate whether there is increase in I/O demand on other nodes than the local and find I/O intensive SQL . Add more disks to disk group or move database to faster disks.

    proddb_1

    proddb_2

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 24

    OracleClusterHealthAdvisor(CHA)StandaloneDataExplorationTool

    DEMO

    • StandaloneJavaGUIClient• Mustberunonlocalclusternode• CanberunagainstliveGIMRorMDB(dump)filechactl export repository -format mdb -start '2017-05-01 00:00:00'-end '2017-05-10 00:00:00'

    • Usedinternallyfordevelopment• WillbeavailableandmaintainedonOracleTechnologyNetwork

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 25

    svr01

    svr02

    svr03

    mycluster

    CHA (on SVR01) V0.57.6, Data V0.17 – OTN Version (on SVR01)

    Cluster ‘mycluster’ Sep-16 21:20:25

    Cluster ‘mycluster’

    devdb

    prod

    testdb

    webdb

    Cluster mycluster, Hosts:3, DB’s:4

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 26

    CHA (on SVR01) V0.57.6, Data V0.17 – OTN Version (on SVR01)

    svr01

    svr02

    svr03

    DB prod in mycluster Sep-16 22:16:35

    DB prod in mycluster

    mycluster

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 27

    CHA (on SVR01) V0.57.6, Data V0.17 – OTN Version (on SVR01)

    svr01

    svr02

    svr03

    mycluster

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 28

    CHA (on SVR01) V0.57.6, Data V0.17 – OTN Version (on SVR01)

    svr01

    svr02

    svr03

    mycluster

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 29Confidential– OracleRestricted

    CHA (on SVR01) V0.57.6, Data V0.17 – OTN Version (on SVR01)

    svr01

    svr02

    svr03

    mycluster

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 30

    CHA (on SVR01) V0.57.6, Data V0.17 – OTN Version (on SVR01)

    svr01

    svr02

    svr03

    mycluster

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 31

    CHA (on SVR01) V0.57.6, Data V0.17 – OTN Version (on SVR01)

    svr01

    svr02

    svr03

    mycluster

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 32

    CHA (on SVR01) V0.57.6, Data V0.17 – OTN Version (on SVR01)

    svr01

    svr02

    svr03

    mycluster

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 33

    svr01

    svr02

    svr03

    mycluster

    CHA (on SVR01) V0.57.6, Data V0.17 – OTN Version (on SVR01)

    devdb

    prod

    testdb

    webdb

    Cluster mycluster, Hosts:3, DB’s:4

    Host ‘svr01’

    Host ‘svr02’

    Host ‘svr03’

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 34

    CHA (on SVR01) V0.57.6, Data V0.17 – OTN Version (on SVR01)

    svr01

    svr02

    svr03

    svr02Host svr02

    mycluster

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 35Confidential– OracleRestricted

    CHA (on SVR01) V0.57.6, Data V0.17 – OTN Version (on SVR01)

    svr01

    svr02

    svr03

    mycluster

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 36

    CHA (on SVR01) V0.57.6, Data V0.17 – OTN Version (on SVR01)

    svr01

    svr02

    svr03

    mycluster

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    OracleClusterHealthAdvisorComingFeatures

    37

    • CrossClusterProblemSupport• Inter-InstanceProblemDetection• Inter-DatabaseProblemDetection

    • PortableHTMLReport• Consolidateddiagnosisoutput• Easytosendandreview

    2017-02-06 09:40:55.0 Database oltpacdb DB Multi Block Read I/O Performance (oltpacdb_1) [detected] Top Instances/PDBs by : IOs per sec Database oltpacdb Host slcac455 Instance total 2228.80 Database oltpacdb Host slcac455 PDB OLTPA 308.40 Database oltpacdb Host slcac455 PDB OLTPA1 12.80 Database oltpacdb Host slcac455 PDB OLTPA5 11.60 Database oltpacdb Host slcac455 PDB OLTPA4 7.60 Database oltpacdb Host slcac455 PDB OLTPA2 4.00 Database oltpacdb Host slcac454 Instance total 1136.20 Database oltpacdb Host slcac454 PDB OLTPA 784.20 Database oltpacdb Host slcac454 PDB OLTPA4 428.00 Database oltpacdb Host slcac454 PDB OLTPA2 21.80 Database oltpbcdb Host slcac455 Instance total 0.20 Database oltpccdb Host slcac455 Instance total 0.00 Database oltpbcdb Host slcac454 Instance total 0.00

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    AutonomouslyPreservesDatabaseAvailabilityandPerformance

    38

    Oracle12cDatabaseHangManager

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 39

    Oracle12cHangManager

    • Alwayson- Enabledbydefault

    • Reliablydetectsdatabasehangsanddeadlocks

    • Autonomouslyresolvesthem• SupportsQoSPerformanceClasses,RanksandPoliciestomaintainSLAs

    • Logsalldetectionsandresolutions• NewSQLinterfacetoconfiguresensitivity(Normal/High)andtracefilesizes

    AutonomouslyPreservesDatabaseAvailabilityandPerformance Session

    DIA0

    EVALUATE

    DETECT

    ANALYZE

    Hung?

    VERIFY

    Victim

    QoSPolicy

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    OracleDatabaseHangManager– AppliedMachineLearning

    • ActualInternalandExternalcustomerdatadrivesmodeldevelopment

    • Purpose-builtdiagnostictechnologyusedforknowledgeextraction

    • ExpertDevteamscrubsdata• HangHeuristicEnginecreatedanddeployed@Customer

    • HMusesrun-timeenginetoperformreal-timeDBhangdetectionandresolution

    40

    DiscoversandResolvesRuntimeDatabaseHangs

    HMDevTeamASH

    KnowledgeExtraction

    HeuristicEngine

    ExpertSupervision

    HMRuntimeEngine

    Feedback

    HM

    HM

    ScrubData

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 41

    FullResolutionDumpTraceFileandDBAlertLogAuditReportsOracle12cHangManager

    Dump file …/diag/rdbms/hm6/hm62/incident/incdir_5753/hm62_dia0_12656_i5753.trcOracle Database 12c Enterprise Edition Release 12.2.0.0.0 - 64bit BetaWith the Partitioning, Real Application Clusters, OLAP, Advanced Analyticsand Real Application Testing optionsBuild label: RDBMS_MAIN_LINUX.X64_151013ORACLE_HOME: …/3775268204/oracleSystem name: LinuxNode name: slc05kyrRelease: 2.6.39-400.211.1.el6uek.x86_64Version: #1 SMP Fri Nov 15 13:39:16 PST 2013Machine: x86_64VM name: Xen Version: 3.4 (PVM)Instance name: hm62Redo thread mounted by this instance: 2Oracle process number: 19Unix process pid: 12656, image: oracle@slc05kyr (DIA0)

    *** 2015-10-13T16:47:59.541509+17:00*** SESSION ID:(96.41299) 2015-10-13T16:47:59.541519+17:00*** CLIENT ID:() 2015-10-13T16:47:59.541529+17:00*** SERVICE NAME:(SYS$BACKGROUND) 2015-10-13T16:47:59.541538+17:00*** MODULE NAME:() 2015-10-13T16:47:59.541547+17:00*** ACTION NAME:() 2015-10-13T16:47:59.541556+17:00*** CLIENT DRIVER:() 2015-10-13T16:47:59.541565+17:00

    2015-10-13T16:47:59.435039+17:00Errors in file /oracle/log/diag/rdbms/hm6/hm6/trace/hm6_dia0_12433.trc (incident=7353):ORA-32701: Possible hangs up to hang ID=1 detectedIncident details in: …/diag/rdbms/hm6/hm6/incident/incdir_7353/hm6_dia0_12433_i7353.trc2015-10-13T16:47:59.506775+17:00DIA0 requesting termination of session sid:40 with serial # 43179 (ospid:13031) on instance 2

    due to a GLOBAL, HIGH confidence hang with ID=1.Hang Resolution Reason: Automatic hang resolution was performed to free a

    significant number of affected sessions.DIA0: Examine the alert log on instance 2 for session termination status of hang with ID=1.

    In the alert log on the instance local to the session (instance 2 in this case), we see the following:

    2015-10-13T16:47:59.538673+17:00Errors in file …/diag/rdbms/hm6/hm62/trace/hm62_dia0_12656.trc (incident=5753):ORA-32701: Possible hangs up to hang ID=1 detectedIncident details in: …/diag/rdbms/hm6/hm62/incident/incdir_5753/hm62_dia0_12656_i5753.trc

    2015-10-13T16:48:04.222661+17:00DIA0 terminating blocker (ospid: 13031 sid: 40 ser#: 43179) of hang with ID = 1

    requested by master DIA0 process on instance 1Hang Resolution Reason: Automatic hang resolution was performed to free a

    significant number of affected sessions.by terminating session sid:40 with serial # 43179 (ospid:13031)

    Hang detected by hang manager

    Session victim identified & terminated

    Identified blocker session

    Blocker session terminated

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    ProgramAgenda

    IntroducingAppliedMachineLearningforDiagnostics

    AppliedMachineLearningforReal-timePrevention

    AppliedMachineLearningforRapidRecovery

    ODAManagementApplianceProfile

    ForFurtherInformation/Q&A

    1

    2

    3

    4

    5

    42

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    SpeedsIssueDiagnosis,TriageandResolution

    43

    Oracle12cTraceFileAnalyzer

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 44

    ChallengesinFailureRecovery

    • GBsoflogsgeneratedeveryday• Distributedacrossclusternodes• Diagnosinganissuecanbe“aneedleinthehaystack”problem

    • Manualissuediagnosiscanbetediousandtime-consuming

    • Anydelayinissuediagnosiscanadverselyimpactthebusiness

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    RapidRecoverywithTraceFileAnalyzer(TFA)

    45

    • Autonomouslycollectsdataintelligently(SmartCollection)– Autonomouslyandintelligentlycollectsonlyrelevantlogs– Reduceslogfilestosmallsetofpotentialcandidates

    • Autonomouslyfindsrelevantinformationforissueathand– AnomalyTimelineGeneration– Identifieserrorsassociatedwiththeissue– Generateslistofpotentialproblemsacrossthesystemorderedbytime

    • SpeedsissuediagnosiswithOracleSupportServices(OSS)forunknownissues

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    RapidRecoverywithTFA

    46

    • Alwayson• Collectscomprehensivefirstfailurediagnosticsoneachnode

    • FiltersandpackagesrelevantdiagnosticdatausingAppliedMLmodel

    • AutomaticallynotifiesDBAsandSysAdminsoferrors

    • OptionallyallowsquickissueresolutionwithOracleSupport

    • TransfersdatatocentralizedstoragefordetailedanalysiswithTFAReceiver

    SmartCollectionwithTFACollector

    COMING

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    TraceFileAnalyzer– AppliedMachineLearning

    • ML-basedKnowledgeExtractionofLogs,SRsandBugs

    • Experttrainingrefinesdatatrainingset• Knowledgeisembeddedintotherun-timemodel

    • ModelisshippedinTFACollectortoworkwiththelivelogsontheCluster

    • LoganomalydetectionisperformedwithTFAReceiver

    • Nomodeltrainingrequiredbyuser• Modelisupdatedregularly

    47

    TFADevTeamBugs

    MLKnowledgeExtraction

    ModelGeneration

    ExpertSupervision

    TFARuntimeModel

    TFAWeb

    SR

    TFAReceiver

    TFACollector

    ScrubData

    SpeedsIssueDiagnosis,TriageandResolution

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    OracleTFAinClusterDomainDesignOverview

    TFA

    FileMetadata

    CollectionRepository

    Node1

    TFA

    FileMetadata

    CollectionRepository

    NodeN

    1

    2

    3

    3

    4

    5

    User/DaemoninitiatedTFAinitiated

    1. Daemoninitiatesdiagnosticcollection2. TFAsignalscollectionsonothernodes3. CollectionswrittentolocalTFArepositories4. Collectionsconsolidatedonasinglenode---------- Coming ----------5.CollectionscopiedtoTFAserviceinDSC

    TFAmetadata

    48

    DomainServicesCluster

    OracleClusterDomain

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 49

    RapidRecoverywithTFADetailedIssueAnalysisusingTFAReceiver

    • CentralizedaggregatorintheClusterDomain• Mineslogsanderrorsfromallnodesregisteredwithit• Browser-BasedUI

    – Supportsbrowsingerrors– Viewingassociatedlogs– Easilyconstructtimelines

    DEMO

    COMING

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 50Confidential– OracleRestricted

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved. 51

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 52Confidential– OracleRestricted

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 53

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 54

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 55

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 56

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 57Confidential– OracleRestricted

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 58

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 59

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    ProgramAgenda

    IntroducingAppliedMachineLearningforDiagnostics

    AppliedMachineLearningforReal-timePrevention

    AppliedMachineLearningforRapidRecovery

    ODAManagementApplianceProfile

    ForFurtherInformation/Q&A

    1

    2

    3

    4

    5

    60

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    Oracle12cDomainServicesCluster

    61

    • HostsFrameworkasServices• Reduceslocalresourcefootprint• Centralizesmanagement• Speedsdeploymentandpatching• OptionalSharedStorage• Supportsmultipleversionsandplatformsgoingforward

    DeploywithMinimumFootprintandMaximumManageability

    ApplicationMemberCluster

    DatabaseMemberCluster

    DatabaseMemberCluster

    OracleDomainServicesCluster

    DatabaseMemberCluster

    ApplicationMemberCluster

    DatabaseMemberCluster

    ORACLECLUSTERDOMAIN

    Management Repository ServiceTrace File Analyzer ServiceGrid Names ServiceStorage ServicesQoS Management ServiceRapid Home Provisioning Service

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    OracleDatabaseAppliance– ManagementApplianceProfile

    62

    • IdealManagementSolutionforOracleEngineeredSystems

    • Reducesdiagnosticfootprint

    • Centralizesmanagementfunction• Doesnotinterferewithprovisioningandpatching

    • PayonlyforODAhardware(S/M/L)• Noadditionalsoftwarelicensefees

    DomainServicesManagementEngineeredSolution

    DatabaseMemberCluster

    DatabaseMemberCluster

    DatabaseMemberCluster

    OracleDomainServicesCluster

    DatabaseMemberCluster

    DatabaseMemberCluster

    DatabaseMemberCluster

    ORACLECLUSTERDOMAIN

    Management Repository ServiceTrace File Analyzer ServiceGrid Names ServiceQoS Management ServiceEnterprise Manager Cloud Control

    COMING

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    ProgramAgenda

    IntroducingAppliedMachineLearningforDiagnostics

    AppliedMachineLearningforReal-timePrevention

    AppliedMachineLearningforRapidRecovery

    ODAManagementApplianceProfile

    ForFurtherInformation/Q&A

    1

    2

    3

    4

    5

    63

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.|

    ForFurtherInformation

    • Oracle 12c Autonomous Health Framework User’s Guide• Oracle 12c Clusterware Adminstration and Deployment Guide• Oracle Autonomus Health Framework on OTN• Oracle QoS Management 12c User’s Guide •Oracle QoS Management on OTN•Oracle 12c ORAchk•Oracle 12c Trace File Analyzer

  • Copyright©2017, Oracleand/oritsaffiliates.Allrightsreserved.| 65