Correlated Monitoring of an Enterprise ALM Environment at Bosch

Preview:

Citation preview

Copyright©2016Splunk Inc.

RaffaelEilerSeniorEngineer,BOSCH

CorrelatedMonitoringofanEnterpriseALMEnvironmentatBosch

JuergenMagieraITSILeadArchitekt EMEA,Splunk

Disclaimer

2

Duringthecourseofthispresentation,wemaymakeforwardlookingstatementsregardingfutureeventsortheexpectedperformanceofthecompany.Wecautionyouthatsuchstatementsreflectourcurrentexpectationsandestimatesbasedonfactorscurrentlyknowntousandthatactualeventsorresultscoulddiffermaterially.Forimportantfactorsthatmaycauseactualresultstodifferfromthose

containedinourforward-lookingstatements,pleasereviewourfilingswiththeSEC.Theforward-lookingstatementsmadeinthethispresentationarebeingmadeasofthetimeanddateofitslivepresentation.Ifreviewedafteritslivepresentation,thispresentationmaynotcontaincurrentoraccurateinformation.Wedonotassumeanyobligationtoupdateanyforwardlookingstatementswemaymake.Inaddition,anyinformationaboutourroadmapoutlinesourgeneralproductdirectionandissubjecttochangeatanytimewithoutnotice.Itisforinformationalpurposesonlyandshallnot,beincorporatedintoanycontractorothercommitment.Splunkundertakesnoobligationeithertodevelopthefeaturesor

functionalitydescribedortoincludeanysuchfeatureorfunctionalityinafuturerelease.

WhoWeAreWhatwedo

Aboutus

RaffaelEiler(raffael.eiler@de.bosch.com)– RobertBoschGmbHStuttgart,Germany– ClearCaseandRationalTeamConcertDeploymentExpert

JuergenMagiera(jmagiera@splunk.com)– SplunkMunich,Germany– ITSIArchitectandLeadEMEA

4

Overview about BoschGroup

5

Bosch– technologytoenhancequalityoflife

6

• Some 56,0001 researchers and developerswork atBosch:at1182 locations worldwide,inasingle network.

• Boschis one of the world’s leading internationalproviders of technologyand services.

• Overthe past five years,Boschhas investedmore than 24billion euros inresearch anddevelopment.

• Our objective:to develop innovative,useful,andexciting products and solutions toenhance quality of life – technology that is“Invented for life.”

Electronics&SoftwareDevelopmentPlatformsProducts&Services

ClearCase DoorsClearQuest

WTS/VDI

electronicsdevelopmentplatforms

softwaredevelopmentplatforms

7

BoschCLMinfrastructure

8

IBMCLMisthepreferredtoolforALM*1 withintheBoschGroupIBMCLMisasetofweb-appshostedinWebSphere8.5runningonvirtualWindowsservers.AsdatabaseisusedORACLE11(RAC)CLMisinaramp-upphaseinmostproductlinesWehavemanySW-developersusingthatsystem(about3000concurrentsessions)CLMsystemisessentialforstepsinSWdevelopmentprocessUnplannedsystemoutageshavetobeminimized

Factsandfigures

*1 : ALM = Application Lifecycle Management

CLMSystemTypesatBosch

9

Type DB Count Availability Version Purpose

P-System Ora 18 Highest 5.0.2 Productiveservers

Q-Systems Ora 9 High 5.0.2 Stagingservers,foracceptancetesting.

Test-Servers DB2 ~10 Low 5.0.2 UsedbyPLtoolsteamsforprocessdevelopments,playgrounds.

Development-Servers

DB2,Ora

6 Lowest 5.0.2and6.0.1

ForPLplugindevelopment.CustomershaveJazzAdmin-Role.

Beta-Servers DB2 2 Low 6.0.1Mx Tohostandshowtheupcomingpre-releaseversions(Mx/RCx).

Demo-Servers DB2 2 Low 5.0.2and6.0.1

Generalplaygroundandproductshow-caseforanyoneinterested.Stableversion.

Training-Server DB2 1 Low 5.0.2 For user training.

Proxy-Server Squid 13 Highest 3.1.10 For remoteaccess at each location,based oncustomer request.

Splunk Topologyin2015

10

indexer

Forwardersinstalledonmanyserver

searchhead

End-2-Endperformanceclients

• Westartedwithoneindexerandonesearchheadrunningonwindows

• AllSplunk serverhavebeenmanagedbytheteamthemself

• Overtimesplunk usagerised veryfast• Performanceissues(slowresponse,concurrentsearches,…)duringdailyusagecameup

Splunk Topologytoday

11

Universalforwarder

All-in-onesplunk server:- Indexer- searchhead- deploymentserver

End-2-Endperformanceclients

• Everyproduct(e.g.SubversionorALM)hasit’sownsplunk server

• AllSplunk server(basicoperation)aremanagedbyaBoschinternalserviceprovider

• Splunk configuration(inputs.conf,scripts,alters,dashboards,…)isunderourcontrol

• Splunk serverarerunningonLinux• Ifwerecognizeperformanceissueswe

willsplitindexerandsearchhead

Heavyforwarder

Variousotherinputs

12

Windows Linux/Unix Virtualization &Cloud

Applications Databases Networking

Ourusecases

• Registry• Event logs• File system• WMI• PerfMon• Logfiles• ….

• Configurations• Syslogs• File system• ps, iostat,

netstat, top, …• Logfiles• ….

• Hypervisor• Guest OS• Apps• Cloud• …

• Web logs• Log4J• JMX• Scripts• …

• Configurations• Audit/query logs• Tables• Size• …

• Network Interface

• Configurations• SNMP• ….

implemented

13

Example:HeapIncrease• Before heap increase

• Heap usage constantlyabove 80%

• Very frequent GarbageCollection Cycles

• Lots of Hung threads• After heap increase

• GC less stressed• Less hung threads

• -> Less impact for the user

14

Example:SystemDetails- RAM• Optimise usage of RM index in

RAM • Calculate real „RAM left“

HighlightsFrom End-usersPerspective

15

„All-in-one“solution

Scales good

No need to consultmonitoring solutionfrom other teams

Allnecessarylogfiles accessiblefrom one place

Findroot causes „on-the-fly“

„Management-friendly“Reporting

/Dashboards

FeedbackFrom My Colleagues

16

Splunk provides early warnings if certain parametersof the system start to leave the safe boundaries, e.g.free disk space, heap usage, CPU usage

Stefan O.

Splunk is my first stop in case ofproblems. I can quickly check whaterrors have been logged, and where.It is also really useful to track thesystem load and resourceconsumption. We have graphs withmatching timelines that allow you toeasily detect patterns acrossdifferent data sources, or evendifferent servers.

Volker G

Splunk informs me when heap usage is high so I canconsider increasing the heap long before userscomplain about performance issues.

Danny M.

Currentsplunk activities(NetIQphase-out)

17

InthepastweusedNetIQasmonitoringsystemprovidedbyanotherdepartmentwithinBosch

Withsplunk wehavenowasystemthatis/has……possibilityofimplementinganykindofchangesinaveryfastway…stableandwellperformingsolution…easytolearnandveryusefulindailyworkasasysadmin

Currentsplunkactivities(ITSI)

18

Clearquest GlassTableDetailsITSIGlasstableforRationalClearQuest (RQ1) service:KPIs:• SeleniumEnd-to-Endtransactiontime.• OverallhealthofIHSapplication• DetailedperformancemetricsonWAS• ResponsetimeforDBValue:• Getnotifiedaboutaboutpoorresponse

timesforcustomersearly.• Ataglanceviewofcurrentandhistoric

performancemetricsalongthewholeservicechain.

RationalClearQuest (RQ1)

CurrentSplunk Activities(ITSI)

20

PlannedImplementation(additionalInfo)

21

Planned:• Reporting• Managementview(e.g.dashboard

withtrafficlights)• Long-termmonitoring(trendanalysis)• Historical,cumulateddata• DifferentDashboardsfordifferent

interests(Managers,Technicals,ProblemAnalysis,Quickoverview,…)

• E2Etestresults(selenium)• AmountofHTTPrequests

Implemented:• Licensestatistics• Logfiles• MonitoringWASwithJMX• Mostsystemressources(perfmon)

Conclusion – next stepsTo be evaluated:SSL(certificates expiration)Monitoringcaching proxies,for example:– How much data is provided through cache?CSM(CLMServerMonitoring)integration– Get Application data to correlate this with system resources,e.g.heap size:

ê How much users are working?ê How much work items are created today?

ESXmonitoringNetworkmonitoring (Whole route,notjustthe network interface)

22

THANKYOU

Recommended