Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Copyright©2016Splunk Inc.
JacobLeverichSoftwareEngineer,Splunk
ExtendingSPLwithCustomSearchCommands
Disclaimer
2
Duringthecourseofthispresentation,wemaymakeforwardlookingstatementsregardingfutureeventsortheexpectedperformanceofthecompany.Wecautionyouthatsuchstatementsreflectourcurrentexpectationsandestimatesbasedonfactorscurrentlyknowntousandthatactualeventsorresultscoulddiffermaterially.Forimportantfactorsthatmaycauseactualresultstodifferfromthose
containedinourforward-lookingstatements,pleasereviewourfilingswiththeSEC.Theforward-lookingstatementsmadeinthethispresentationarebeingmadeasofthetimeanddateofitslivepresentation.Ifreviewedafteritslivepresentation,thispresentationmaynotcontaincurrentoraccurateinformation.Wedonotassumeanyobligationtoupdateanyforwardlookingstatementswemaymake.Inaddition,anyinformationaboutourroadmapoutlinesourgeneralproductdirectionandissubjecttochangeatanytimewithoutnotice.Itisforinformationalpurposesonlyandshallnot,beincorporatedintoanycontractorothercommitment.Splunkundertakesnoobligationeithertodevelopthefeaturesor
functionalitydescribedortoincludeanysuchfeatureorfunctionalityinafuturerelease.
WhoamI?Splunker for2years,basedinSanFrancisco
Engineeringleadfor…– MachineLearningToolkit– ITSIAnomalyDetectionandAdaptiveThresholdingfeatures– Splunk customsearchcommandinterface
ImplementedSearchCommandProtocolVersion2
Die-hardLonghornsfan
3
Agenda
4
IntroductiontoCustomSearchCommandsHowdoCustomSearchCommandswork?– High-levelconcepts– Low-leveldetails
TypesofSearchCommandsHowtocreatenewCustomSearchCommandsWrap-up
IntroductiontoCustomSearchCommands
WhatisaCustomSearchCommand?Auser-definedSPLcommand.
7
|search
8
WhatisaCustomSearchCommand?Auser-definedSPLcommand.
CanbeusedtoextendtheSPLlanguage!
WhousesCustomSearchCommands?Partners– Concanon,etc.
Customers– Use-casespecificanalytics
Splunk!– predict command– DBConnect– MachineLearningToolkit
AnyonewhowantstoextendtheSplunk platform– Integrationwith3rd partyservices– Implementationofcustomlogic
HowdoCustomSearchCommandswork?
HowdoCustomSearchCommandswork?
12
1. WhenparsingSPL,splunkd interrogateseachcommand.“AreyouaCustomSearchCommand?”
2. Ifso,spawnexternalprocessandallowittoparsearguments.
3. Duringsearch,pipesearchresultsthroughexternalprocess.
Parsing#1:Splitsearchintocommands
13
| inputlookup geo_attr_us_states.csv | GOCRAZY | head 5
inputlookup geo_attr_us_states.csv
GOCRAZY head 5
Parsing#2:Lookforcustomsearchcommands
14
| inputlookup geo_attr_us_states.csv | GOCRAZY | head 5
inputlookup geo_attr_us_states.csv
GOCRAZY head 5
[gocrazy]…
commands.conf
Parsing#3:Spawnexternalprocess
15
| inputlookup geo_attr_us_states.csv | GOCRAZY | head 5
inputlookup geo_attr_us_states.csv
GOCRAZY head 5
$SPLUNK_HOME/bin/pythongocrazy.py
Parsing#4:Letexternalprocessparsearguments
16
| inputlookup geo_attr_us_states.csv | GOCRAZY | head 5
inputlookup geo_attr_us_states.csv
GOCRAZY head 5
$SPLUNK_HOME/bin/pythongocrazy.py
Search:Piperesultsthrough externalprocess
17
| inputlookup geo_attr_us_states.csv | GOCRAZY | head 5
inputlookup geo_attr_us_states.csv
head 5GOCRAZY
$SPLUNK_HOME/bin/pythongocrazy.py
Recap:high-levelconcepts
18
EnableyoutoregisternewSPLcommands,extendthelanguage.
Allowyoutointerceptandmodifysearchresultsduringasearch.– CSVin➞ CSVout
Implementedasaexternalprocess(i.e.aprogramyouwrite).– TypicallywritteninPython.
CustomCommands:low-leveldetails
19
Howresultsareexchangedbetweensplunkd andexternalprocess“Types”ofsearchcommands
splunkd⬌ customcommand
20
Therearetwo“protocols”forcustomcommands:– Version1,legacyprotocolusedbyIntersplunk.py (availablesinceSplunk 3.0)– Version2,newprotocolusedbyPythonSDK(availablesince6.3)– Inbothprotocols,allcommunicationoverstdin/stdout
Version2protocol– Spawnsexternalprocessonce,streamsresultsthroughchunkbychunk– Simplecommands.conf configuration
ê “chunked=true”– Supportforplatform-specificprograms
Version1protocol– Spawnsexternalprocessforeachchunkofsearchresults(!)– “Transforming”commandslimitedto50,000events
SearchCommandprotocolcomparison
21
Protocol APIs Performance Scalability Simpleconfiguration
Platform-specificprograms
Programminglanguages
Version1(legacy)
Intersplunk.py,Python SDK ✘ ✘ ✘ ✘
Python
Version2 PythonSDK✔ ✔ ✔ ✔
Python,Javascript,bash,Shell,arbitrarybinaries
SearchCommandProtocolVersion2
22
• Transaction-oriented• splunkd sendsacommand,externalprocessrespondswithreply
• Simplebi-directionaltransportprotocol:• ASCIItransportheader• JSONmetadatapayload• CSVsearchresultspayload
• Everysearchstartswitha“getinfo”command(capabilityexchange)• Subsequently,issues“execute”commandswithsearchresults
Transport“chunk”
chunked 1.0, 22, 54{“action”: “execute”}_raw,a,b,chello,0,1,2everyone,3,4,5howareyou,6,7,8
TransportheaderMetadata(JSON)
Datapayload(CSV)
Metadatalength Datalength
Example:GOCRAZY
24
| inputlookup geo_attr_us_states.csv | head 5 | GOCRAZY
chunked 1.0,22,106{“action”: “execute”}state_code,state_fips,state_nameAL,01,AlabamaAK,02,AlaskaAZ,04,ArizonaAR,05,ArkansasCA,06,California
$SPLUNK_HOME/bin/pythongocrazy.py
chunked 1.0,18,106{“finished”: true}dste_aecot,pste_asfit,mste_aenatLA,10,aaalbmAKA,20,laaskAZA,40,iaorznARA,50,AkaasnsrAC,60,iCifolarna
ProtocolVersion2:Transactiontimeline
25
time
splunkd externalprocess
✘…
“Whatkindofcommandareyou?”
“Hey!I’mastreamingcommand!”
“getinfo”command
26
Metadatainthegetinfo commandsentbysplunkd:– Commandarguments– FullSPLquerystring– Executioncontext(app,user)– Searchsid– splunkd URIandauth token(formakingRESTrequests)
Metadatainthecustomcommand’sreply:– Typeofsearchcommand(streaming/stateful/reporting/etc.)– Whichfieldssplunkd shouldextract(requiredfields)– Whetherornotitgeneratesresults(e.g.mustbefirstsearchcommand)
Sample“getinfo”metadata{
"action": "getinfo","streaming_command_will_restart": false,"searchinfo": {
"earliest_time": "0","raw_args": [
"LinearRegression", "petal_length", "from", "petal_width”],"session_key": "...","maxresultrows": 50000,"args": [
"LinearRegression", "petal_length", "from", "petal_width”],"dispatch_dir": "/Users/jleverich/builds/conf_mlapp_demo/var/run/splunk/dispatch/1475007525.265","command": "fit","latest_time": "0","sid": "1475007525.265","splunk_version": "6.5.0","username": "admin","search": "%7C%20inputlookup%20iris.csv%20%7C%20fit%20LinearRegression%20petal_length%20from%20petal_width","splunkd_uri": "https://127.0.0.1:8090","owner": "admin","app": "Splunk_ML_Toolkit”
},"preview": false
}
“execute”command
28
Metadatainexecutecommandsentbysplunkd– Whetherornotprecedingcommandsare“finished”
Metadatainthecustomcommand’sreply:– Whetherornotthiscommandis“finished”
splunkd andsearchcommandsnegotiatecompletionofsearch– Bothmustindicate“finished”=True
TypesofSearchCommands
TypesofSearchCommands“Streaming”commands
“Stateful Streaming”commands
“Transforming”commands– “Events”commands– “Reporting”commands
“Streaming”commandsProcesssearchresultsone-by-one– Can’tmaintainglobalstate– Mustnotre-ordersearchresults
EligibletorunatIndexers– CanruninparallelonIndexers
Examples:– eval– where– rex
“Streaming”commandexample
32
... | eval foo=“bar” | ...
field_A field_B field_C foo
the jumps dog bar
quick over oops bar
brown the too bar
fox lazy many bar
field_A field_B field_C foo
the jumps dog bar
field_A field_B field_C foo
quick over oops bar
field_A field_B field_C foo
brown the too bar
field_A field_B field_C foo
fox lazy many bar
Remoteresults
Finalsearchresults
Indexers
Searchhead
“Stateful Streaming”commandsProcesssearchresultsone-by-one– Canmaintainglobalstate– Mustnotre-ordersearchresults
OnlyrunatSearchHead
Examples:– accum– streamstats– dedup
“Stateful Streaming”commandexample
34
... | accum foo | ...
field_A field_B field_C foo
the jumps dog 1
quick over oops 2
brown the too 3
fox lazy many 4
field_A field_B field_C foo
the jumps dog 1
quick over oops 1
brown the too 1
fox lazy many 1
“Events”commandsProcesssearchresultsasawhole– Mayre-ordersearchresults– Typicallymaintainallfieldsineachevent,especially:
ê _raw,_time,index,sourcetype,source,host
OnlyrunatSearchHeadMayrunseveraltimesfor“preview”
Examples:– sort– eventstats
“Events”commandexample
36
... | sort field_A | ...
field_A field_B field_C foo
brown the too 3
fox lazy many 4
quick over oops 2
the jumps dog 1
field_A field_B field_C foo
the jumps dog 1
quick over oops 2
brown the too 3
fox lazy many 4
“Reporting”commandsProcesssearchresultsasawhole– Typicallytransformtheresults(e.g.aggregate,project,summarize,etc.)
OnlyrunatSearchHeadMayrunseveraltimesfor“preview”Resultsshowupinthe“Statistics”tab
Examples:– stats– timechart– transpose
“Reporting”commandexample
38
... | stats count | ...
count
4
field_A field_B field_C foo
the jumps dog 1
quick over oops 2
brown the too 3
fox lazy many 4
Bewareoflargeresultsets!
39
“Events”and“Reporting”commandsprocessresultsasawhole.– Maycontain1,000,000sofsearchresults!– WriteStreamingorStateful commandsinsteadwhenpossible.
Build-incapacitylimits,orspillresultstodiskwhennecessary.
Streaming“pre-op”
40
Commandsmayspecifya“pre-op”toprependinSPL
Communicatedtosplunkd ingetinfo metadata(streaming_preop)Usefultoparallelizecomputation,reducevolumeofdatatransferMustbe“Streaming”(i.e.,mayrunatIndexers)
... | stats count | ... ... | prestats count | stats count | ...
ImplementingCustomSearchCommandswiththeSplunk SDKforPython
41
Basicstepstocreateasearchcommand1. Createan“App”2. DeploythePythonSDKforSplunk inthebin directory3. WriteascriptforyourCustomSearchCommand4. Registeryourcommandincommands.conf5. RestartSplunk Enterprise6. (optional) Exportthecommandtootherapps
Createan“App”
43
DeploythePythonSDKinthebin directory
44
cd $SPLUNK_HOME/etc/apps/MyNewApp/bin
pip install -t . splunk-sdk
WriteascriptforyourCustomSearchCommand
45
import sysfrom splunklib.searchcommands import dispatch, StreamingCommand, Configuration
@Configuration()class FoobarCommand(StreamingCommand):
def stream(self, records):for record in records:
record['foo'] = 'bar'yield record
if __name__ == "__main__":dispatch(FoobarCommand, sys.argv, sys.stdin, sys.stdout, __name__)
$SPLUNK_HOME/etc/apps/MyNewApp/bin/foobar.py
Registeryourcommandincommands.conf
46
[foobar]chunked=true# filename=foobar.py ## <--- optional
$SPLUNK_HOME/etc/apps/MyNewApp/default/commands.conf
RestartSplunk Enterprise
47
$SPLUNK_HOME/bin/splunk restart
Exporttootherapps(optional)
48
Exporttootherapps(optional)
49
Exporttootherapps(optional)
50
ExampleStreamingCommand
51
import sysfrom splunklib.searchcommands import dispatch, StreamingCommand, Configuration
@Configuration()class ExStreamCommand(StreamingCommand):
def stream(self, records):for record in records:
record['foo'] = 'bar'yield record
if __name__ == "__main__":dispatch(ExStreamCommand, sys.argv, sys.stdin, sys.stdout, __name__)
$SPLUNK_HOME/etc/apps/MyNewApp/bin/exstream.py
ExampleStateful StreamingCommand
52
import sysfrom splunklib.searchcommands import dispatch, StreamingCommand, Configuration
@Configuration(local=True)class ExStatefulCommand(StreamingCommand):
def stream(self, records):for record in records:
record['foo'] = 'bar'yield record
if __name__ == "__main__":dispatch(ExStatefulCommand, sys.argv, sys.stdin, sys.stdout, __name__)
$SPLUNK_HOME/etc/apps/MyNewApp/bin/exstateful.py
ExampleEventsCommand
53
import sysfrom splunklib.searchcommands import dispatch, EventingCommand, Configuration
@Configuration()class ExEventsCommand(EventingCommand):
def transform(self, records):l = list(records)l.sort(key=lambda r: r['_raw'])return l
if __name__ == "__main__":dispatch(ExEventsCommand, sys.argv, sys.stdin, sys.stdout, __name__)
$SPLUNK_HOME/etc/apps/MyNewApp/bin/exevents.py
ExampleReportingCommand
54
import sysfrom splunklib.searchcommands import dispatch, ReportingCommand, Configuration
@Configuration()class ExReportCommand(ReportingCommand):
@Configuration()def map(self, records):
return records
def reduce(self, records):count = 0for r in records:
count += 1return [{'count': count}]
if __name__ == "__main__":dispatch(ExReportCommand, sys.argv, sys.stdin, sys.stdout, __name__)
$SPLUNK_HOME/etc/apps/MyNewApp/bin/exreport.py
Alittleadvice
55
Customcommandsareprograms thatrunonSplunk instances–BEWAREUNVALIDATEDINPUT!– SanitizeuserargumentsANDsearchresults
Userole-basedaccesscontroltorestrictaccess
Bepreparedtohandle1,000,000sofevents
Beexcellenttoeachother.
WhatNow?
56
https://github.com/splunk/splunk-sdk-python– https://github.com/splunk/splunk-sdk-
python/tree/master/examples/searchcommands_app
DevPortalDocumentation– http://dev.splunk.com/view/python-sdk/SP-CAAAEU2
DetailedspecificationforProtocolVersion2availablebyrequestPMContact:MarkGroves<[email protected]>
THANKYOU
StreamingCommandsonlyserializerequiredfields
Internalresultset_raw,_time,_cd,_indextime,...,fieldXa,1400000000,x:y,1400000010,...,BOBa,1400000001,x:y,1400000011,...,JIMa,1400000002,x:y,1400000012,...,BOBa,1400000003,x:y,1400000013,...,JIMa,1400000004,x:y,1400000014,...,JIMa,1400000005,x:y,1400000015,...,BOBa,1400000006,x:y,1400000016,...,JIMa,1400000007,x:y,1400000017,...,BOBa,1400000008,x:y,1400000018,...,BOBa,1400000009,x:y,1400000019,...,JIM
Externalresultset_chunked_idx,fieldX0,BOB1,JIM2,BOB3,JIM4,JIM5,BOB6,JIM7,BOB8,BOB9,JIM
{“required_fields”: [“fieldX”], …}
“Rightouter-join”onrequiredfields
Resultset
Resultset idx
Slice
New
field
ResultsetNew
field
+
Toexternalprocess
idx
Insplunkd• Supports
– Removingevents– Addingevents– Editingfields– Addingfields
• Can’tre-orderevents
Performancecomparison
020406080
100120140160180
Runtim
e(secon
ds)
Splunk
Protocolv1
Protocolv2
2.5millionevents
“Streaming”commandexample
61
... | eval foo=“bar” | ...
field_A field_B field_C
the jumps dog
quick over oops
brown the too
fox lazy many
field_A field_B field_C foo
the jumps dog bar
quick over oops bar
brown the too bar
fox lazy many bar