Upload
lisanl
View
145
Download
0
Embed Size (px)
Citation preview
© 2016 IBM Corporation
Highlights of the Telecommunications
Event Data Analytics toolkit IBM Streams Version 4.2
Paul Zollna
Senior Software Developer and TEDA Architect
For questions about this presentation contact Paul Zollna
2 © 2016 IBM Corporation
Important Disclaimer
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONALPURPOSES ONLY.
WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THEINFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTYOF ANY KIND, EXPRESS OR IMPLIED.
IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY,WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.
IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OROTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:
• CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS ORTHEIR SUPPLIERS AND/OR LICENSORS); OR
• ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENTGOVERNING THE USE OF IBM SOFTWARE.
IBM’s statements regarding its plans, directions, and intent are subject to change orwithdrawal without notice at IBM’s sole discretion. Information regarding potentialfuture products is intended to outline our general product direction and it should notbe relied on in making a purchasing decision. The information mentioned regardingpotential future products is not a commitment, promise, or legal obligation to deliverany material, code or functionality. Information about potential future products maynot be incorporated into any contract. The development, release, and timing of anyfuture features or functionality described for our products remains at our solediscretion.
THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.
IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
3 © 2016 IBM Corporation
Agenda
Highlights of the com.ibm.streams.teda application framework (TEDA)
What’s new about the operators and functions
Tutorial & references
DEMO: TEDA & the Secure Application Configuration
DEMO: TEDA & Plug-In for External Applications
4 © 2016 IBM Corporation
Highlights of the application framework
New Context Container composite operator– The new <namespace>.context.custom::ContextContainer operator is introduced. It
allows to implement a multi-level context logic or contexts with different algorithms.
Improved configuration of the Lookup Manager application– Simplified XML description format
– More flexible referencing of database configuration
– The database source can delete lookup data now
– The database settings are configurable with Secure Application Configuration
Integration of the partitioned BloomFilter feature– The configuration, functions and output statements of the partitioned BloomFilter are fully
integrated in the ITE application framework
Enhanced handling of CSV file with enrichment data– Configurable handling of header lines, empty lines
– Configurable handling of quoted values of attributes
– Configurable separator, delimiter and end-of-line marker
New shared memory segment naming– The unique segment naming simplifies the host resource sharing
New tuple data export plug-in interface– An external applications can import the tuple data from the ITE application without side
effects on the performance of the ITE application
5 © 2016 IBM Corporation
What’s new about the operators and functions
New DirectoryWatch operator– The new DirectoryWatch operator adds watches to the system's inotify functionality to monitor
directories and report file changes using less CPU than the standard spl.adapter::DirectoryScan
operator
Enhancement in the error reporting of the CSVParse operator– The CSVParse operator provides new custom output functions to get error descriptions when parsing
fails
– The detailed fault information about position in the record is provided in case of a failure
New functions in the com.ibm.streams.teda.file and com.ibm.streams.teda.file.path
namespaces– symlink - creates symbolic links in the file system.
– space - determines the total, free, and available disk space capacity for a mounted file system.
– dirname - extracts the string from the provided path, which specifies the parent directory.
– filename - extracts the string from the provided path, which specifies the file name.
– stem - extracts the string from the provided path, which specifies the file name without the extension.
– extension - extracts the string from the provided path, which specifies the extension of the file name.
6 © 2016 IBM Corporation
Additional Resources
Reference in IBM Knowledge Center - IBM Streams 4.2
com.ibm.streams.teda:– http://www.ibm.com/support/knowledgecenter/SSCRJU_4.2.0/com.ibm.streams.toolkits.d
oc/spldoc/dita/tk$com.ibm.streams.teda/tk$com.ibm.streams.teda.html
TEDA Tutorial for versions 1.0.2 & 2.0.0:– http://ibmstreams.github.io/streamsx.tutorial.teda
A TEDA demoapp sample on https://demo.ibmcloud.com
– Available for demos on request
An Introduction to Streaming Telecommunications Event Data Analytics:– https://developer.ibm.com/streamsdev/docs/introduction-streaming-telecommunications-
event-data-analytics-teda
Getting Started with Streaming Telecommunications Event Data Analytics:– https://developer.ibm.com/streamsdev/docs/getting-started-streaming-
telecommunications-event-data-analytics-teda
7 © 2016 IBM Corporation
TEDA & Application Configuration
The Secure Application Configuration feature
The database as source that provides the enrichment data
Demo: Lookup Manager application using Streams Console to configure
database credentials
8 © 2016 IBM Corporation
Secure Application Configuration
Application specific set of properties in secure storage
API implemented in SPL, C++ and Java
Based on JMX communication
Implemented in Streams Console
JMX API to manage the Secure Application Configuration store on the
instance or domain level
9 © 2016 IBM Corporation
TEDA framework with database configuration
Sensitive database credentials stored in secure storage
– The configuration files do not include credentials in plain text• Lookup Manager configuration file for default settings: config.cfg
• database configuration file is not involved: connections.xml
Changable configuration at runtime
– Update the password without cancellation and submission of the application job
10 © 2016 IBM Corporation
Demo
Specify the database properties in the Secure Application Configuration
Specify the Application Configuration in TEDA Framework
Load the enrichment data form database with the Lookup Manager
application and process files in the ITE application
11 © 2016 IBM Corporation
Details
Properties for Secure Application Configuration– lm.db.name: DEMOAPP
– lm.db.user: db2inst1
– lm.db.password: <password>
Configuration of the Lookup Manager application– lm.applicationConfiguration=MyApplConfig
– lm.db.connectionName=DEMOAPP
– lm.db=on
– lm.file=off (optional)
12 © 2016 IBM Corporation
Additional Resources
IBM Knowledge Center reference:
– http://www.ibm.com/support/knowledgecenter/SSCRJU_4.2.0/com.ibm.streams
.toolkits.doc/spldoc/dita/tk$com.ibm.streams.teda/tk$com.ibm.streams.teda$10
1.html
TEDA Tutorial in Module 11:
– http://ibmstreams.github.io/streamsx.tutorial.teda/docs/2.0.0/Module-11
13 © 2016 IBM Corporation
Plug-In for TEDA & External Applications
Plug-in interfaces in the ITE application
Demo: Export of deduplication data to external application using plug-in
interface in ITE application
14 © 2016 IBM Corporation
TEDA toolkit Plug-in
The TEDA application framework provides plug-in interfaces to export tuple
data to an external application
One or more external applications can connect to 4 different plug-in points:– Reader
– Transfomer
– Writer
– Dedup
A weak performance of the external application does not affect the
performance of the ITE file processing
None back pressure from external applications
The connections are monitored by metrics
15 © 2016 IBM Corporation
Plug-in interfaces in the ITE application
Statistics
Control
IngestFiles
Context
ChainDirScan
FileType Validator
ApplCtrl Scheduler
LogWriter
Dedup
Filename Dedup
ChainProcessorReader
ChainSink
ChainControl
ChainProcessorTransformer
PreFile Reader
RejectFileWriter
File Writer
Validator
Business Logic / Transform / EnrichTuple Group Split
Taps
Post Transformer
Tap
PostContext Processor
Tap
Chain Finalizer
(Files Mover)
Chain Split
File GroupSplit
Context Custom
FileReaderFileReader
Converter
ContextRestore Writer
PostContext Processor
Checkpoint Control
Legend Custom optionalCustomCommon Common or Custom Variant CVariant B
writer
reader
transformerdedup
16 © 2016 IBM Corporation
Demo
Specify the ITE configuration to export deduplication data
Specify the parameter of the Export operator to import the data from ITE
application
Connect a ‚fast‘ and ‚slow‘ importer to ITE application and compare the
performance of both jobs
17 © 2016 IBM Corporation
Plug-in interface in ITE application framework
The ITE application framework provides 4 plug-in configurations– The ITE application provides 4 unique export properties
– The <namespace>.streams::TypesCommon composite provides exported
stream schema specification for each plug-in configuration
New congestionPolicy parameter in spl.adapter::Export operator– Specifies the congestion policy of the stream that is exported
– Applicable values:• dropConnection
The connection is dropped when a downstream importer is not keeping up.
A nBrokenConnections metric indicates the connection drop count at the output port
• wait The output port causes back pressure when congested
Value Export property Exported SPL Schema
reader ite="<namespace>.chainprocessor.reader_output_RecordValidator" TypesCommon.ReaderOutStreamType
transformer ite="<namespace>.chainprocessor.transformer_output_DataProcessor" TypesCommon.TransformerOutType
writer ite="<namespace>.chainsink_input_Writer" TypesCommon.ChainSinkStreamType
dedup ite="<namespace>.context_output_Dedup" TypesCommon.TransformerOutType
18 © 2016 IBM Corporation
Details
Importer settings– Import of ITE stream schema types
• use demoapp.streams::*;
– Output stream type of the Importer• stream<TypesCommon.TransformerOutType> In = Import()
– Set subscription• param subscription : ite=="demoapp.context_output_Dedup";
Configuration of the ITE application– Specify the list of exporters, here Dedup only
• ite.export.streams=dedup
19 © 2016 IBM Corporation
Additional Resources
IBM Knowledge Center reference– Description of the ite.export.streams configuration parameter:
• http://www.ibm.com/support/knowledgecenter/SSCRJU_4.2.0/com.ibm.streams.toolki
ts.doc/spldoc/dita/tk$com.ibm.streams.teda/tk$com.ibm.streams.teda$184.html– Description of the congestionPolicy parameter in the spl.adapter::Export operator:
• http://www.ibm.com/support/knowledgecenter/en/SSCRJU_4.2.0/com.ibm.streams.toolkits.doc/spl
doc/dita/tk$spl/op$spl.adapter$Export.html#spldoc_operator__parameter__congestionPolicy
TEDA Tutorial in Module 12:
– http://ibmstreams.github.io/streamsx.tutorial.teda/docs/2.0.0/Module-12
20 © 2016 IBM Corporation
Thank YOU!!!