View
2.258
Download
1
Category
Tags:
Preview:
DESCRIPTION
Software log file analysis helps immensely in software testing and troubleshooting. The first step in automated log file analysis is extracting log data. This requires decoding the log file syntax and interpreting data semantics. The expected output of this phase is an organization of the extracted data for further processing. Log data extractors can be developed using popular programming languages targeting one or few log file formats. Rather than repeating this process for each log file format, it is desirable to have a generic scheme for interpreting elements of a log file and filling a data structure suitable for further processing. The new log data extraction scheme introduced in this paper is an attempt to provide the advanced features demanded by modern log file analysis procedures. It is a generic scheme which is capable of handling both text and binary log files with complex structures and difficult syntax. Its output is a tree filled with the information of interest for the particular case. My speech in ICSCA 2011 - http://dileepaj.blogspot.com/2011/07/speech-in-icsca-2011.html
Citation preview
Dileepa Jayathilake
A Novel Mind Map Based Approach for Log Data Extraction
Department of Electrical Engineering University of Moratuwa Sri LankaICIIS
2011
Background
Problem Identification
Solution Overview
Solution Design
Implementation
Conclusion
AG
EN
DA
BACKGROUNDFunctional Conformance
Quality Verification
Troubleshooting
System AdministratorsDomain Experts
DevelopersApplication Logs
Monitoring Tool Logs
LOG FILE ANALYSIS
Testers
Require Expertise
Labor Intensive
Error-prone
Advantage of Recurrence not used
BACKGROUND
PITFALLS IN MANUAL APPROACH
PROBLE
M
IDENTI
FICAT
ION
Challenges
Result
Automation abandoned
Proprietary Implementation
Costly
Rules not human readable
Difficult to add new rules
Less resilient to format changes
CHALLENGES
Reports not customizable
Different log formats & structure
Lack of a common platform
Making rules human & machine
readable
XML
Universal format
Ubiquitous use
Many tools available
Costly meta data
Less human readable
Associated languages are complex
Not every log is xml
Log File Grammars Formal definitions
Regular expression based
Assume line logs
Fail with complex log file structures
Unable to handle difficult syntax
Distant from XML
EXISTING SUPPORT
PROBLE
M
IDENTI
FICAT
ION
Handle arbitrary formats and structures of log files
In lined with XML
Friendly for non-developers
Ability to generate custom reports
A GENERIC LOG ANALYSIS FRAMEWORK
+
Resilient to log file format and structure changes
A knowledge representation which is both human and machine readable
EXPECTA
TIONS
SOLUTI
ON
OVERVIEW
InterpretationUnified mechanism for extracting information of interest from both text and binary log files with arbitrary structure and format
ProcessingEasy mechanism to build and maintain a rule base for inferences
PresentationFlexible means for generating custom reports from inferences
Log Files
Knowledge Representation Schema
SOLUTION OVERVIEW
SOLUTI
ON
OVERVIEW
Resembles human knowledge organization better
MIND MAPS
Easy to add content
Easy to visualize
Easy access to computers
Tree
Can utilize existing tree algorithms
Easily convertible to XML
Can utilize existing tools
Easy to combine
MIND MAP AS KNOWLEDGE UNIT
SOLUTI
ON
DESIGN
InterpretationUnified mechanism for extracting information of interest from both text and binary log files with arbitrary structure and format
Log Files
GENERIC INTERPRETATION SOLU
TION
DESIGN
SOLUTI
ON
IMPL
EMENTATI
ONLOG FILE GRAMMAR
Assume knowledge on file structure and syntax
Able to handle a spectrum of log file types
Based on hierarchical log entries
Log entries identified by attribute combinationTranslates a log file into a mind mapResilient for malformed log files
SOLUTI
ON
IMPL
EMENTATI
ON
PARSER
SOLUTI
ON
IMPL
EMENTATI
ON
EXAMPLE
LE ≡ ([A,S,E,S,B], NO); A ≡ ([A1,A2,A3], NO); A1 ≡ (‘v’); A2 ≡ (‘a’); A3 ≡ (‘l’);S ≡ ({SPACE, TAB}, -1, 0, NO); SPACE ≡ (‘ ‘); TAB ≡ (‘\t’); E ≡ (‘=’); B ≡ ({ZERO, ONE, …, NINE, DECIMAL_POINT}, -1, 1); ZERO ≡ (‘0’); ONE ≡ (‘1’); … ; NINE ≡ (‘9’); DECIMAL_POINT ≡ (‘.’)
val = 2.3
SOLUTI
ON
IMPL
EMENTATI
ON
MICROSOFT SHAREPOINT LOG FILE
Difficult syntax
SOLUTI
ON
IMPL
EMENTATI
ONMICROSOFT APPLICATION VERIFIER LOG
XML
SOLUTI
ON
IMPL
EMENTATI
ON
TRADING SYSTEM LOG
Corrupted Log
CONCLUSION
The new schemeIs capable of expressing both text and binary log files with different structures and formats ranging from flat messages
to complex hierarchies.
REFERENCES[1] J. H. Andrews, “Testing using log file analysis: tools, methods and issues,” Proc. 13th IEEE
International Conference on Automated Software Engineering, Oct. 1998, pp. 157-166.
[2] D. Jayathilake, “A mind map based framework for automated software log file analysis,” International Conference on Software and Computer Applications., in press.
[3] T. Takada and H. Koike, “Mielog: a highly interactive visual web browser using information visualization and statistical analysis,” Proc. USENIX Conf. on System Administration, Nov. 2002, pp. 133-144.
[4] L. Destailleur, “AWStats,” [Online]. Available: http://awstats.sourceforge.net
[5] J. Valdman, “Log file analysis,” Department of Computer Science and Engineering (FAV UWB)., Tech. Rep. DCSE/TR-2001-04, 2001.
[6] J. H. Andrews, “Theory and practice of log file analysis,” Department of Computer Science, University of Western Ontario., Tech. Rep. 524, May 1998.
[7] T. Buzan and B. Buzan, The Mind Map Book. New York: Penguin Books, 1994, pp.79-91.
[8] J. Cowie and W. Lehnert, “Information extraction,” Comm. ACM 39, 1996, pp. 80–91.
[9] J. Abela and T. Debeaupuis, “Universal Format for Logger Messages,” The Internet Engineering Task Force. [Online]. Available: http://tools.ietf.org/html/draft-abela-ulm-05
QUESTIONS
Recommended