Upload
huguk
View
442
Download
1
Embed Size (px)
Citation preview
13
Benefits of unit testing
• Build confidence
• Enable change
• Describe behaviour
• Accelerate development
15
Hadoop testing challenges
• Framework modularization issues
• Heavyweight execution engine
• Availability of testing utilities
16
Hadoop testing challenges
• Framework modularization issues
• Heavyweight execution engine
• Availability of testing utilities
25
Hadoop testing challenges
• Framework modularization issues
• Heavyweight execution engine
• Availability of testing utilities
26
Local execution
Framework Local engine
Hive Local-mode
Cascading LocalFlowConnector
Pig Local mode
Crunch MemPipeline
27
Hadoop testing challenges
• Framework modularization issues
• Heavyweight execution engine
• Availability of testing utilities
28
Helper libraries
Framework Library
Hive HiveRunner
Cascading cascading-test
Plunger
Pig PigUnit
Crunch MemPipeline
31
Hive + HiveRunner: pros
• Write/test Hive apps in the same environment
• Seamless UDF development
32
Hive + Runner: cons
• Slow execution
• CSV data – hard to maintain
• Assertions on CSV strings is brittle
• Hadoop compatibility issues
37
Conclusions
• Testing possible with most frameworks
• Efficacy largely influenced by framework
• Tooling is immature