Upload
accumulo-summit
View
112
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Speaker: Bill Havanki Accumulo includes a remarkable breadth of testing frameworks, which helps to ensure its correctness, performance, robustness, and protection of your vital data. This presentation takes you on a tour from Accumulo's basic unit testing up through performance and scalability testing exercised on running clusters. Learn the extent to which Accumulo is put through its paces before it is released, and get ideas for how you can similarly enhance testing of your own code.
Citation preview
11
A Tour of Internal Apache Accumulo TestingBill HavankiSolutions Architect, Cloudera Government Solutions
2 ©2014 Cloudera, Inc. All rights reserved.
I’ll be your skipper for as far as we get
2
CC BY-ND 2.0Loren Javier
3 ©2014 Cloudera, Inc. All rights reserved.
Itinerary
• unit, functional, and integration• static analysis• continuous and randomwalk• memory stress and failures• upgrade• scalability and performance
3
CC BY-ND 2.0Loren Javier
4
Basic TestsDon’t worry if it’s crowded now ... there’ll be lots of room after these first few slides
4
5 ©2014 Cloudera, Inc. All rights reserved.
Unit testing
• just over 1000 unit tests• JUnit with Easymock• can run under Jenkins• less focus here than long-
running tests
5
CC BY-ND 2.0Loren Javier
6 ©2014 Cloudera, Inc. All rights reserved.
Functional testing (through 1.5.x)
• 72 or so• implemented in Java with
Python wrappers• need HDFS / ZK running• either Accumulo installed
or built• can run under MR
6
CC BY-ND 2.0Loren Javier again
7 ©2014 Cloudera, Inc. All rights reserved.
Integration testing (1.6.0+)
• just over 100 • pure Java• run under Maven• can run under Jenkins• uses MiniAccumuloCluster
7
CC BY-ND 2.0guess who? Loren Javier
8 ©2014 Cloudera, Inc. All rights reserved.
Static analysis
• Findbugs• PMD and CPD• Eclipse• Coverity
8
CC BY-ND 2.0still Loren Javier
9
Small Cluster TestsThe python is one of the jungle’s most fascinating creatures. Look at how many animals get wrapped up in the subject
9
10
©2014 Cloudera, Inc. All rights reserved.
Continuous ingest
• two phases• ingest• verify
• run for 24 or 72 hours
10
CC BY-ND 2.0oh hello there, Loren Javier
11
©2014 Cloudera, Inc. All rights reserved.
Continuous ingest
other processes during ingest
• walkers• scanners• agitators (≈ Chaos Monkey)
11
CC BY-ND 2.0that’s about enough, Loren Javier
12
©2014 Cloudera, Inc. All rights reserved.
Joke time
Knock knock.I said, KNOCK KNOCK.Safari.Safari, so good. You haven’t fallen asleep yet.
12
13
©2014 Cloudera, Inc. All rights reserved.
Randomwalk
• traversal of graph of test actions• 8 fundamental graphs• several more combining
• can be limited by hops or run time• usually run for 24 hours
13
CC BY-ND 2.0look at me, I’m Loren Javier, I take all the best pictures
14
©2014 Cloudera, Inc. All rights reserved.
Memory stress
• two running parts• reader• writer
• configure for large keys or values (e.g., > 1 MB)• run for about an hour
14
CC BY 2.0Justin Ennis
15
©2014 Cloudera, Inc. All rights reserved.
Failures
• namenode failover• datanode failure• tablet server failure• ZooKeeper failure• loss of tablet server node• loss of master node
15
CC BY-ND 2.0Loren Javier, back in the saddle
16
©2014 Cloudera, Inc. All rights reserved.
Upgrades
less formal, run manuallyCloudera has a couple tests here
• data compatibility test• ACL compatibility test
16
CC BY-ND 2.0Loren “likes the ride a bit too much” Javier
17
Large Cluster TestsLook at all the elephants! Go ahead and take pictures, they all have their trunks on
17
18
©2014 Cloudera, Inc. All rights reserved.
Scalability
• ingest millions of entries• run over tens of nodes (up
to 80)• watch performance, esp.
across versions
18
CC BY-ND 2.0Loren Javier, shutterbug extraordinaire
19
©2014 Cloudera, Inc. All rights reserved.
Joke time
Why did the elephant quit her testing job?• She was tired of working for peanuts.• She had had it up to ear.• She herd about a better job.• There were just too many wrinkles.• Her desire to do it was stamped out.
19
20
©2014 Cloudera, Inc. All rights reserved.
Performance - Benchmarks
8 benchmark implementations, e.g.:
• lots of small records• create, delete many tables• insert one terabyte of data
20
CC BY-ND 2.0Loren ... Javier
21
©2014 Cloudera, Inc. All rights reserved.
Performance - Yahoo! Cloud Serving Benchmark
• Accumulo binding• workloads with 2 phases
• load data• run test
• customizable record count, field length, etc.• multiple clients
21
CC BY-ND 2.0Nancy Nally
22
Returning to the SummitMake sure you have all your personal belongings with you ... smartphones, small children ... children left behind will be forced to write unit tests
22
23
©2014 Cloudera, Inc. All rights reserved.
Still more to do
• more unit and integration test coverage (≈30%)• attention to static analysis• chasing performance
regressions• easier to run
23
CC BY-ND 2.0nice work, Loren Javier
24
©2014 Cloudera, Inc. All rights reserved.
Thanks
• Sean Busbey & Mike Drob• Accumulo committers and
community• fellow Clouderans• Loren Javier• my audience
24
CC BY 2.0Justin Ennis
25
©2014 Cloudera, Inc. All rights reserved.
Visit our booth!
25
CC BY-ND 2.0Loren Javier, head photographer
26
©2014 Cloudera, Inc. All rights reserved.
The most dangerous part of our journey
26
CC BY-ND 2.0Loren Javier
27
©2014 Cloudera, Inc. All rights reserved.27
Please exit the boatWatch your step and mind your head. If you hit your head, watch your language