Steer and/or sink the supertanker by Andrew Rendell

Continuous source code analysis to help steer the super tanker and / or hit the iceberg ANDREW RENDELL, PRINCIPAL CONSULTANT

Image found on: ttp://www.flickr.com/photos/winkyintheuk/ .

INTRODUCTION

What is Continuous Inspection?

Directing software development can feel like trying to steer a super tanker.

–This practice might help you succeed, or...

–You might still hit the iceberg.

–You might even hit the iceberg because of this practice!

INTRODUCTION

Agile techniques focus on retrospection and reflection

Gathering metrics on source code is nothing new

Continuous Inspection has in theory been possible for many years but gaining momentum recently, possibly because of new, easy to use tools such as Maven and Sonar

New focus on temporal aspect of data (4th dimension) and web based graphical analysis

CONTINUOUS INSPECTION OF EVOLVING SOFTWARE CAN...

Give developers and architects an incredible insight into real software quality as it changes

Get the feedback required to make a development team adaptive to real issues

Allow the technical architect to apply lighter direction empowering the team and helping it become self-governing

Allow clients, sponsors and managers to explore and investigate their codebase through visualisation tools


Allow clients, sponsors and managers to explore and investigate their codebase through visualisation tools

Facilitate an understanding of underlying trends of development quality and the consequences of actions on that code base

Allow architects and developers to assimilate large swathes of code then drill down to specific areas in seconds rather than spending hours wading through generated data and code reviews

It might be possible to

STEER THE SUPER TANKER*

that is non-trivial software development

* I know it’s not a super tanker but it is very cool

Image found on: http://www.flickr.com/photos/mikebaird/


Supply developers and architects with a surprising amount of misinformation

Provide metrics which can be abused by developers and managers alike to prove pet theories or disprove unpopular ideas

The act of looking at a metric often results in conscious or subconscious gaming of that metric for short term gain with no real benefit other than the temporarily improved remuneration or status


Well meaning, intelligent individuals will become incredibly excited by the presentation of an attractive infographic whilst having zero understanding of what it is they are seeing

Architects and developers become obsessed by a gradient on a graph or the exact hue of a pulsating ball and forget about working software.

The development team will

STILL HIT THE ICEBERG

of software entropy

Image found http://www.flickr.com/photos/alanvernon/

SESSION STRUCTURE

Case studies – A number of experiences from various projects which

highlight the positive and negative of Continuous Inspection

HANDS ON TUTORIAL – Using an interactive tool (Sonar) to investigate a code base

OUR EXPERIENCES OF CONTINUOUS INSPECTION

Image found http://www.flickr.com/photos/83508181@N00/

FINDING THE TECHNICAL DEBT

With a large legacy application, can these tools help us locate the areas of high debt?

A large and relatively opaque code base.

With high volatility (i.e. high number of pushes to repository).

Geographically disparate team, knowledge spread across several time zones.

Attempt to increase velocity and reduce development, test and deployment costs by reducing technical debt and increasing test coverage.

With almost 100k lines of code, where do you start?

STEER THE SUPER TANKER


SEVERAL SOURCES OF DATA

– Human knowledge, which component is important and troublesome?

– Source control statistics, which files are amended the most?

– Issue tracking, can we identify areas with most defects or changes?

– Static source code analysis metrics augmented by test coverage




Only real candidate



This is where we start


Not low hanging fruit, those are genuinely rotten

Sonar report from previous screen created automatically every morning by CI

Very little effort to check

Means that technical leads receive visual feedback that corrective action are being taken


GAMING THE METRIC

As soon as you focus on a metric, its likely to be gamed with unexpected results.

A large legacy code base with a team spread around the world.

We know we have too much technical debt because:

– Simple changes take too long

– The teams spend almost all their time fire fighting in production

– Complex changes never make it out of development

HIT THE ICEBERG

GAMING THE METRIC

HIT THE ICEBERG

Some of the code has a ‘good’ level of complexity

Some of the code is ‘too complex’

Some modules are larger than others

This is a zero test system!

GAMING THE METRIC

This system obviously would be better with tests

Can say with some confidence that not having tests is one of the reasons for this system’s perceived poor quality

This is a very easy metric to measure

SOLUTION:

– Lets increase test coverage

– Lets motivate development teams across the world by making their end of year bonus dependent on achieving a certain level of unit test coverage

HIT THE ICEBERG

GAMING THE METRIC

HIT THE ICEBERG

Day before bonus day

BLIP ON THE RADAR

HIT THE ICEBERG

What happened here? Spikes in loc, coverage.

When you continually measure, have to be ready to react to erroneous data now and again (inclusion of generated code in this case).

EMPIRICAL EVIDENCE

Metrics, even without a tool like Sonar, can provide valuable empirical evidence on the state of the system

A multi-team project with between eight and fifteen developers at any one time

Geographically co-located with strong cross team communications and management

Before using Sonar metrics collected via shell scripts and excel


EMPIRICAL EVIDENCE

Anecdotally, team felt complexity was rising and velocity slowing.

Culture of refactoring, rationalising and retiring code.

– Was strategy working?

Following graph showed unexpected level of problem.


EMPIRICAL EVIDENCE


Code has consistently average complexity per method.

There is simply more code being written every release.

Analysis of code found duplication at the functional rather than code level

EMPIRICAL EVIDENCE

Team’s ‘gut feeling’ was right in that there was a problem.

Gathering the metrics provided empirical evidence of the issue.

Complexity growth rate had been underestimated.

Continually collecting these metrics might not have stopped team creating the situation.

This evidence was enough to justify a significant (12 man weeks) investment in refactoring.


A LIGHTER TOUCH

Architects and technical leads often have a broad remit across several projects.

Tools like Sonar can support the architect’s ability to pragmatically monitor large code bases without intrusive working practices.


A LIGHTER TOUCH

Small (2 man) team tasked with phased approached to correction:

– Capture existing behaviour through automated tests (current coverage high but not high enough).

– Implement a replacement system for several duplicated modules in an existing system.

No budget for manual testing – automated tests must be fit for purpose.

Very limited budget for technical governance and project management.

Key metrics monitored several times a week through Sonar.

Augmented by code review and design walk though.


A LIGHTER TOUCH


Start of exercise, 75.8k loc in main system

Objective: Reduce duplication (and therefore loc), improve quality

A LIGHTER TOUCH


Progress continually monitored

End of exercise, main system 65k loc Average complexity / method reduced

Functionality extracted into new module 4.3k loc – 5k redundant code removed

A LIGHTER TOUCH


Metrics as we saw them evolve

Rules compliance increases

Total complexity increases slightly – artefact of way of working

Something worrying happening to complexity / method

FALSE POSITIVE

Tools and tool users are fallible. They can supply false positives that create noise for the project and waste resources investigating.

Metrics collected continuously using sonar.

Technical Architect:

– Inspected metrics several times a week.

– Observed standup and burndown.

– Reviewed automated acceptance tests.

– Collaborated in design exercises on whiteboard

– Delegated technical implementation to team

HIT THE ICEBERG

FALSE POSITIVE

Development suspended for a week when complexity per method metric went red and correction not executed.

Detailed code review initiated.

HIT THE ICEBERG

FALSE POSITIVE

HIT THE ICEBERG

Original module, complexity / method okay, coverage good

New module, complexity / method worse, coverage low

FALSE POSITIVE

Drilled down using sonar to identify where higher than expected complexity was originating from.

Code then inspected.

False alarm:

– Sonar complexity algorithm rated JAX-RS method signatures as high, nothing in dev team control.

– Other methods were part of automated test controls which were verbose in order to demonstrate purpose.

HIT THE ICEBERG

FALSE POSITIVE

HIT THE ICEBERG

Cluster of complex packages

Test utility

Test utility

JAX-RS resources

CASTING A WIDE NET

Powerful visualisation tools coupled with an array of integrated metrics can allow large code basis to be monitored.

A large, highly volatile, code base.

Geographically disparate team, spread across several timezones.

How can technical authorities police such a code base without velocity crushing (and probably ineffective) prescriptive processes or a code review team almost as big as the development team?


CASTING A WIDE NET


Something bad happens to coverage in one module

Duplication starts to rise

A warning to investigate further

CASTING A WIDE NET


One package in module is being worked on

Complexity / method rising rapidly

LOC and duplication increasing

Flagged up whilst development is ongoing, not later in process

CASTING A WIDE NET


New module appears, unusually high avg complexity / method

CASTING A WIDE NET


Complexity / method average drops (as code size increases, bad code still there)

Duplication rises

HOW WE DO CONTINUOUS INSPECTION

Build system established in Sprint Zero

– Jenkins CI

–Sonar

Rules (PMD, Checkstyle) configured to use those defined for use on this site


HOW WE USE CONTINOUS INSPECTION


Daily sonar build (4am) triggered by Jenkins

Runs unit and integration tests

Dashboard printed out and reviewed by team at end of stand-up

Anything ‘unusual’ discussed and actions taken to investigate / correct

Includes violations, coverage, complexity, any unexpected change

Continuous inspection, continuous improvement

CONCLUSIONS

NEGATIVE EXPERIENCES CAN BE CATEGORISED AS:

– Being distracted then satisfied by the superficial

– Using the metrics in isolation to decide whether something is good or bad, high or low quality, true or false

Everybody loves an infographic, be aware they are often misunderstood and even knowingly abused

Continuous Inspection is a great tool for anybody involved with the project willing to invest a little time understanding and questioning what they see


CONCLUSIONS

Can Continuous Inspection enable developers to steer the super tanker?

– Or will they still hit the iceberg,

– Or even hit the iceberg because of the feedback from inspection?

Continuous Inspection is a valuable tool that if used with care can make a difference

Must recognise that it rarely delivers the full story, just an indication

Be wary of wider dissemination of attractive data


YOUR TURN!

Image found http://www.flickr.com/photos/ell-r-brown/

STRUCTURE OF NEXT SECTION

Point everybody at useful resources (metric definitions etc)

Get everybody accessing the Sonar server

Five minute tour of the relevant features

In small groups or as individuals use the tool to draw some positive and negative conclusions about the code base

COLLATE OUR CONCLUSIONS AND DISCUSS:

– Do we feel the conclusions have merit?

– Are they superficial or of real impact on quality?

– How could we correct, control or even prevent in future?

– What negative implications (e.g. gaming) could this have?


USEFUL RESOURCES

SONAR INSTANCE: – http://192.168.x.x:9000

SONAR METRIC DEFINITIONS: – http://docs.codehaus.org/display/SONAR/Metric+definitions

SOURCE BASE WE ARE ANALYSING: – https://jira.springsource.org/browse/AMQP

– http://github.com/SpringSource/spring-amqp

OR (MORE COMPLEX):

– http://nemo.sonarsource.org/dashboard/index/50544


COLLATE

RULES

– Three minutes maximum each per point (we can come back to you)

– Please let the speaker describe their point in full before we discuss and analyse

– Presenter will try and keep analysis of any one point to a sensible length, shout if he forgets!


http://www.valtech.co.uk http://blog.valtech.co.uk http://twitter.com/valtech http://www.slideshare.net/valtechuk

Technology

Steer and/or sink the supertanker by Andrew Rendell