21
The digital magazine for enterprise developers Issue February 2015 | presented by www.jaxenter.com #42 Modding your JS Build a modular application with AngularJS Elasticsearch and Kafka Combining Kafka’s speed with Elasticsearch’s intelligence Thinking hard Why software devs need to work like hardware devs Putting out your Scrum fires ©iStockphoto.com/Raycat

Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

The digital magazine for enterprise developers

Issue February 2015 | presented by www.jaxenter.com #42

Modding your JSBuild a modular application with AngularJS

Elasticsearch and KafkaCombining Kafka’s speed with Elasticsearch’s intelligence

Thinking hardWhy software devs need to work like hardware devs

Putting out your

Scrum fires©

iSto

ckp

hoto

.co

m/R

ayca

t

Page 2: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Editorial

www.JAXenter.com | February 2015 2

Let’s be honest about it, this wasn’t exactly how we expected the year to start: with a major open-source project being cut down in its prime. The news of Pivotal cutting its support for Groovy and Grails took us all by surprise. But most of all, it got us asking questions.

Did people really know that (or why) Pivotal was support-ing this open-source project until the money was gone? Is the only purpose of open source to make money? If not, then what is it all for? The community? The company? The soft-ware? The greater good?

While you dwell on those questions, we have a bumper sized issue packed with useful nuggets of knowledge to keep your mind nourished: Moritz Schulze gives us a tour of An-gularJS modular applications and the setup required to get started. Mariam Hakobyan shows us her combination of Elasticsearch database and Kafka for speed and intelligence.

A bumpy beginning to 2015

Dr. James Stanier from Brandwatch explains why software developers still need to think like hardware engineers to de-liver increased speed, reliability and potential when it comes to processing data. Oracle’s Wolfgang Weigend has taken the time to show us his experience with automated testing of JavaFX GUI components. And fi nally, our very own Natali Vlatko is introducing herself to the JAX community with two words of advice: “Agile Schmagile”. Agile alone does not an effective development team make, says Natali, who shows us fi ve concrete ways in which scrums can fail.

Coman Hamilton,Editor

Building modular applications with AngularJS 5Modularizing your JavaScript to structure large applicationsMoritz Schulze

Automated testing of JavaFX GUI components 9Testing JavaFX 8 UI application functionality Wolfgang Weigend

A blend of the best – Elasticsearch and Kafka 11Matching Kafka’s speed with Elasticsearch’s intelligence using the river pluginMariam Hakobyan

Scaling: Thinking in hardware and software 16Speed, reliability and increased data processing potentialDr. James Stanier

The ways that Scrum can fail 19Is your Scrum implementation failing?Natali Vlatko

Inde

x

Page 3: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Hot or Not

www.JAXenter.com | February 2015 3

Microsoft’s (lack of) diversityThe most recent census data released by Microsoft basically confirms that it’s run and managed by a bunch of white guys, with a guest appearance by Asians now and then. After quietly revealing their diversity statistics late last year, Microsoft cements its position as one of the least diverse companies in tech by boasting a 76 percent male, 61 percent white workforce population. To further question Microsoft’s commitment to diversity, CEO Satya Nadella proved that the issue stemmed from above when he made comments regarding women and their right to a raise, suggesting that faith in “the system” should be enough. Pffft.

Eclipse’s new dark themeOur quest for all black everything has hit another high point with the launch of Eclipse Luna 4.4 – finally, our witchy coding incantations are feeling right at home. With dark themes becoming incredibly popular, Luna has been accepted with some-what open arms, and to make sure the darkness is all encompassing, MacBook users can turn to the dark side with the latest version of OS X Yosemite. Incremental chang-es to the icon design and overall aesthetic complete the broody transformation. Other things that are better in the dark: dancing, coffee, sexy time.

Cameron’s proposed encryption banU.K. Prime Minister David Cameron wants to get those terrorists bad, with unfortunate consequences potentially hitting digital Britain in the process. Cameron argues that there should be no means of communication “that we can-not read”, suggesting a ban on encryp-tion and back-door access that can only come from someone who is completely uninformed about the technology. His proposal would likely be revisited if he was re-elected to office, in order to target suspicious or potential terrorist activity across digital communications. Yo Cam-eron, back-doors are bad news – creat-ing such a vulnerability will only have those extremists chomping at the bit.

The YDD ManifestoNew thinking that changes the way you program? Well, be prepared to want to punch this new developer methodology in the face: YOLO Driven Development gives us sev-enteen ways to piss off the entire programming world and it’s more widespread than you think: Don’t indent. YOLO. Don’t use naming conventions. YOLO. Don’t waste time with gists. YOLO. Don’t be a complete asshole. YOLO (Okay, that last one isn’t really part of it, but it may as well be). Craving further chaos? You may also enjoy other methodologies such as Cover Your Ass Engineering (CYAE) or Asshole Driven Development (ADD), where the biggest jerk makes all the big decisions.

Groovy and Grails: DumpedPivotal’s decision to withdraw support from Groovy and the accompanying framework Grails is sad news, and has come as a surprise to a lot of fans of the JVM language. New sponsorship is now being sought, with Lukas Eder calling it by declaring that Groovy is no longer a viable business for Pivotal. While the timeframe given for new sponsors to emerge also allows for new version releases, it’s all a bit sucky and the founders are crossing their fingers and toes that their potential for greatness as open source projects can be recognised by other interested parties (who have a shitload of cash).

Page 4: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

So you improved transactional

through put from 400,000 TPS

to 20,000,000?  It must be

LOWLATENCY

The Tech in Finance Conference 2015

April 28–29, LONDON www.jax-fi nance.com

JOIN NOW!

Page 5: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Web

www.JAXenter.com | February 2015 5

by Moritz Schulze

In a previous article on JAXenter.com, I’ve shown how to build a secured REST API with Spring. It followed how we at techdev built our time tracking application trackr.

In this article I will describe a way to build a modular An-gularJS application which is suited as a frontend for a large application like this one in the last article. The main focus is the modularization of the JavaScript code in a way that helps structuring large applications. I chose to not use any JavaScript superset language like TypeScript or the modern ECMAScript 6. This will make the code more readable for people used to plain old JavaScript, but some boilerplate might be avoided with those tools. As in the last example,

code for this article is provided on our company GitHub ac-count. The commits represent the chapters.

ModularizationI will use two concepts of modules. The first one is about file dependencies and the correct order of loading the JavaScript files, the other one is about application dependencies.

RequireJSRequireJS is a JavaScript library for Asynchronous Module Definitions (AMD). You can define modules with dependen-cies on other modules and RequireJS will load and provide them. It also allows for different versions of the same library to be used in the same project which can be quite useful:

Modularizing your JavaScript to structure large applications

Building modular appli-cations with AngularJS If you’ve ever wanted to mod a large application to give it more struc-ture, here’s your chance. Software consultant Moritz Schulze takes us through the steps for setting up a modular AngularJS application.

©iS

tock

pho

to.c

om

/gm

utlu

Page 6: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Web

www.JAXenter.com | February 2015 6

// src/loadModule.jsdefine(['jQuery'], function($) { function loadSomething() { $.get('/load/something'); } return loadSomething;});

Here we define a module (without a special name) in the file loadModule.js that depends on the module jQuery and ex-ports a function.

Other modules now might load this one and use the func-tion. If jQuery was already required and loaded by another module it won’t be loaded again – RequireJS keeps track of all loaded modules and provides them.

You might wonder whether a special version of jQuery has to be used since jQuery is not a RequireJS module. Luckily not. There is a way to configure RequireJS so it knows the jQuery file will export a $ function and will wrap it into a module. You can also define dependencies between non-Re-quireJS modules. This config is also my preferred way to start

a JavaScript application with RequireJS. With it, you only need one <script> tag in your index.html (Listing 1).

RequireJS will load the bootstrap.js file, read the config and execute the last require call at the end of the file – which will load the application and in return all its dependencies and call app.init() after everything is loaded. Now we can define file dependencies and move on to AngularJS modules.

AngularJS ModulesMost developers using AngularJS have already written a module – because every app is one. Most likely they also have set up some dependencies to other modules like ngRoute. In this example we will make heavy use of modules for our ap-plication.

AngularJS modules aren’t very powerful in terms of encap-sulation but at least provide facilities to separate code and package reusable modules. It may seem they provide proper namespacing but that’s not the case. If you define a UserSer-vice in some module it is injectable in all other modules – and would overwrite or be overwritten by another UserService. I will show a way how I solved this namespacing problem.

Listing 1

// bootstrap.jsrequire.config({ baseUrl: '/src', paths: { 'jQuery': 'lib/jQuery/jquery-2.11.js', 'bootstrap': 'lib/bootstrap/bootstrap.js' }, shim: { 'bootstrap': ['jQuery'], 'jQuery': { exports: ['$'] } }

});

require(['app'], function(app) { // bootstrap the app app.init();});

// app.jsdefine(['jQuery'], function(jQuery) { function init() { $.get('init');

}

return { init: init };})

// index.html...<script src="require.js" data-main="bootstrap.js"> </script>

Listing 2

// bootstrap.jsrequire.config({ /* ... */ });require(['angular', 'app'], function(angular) { angular.bootstrap(document, ['jax']);});

// src/app.jsdefine(['angular', 'modules/user/userModule', 'modules/admin/adminModule'], function(angular) { var jax = angular.module('jax', ['jax.user', 'jax. admin']); return jax;});

// src/user/userModule.jsdefine(['angular', './addressBookController', './addressBookService'], function(angular, addressBookController, addressBookService) {

var user = angular.module('jax.user', []); user.controller('user.addressBookController', addressBookController); user.service('user.addressBookService', addressBookService); return user;});

// src/user/addressBookService.jsdefine([], function() { return ['$http', function($http) { function loadAddressBook() { return $http.get(...); } return { loadAddressBook: loadAddressBook } }];

});

// src/user/addressBookController.jsdefine([], function() { return ['$scope', 'user.addressBookService', function($scope, addressBookService) { addressBookService.loadAddressBook(). then(function(addressBook) { $scope.addressBook = addressBook; }); }];});

// src/admin/adminModule.jsdefine(['angular'], function(angular) { var admin = angular.module('jax.admin', []); return admin;});

Page 7: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Web

www.JAXenter.com | February 2015 7

I like to structure my whole application according to the business functionality. That goes for files (JS and HTML) as well as modules. In my experience this really helps when navi-gating the code.

I will provide you a basic structure for an AngularJS app with RequireJS. As an example, let’s create an address book app with two roles, users and admins. Every user can see a list of addresses and an admin can edit and delete individual entries. This already cries for a user and an admin module (Listing 2).

We don’t have any routes yet – since I will be using the ui-router later on I left them out. You can already see how I do my own namespacing – both the service and controller have a prefix in their name. Since the prefix is separated with dots I always have to use the array notation of angular injec-tion. This together with RequireJS leads to a lot of boilerplate strings in your files. I see it this way – Java requires me to write imports. IDEs are doing this for a long time, so you never have to bother with it. Unfortunately there’s no IDE that can generate RequireJS imports so we still have to do it ourselves.

But when your application grows to a certain size you will come to appreciate the namespacing of AngularJS services and controllers. If you do it right you can easily figure out which module a service is coming from.

The ui-routerAngularJS has a module to allow simulation of different pages, called ngRoute. With it you can configure your app to react to changes in the hash part of the current location by replacing a marked div with a template and instantiating a controller in it. So index.html#/addressbook could load a template called addressbook.html and our user.addressBook-Controller.

This system is not very flexible. Let’s say you have a naviga-tion bar and want to replace some of it’s content when you’re visiting a specific URL:

<nav><a href="#/admin">Admin</a> We want to insert a view here, too!</nav><div ng-view id="content"/>

But you already used your ng-view to display the content. Multiple views are not supported by ngRoute. This is where the ui-router comes into play. Rather than just telling An-gularJS what template to load into one view and what con-troller to use you build deeply nested views and hierarchical so-called states that can fill each view in the tree with a tem-plate and instantiate a controller on it. A childstate can easily replace a view that a parent state has defined. So a URL does now correspond to one state that might have several parent states and several views.

It might take a moment to get into the notation the ui-rout-er uses, I encourage you to read the documentation on their GitHub wiki.

The ui-router goes very well with my modularization ap-proach. Each module can define its own states that can be child states of more general states that set up the base UI. Let’s add the ui-router to the addressbook example (Listing 3).

As you can see, this allows a very detailed separation of concerns. It also allows very fine and granular control for each module what to display. For example, the admin module does not want to display additional navigation information and just leaves the template empty. The ui-router has many options for a state definition and is very flexible.

JSHintI think it helps a larger project if there are guidelines for code style that everyone adheres to. JavaScript is a very generous language in terms of syntax which can lead to very inconsist-ent style. A linting tool can help detect such errors with a tool and with IDE support. I use JSHint which is understood by IntelliJ and can be executed with Grunt (see next section) to automatically fail a build if something’s wrong. Since it’s very easy to integrate, I highly recommend it. Here are some rules I enforce with JSHint:

•Four space indentation (tabs is also possible if you prefer that)

•Only camelcase identifiers•No trailing whitespace•Enforcement of curly braces around blocks•No unused variables in functions•Variables must be defined before used•Single quotation marks

Build Process with GruntGrunt is a great tool for AngularJS applications. It offers functionality to run certain tasks on the source code, CSS files and HTML files. It’s great to package certain tasks together into goals that can be executed, like linting or testing.

Building a Single FileWhile having a lot of small files containing the AngularJS

code (modules, controllers, ...) is great during development, you certainly don’t want that in production. You could accu-mulate over 50 requests only to load all parts of your applica-tion really fast and the user will notice the increased loading time from that.

Luckily there is a Grunt plugin to package RequireJS mod-ules into single files and even minify the JavaScript along. You can even do this for single modules, so we could build app.js, jaxUser.js and jaxAdmin.js files from our previous example.

Another great use of this build phase is to replace non-min-ified vendor libraries with minified ones. While developing it is very helpful to have a non-minified AngularJS included since it provides better error messages. But when deploying to production you most certainly want to use a minified version. I have two additional requirements:

1. The vendor libraries are delivered in separate files/re-quests because they’re probably easier to cache

2. I want to use the provided minified versions and not mini-fy AngularJS every time I build the project myself

Since listing the config for that here would be quite boring I refer you to my sample application. There are some tricks in-

Page 8: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Web

www.JAXenter.com | February 2015 8

volved and the implementation of my requirements has some drawbacks.

First, in the Grunt task for RequireJS you can set a path to empty, so it won’t be included in the final file. I do this for all libraries, they should be loaded separately. Second, I keep an extra index.html and bootstrap.js just for production use. This helps referencing the minified libraries. The large drawback is, if you add a new library to your project you have to configure three places: The Gruntfile.js and both bootstrap.js files. I guess a more elaborate setup can be found to make this process easier.

I know this is probably confusing, so please take a look at the sample application to get a better understanding. Run grunt and look at the output – you get what I think is a nice deliverable.

Listing 3

// index.html<head> <script data-main="bootstrap.js" src="require. js"></script></head><body> <div ui-view="root"/></body>

// src/app.html<nav class="navbar navbar-default navbar-fixed- top"> <div class="container"> <div ui-view="navbar"></div> </div></nav>

<div class="container"> <div ui-view="content"></div></div>

// src/moduleSelection.html<h1>Modules</h1><a ui-sref="jax.user" href>User</a><br/><a ui-sref="jax.admin" href>Admin</a>

// src/app.jsdefine(['angular', 'angular-ui-router', 'modules/user/userModule', 'modules/admin/adminModule'], function(angular) { var jax = angular.module('jax', ['ui.router', 'jax.user', 'jax.admin']); jax.config(['$stateProvider', function($stateProvider) { $stateProvider .state('jax', { url: '', abstract: true, views: { root: { templateUrl: 'src/app.html' } } }) .state('jax.index', {

url: '', views: { content: { templateUrl: 'src/moduleSelection.html' } } }); }]); return jax;});

// src/modules/user/userModule.jsdefine(['angular', './addressBookController', './addressBookService', 'angular-ui-router'], function(angular, addressBookController, addressBookService) { var user = angular.module('jax.user', ['ui.router']); user.controller('user.addressBookController', addressBookController); user.service('user.addressBookService', addressBookService);

user.config(['$stateProvider', function($stateProvider) { $stateProvider .state('jax.user', { url: '/user', views: { content: { templateUrl: 'src/modules/user/addressBook. html', controller: 'user.addressBookController' }, navbar: { templateUrl: 'src/modules/user/userNavbar. html' } } }); }]); return user;});

// src/modules/user/addressBook.html<h1>Addresses</h1>

<table class="table table-bordered"> <thead> <tr> <td>Name</td> <td>First name</td> <td>City</td> </tr> </thead> <tbody> <tr ng-repeat="address in addresses"> <td>{{address.name}}</td> <td>{{address.firstName}}</td> <td>{{address.city}}</td> </tr> </tbody></table>

// src/modules/user/userNavbar.html<ul class="nav navbar-nav"><li><a>Custom User navbar</a></li></ul>

// src/modules/admin/adminModule.jsdefine(['angular', 'angular-ui-router'], function(angular) { var admin = angular.module('jax.admin', ['ui. router']); admin.config(['$stateProvider', function($stateProvider) { $stateProvider .state('jax.admin', { url: '/admin', views: { content: { templateUrl: 'src/modules/admin/admin.html' } } }); }]); return admin;});

// src/modules/admin/admin.html<h1>Welcome to the admin module!</h1>

Moritz Schulze is a Software Engineering consultant with techdev solu-tions.

Page 9: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

JavaFX

www.JAXenter.com | February 2015 9

by Wolfgang Weigend

As of March 2014, JavaFX has been a permanent part of JDK 8. In future, JavaFX 8 will be used to build mission criti-cal business applications. On the one hand, this requires a degree of maturity of new JavaFX UI technology to be accept-able. On the other hand, it is necessary that user interfaces of

business applications could be tested extensively for their cor-rect application functionality. The focus is on the high data volume that is fed to automated tests into the graphical user interface. Please see below for an illustration of how such a test scenario could be set up with the QF-Test tool.

During applications development, testing plays an impor-tant role. According to the current World Quality Report, a

quarter of the IT budget is already allotted for test-ing. The increasing complexity of the application development also need the testing requirements to increase. High test coverage can be achieved by test automation. Suitable tools help facilitate the setting up of automated tests and ensure a high integration with the developed application. The benefi t of auto-mated UI tests is that they can cover a whole process with one test in contrast to unit tests which cover only an isolated unit.

By using the correct test environment it is possible to have the bulk of tests set up by professional test-ers, thus freeing resources for development work. For this, the test environment should allow the test-er to work with familiar terms and objects. Espe-cially in GUI tests it is important that testers work with objects they know and recognize on the GUI of the application, even though the actual structure of the GUI is much more complex. Even in simple Java FX applications the GUI consists of many single el-ements in a highly complex tree structure. Figure 1 shows a demo application for the confi guration of new cars. The GUI structure of this application be-comes visible in a 3D view (Figure 2), three-dimen-sionally highlighting the complexity of the nesting. However, only a small part of the elements is rel-Figure 1: Demo application “confi guration of new cars” (2dimensional)

Testing JavaFX 8 UI application functionality

Automated testing of JavaFX GUI components JavaFX is now celebrating one year in the Java8 club. As application development grows more and more complex, Oracle’s Wolfgang Weigend shows us why GUI testing is so important.

Page 10: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

JavaFX

www.JAXenter.com | February 2015 10

evant for testing, all others are integrated on a technical level. The test tool allows the reduction of these complex GUIs to the essential.

Figure 3 shows the simplifi ed structure as it is made available to the tester by the test tool. In specifi c cases it is possible to work with the full hierarchy at any time, and programmable tests are possible via scripting. Through simplifi cation and generalization of the UI components, the support of a modular structure allows parts of the test to be

Figure 2: Demo application “confi guration of new cars” (3dimensional)

Figure 3: 3dimensional structure simplifi ed by test tool QF-Test

reused. Thus the effort required for test creation is reduced and at the same time maintainability increased. It provides the opportunity to create modules and libraries profession-al testers can work with. Due to the additional integration of software drivers, data-driven testing is also possible on the GUI, and mass testing can be created.

ConclusionGUI testing of an application is essential to the correspond-ing development. By directly integrating a test tool into the development process, a continuous integration scenario could be set up. Test tools are facilitating the JavaFX 8 ap-plications testing by simplifying the component hierarchy and due to its modular structure. Also this is a valid option for testers, who haven't any programming knowledge.

References

[1] http://docs.oracle.com/javase/8/javase-clienttechnologies.htm

[2] http://docs.oracle.com/javase/8/javafx/api/toc.htm

[3] http://www.capgemini.com/thought-leadership/world-quality-report-2014-15

[4] http://www.qfs.de/de/qftest/index.html

Wolfgang Weigend is a senior leading systems consultant at Oracle Ger-many B.V. & Co. KG. He is chiefl y involved in Java technology and ar-chitecture development for company-wide applications.

Page 11: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Data

www.JAXenter.com | February 2015 11

by Mariam Hakobyan

ElasticsearchElasticsearch is a highly scalable, distributed, real time search engine with a REST API that is hard not to love. The core of Elasticsearch’s intelligent search engine is largely another software project: Lucene. It is perhaps easiest to understand Elasticsearch as a piece of infrastructure built around Lucene’s Java libraries. Elasticsearch itself provides a more useable and concise REST API, scalability, and op-erational tools on top of Lucene’s search implementation. It also allows you to start with one machine and scale to hundreds, and supports distributed search deployed over Amazon EC2’s cloud hosting.

Plugins in Elasticsearch are a way to enhance the basic Elasticsearch functionality in a custom manner. They range from adding custom mapping types, custom analyzers, na-tive scripts, custom discovery and more. There are multi-ple types of plugins, in this article we will explore the river plugin, which is used to index a stream of data from one source into Elasticsearch. Particularly, we will see how to index a stream of data from Kafka into Elasticsearch.

KafkaApache Kafka is a publish-subscribe messaging system re-thought as a distributed commit log. It was originally de-veloped at LinkedIn and later on became a part of Apache project. Kafka is fast – a single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. The main reason why it’s so fast is that it uses Zero Copy and works in a partitioning mechanism.

Applications that use zero copy request that the kernel copy the data directly from the disk file to the socket, without go-ing through the application.

Zero copy greatly improves application performance and reduces the number of context switches between kernel and user mode. Other advantage is that consumers keep the in-dex of read data, not Kafka itself. It is scalable – can be elastically and transparently expanded without downtime, durable  – messages are persisted on disk and replicated within the cluster to prevent data loss, and distributed by design – it has a modern cluster-centric design based on mul-tiple brokers and partitions.

InitiativeAssume you have an event-driven asynchronous applica-tion, which produces thousands of events per second. The application uses Kafka distributed messaging system, and puts all application related events into Kafka. So the use case is to have all this data available in Elasticsearch to be able to perform analytics, search on it later on. Elastic-search allows clients to build custom plugins, to add any additional functionality to Elasticsearch using provided Java API, so the perfect solution was to have our own plug-in that will do this for us. Using plugins, it’s possible to add new functionality to Elasticsearch without having to create a fork of Elasticsearch itself. You then run the plugin in Elasticsearch itself, without the need to start up a separate application/process.

In this article, we will go through the steps and show how to create the Elasticsearch plugin to put all data from Kafka into Elasticsearch. The plugin is an open source project and

Matching Kafka’s speed with Elasticsearch’s intelligence using the river plugin

A blend of the best – Elasticsearch and Kafka While real-time search engine Elasticsearch is known for its scala-bility, LinkedIn’s Kafka is a reliably fast messaging system. Mari-am Hakobyan shows us how to the two work together as a fast and performance-optimised duo.

Page 12: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Data

www.JAXenter.com | February 2015 12

available in GitHub in the following repository. You can look through the codebase while following the steps dis-cussed in this article.

Building the “elasticsearch-river-kafka” pluginThe requirements of building Elasticsearch river plugin for Kafka are the following:

1. Elasticsearch installation2. Kafka installation3. Maven4. JDK

The plugin is simply a Zip file that contains one or more Java jar files with compiled code and resources. After setting up the Maven project including Elasticsearch and Kafka depend-encies, the next step is to use a maven assembly plugin in pom. xml to create the zip file (for more details take a look here). After setting up all the configuration related stuff, here are the actual steps of building the plugin.

Step 1We need to write a class, KafkaRiverPlugin, that extends the AbstractPlugin class, and implement the following methods:

@Overridepublic String name() { return "river-kafka";}

@Overridepublic String description() { return "River Kafka Plugin";}

The name is used in Elasticsearch to identify the plugin, for example when printing the list of loaded plugins.

Step 2Now we need to tell Elasticsearch about our plugin, which is done by adding the fully qualified class name of the plugin to a special es-plugin.properties file on the classpath, usually stored under src/main/resources/es-plugin.properties:

plugin=org.elasticsearch.plugin.river.kafka.KafkaRiverPlugin

When Elasticsearch starts up, the Elasticsearch org.elastic-search.plugins.PluginManager will scan the current class-path looking for plugin configuration files and instantiate the referenced plugins.

Step 3To add our custom functionality to the plugin, we need to create a module. Elasticsearch uses Guice to wire together all the components of the server (More details about the Guice Framework here). While loading all the plugins, Elasticsearch invokes (via reflection) a method called onModule() with a parameter that extends the AbstractModule class:

public void onModule(RiversModule module) { module.registerRiver("kafka", KafkaRiverModule.class);}

The above line of code tells Guice that for this module the River class implementation will be the KafkaRiver class. We also tell KafkaRiver to be created during Guice initialization only once:

public class KafkaRiverModule extends AbstractModule { @Override protected void configure() { bind(River.class).to(KafkaRiver.class).asEagerSingleton(); }}

Now, when we have the plugin setup, it’s time to dig into the actual implementation of our custom logic.

Step 4A river should implement the River interface and extend the AbstractRiverComponent. The River interface contains only two methods: start, called when the river is started and close, called when the river is closed. The AbstractRiver-Component is just a helper that initializes the logger for the river and stores the river name and the river settings on two instance members. The river constructor is annotated with the Guice @Inject annotation, so that all the needed depend-encies can be injected into the river. The following depend-encies are injected: RiverName, RiverSettings and Client, where Client is a client pointing to the node where the river is allocated (Listing 1).

Listing 1

public class KafkaRiver extends AbstractRiverComponent implements River {

private KafkaConsumer kafkaConsumer; private ElasticSearchProducer elasticsearchProducer; private RiverConfig riverConfig;

private Thread thread;

@Inject protected KafkaRiver(final RiverName riverName, final RiverSettings riverSettings, final Client client) { super(riverName, riverSettings);

riverConfig = new RiverConfig(riverName, riverSettings); kafkaConsumer = new KafkaConsumer(riverConfig); elasticsearchProducer = new IndexDocumentProducer(client, riverConfig, kafkaConsumer); }

.......}

Page 13: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Data

www.JAXenter.com | February 2015 13

The nice thing here is you have a control over the injected properties, so you can retrieve the settings which were passed by the user while creating the river, and do any operation with the client.

River specific properties, which were passed by the user, need to be extracted from the RiverSettings, and this is done in RiverConfig class. In this plugin the settings specified by the user will usually contain Elasticsearch and Kafka specific properties (Listing 2).

Step 5As mentioned earlier, we need to override the start and close methods as well (Listing 3).

Basically river starts a new thread, and inside this thread reads messages from Kafka stream and puts those messages into Elasticsearch. The custom logic of reading messages from Kafka is implemented in KafkaConsumer class, and the part of indexing the data into Elasticsearch in ElasticsearchPro-ducer, which we will explore later.

There are generally two ways to implement Kafka con-sumer, either using High Level Consumer API, which keeps track of offset automatically, or the Simple Consumer API where you manually need to handle the offsets, leader brokers etc. The Simple API generally is more complicated, and you should only use it if there is a need for it. For Elasticsearch Kafka River we use the High Level API, because we do not need to care about the offsets, we need to stream all the data

from Kafka to Elasticsearch, and on top of that, this API au-tomatically enables the river to read Kafka messages from multiple brokers and multiple partitions.

Installing Kafka and sending messagesOnce the plugin is written and packaged, it can easily be add-ed to any Elasticsearch installation in a single command. But before that, we need to start the Kafka server and produce some messages, so we see them being consumed by the river plugin. Here are the steps necessary to perform to produce messages into Kafka:

1. Install Kafka (See Apache Kafka Quick Start Guide for instructions on how to download and build.). We will execute all the steps locally.

2. First, start a local instance of the zookeeper server:bin/zookeeper-server-start.sh config/zookeeper.properties

3. Now start the Kafka server:bin/kafka-server-start.sh config/server.properties

4. Then we need to create a topic called test:bin/kafka-topics.sh --create --zookeeper localhost:2181

--replication-factor 1 --partitions 1 --topic test5. Kafka comes with a command line producer client that

will take input from command line and send it out as mes-sages to the Kafka cluster. By default each line will be sent as a separate message. Let’s produce some messages:

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test This is a message This is another message

Listing 3

@Overridepublic void start() {

try { final KafkaWorker kafkaWorker = new KafkaWorker(kafkaConsumer, elasticsearchProducer, riverConfig);

thread = EsExecutors.daemonThreadFactory(settings.globalSettings(), "Kafka River Worker").newThread(kafkaWorker); thread.start(); } catch (Exception ex) { throw new RuntimeException(ex); }}

@Overridepublic void close() { elasticsearchProducer.closeBulkProcessor(); kafkaConsumer.shutdown();

thread.interrupt();}

Listing 2

public RiverConfig(RiverName riverName, RiverSettings riverSettings) {

// Extract kafka related configuration if (riverSettings.settings().containsKey("kafka")) { Map<String, Object> kafkaSettings = (Map<String, Object>) riverSettings. settings().get("kafka");

topic = (String) kafkaSettings.get("topic"); zookeeperConnect = XContentMapValues.nodeStringValue(kafkaSettings. get("zookeeper.connect"), "localhost"); } else { topic = "test-topic"; zookeeperConnect = "localhost"; }

// Extract ElasticSearch related configuration if (riverSettings.settings().containsKey("index")) { Map<String, Object> indexSettings = (Map<String, Object>) riverSettings. settings().get("index"); indexName = XContentMapValues.nodeStringValue(indexSettings.get("index"), riverName.name()); } else { indexName = riverName.name(); }}

Page 14: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Data

www.JAXenter.com | February 2015 14

If you wanted to make sure that the messages are really be-ing sent to Kafka server, you could also run a command line consumer client, which would receive the messages, but we will not go into details here, because our plugin will consume those messages using Java API.

Installing the plugin in ElasticsearchNow, that we have the messages being produced, we can in-stall the plugin into our Elasticsearch instance. Elasticsearch can be downloaded from the Elasticsearch Download page:

1. Install the plugin with a single command:cd $ELASTICSEARCH_HOME ./bin/plugin -remove <plugin-name>where the <plugin-name> should be the name of the plugin. Note that we use the --url option for the plugin command to get the file locally instead of trying to download it from an online repository, which is another option.

2. We can now start Elasticsearch and see that our plugin gets loaded:~/elasticsearch-1.4.0/bin $ ./elasticsearch [2014-12-27 18:12:47,862][INFO ][node ] [Mahkizmo] version[1.4.0], pid[9233], build[bc94bd8/2014-11-05T14:26:12Z] [2014-12-27 18:12:47,863][INFO ][node ] [Mahkizmo] initializing ... [2014-12-27 18:12:47,886][INFO ][plugins ] [Mahkizmo] loaded [river-kafka], sites []

Configuring the pluginBasically, at this point the plugin is installed and running, but we still need to deploy the river itself. To create a river, we execute the following curl request (Listing 4).

The first section is about Kafka properties, e. g. zookeeper server host, topic name etc. The second section contains the properties defined by the user for Elasticsearch index itself,

e.  g. index-name, bulk.size etc. In order to index the data from Kafka to Elasticsearch, the river plugin uses the Elas-ticsearch Bulk API, which makes it possible to perform many index/delete operations in a single API call. This can greatly increase the indexing speed. You should note, that type kafka must match with the string previously provided when regis-tering the KafkaRiverModule to the RiversModule. The de-tailed description of each parameter:

•zookeeper.connect (optional): Zookeeper server host. Default is: localhost

•zookeeper.connection.timeout.ms (optional): Zookeeper server connection timeout in milliseconds. Default is:10000

•topic (optional): The name of the topic where you want to send Kafka message. Default is: elasticsearch-river-kafka

•message.type (optional): The kafka message type, which then will be inserted into ES keeping the same type. De-fault is: json. The following options are available: JSON: Inserts JSON message into ES separating each json prop-erty into ES document property. Example:"_source": { "name": "John", "age": 28}

•string: Inserts string message into ES as a document, where the key name is value, and the value is the received mes-sage. Example:"_source": { "value": "received text message"}

•index (optional): The name of Elasticsearch index. Default is: kafka-index

•type (optional): The mapping type of Elasticsearch index. Default is: status

•bulk.size (optional): The number of messages to be bulk indexed into Elasticsearch. Default is: 100

•concurrent.requests (optional): The number of concur-rent requests of indexing that will be allowed. A value of 0 means that only a single request will be allowed to be ex-ecuted. A value of 1 means 1 concurrent request is allowed to be executed while accumulating new bulk requests. Default is: 1

•action.type (optional): The action type against how the messages should be processed. Default is: index. The fol-lowing options are available: à index: Creates documents in ES with the value field set to the received message.

à delete: Deletes documents from ES based on id field set in the received message.

à raw.execute: Execute incoming messages as a raw query.

As a conclusion we see, that this plugin allows to index differ-ent types of messages coming from Kafka (JSON and string), and as well to do several types of operations, such as index-ing, deleting the indexed data or executing the raw data that comes as a message. These, as well as the possibility to control the bulk size, concurrent request numbers, are powerful fea-tures provided by the plugin.

Listing 4

curl -XPUT 'localhost:9200/_river/kafka-river/_meta' -d '{ "type" : "kafka", "kafka" : { "zookeeper.connect" : "localhost", "zookeeper.connection.timeout.ms" : 10000, "topic" : "test", "message.type" : "json" }, "index" : { "index" : "kafka-index", "type" : "status", "bulk.size" : 5, "concurrent.requests" : 1, "action.type" : "index" }}'

Page 15: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Data

www.JAXenter.com | February 2015 15

Bulk API usage in the pluginHere is how the Bulk API is used in ElasticsearchProducer class to index the messages (Listing 5).

The code above creates the bulkProcessor and configures it to execute the bulk when specified number of documents are ready to be indexed or when twelve hours have passed from the last bulk execution, so any remaining messages get flushed to Elasticsearch even if the number of messages has not reached. The following code adds an index request to the previously created bulkProcessor:

bulkProcessor.add(Requests.indexRequest(riverConfig.getIndexName()). type(riverConfig.getTypeName()). id(UUID.randomUUID().toString()). source(jsonMessageMap));

Updating the pluginAdditionally, if you are in development phase, and you change the logic and want to update the plugin, you can do this by

removing and installing the plugin again in Elasticsearch. To remove the plugin from Elasticsearch, execute:

cd $ELASTICSEARCH_HOME ./bin/plugin -remove <plugin-name>

To see the indexed data in Elasticsearch you can execute:

curl -XGET 'localhost:9200/kafka-index/_search?pretty=1'

SummaryThis article showed how to work with Elasticsearch, and how to create plugins, particularly a Kafka plugin which reads stream of messages and indexes them into Elasticsearch. As you saw, it’s relatively easy to add any functionality to the ex-isting Elasticsearch running instance in a custom manner. The most important learnings were that you should use Bulk API as much as possible, because it makes indexing much faster. Also, if you only need to stream data from Kafka brokers to Elas-ticsearch, you can simply use the High Level API from Kafka.

The Elasticsearch Kafka River plugin, that we walked through in this article, is an open source project, and available on the Elasticsearch official website as a plugin. It is available in GitHub in the following repository. The readers are wel-come to explore, try out and give feedback on it.

Listing 5

private void createBulkProcessor(final KafkaConsumer kafkaConsumer) { bulkProcessor = BulkProcessor.builder(client, new BulkProcessor.Listener() { @Override public void beforeBulk(long executionId, BulkRequest request) { logger.info("Index: {}: Going to execute bulk request composed of {} actions.", riverConfig.getIndexName(), request.numberOfActions()); }

@Override public void afterBulk(long executionId, BulkRequest request, BulkResponse response) { logger.info("Index: {}: Executed bulk composed of {} actions.", riverConfig.getIndexName(), request.numberOfActions());

// Commit the Kafka messages offset, only when messages have been // successfully // inserted into Elasticsearch kafkaConsumer.getConsumerConnector().commitOffsets(); }

@Override public void afterBulk(long executionId, BulkRequest request, Throwable failure) { logger.warn("Index: {}: Error executing bulk.", failure, riverConfig. getIndexName()); } }) .setBulkActions(riverConfig.getBulkSize()) .setFlushInterval(TimeValue.timeValueHours(12)) .setConcurrentRequests(riverConfig.getConcurrentRequests()) .build();}

Mariam Hakobyan is Head of Engineering at Zanox, and a technology enthusiast with over 8 years of experience in the field of Enterprise Java Development, event-driven, scalable, reactive applications and environ-ments using Java, Spring, Elasticsearch, Kafka etc.

“Use bulk API as much as possible because it makes indexing much faster.”

Page 16: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Buy now at: www.developerpress.com

Lesson #1: Never, ever name your test “Test1”

$ 3.99

Also available to buy on:

Page 17: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Data

www.JAXenter.com | February 2015 16

by Dr. James Stanier

I remember learning about hardware when I was a Computer Science undergraduate. It was taught mostly separately from software with both subjects residing in different modules on the curriculum. To students without much industry experience, it can be easy to think that hardware and software are extremely different subject areas; the former populated with stuffy elec-tronic engineers and silicon, and the latter being where the real computer science happens: data structures, algorithms and code.

I find that the best backend developers are the ones that think in hardware as much as software. Working with large amounts of data, in both streaming and batch modes, is becoming a big area of computer science in industry. Thinking in terms of high throughput, remote services and no downtime can be challeng-ing for developers who haven’t experienced writing this kind of code before. And if that’s not hard enough, I can unscientifically say that the data that we need to consume as an industry only gets bigger over time. Continuing to scale is a challenge, and we have to think of the best ways to utilise our software and hard-ware so that we can increase data processing capabilities whilst maintaining speed and reliability.

More frequently we find that hardware planning is a bigger part of creating new features than just writing new software. Making good choices up front can not only make your feature extremely reliable and fast, but it can save a large amount of money on your IT budget too.

Data processingLet’s just jump right in with an example. Where I work at Brandwatch we provide software that allows people to ana-

lyse and explore what is happening online. Customers write search queries that provide them with a personal data stream to explore, in order to understand what people are saying about that topic online.

A lot of our architecture processes data. We analyse matches from the web, classify them in a multitude of ways, and store them. More recently we’ve been analysing the last 24 hours of data with more scrutiny to try and predict whether something interesting is happening: for example, whether a hashtag is be-ginning to trend, or whether a particular story is being shared. This problem is made more complicated by the fact that we’re not doing global data analysis like Twitter and Facebook do; instead we’re inspecting tens thousands of individual data streams, since every customer’s query in the system is different.

The number of customer’s queries, and therefore data streams, is increasing very rapidly. We currently have 120,000+ live queries, compared to about 70,000 this time last year. The year before that, it was 30,000. You can prob-ably see the issue with scale here.

But let’s not complicate things yet. Let’s start simpler. Writing software to do analysis of one data query is (fairly) straightforward. Let’s do this now with an example of count-ing the top countries that matches are coming from. Perhaps you could have an application that receives web pages, tweets or Facebook updates, and then for each one you store the country it has been posted from (Listing 1).

Great! We can then write whatever algorithm we like to find the top country. We could even use a SortedMultiset for this, al-though enforcing the order of the collection could be expensive with a high rate of inserts. Even though the code is simple, there are a number of considerations when working with extremely

Speed, reliability and increased data processing potential

Scaling: Thinking in hardware and software Torrents of data, high throughput and no downtime – it’s understandable that younger programmers will struggle to deal with the requirements of today’s stack. The best backend developers are the ones that think like hardware engi-neers, says analyst Dr. James Stanier.

Page 18: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Data

www.JAXenter.com | February 2015 17

large amounts of data. Ideally we don’t want our application to stop running, since the stream of data is never going to stop. How will we keep the Multiset under control? In this case, aside from a potential future catastrophic political event, the number of countries in the world will never grow explosively. But what if we were tracking the top hashtags, or the top authors? These facets of data often have a very long tail. What impact does that have on RAM over time? Can we store the top items in RAM, but palm off the tail entries to a slower but cheaper storage? How would we cope if we had multiple instances contributing to this global counter? These are all exercises for the reader.

Let’s get back to our specific example, which is processing data for tens of thousands of queries. In this case, we’d want pretty much the same code, but we’d want a Multiset for each search query in our system. They each have a unique ID, so that would be a sensible key to use:

public class QueryCountryCounters {

private Map<Long, CountryCounter> queryCountryCounters = Maps.newHashMap();

public void logCountry(long queryId, Match match) { CountryCounter countryCounter = queryCountryCounters.get(queryId); countryCounter.logCountry(match.getCountry()); }}

That’ll do – we can assume there will always be a Country-Counter initialised for each query. But what if we want to count thirty other metrics for each query in the system? Thirty Multisets for each query is beginning to look more difficult to fit in a sensibly-sized JVM heap, especially as garbage col-lection becomes slower when there’s more to collect. One so-lution would be to buy the most powerful machine on the market to run this application, but that’s expensive. Imagine the look on your CFO’s face. It also won’t scale very well if the number of queries continues to grow at an exponential rate.

A better solution is to calculate the required RAM that we require, and divide it by a sensible JVM heap size. That’ll be the number of instances of this application we’ll need to run to solve the problem, as we can have each instance processing a unique set of the total number of queries. This also means you could run instances on cheaper commodity hardware if you wanted to. However, we’ve now created a different problem: if we were running tens or even hundreds of instances, how will they know which queries they should be working on?

Coordinating tasks with Apache Zookeeper and CuratorLeader election is a method of deciding which process is the designated assignee of a task in a distributed system. You can harness the power of leader election by using Apache Zoo-keeper and Curator. We’ll assume from now that you’ve got a Zookeeper quorum running, but if you need some assistance then do look at the documentation on the Zookeeper website.

Zookeeper looks a bit like a file system, and you interact with data at specific paths, e. g. /foo/bar. It’s up to you to decide how you name and store data. Each item in the “filesys-

tem” is a node, most commonly called a znode to make it clear that we’re talking about Zookeeper. These znodes can also be ephemeral; that is, that when a connection is lost with the client that created them, the ephemeral nodes are deleted also.

First, we’ll write some code to tell Zookeeper that we have a query that needs processing by our instances. We might want to write a new application that manages the creation and deletion of queries, since it only needs to happen in one place. Creating an instance of Curator in your code is simple:

CuratorFrameworkFactory .builder() .connectString("localhost:19121) .namespace("/data_counters") .build() .start();

Then here’s a method to create a new node at a specified path:

public void createZNode(String queryId) { try { client.create().forPath("/queries/" + queryId); } catch (NodeExistsException e) { log.warn("Node {} was already created.", queryId); }}

If a query gets removed from the system, then the code looks pretty similar:

public void removeZNode(String queryId) { try { client.delete().forPath("/queries/" + queryId); } catch (NoNodeException e) { log.warn("Node {} was already deleted.", queryId); }}

Electing a leader Next, in the code for our counters, which will be running in multiple instances, we will listen for chang-es at the path. Firstly, let’s use the LeaderLatchListender interface to define what happens when we win or lose leader-

Listing 1

public class CountryCounter {

private Multiset<String> countries = HashMultiset.create();

public void logCountry(String country) { countries.add(country); }

public int occurencesOf(String country) { return countries.count(country); }}

Page 19: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Data

www.JAXenter.com | February 2015 18

ship for a query. We’ll assume that the createDataStructure() method initialises the Multisets we defined earlier and inserts the HashMap entry for that query. We’ll also assume that removeDataStructure() removes it, allowing it to be garbage collected (Listing 2).

Next we’ll write a class that uses PathChildrenCacheListener to watch the parent node that is having child nodes added and removed by our other application, and it will receive callbacks via the childEvent method whenever this happens. Depending on the event, it will create or remove the Leader LatchListener implementation that we’ve just written (Listing 3).

And that’s about it! The instances will now distribute work amongst themselves. If one instance dies, the queries that it was processing will now fail over to another instance. But what about the historical data that was being held in RAM in the other instance? Well, that’s a problem for another time, I guess.

We’re increasingly using distributed election of tasks in our infrastructure to allow us to horizontally scale applications. For example, using the above code, you can spin up five more instances and they would automatically join and participate in the leader election process. Magic.

Dr. James Stanier is Head of Analytics at Brandwatch, a global social media monitoring company. His team work on finding actionable insights in data for the Brandwatch platform.

Listing 2public class WorkerLeaderLatchListener implements LeaderLatchListener {

@Override public void isLeader() { createDataStructure(queryId); }

@Override public void notLeader() { removeDataStructure(queryId); }}

Listing 3

public class WorkerManager implements PathChildrenCacheListener {

private Map<Integer, LeaderLatch> leaderLatches = newHashMap();

@PostConstruct public void initialise() throws Exception { List<ChildData> currentData = newArrayList(initialisePathChildrenCache()); for (ChildData childData : currentData) { long queryId = parseQueryIdFromPath(childData. getPath()); startLeaderElection(queryId); } }

private List<ChildData> initialisePathChildrenCache() throws Exception { pathChildrenCache.start(StartMode.BUILD_ INITIAL_CACHE); pathChildrenCache.getListenable(). addListener(this); return pathChildrenCache.getCurrentData(); }

@Override public void childEvent(CuratorFramework client, PathChildrenCacheEvent event) { ChildData childData = event.getData(); switch (event.getType()) { case CHILD_ADDED: long queryId = parseQueryIdFromPath(childData.

getPath()); startLeaderElection(queryId); break; case CHILD_REMOVED: long queryId = parseQueryIdFromPath(childData. getPath()); removeFromLeaderElection(queryId); break; default: break; // There are other cases to deal with, such as // RECONNECT. } }

private boolean haveLeaderLatchForQuery(int queryId) { return leaderLatches.containsKey(queryId); }

private void startLeaderElection(final int queryId) { String leaderPath = "/queries/" + Integer. toString(queryId); LeaderLatch leaderLatch = createLeaderLatch(queryId, leaderPath); attemptToGetLeadership(queryId, leaderLatch); }

void attemptToGetLeadership(final int queryId, LeaderLatch leaderLatch) { LeadershipAttempt leadershipAttempt = new LeadershipAttempt(leaderLatch);

executorService.submit(leadershipAttempt); }

LeaderLatch createLeaderLatch(final int queryId, String leaderPath) { LeaderLatch leaderLatch = new LeaderLatch(client, leaderPath); LeaderLatchListener leaderLatchListener = new WorkerLeaderLatchListener(queryId, workerContainer, workerRefusals, maximumWorkersAllowed); leaderLatch.addListener(leaderLatchListener); try { leaderLatch.start(); } catch (Exception e) { log.error("Error when starting leadership election", e); } leaderLatches.put(queryId, leaderLatch); return leaderLatch; }

void shutdown() throws IOException { for (LeaderLatch leaderLatch : leaderLatches. values()) { leaderLatch.close(); } pathChildrenCache.close(); }

}

Page 20: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Agile

www.JAXenter.com | February 2015 19

by Natali Vlatko

The advantages of Scrum are definitely the main attributing factors to its popularity: the quick delivery of products, the promotion of transparency and better workforce manage-ment to name a few. Of course, these advantages are all de-pendent on the successful implementation of Scrum and how committed the team is to the process.

For many companies, Scrum is beneficial for customers, the organisation and management alike. But what if it fails? What if the framework can’t support the business? What if there’s too many chickens and not enough pigs? Other than the fact that these labels have since been removed from the official Scrum guide, the problem with roles won’t be the only reason why your Scrum implementation is doomed. Simply sending a few staff project managers to Scrum Master training is in-sufficient to foster the degree of change required to succeed.

Rather than applaud the achievements of Scrum, learning from its failures can be a good way to improve your own un-derstanding and implementation of the process. With failure, we are presented with an opportunity to learn and improve. Assuming as a reader that you already know the basics, lets get into five ways that Scrum can break down and flounder.

Resistance to reorganisationFirst of all, agile is not something that we can implement, but is a characteristic that we try to achieve. This means that we do not “do agile”, but we practice techniques that help us become agile, that give us the ability to become adaptable and resourceful.

Seeing as this methodology requires a change in the way people think, approach and manage teams and projects, your implementation may fall short too soon due to unwilling par-ties at the middle management and executive levels. Even members of the Scrum team itself can be antagonistic towards the process. For established organizations, Scrum can be in-

credibly hard to deploy. Scrum works when an organization is open to change and while disruption can be good, it’s also extremely risky. This is where the resistance lies.

Jimmy Bogard believes that Scrum implementation can work if the organisation finds the transition easy. If it’s easy, then it will work for you. However, if its seen as difficult to adopt, then Bogard claims that “you’re using a process to force organizational change, and that’s a rather lousy, failure-prone way to do it.”

Lack of leadershipSure, with Scrum, your team is self-managed. But at the end of the day, the project still requires strong leadership in order to get it’s development game on point. A Scrum team needs sufficient support and guidance to work, so even though there’s a self-organisation element in the equation, support from the executive level is paramount in achieving desired results.

The role of Scrum Master doesn’t translate to “Overlord of the Project”; a novice Scrum Master may try to manage the team instead of leading, coaching, and guiding them. A command-and-control management style is antithetical to Scrum and other agile methods, where a more collabora-tive approach is best. However, some kind of leadership, or coaching, is still required to make sure the team are well coor-dinated and that impediments to the project are taken care of. The better the level of engagement you get from all stakehold-ers in the project, the better off your project will be, and it’s the Scrum Master’s job to adequately guide and educate the team rather than administer and control.

Blaming the processScrum exists merely as a framework – it’s not the magic

wand that is destined to solve your production problems. The thing about Scrum is it’s tendency to highlight issues fast. The issue mightn’t be with the process, but with the project its-

Is your Scrum implementation failing?

The ways that Scrum can fail Ever gotten the feeling your Scrum implementation isn’t achieving what it should? Natali Vlatko explores the most common ways that agile software development framework can fail.

Page 21: Putting out your Scrum fires - JAXenter · 2018-06-27 · The digital magazine for enterprise developers Issue February 2015 | presented by #42 Modding your JS Build a modular application

Agile

www.JAXenter.com | February 2015 20

elf, which could mean discovering that what you’re trying to build is unclear, overambitious and unrealistic.

These scenarios often result in the users blaming Scrum for the project failure, when in fact, it’s done it’s job. Henrik Kniberg over on Crisp’s Blog has commented on how easy it can be to confuse projectsuccess/failure with processsuccess/ failure.. As Kniberg says, Scrum fails when it’s misapplied. You don’t need to get it right from the start, but you do need to continuously inspect and adapt the process. Organisations are often too quick to dismiss the process as fl awed due to a project’s inability to gain traction. However, one signifi cant advantage of Scrum is that it reveals problems at quite an early stage in the development process, which can unfortu-nately be misinterpreted as a breakdown in the framework.

The defi nition of “done”The rules of Scrum are somewhat considered as gospel to those who implement it. Mike Cottmeyer suggested that once Scrum started to go mainstream, all people seemed to remember were the rules. They forgot the meaning behind the rules. The pur-pose of the methodology is to focus on maximizing the team’s ability to deliver quickly and respond to changes better, how-ever, there’s a lot of teams being caught up in the act of “doing” Scrum.

While Cottmeyer believes the rules are important, the prob-lem for him is cultural, or societal, in nature: “I say this is a societal problem because I think people do this in lots of dif-ferent areas … not just work ... not just agile. Think about all the areas of your life where folks are just going through the motions without any real understanding of what they are try-ing to accomplish.”

This idea of “going through the motions” incorrectly glo-rifi es the Scrum process over the end result of its implemen-tation, which is essentially to deliver the product. Knowing the short term goals of the team isn’t enough – as Jeff Suther-land states, the secret to working software is to complete testing inside the Sprint. If your testing practices aren’t tip-top, your teams are probably struggling, your clients are probably frustrated, and you most likely don’t know when the product will ship. Exit-criteria should be beefed up by

testing to save the team from going through another three Sprints in order to deliver.

The client’s acceptanceThe fi nal hurdle that your Scrum implementation needs to overcome is the matter of client acceptance or misunderstand-ing. It’s important that the client is not only aware of what you are doing but that they embrace it. A client expecting to see detailed documentation but is instead given a prototype will be disappointed, even if that disappointment feels unjustifi ed to your team who have worked hard to produce said product.

Your client needs to accept the Scrum process and become agile as well. Typically, most clients want to hear the follow-ing assessment at the beginning of the project:

•How much it will cost•What the product will look like•When it will be done

For the client, these approximations sound completely reasonable, but this approach is incompatible with agile frameworks. Essentially, it’s all about having a shared un-derstanding around what the team are going to build, with a little education about how this will take place. Agile stresses agility – responsiveness to change and uncertainty – and be-cause of that the expectation is faster production. However, it requires a different mindset and can make high-level plan-ning more diffi cult, which is traditionally what a majority of clients and companies expect.

Scrum isn’t designed to be the king of methodologies, but its implementation can help to improve the production and cost of working software, if done right. Organisations who wish to implement Scrum need to remember that taking the time to reprioritise is important. The failures outlined above are meant to represent the general areas of contention when Scrum breaks down. Of course, each area has its own niched issues and we didn’t even approach the question of the back-log, but Cottmeyer’s assessment is a great place to start for backlog-related questions. Just resist the urge to implement the half-arsed agile manifesto.

PublisherSoftware & Support Media GmbH

Editorial Offi ce AddressSoftware & Support MediaSaarbrücker Straße 3610405 Berlin, Germanywww.jaxenter.com

Editor in Chief: Sebastian Meyen

Editor: Coman Hamilton, Natali Vlatko

Authors: Mariam Hakobyan, Moritz Schulze, Dr. James Stanier, Wolfgang Weigend

Copy Editor: Jennifer Diener

Creative Director: Jens Mainz

Layout: Flora Feher

Sales Clerk:Anika Stock+49 (0) 69 [email protected]

Entire contents copyright © 2015 Software & Support Media GmbH. All rights reserved. No part of this publication may be reproduced, redistributed, posted online, or reused by any means in any form, including print, electronic, photocopy, internal network, Web or any other method, without prior written permission of Software & Support Media GmbH.

The views expressed are solely those of the authors and do not refl ect the views or po-sition of their fi rm, any of their clients, or Publisher. Regarding the information, Publisher disclaims all warranties as to the accuracy, completeness, or adequacy of any informa-tion, and is not responsible for any errors, omissions, in adequacies, misuse, or the con-sequences of using any information provided by Pub lisher. Rights of disposal of rewarded articles belong to Publisher. All mentioned trademarks and service marks are copyrighted by their respective owners.

Imprint