20
A Microservice Architecture for Big Data Pipelines BigData.be Meetup June 2016

A Microservice Architecture for Big Data Pipelines

Embed Size (px)

Citation preview

Page 1: A Microservice Architecture for Big Data Pipelines

A Microservice Architecture for Big Data PipelinesBigData.be Meetup June 2016

Page 2: A Microservice Architecture for Big Data Pipelines

Let’s face it: Big Data is no longer a Big Deal

2Image © User:Kleiner / Wikimedia Commons / CC BY-SA 3.0

Page 3: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.com

Yardsticks of Software Development:

1. Create Modularity

2. Ensure Quality

3. Scale Development

4. Painless Deployment

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Image © User:Guma89 / Wikimedia Commons / CC BY-SA 3.0

Page 4: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.com

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Modularity is Imperative for On Premise Deployments:

RealImpact

Product Product Product

Client Client Client

Page 5: A Microservice Architecture for Big Data Pipelines

The Promised Land

5Image http://hanciong.deviantart.com/art/old-world-map-253195357

Page 6: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.com

Micro Services: Maximal Modularity

1. No shared state

2. Minimal coupling

3. Separation of concerns

4. Mix & match

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Page 7: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.com

Micro Services: Scalable Development

1. Team responsibility

2. Less code = faster ramp up

3. Technology independence

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Page 8: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.com

Micro Services: Painless Deployment

1. Reproducible environments

2. Versioned APIs

3. Installation = docker-compose up

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Prod

Dev

Page 9: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.com

Micro Services: QA Friendly

1. Three levels of testing

• Class / function level

• Service level

• Integration level

2. Staging is no big deal

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Page 10: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.com

Translation to Big Data Pipelines…

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

TrendingAnalysis

Twitter Data

TopTweeters

Recommend

Page 11: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.comcontainer

Translation to Big Data Pipelines…

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

TrendingAnalysis

manifest.yaml

run.sh

jar

runtime

Page 12: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.comcontainer

Translation to Big Data Pipelines…

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

TrendingAnalysis

datasources: - twitter

outputs: - id: daily-trends fields: - name: keyword type: string - name: relevance type: integer

parameters: …

manifest.yaml

run.sh

jar

runtime

Page 13: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.com

Translation to Big Data Pipelines…

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

TrendingAnalysis

HDFS

Input Data

Result

Parameters

Page 14: A Microservice Architecture for Big Data Pipelines

Demo

14

Page 15: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.com

Data Modules: QA Friendly?

1. Three levels of testing ✔

• Class / function level

• Module level

• Integration level

2. Staging is no big deal ✔

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Page 16: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.com

Data Modules: Painless Deployment?

1. Reproducible environments (✔)

2. Versioned APIs ✔

3. Installation = docker-compose up (✔)

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Prod

Dev

Page 17: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.com

Data Modules: Scalable Development?

1. Team responsibility ✔

2. Less code = faster ramp up ✔

3. Technology independence ✔

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Page 18: A Microservice Architecture for Big Data Pipelines

www.realimpactanalytics.com

Data Modules: Modularity?

1. No shared state (well…)

2. Minimal coupling ✔

3. Separation of concerns ✔

4. Mix & match ✔

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Page 19: A Microservice Architecture for Big Data Pipelines

Conclusion

19

Page 20: A Microservice Architecture for Big Data Pipelines

Brussels Office 5, Place du Champ de Mars 1050 Brussels Belgium

Cape Town Office Sovereign Quay, 34 Somerset Road8005, Green Point, Cape Town South Africa

São Paulo Office 93, Rua Doutor Andrade Pertence Vila Olímpia, São Paulo Brazil

Luxembourg Office 691, rue de Neudorf 2220 Luxembourg Grand Duché du Luxembourg

www.realimpactanalytics.com

Kuala Lumpur Office 28-01, Integra Tower 348 JalanTun Razak, 50400 Kuala Lumpur Malaysia