Upload
dataconomy-media
View
83
Download
0
Embed Size (px)
Citation preview
Awesomeness of Container Orchestras Data Crunching in a Company Builder
Martin Held
What is FinLeap doing?
Tech Department’s Mission
Our journey
How we did it
WebUI Backend
Service A Service B
Monolithic Application (typically RoR)
SQL
Data
Some Learnings
• finding developers is hard, finding developers experienced in a specific tech stack is even harder
• integrating data science like functionality is not straightforward
• scaling the monolithic application can be challenging
How we do it these days
WebUI
SQL
AP
I Gat
eway
Msg
Bro
ker
Backend
Service A
Service B
we plan and implement containerised microservice architectures
This allows us
• multilingual applications • better utilisation of existing dev resources
• broader talent pool for recruitment
• plug-in data science solutions as service
• scaling • tech wise - ‘scale bottleneck services‘
• business wise - dedicated teams for different tasks
• reuse of services (authentication etc.)
Service A
How is life in a container?
A Docker ContainerContainer 1 Container 2 Container 3
- package applications with its dependencies
- more lightweight than virtual machines (shared the OS)
- run on any computer, any infrastructure, any cloud
Life in a (DataScientist) Container
FROM ubuntu:14.04
ENV PYTHONPATH /opt/caffe/python
# Add caffe binaries to path ENV PATH $PATH:/opt/caffe/.build_release/tools
# Get dependencies RUN apt-get update && apt-get install -y bc cmake curl gcc-4.6 g++-4.6 wget
# Use gcc 4.6 RUN update-alternatives --install /usr/bin/cc cc /usr/bin/gcc-4.6 30 && \
# Clone the Caffe repo RUN cd /opt && git clone https://github.com/BVLC/caffe.git
# Build Caffe core RUN cd /opt/caffe && cp Makefile.config.example Makefile.config && \
# Add ld-so.conf so it can find libcaffe.so ADD caffe-ld-so.conf /etc/ld.so.conf.d/
# Run ldconfig again (not sure if needed) RUN ldconfig
# Install python deps RUN cd /opt/caffe && \ cat python/requirements.txt | xargs -L 1 sudo pip install
Life in a (DataScientist) Container
bvlc_reference_caffenet.caffemodel RUN wget -O models/deploy.prototxt https://raw.githubusercontent.com/BVLC/caffe/master/models/bvlc_reference_caffenet/deploy.prototxt
Life in a (DataScientist) Container
USE ubuntu base Image
Install Os dependencies (g++, python, git, fortran, curl, etc )
Clone Caffe repo (open source Deep Learning Lib)
Build Caffe Core
Install Python Dependencies
Build Caffe Python bindings
Add Model and Source Code
Specify Execution Command
+
+
+
Life in a (DataScientist) Container
Image Recognition Al
fish, aquarium, child
Orchestration
The Orchestra
Scheduling place and start container on host(s) offering required resources
Service Discovery + Registration allow containers to communicate with each other and the rest of the world
Implement Resilience e.g. auto restart containers in case of failure
An example
Reverse Image Search
Query Image
Reverse Image Search
Message Broker
Reverse Image Search
Crawler
Product Pictures Metadata
Product Picture
Raw Metadata
Message Broker
Reverse Image Search
Crawler
Product Pictures Metadata
Product Picture
Raw Metadata
Message Broker
Feature Extractor
Picture Features
Reverse Image Search
Crawler
Product Pictures Metadata
Product Picture
Raw Metadata
Message Broker
Feature Extractor
Picture Features
Metadata Parser
Raw Metadata
Structured Metadata
Reverse Image Search
Crawler
Product Pictures Metadata
Product Picture
Raw Metadata
Message Broker
Feature Extractor
Picture Features
Metadata Parser
Raw Metadata
Structured Metadata
StorageSink
Store
Picture Features,
Structured Data
Reverse Image Search
Crawler
Product Pictures Metadata
Product Picture
Raw Metadata
Message Broker
Feature Extractor
Picture Features
Metadata Parser
Raw Metadata
Structured Metadata
StorageSink
Store
Picture Features,
Structured Data NearestNeighbor
Search API
Query Picture
Similar Pictures
Query Features
Similar Pictures
to take home
containers are great tool to package code with all its dependencies and make it usable by others
allow us develop scalable plug-in ready data science solutions
complex scalable and resilient architectures for the masses
Thank you for your attention!
www.linkedin.com/in/martin-held
@