Designing a Real Time Data Ingestion Pipeline

View
354
Download
0
Category

Technology

Preview:

Citation preview

Designing Real-TimeData Ingestion PipelineBadar Ahmed

About Us

DataScience Inc. ▪ Data Science as a service▪ Customers from Sonos to Belkin▪ Ranked #1 among "Best Places to

Work in Los Angeles for 2015"

▪ Visit datascience.com!

Badar Ahmed ▪ Software Engineer▪ Background in high performance

computing & cloud computing▪ Work across the stack on Big Data

problems

http://datascience.com

Importance of Data Ingestion

▪ Data ingestion is precursor to any analysis▪ Characteristics:▪ Reliable▪ Correctness▪ Speed▪ Scalable

Types of Data Ingestion

▪ Broad topic with many different architectural patterns

▪ Real Time▪ Batch

▪ Structured Data▪ Unstructured Data

Ingestion Evolution @ DataScience

▪ Legacy API existed▪ But ..

✦ Expensive✦ Ops Heavy✦ Hard to scale✦ No batch interface

What was needed

▪ Scaleable ingestion system▪ Batch Ingest▪ Lower Ops and $$$ Cost

Idea #1

▪ Asynchronous API▪ Queue requests and process them later

Pros:▪ Fast▪ Scaleable

Issues with Idea #1

▪ Failure introduces complexity▪ Decoupled systems can be more

difficult to debug▪ User UX poorer if they need to keep

track of async requests▪ Lot of deviation from the simpler API

model of ConnectHQ

Idea #2

▪ Synchronous Batch so ..✦ UX remains the same

▪ Use Concurrency to do parallel writes to datastore

✦ Caveat: Concurrent code is difficult to write & debug

First Step: Prototype

Integration Testing

Unit Testing with Mocks

More Testing

Test & Refactor Cycle

Questions?

Thank you.

Development

Operations & Monitoring

Batch Data Loading

Recommended

Corrosive ingestion

Health & Medicine

Caustic Ingestion: Presentation...Caustic Ingestion: Presentation •Epidemiology –50% under age 4 –Accidental vs. intended ingestion •History Hoarseness, stridor, dyspnea, odynophagia,

Documents

Designing and developing a data processing pipeline for

Documents

Designing a Pixar Film - University of California, Berkeleycs194-8/Fa08Sp09/... · Pixar Animation Studios Designing a Pixar Film 2 Pixar Planning & Implementation Pipeline Some of

Documents

361 hazards.1 ECE 361 Computer Architecture Lecture 13: Designing a Pipeline Processor

Documents

Corrosive ingestion

Health & Medicine

Designing an Android continuous delivery pipelineceur-ws.org/Vol-1559/paper20.pdf · Designing an Android continuous delivery pipeline Milena Zachow Fachbereich 3: Information & Kommunikation

Documents

V) Ingestion

Documents

A scalable pipeline for designing reconfigurable organisms · scalable pipeline for creating functional novel lifeforms: AI methods automatically design diverse candidate lifeforms

Documents

Herbicide Ingestion - Latest

Documents

Modeling Boundary Layer Ingestion Using a Coupled ...mdolab.engin.umich.edu/sites/default/files/Gray2018-Modeling... · Although boundary layer ingestion (BLI), or wake ingestion,

Documents

Designing a Shading System - Chalmersuffe/xjobb/DesigningAShadingSystem.pdf1.4 The Rendering Pipeline A rendering pipeline is a method of transforming a conceptual 3d world into an

Documents

Designing Cross-country pipelines for Saudi Aramco …...•Saudi Aramco engineering standards related to cross-country pipeline design. Pipeline network in northern area This map

Documents

Abhinav Nekkanti, Sourav Pal & Tameem Anwar Splunk · Abhinav Nekkanti - Sr. Software Engineer, Splunk – Ingestion Pipeline Sourav Pal - Principal Engineer, Splunk – Search Parallelization

Documents

A scalable pipeline for designing reconfigurable organisms · Although some steps in this pipeline still require manual intervention, ... bio engineering efforts with 3D scaffolds

Documents

& data ingestion

Documents

Lecture 16: Pipeline Controlsjtang/cs411.s20/lectures/L16... · 2020-03-26 · Lecture 16: Pipeline Controls Spring 2020 Jason Tang 1. Topics • Designing pipelined datapath •

Documents

Designing and developing a data processing pipeline for ... · Designing and developing a data processing pipeline for archiving sensitive human data Metropolia University of Applied

Documents

Funciones ingestion

Documents

1 Caustic ingestion

Documents