18
© Nube Technologies Real Time Fuzzy Matching With Spark and ElasticSearch

Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)

Embed Size (px)

Citation preview

© Nube Technologies

Real Time Fuzzy Matching With Spark and ElasticSearch

© Nube Technologies

About Us

The only way to do great work is to love what you do.

- Steve Jobs

© Nube Technologies

The problem - lake or swamp?

© Nube Technologies

Duplicates

© Nube Technologies

Challenges

● Quadratic problem● No standard notion of similarity● Omissions, typos and other issues● Different languages

© Nube Technologies

Use Case - Customer Record Dedup

© Nube Technologies

Use Case - Customer Record Dedup

© Nube Technologies

Use Case - Shopping Site Comparison

© Nube Technologies

Use Case - Shopping Site Comparison

© Nube Technologies

Other Use Cases

● Cross selling● Financial Credit Ratings● Fraud Analytics● Catalog and inventory management● Household and individual level analytics.

© Nube Technologies

Lets start wishing...

● Data variety● Scalable● No manual configuration of rules or

algorithms● Multi language● Real time

© Nube Technologies

Reifier - learn

© Nube Technologies

Reifier - learn

© Nube Technologies

Reifier - learn

© Nube Technologies

Reifier - learn

© Nube Technologies

Real Time

Spark + ElasticSearch

© Nube Technologies

Spark Benefits

● Distributed● Scalable● Fast● Machine Learning● Sampling● No need to orchestrate multiple jobs

© Nube Technologies

Thank You!

www.nubetech.co [email protected]