20
Automation & Machine Learning Dr. Alvaro Feito Boirac 22 Aug 2016, Bermuda

Automation and machine learning in the enterprise

Embed Size (px)

Citation preview

Automation & Machine LearningDr. Alvaro Feito Boirac

22 Aug 2016, Bermuda

Where should we use automation?

- Repetitive Tasks

- Grunt work time/cost > development time/cost

- Task can be described as clear* steps

* the meaning of “clear” is subtle and will require clarification

Effortless for Humans≠

Easy for Machines

3 Criteria for automation:

- Is it repetitive enough?

- Does it require enough man-hours?

- Can we describe it as an algorithm?

4 Tools (for different processes)

Show by Clicking

Show with Scripting

Show with Code

MachineLearning

Click & Tell

1. Click on

2. Type “Excel”

3. Click on

4. Paste text from previous task ...

Use a software which reads & executes instructions:

PROS

- $ Cheap

- Easy to learn

- Quick-Start

CONS

- Not resilient (image change)

- Somewhat Limited: click, open, close,

save, write, copy, paste, move, etc.

- Does not scale with complexity

Click & Tell

Examples:

- Sikuli (or SikuliX)

- Automa

- PyWinAuto

-

- Etc ...

Click & Tell

Ref: Automate the Boring Stuff with Python

Open all documents in folder X, compare the 3rd item, and save the result to an excel sheet. Every end-of-the month, open a website, take a screenshot of the stock, paste to excel and save that excel sheet in the VP’s drive and delete all the documents.

Click & Tell

Ref: Automa

Script & tell

Combine scripts from:

Windows, VBA, your API

PROS

- Uses your current software

- Not difficult learning curve

- Affordable

- Can automate progressively

CONS

- Not resilient (program change)

- Often Incompatible: Program A can’t talk

to program B which uses a different format.

- Not one single project: Too many

moving parts

Script & tell

Examples:

- VBA (Excel)

- Windows Automation API

- AutoIT

- Etc ...

Open all documents in folder X, compare the 3rd item, and save the result to an excel sheet. Calculate the rolling average of the price & the contribution of each department. At the end of the month, open a website, take a screenshot of the stock, and create a .doc in the VP’s drive after deleting all the documents.

Script & tell

Code & Tell

Use a programming language to manipulate your documents

and interact with your software.

Code & Tell

COMPILER

Click

Macros

Code

Code & Tell

PROS

- More versatile & powerful

- All in one platform

- Can automate progressively

- Many building blocks already exist

- Data analysis is easy to add

CONS

- Steeper learning curve

- Investment in staff ? time ?

- Requires some maintenance

Code & Tell

Examples:

- Python (PyWinAuto + Pandas + Numpy + … )

- . NET (White + RogueWave, …)

- Java, perl, BASH, …

Pull the raw data, make the usual statistics, create a PDF report from it with graphs. Make backups of all the documents and copy the first line of each in an email that will go to the SVP of XYZ.

Machine Learning

Use software and programming tools inspired by the brain.

For more fuzzy tasks:

- Recognize objects in an image- Transcribe handwriting / solve Captchas- Transcribe speech- Make decisions based on data- Identify trends or patterns- Does this scan contain a seal and a signature?

Machine Learning (3 main schools*)

Biology & Physics inspired networks

Statisticallearning (Bayesian)

Learning by Analogy(SVM)

Machine Learning

PROS

- Great for intuitive tasks (image,

patterns, trends, voice)

- Many ready-to-use tools

- Can run parallel (or on top) of

other tasks.

- Mostly free & Open source

CONS

- Longer/Steeper learning curve

- Harder to hire experts

- May need large training data sets

Machine Learning

Examples:

- OpenCV (image)

- Pyocr, tesseract, FreeOCR (OCR)

- Theano, Sci-Kit, TensorFlow

Find object in image, check signature, check stamp, count bullet points on a scan, find deep correlations in data, track point in video, transcribe handwriting or voice, etc