36
Wolfram Alpha An introduction to the underlying technology SIGC 2010/2011 Pedro Gaspar

Wolphram Alpha

Embed Size (px)

DESCRIPTION

An introduction Wolphram Alpha's underlying technology.

Citation preview

Page 1: Wolphram Alpha

Wolfram AlphaAn introduction to the underlying technology

SIGC 2010/2011

Pedro Gaspar

Page 2: Wolphram Alpha

Outline

IntroductionHistoryTechnology – The “Four Pillars”Technology – Interesting FactsConclusionsReference

Wolfram Alpha - Pedro Gaspar 2

Page 3: Wolphram Alpha

IntroductionReal-time computational answering

systemNot a Search Engine like GoogleNot as static as Wikipedia or as an

Encyclopedia

Wolfram Alpha - Pedro Gaspar 3

Page 4: Wolphram Alpha

IntroductionGoal:

Systematic knowledge:◦Objective Data◦Models◦Methods◦Algorithms◦Formulae

Wolfram Alpha - Pedro Gaspar 4

“Wolfram|Alpha's long-term goal is to make all systematic knowledge immediately computable and accessible to everyone.”

Page 5: Wolphram Alpha

IntroductionSome of the explored areas:

Wolfram Alpha - Pedro Gaspar 5

MathematicsStatistics & Data AnalysisPhysicsChemistryMaterialsEngineeringAstronomyEarth SciencesLife SciencesComputational Sciences

Units & MeasuresDates & TimesWeatherPlaces & GeographyPeople & HistoryCulture & MediaMusicWords & LinguisticsSports & GamesColors

Money & FinanceSocioeconomic DataHealth & MedicineFood & NutritionEducationOrganizationsTransportationTechnological WorldWeb & Computer Systems

Page 6: Wolphram Alpha

HISTORY

Wolfram Alpha - Pedro Gaspar 6

How did the project start?

Page 7: Wolphram Alpha

History – Wolfram Alpha

Project lead by Stephen Wolfram

It is the culmination of 5 years of work, and 25 more years of previous development

Stephen started Wolfram Research in 1987, focusing mainly on the Mathematica software

Wolfram Alpha - Pedro Gaspar 7

Page 8: Wolphram Alpha

History – Wolfram Alpha

In 2002 Stephen publishes “A New Kind of Science”

In 2004 the company tries to apply the concepts from the book to a real-world product and thus started developing Wolfram Alpha

In May 18th, 2009 Wolfram Alpha is officially launched to the public

Wolfram Alpha - Pedro Gaspar 8

Page 9: Wolphram Alpha

History – Computable KnowledgeThe history of Systematic Data and the

Development of Computable Knowledge goes back to the 20,000 BC with the invention of arithmetic

Scientific Books, Encyclopedias, Census, Maps and other sources of information have been collecting data since Ancient Mesopotamia

Wolfram Alpha - Pedro Gaspar 9

Page 10: Wolphram Alpha

TECHNOLOGY

Wolfram Alpha - Pedro Gaspar 10

How does it work?

Page 11: Wolphram Alpha

Technology – the “Four Pillars”

Curation Formalization NLP Visualizati

on

Wolfram Alpha - Pedro Gaspar 11

Page 12: Wolphram Alpha

Pillar1 - Curation

Field Experts help the team find the best content sources and validate the data

Community input is also accepted, but all the data has to go through a rigorous validation process before being used

Almost none of their data comes from the Internet now

It turned out that curation and data gathering was only 5% of the work

Wolfram Alpha - Pedro Gaspar 12

Page 13: Wolphram Alpha

Pillar1 - Curation

Wolfram Alpha - Pedro Gaspar 13

Page 14: Wolphram Alpha

Pillar 2 - FormalizationOrganizing the curated data so that it can be

computable

Figuring out its conventions, units, definitions and how it connects to other data

All these are encoded algorithmically in Wolfram Alpha so that they’re available when needed

All the algorithms, models and equations are encoded into functions in Mathematica, the programming language behind Wolfram Alpha

Wolfram Alpha - Pedro Gaspar 14

Page 15: Wolphram Alpha

Pillar 2 - Formalization

Mathematica’s language is able to represent data of all kinds using arbitrarily structured symbolic expressions

As a result, the code is much more compact than in a lower-level language like Java or Python

Mathematica already includes a very big set of algorithms and functions, making it easier to implement new (usually more complex) algorithms

Wolfram Alpha - Pedro Gaspar 15

Page 16: Wolphram Alpha

Pillar 2 - Formalization

This creates a recursive process, that makes implementing new algorithms easier and easier through software reutilization

Wolfram Alpha - Pedro Gaspar 16

Page 17: Wolphram Alpha

Pillar 2 - Formalization

Wolfram Alpha - Pedro Gaspar 17

Page 18: Wolphram Alpha

Pillar 2 - Formalization

Wolfram Alpha - Pedro Gaspar 18

Page 19: Wolphram Alpha

Pillar 3 – Natural Language ProcessingHow could users interact with the system and

use its computing powers? Through human language is the most natural response

The problem is not the one we are used to – instead of trying to make sense of a big set of words, the system has to map small pieces of human input (queries) into its large set of symbolic representations

The implemented solutions generally achieve good results

Wolfram Alpha - Pedro Gaspar 19

Page 20: Wolphram Alpha

Pillar 3 – Natural Language Processing

Wolfram Alpha - Pedro Gaspar 20

Page 21: Wolphram Alpha

Pillar 3 – Natural Language Processing

Wolfram Alpha - Pedro Gaspar 21

Page 22: Wolphram Alpha

Pillar 3 – Natural Language Processing

Wolfram Alpha - Pedro Gaspar 22

Page 23: Wolphram Alpha

Pillar 3 – Natural Language Processing

Wolfram Alpha - Pedro Gaspar 23

Page 24: Wolphram Alpha

Pillar 4 – Visualization

Wolfram Alpha’s ability to present results in formats other than text is one of its most visually appealing features

Mathematica includes some functionality to deal with this challenge, through what they call “computational aesthetics”

This automates, for a specific symbolic representation, what to present and how to present it

Wolfram Alpha - Pedro Gaspar 24

Page 25: Wolphram Alpha

Pillar 4 – Visualization

Wolfram Alpha - Pedro Gaspar 25

Page 26: Wolphram Alpha

Pillar 4 – Visualization

Wolfram Alpha - Pedro Gaspar 26

Page 27: Wolphram Alpha

Pillar 4 – Visualization

Wolfram Alpha - Pedro Gaspar 27

Page 28: Wolphram Alpha

Pillar 4 – Visualization

Wolfram Alpha - Pedro Gaspar 28

Page 29: Wolphram Alpha

Pillar 4 – Visualization

Wolfram Alpha - Pedro Gaspar 29

Page 30: Wolphram Alpha

Pillar 4 – Visualization

Wolfram Alpha - Pedro Gaspar 30

Page 31: Wolphram Alpha

Pillar 4 – Visualization

Wolfram Alpha - Pedro Gaspar 31

Page 32: Wolphram Alpha

Pillar 4 – Visualization

Wolfram Alpha - Pedro Gaspar 32

Page 33: Wolphram Alpha

Technology – Interesting Facts

Wolfram Alpha - Pedro Gaspar 33

More than 10 trillion of dataMore than 50,000 types of algorithms

and modelsLinguistic capacity for more than 1000

domainsMore than 8 million lines of symbolic

Mathematica codeRuns in clusters of supercomputers,

including the 44th largest supercomputer in the world - R Smarr

Hundreds of terabytes of storage

Page 34: Wolphram Alpha

ConclusionsIt is all a matter of representing data

and mapping queries to the set of things they can compute about

Uses an internal and pre-structured database to find the answers to the queries

Computation brings a lot of value when comparing it to search engines like Google

Little to no information available about how the system works internally

Wolfram Alpha - Pedro Gaspar 34

Page 36: Wolphram Alpha

QUESTIONS?

Wolfram Alpha - Pedro Gaspar 36

[email protected]

Pedro Gaspar