77
Rapid Advances in Computer Science and Opportunities for Society European CS Presentation, October 2010 Alfred Spector VP, Research and Special Initiatives

Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Rapid Advances in Computer Science and Opportunities for SocietyEuropean CS Presentation October 2010

Alfred SpectorVP Research and Special Initiatives

Rapid Advances in Computer Science amp Opportunities for Society

Information and Communication Technologies have had a rapid impact on society and ndashamazinglymdashthe pace of innovation continues to accelerate This innovation is catalyzed by ever-increasing hardware and networking capabilities the growth in internet usage as well as important advances in basic and applied computer science In this talk I will describe some of the research that Google is undertaking (for example in machine translation semantic processing and information management) and discuss some of the likely beneficial impacts on our society ndash for example in science the humanities education philanthropic activities and more Irsquoll conclude my presentation with some interesting challenges from both a technology and policy point of view

Abstract

OutlineGoogleProdigiousnessAdvances in the Field examples

TranslationSpeechVisionCloud-based collaboration around structured-dataOperations ResearchSemantic Processing

Beneficial Societal Impacts examplesEarth EngineGoogle HealthOther Health EffortsCrisis ResponseDigital HumanitiesEducation

A Technical ThemesChallenges

Mission

Organizing the worldrsquos information andMaking it universally accessible and useful

Google and Commerce

Over 1 million AdWords advertisers worldwideOver 1 million AdSense publishers worldwideVia the Google Ad Network AdSense publishers reach over 80 of global internet users in 100 countries and 20 languagesYouTube is monetizing over a billion video views per week globallyIn 2009 Google generated $54 billion of economic activity for American businesses website publishers and non-profits

Prodigiousness

Giga 109 Tera 1012 Peta 1015 Exa 1018 Zetta1021

Publicized Bigtable of 70 petabytes 10M opssecWarehouse computing possibilities 100 x 10 x 20 x 20 x 40 = 16000000 nodeshellipSome representative numbers

Storage 1018 -gt 1020-21

Users 109 -gt 1010

Devices 10 -gt 1012

Network 1020 now -gt1021yr 32 KBsec for 1B peopleApps 105 -gt 106-7 or more

Eg embedded car systems 30-50 ECUs 100M lines of code

A variety of science engineering challenges

Focus on Innovation that Benefits our UsersFocus on Research and Engineering

Commitment to advancing technologyRich domain of work due to our missionGrand challenge problemsInternal consensus that production issues are often as challengingfun as pure inventionTechnical leverage1 Google Common Distributed System 2 A Focus on Services3 Empiricism and a Holistic Approach to Design

Our Innovation Culture

Focus on talentDistributed across the organization

Impacting Google necessitates broad diverse involvement in science and engineeringResearch is done both in our research team and in our engineering organization organized opportunistically

Teams benefit greatlyFrom mutual talentFrom Googlersquos comparative advantages to our scale and broad useFrom service-based architecture (ldquoeaserdquo of working in vivo)

Ideal Distributed Computing

Devices

Research Challenges in Ideal Distributed Computing

Alternative designs that would give better energy efficiency at lower utilizationServer OS design aimed at many highly-connected machines in one buildingUnifying abstractions for exploiting parallelism beyond inter-transaction parallelism and map-reduceLatency reductionA general model of replication including consistency choices explained and codifiedMachine learning techniques applied to monitoringcontrolling such systemsAutomatic dynamic world-wide placement of data amp computation to minimize latency andor cost given constraints onBuilding retrieval systems that efficiently and usably deal with ACLsHolistic models of privacyThe user interface to the userrsquos diverse processing and state

Totally Transparent Processing

D The set of all end-user access

devices

L The set of all human languages

M The set of all modalities

C The set of all corpora

Personal ComputersPhoneMedia PlayersReadersTelematicsSet-top BoxesAppliancesHealth deviceshellip

Current languagesHistorical languagesOther forms of human notationPossible language specializationFormal languageshellip

TextImageAudioVideoGraphicsOther sensor-based datahellip

The normal webThe deep webPeriodicalsBooksCatalogsBlogsGeodataScientific datasetsHealth datahellip

For all d in D all l in L all m in M and all c in C

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 2: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Rapid Advances in Computer Science amp Opportunities for Society

Information and Communication Technologies have had a rapid impact on society and ndashamazinglymdashthe pace of innovation continues to accelerate This innovation is catalyzed by ever-increasing hardware and networking capabilities the growth in internet usage as well as important advances in basic and applied computer science In this talk I will describe some of the research that Google is undertaking (for example in machine translation semantic processing and information management) and discuss some of the likely beneficial impacts on our society ndash for example in science the humanities education philanthropic activities and more Irsquoll conclude my presentation with some interesting challenges from both a technology and policy point of view

Abstract

OutlineGoogleProdigiousnessAdvances in the Field examples

TranslationSpeechVisionCloud-based collaboration around structured-dataOperations ResearchSemantic Processing

Beneficial Societal Impacts examplesEarth EngineGoogle HealthOther Health EffortsCrisis ResponseDigital HumanitiesEducation

A Technical ThemesChallenges

Mission

Organizing the worldrsquos information andMaking it universally accessible and useful

Google and Commerce

Over 1 million AdWords advertisers worldwideOver 1 million AdSense publishers worldwideVia the Google Ad Network AdSense publishers reach over 80 of global internet users in 100 countries and 20 languagesYouTube is monetizing over a billion video views per week globallyIn 2009 Google generated $54 billion of economic activity for American businesses website publishers and non-profits

Prodigiousness

Giga 109 Tera 1012 Peta 1015 Exa 1018 Zetta1021

Publicized Bigtable of 70 petabytes 10M opssecWarehouse computing possibilities 100 x 10 x 20 x 20 x 40 = 16000000 nodeshellipSome representative numbers

Storage 1018 -gt 1020-21

Users 109 -gt 1010

Devices 10 -gt 1012

Network 1020 now -gt1021yr 32 KBsec for 1B peopleApps 105 -gt 106-7 or more

Eg embedded car systems 30-50 ECUs 100M lines of code

A variety of science engineering challenges

Focus on Innovation that Benefits our UsersFocus on Research and Engineering

Commitment to advancing technologyRich domain of work due to our missionGrand challenge problemsInternal consensus that production issues are often as challengingfun as pure inventionTechnical leverage1 Google Common Distributed System 2 A Focus on Services3 Empiricism and a Holistic Approach to Design

Our Innovation Culture

Focus on talentDistributed across the organization

Impacting Google necessitates broad diverse involvement in science and engineeringResearch is done both in our research team and in our engineering organization organized opportunistically

Teams benefit greatlyFrom mutual talentFrom Googlersquos comparative advantages to our scale and broad useFrom service-based architecture (ldquoeaserdquo of working in vivo)

Ideal Distributed Computing

Devices

Research Challenges in Ideal Distributed Computing

Alternative designs that would give better energy efficiency at lower utilizationServer OS design aimed at many highly-connected machines in one buildingUnifying abstractions for exploiting parallelism beyond inter-transaction parallelism and map-reduceLatency reductionA general model of replication including consistency choices explained and codifiedMachine learning techniques applied to monitoringcontrolling such systemsAutomatic dynamic world-wide placement of data amp computation to minimize latency andor cost given constraints onBuilding retrieval systems that efficiently and usably deal with ACLsHolistic models of privacyThe user interface to the userrsquos diverse processing and state

Totally Transparent Processing

D The set of all end-user access

devices

L The set of all human languages

M The set of all modalities

C The set of all corpora

Personal ComputersPhoneMedia PlayersReadersTelematicsSet-top BoxesAppliancesHealth deviceshellip

Current languagesHistorical languagesOther forms of human notationPossible language specializationFormal languageshellip

TextImageAudioVideoGraphicsOther sensor-based datahellip

The normal webThe deep webPeriodicalsBooksCatalogsBlogsGeodataScientific datasetsHealth datahellip

For all d in D all l in L all m in M and all c in C

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 3: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

OutlineGoogleProdigiousnessAdvances in the Field examples

TranslationSpeechVisionCloud-based collaboration around structured-dataOperations ResearchSemantic Processing

Beneficial Societal Impacts examplesEarth EngineGoogle HealthOther Health EffortsCrisis ResponseDigital HumanitiesEducation

A Technical ThemesChallenges

Mission

Organizing the worldrsquos information andMaking it universally accessible and useful

Google and Commerce

Over 1 million AdWords advertisers worldwideOver 1 million AdSense publishers worldwideVia the Google Ad Network AdSense publishers reach over 80 of global internet users in 100 countries and 20 languagesYouTube is monetizing over a billion video views per week globallyIn 2009 Google generated $54 billion of economic activity for American businesses website publishers and non-profits

Prodigiousness

Giga 109 Tera 1012 Peta 1015 Exa 1018 Zetta1021

Publicized Bigtable of 70 petabytes 10M opssecWarehouse computing possibilities 100 x 10 x 20 x 20 x 40 = 16000000 nodeshellipSome representative numbers

Storage 1018 -gt 1020-21

Users 109 -gt 1010

Devices 10 -gt 1012

Network 1020 now -gt1021yr 32 KBsec for 1B peopleApps 105 -gt 106-7 or more

Eg embedded car systems 30-50 ECUs 100M lines of code

A variety of science engineering challenges

Focus on Innovation that Benefits our UsersFocus on Research and Engineering

Commitment to advancing technologyRich domain of work due to our missionGrand challenge problemsInternal consensus that production issues are often as challengingfun as pure inventionTechnical leverage1 Google Common Distributed System 2 A Focus on Services3 Empiricism and a Holistic Approach to Design

Our Innovation Culture

Focus on talentDistributed across the organization

Impacting Google necessitates broad diverse involvement in science and engineeringResearch is done both in our research team and in our engineering organization organized opportunistically

Teams benefit greatlyFrom mutual talentFrom Googlersquos comparative advantages to our scale and broad useFrom service-based architecture (ldquoeaserdquo of working in vivo)

Ideal Distributed Computing

Devices

Research Challenges in Ideal Distributed Computing

Alternative designs that would give better energy efficiency at lower utilizationServer OS design aimed at many highly-connected machines in one buildingUnifying abstractions for exploiting parallelism beyond inter-transaction parallelism and map-reduceLatency reductionA general model of replication including consistency choices explained and codifiedMachine learning techniques applied to monitoringcontrolling such systemsAutomatic dynamic world-wide placement of data amp computation to minimize latency andor cost given constraints onBuilding retrieval systems that efficiently and usably deal with ACLsHolistic models of privacyThe user interface to the userrsquos diverse processing and state

Totally Transparent Processing

D The set of all end-user access

devices

L The set of all human languages

M The set of all modalities

C The set of all corpora

Personal ComputersPhoneMedia PlayersReadersTelematicsSet-top BoxesAppliancesHealth deviceshellip

Current languagesHistorical languagesOther forms of human notationPossible language specializationFormal languageshellip

TextImageAudioVideoGraphicsOther sensor-based datahellip

The normal webThe deep webPeriodicalsBooksCatalogsBlogsGeodataScientific datasetsHealth datahellip

For all d in D all l in L all m in M and all c in C

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 4: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Mission

Organizing the worldrsquos information andMaking it universally accessible and useful

Google and Commerce

Over 1 million AdWords advertisers worldwideOver 1 million AdSense publishers worldwideVia the Google Ad Network AdSense publishers reach over 80 of global internet users in 100 countries and 20 languagesYouTube is monetizing over a billion video views per week globallyIn 2009 Google generated $54 billion of economic activity for American businesses website publishers and non-profits

Prodigiousness

Giga 109 Tera 1012 Peta 1015 Exa 1018 Zetta1021

Publicized Bigtable of 70 petabytes 10M opssecWarehouse computing possibilities 100 x 10 x 20 x 20 x 40 = 16000000 nodeshellipSome representative numbers

Storage 1018 -gt 1020-21

Users 109 -gt 1010

Devices 10 -gt 1012

Network 1020 now -gt1021yr 32 KBsec for 1B peopleApps 105 -gt 106-7 or more

Eg embedded car systems 30-50 ECUs 100M lines of code

A variety of science engineering challenges

Focus on Innovation that Benefits our UsersFocus on Research and Engineering

Commitment to advancing technologyRich domain of work due to our missionGrand challenge problemsInternal consensus that production issues are often as challengingfun as pure inventionTechnical leverage1 Google Common Distributed System 2 A Focus on Services3 Empiricism and a Holistic Approach to Design

Our Innovation Culture

Focus on talentDistributed across the organization

Impacting Google necessitates broad diverse involvement in science and engineeringResearch is done both in our research team and in our engineering organization organized opportunistically

Teams benefit greatlyFrom mutual talentFrom Googlersquos comparative advantages to our scale and broad useFrom service-based architecture (ldquoeaserdquo of working in vivo)

Ideal Distributed Computing

Devices

Research Challenges in Ideal Distributed Computing

Alternative designs that would give better energy efficiency at lower utilizationServer OS design aimed at many highly-connected machines in one buildingUnifying abstractions for exploiting parallelism beyond inter-transaction parallelism and map-reduceLatency reductionA general model of replication including consistency choices explained and codifiedMachine learning techniques applied to monitoringcontrolling such systemsAutomatic dynamic world-wide placement of data amp computation to minimize latency andor cost given constraints onBuilding retrieval systems that efficiently and usably deal with ACLsHolistic models of privacyThe user interface to the userrsquos diverse processing and state

Totally Transparent Processing

D The set of all end-user access

devices

L The set of all human languages

M The set of all modalities

C The set of all corpora

Personal ComputersPhoneMedia PlayersReadersTelematicsSet-top BoxesAppliancesHealth deviceshellip

Current languagesHistorical languagesOther forms of human notationPossible language specializationFormal languageshellip

TextImageAudioVideoGraphicsOther sensor-based datahellip

The normal webThe deep webPeriodicalsBooksCatalogsBlogsGeodataScientific datasetsHealth datahellip

For all d in D all l in L all m in M and all c in C

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 5: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Google and Commerce

Over 1 million AdWords advertisers worldwideOver 1 million AdSense publishers worldwideVia the Google Ad Network AdSense publishers reach over 80 of global internet users in 100 countries and 20 languagesYouTube is monetizing over a billion video views per week globallyIn 2009 Google generated $54 billion of economic activity for American businesses website publishers and non-profits

Prodigiousness

Giga 109 Tera 1012 Peta 1015 Exa 1018 Zetta1021

Publicized Bigtable of 70 petabytes 10M opssecWarehouse computing possibilities 100 x 10 x 20 x 20 x 40 = 16000000 nodeshellipSome representative numbers

Storage 1018 -gt 1020-21

Users 109 -gt 1010

Devices 10 -gt 1012

Network 1020 now -gt1021yr 32 KBsec for 1B peopleApps 105 -gt 106-7 or more

Eg embedded car systems 30-50 ECUs 100M lines of code

A variety of science engineering challenges

Focus on Innovation that Benefits our UsersFocus on Research and Engineering

Commitment to advancing technologyRich domain of work due to our missionGrand challenge problemsInternal consensus that production issues are often as challengingfun as pure inventionTechnical leverage1 Google Common Distributed System 2 A Focus on Services3 Empiricism and a Holistic Approach to Design

Our Innovation Culture

Focus on talentDistributed across the organization

Impacting Google necessitates broad diverse involvement in science and engineeringResearch is done both in our research team and in our engineering organization organized opportunistically

Teams benefit greatlyFrom mutual talentFrom Googlersquos comparative advantages to our scale and broad useFrom service-based architecture (ldquoeaserdquo of working in vivo)

Ideal Distributed Computing

Devices

Research Challenges in Ideal Distributed Computing

Alternative designs that would give better energy efficiency at lower utilizationServer OS design aimed at many highly-connected machines in one buildingUnifying abstractions for exploiting parallelism beyond inter-transaction parallelism and map-reduceLatency reductionA general model of replication including consistency choices explained and codifiedMachine learning techniques applied to monitoringcontrolling such systemsAutomatic dynamic world-wide placement of data amp computation to minimize latency andor cost given constraints onBuilding retrieval systems that efficiently and usably deal with ACLsHolistic models of privacyThe user interface to the userrsquos diverse processing and state

Totally Transparent Processing

D The set of all end-user access

devices

L The set of all human languages

M The set of all modalities

C The set of all corpora

Personal ComputersPhoneMedia PlayersReadersTelematicsSet-top BoxesAppliancesHealth deviceshellip

Current languagesHistorical languagesOther forms of human notationPossible language specializationFormal languageshellip

TextImageAudioVideoGraphicsOther sensor-based datahellip

The normal webThe deep webPeriodicalsBooksCatalogsBlogsGeodataScientific datasetsHealth datahellip

For all d in D all l in L all m in M and all c in C

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 6: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Prodigiousness

Giga 109 Tera 1012 Peta 1015 Exa 1018 Zetta1021

Publicized Bigtable of 70 petabytes 10M opssecWarehouse computing possibilities 100 x 10 x 20 x 20 x 40 = 16000000 nodeshellipSome representative numbers

Storage 1018 -gt 1020-21

Users 109 -gt 1010

Devices 10 -gt 1012

Network 1020 now -gt1021yr 32 KBsec for 1B peopleApps 105 -gt 106-7 or more

Eg embedded car systems 30-50 ECUs 100M lines of code

A variety of science engineering challenges

Focus on Innovation that Benefits our UsersFocus on Research and Engineering

Commitment to advancing technologyRich domain of work due to our missionGrand challenge problemsInternal consensus that production issues are often as challengingfun as pure inventionTechnical leverage1 Google Common Distributed System 2 A Focus on Services3 Empiricism and a Holistic Approach to Design

Our Innovation Culture

Focus on talentDistributed across the organization

Impacting Google necessitates broad diverse involvement in science and engineeringResearch is done both in our research team and in our engineering organization organized opportunistically

Teams benefit greatlyFrom mutual talentFrom Googlersquos comparative advantages to our scale and broad useFrom service-based architecture (ldquoeaserdquo of working in vivo)

Ideal Distributed Computing

Devices

Research Challenges in Ideal Distributed Computing

Alternative designs that would give better energy efficiency at lower utilizationServer OS design aimed at many highly-connected machines in one buildingUnifying abstractions for exploiting parallelism beyond inter-transaction parallelism and map-reduceLatency reductionA general model of replication including consistency choices explained and codifiedMachine learning techniques applied to monitoringcontrolling such systemsAutomatic dynamic world-wide placement of data amp computation to minimize latency andor cost given constraints onBuilding retrieval systems that efficiently and usably deal with ACLsHolistic models of privacyThe user interface to the userrsquos diverse processing and state

Totally Transparent Processing

D The set of all end-user access

devices

L The set of all human languages

M The set of all modalities

C The set of all corpora

Personal ComputersPhoneMedia PlayersReadersTelematicsSet-top BoxesAppliancesHealth deviceshellip

Current languagesHistorical languagesOther forms of human notationPossible language specializationFormal languageshellip

TextImageAudioVideoGraphicsOther sensor-based datahellip

The normal webThe deep webPeriodicalsBooksCatalogsBlogsGeodataScientific datasetsHealth datahellip

For all d in D all l in L all m in M and all c in C

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 7: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

A variety of science engineering challenges

Focus on Innovation that Benefits our UsersFocus on Research and Engineering

Commitment to advancing technologyRich domain of work due to our missionGrand challenge problemsInternal consensus that production issues are often as challengingfun as pure inventionTechnical leverage1 Google Common Distributed System 2 A Focus on Services3 Empiricism and a Holistic Approach to Design

Our Innovation Culture

Focus on talentDistributed across the organization

Impacting Google necessitates broad diverse involvement in science and engineeringResearch is done both in our research team and in our engineering organization organized opportunistically

Teams benefit greatlyFrom mutual talentFrom Googlersquos comparative advantages to our scale and broad useFrom service-based architecture (ldquoeaserdquo of working in vivo)

Ideal Distributed Computing

Devices

Research Challenges in Ideal Distributed Computing

Alternative designs that would give better energy efficiency at lower utilizationServer OS design aimed at many highly-connected machines in one buildingUnifying abstractions for exploiting parallelism beyond inter-transaction parallelism and map-reduceLatency reductionA general model of replication including consistency choices explained and codifiedMachine learning techniques applied to monitoringcontrolling such systemsAutomatic dynamic world-wide placement of data amp computation to minimize latency andor cost given constraints onBuilding retrieval systems that efficiently and usably deal with ACLsHolistic models of privacyThe user interface to the userrsquos diverse processing and state

Totally Transparent Processing

D The set of all end-user access

devices

L The set of all human languages

M The set of all modalities

C The set of all corpora

Personal ComputersPhoneMedia PlayersReadersTelematicsSet-top BoxesAppliancesHealth deviceshellip

Current languagesHistorical languagesOther forms of human notationPossible language specializationFormal languageshellip

TextImageAudioVideoGraphicsOther sensor-based datahellip

The normal webThe deep webPeriodicalsBooksCatalogsBlogsGeodataScientific datasetsHealth datahellip

For all d in D all l in L all m in M and all c in C

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 8: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Focus on Innovation that Benefits our UsersFocus on Research and Engineering

Commitment to advancing technologyRich domain of work due to our missionGrand challenge problemsInternal consensus that production issues are often as challengingfun as pure inventionTechnical leverage1 Google Common Distributed System 2 A Focus on Services3 Empiricism and a Holistic Approach to Design

Our Innovation Culture

Focus on talentDistributed across the organization

Impacting Google necessitates broad diverse involvement in science and engineeringResearch is done both in our research team and in our engineering organization organized opportunistically

Teams benefit greatlyFrom mutual talentFrom Googlersquos comparative advantages to our scale and broad useFrom service-based architecture (ldquoeaserdquo of working in vivo)

Ideal Distributed Computing

Devices

Research Challenges in Ideal Distributed Computing

Alternative designs that would give better energy efficiency at lower utilizationServer OS design aimed at many highly-connected machines in one buildingUnifying abstractions for exploiting parallelism beyond inter-transaction parallelism and map-reduceLatency reductionA general model of replication including consistency choices explained and codifiedMachine learning techniques applied to monitoringcontrolling such systemsAutomatic dynamic world-wide placement of data amp computation to minimize latency andor cost given constraints onBuilding retrieval systems that efficiently and usably deal with ACLsHolistic models of privacyThe user interface to the userrsquos diverse processing and state

Totally Transparent Processing

D The set of all end-user access

devices

L The set of all human languages

M The set of all modalities

C The set of all corpora

Personal ComputersPhoneMedia PlayersReadersTelematicsSet-top BoxesAppliancesHealth deviceshellip

Current languagesHistorical languagesOther forms of human notationPossible language specializationFormal languageshellip

TextImageAudioVideoGraphicsOther sensor-based datahellip

The normal webThe deep webPeriodicalsBooksCatalogsBlogsGeodataScientific datasetsHealth datahellip

For all d in D all l in L all m in M and all c in C

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 9: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Our Innovation Culture

Focus on talentDistributed across the organization

Impacting Google necessitates broad diverse involvement in science and engineeringResearch is done both in our research team and in our engineering organization organized opportunistically

Teams benefit greatlyFrom mutual talentFrom Googlersquos comparative advantages to our scale and broad useFrom service-based architecture (ldquoeaserdquo of working in vivo)

Ideal Distributed Computing

Devices

Research Challenges in Ideal Distributed Computing

Alternative designs that would give better energy efficiency at lower utilizationServer OS design aimed at many highly-connected machines in one buildingUnifying abstractions for exploiting parallelism beyond inter-transaction parallelism and map-reduceLatency reductionA general model of replication including consistency choices explained and codifiedMachine learning techniques applied to monitoringcontrolling such systemsAutomatic dynamic world-wide placement of data amp computation to minimize latency andor cost given constraints onBuilding retrieval systems that efficiently and usably deal with ACLsHolistic models of privacyThe user interface to the userrsquos diverse processing and state

Totally Transparent Processing

D The set of all end-user access

devices

L The set of all human languages

M The set of all modalities

C The set of all corpora

Personal ComputersPhoneMedia PlayersReadersTelematicsSet-top BoxesAppliancesHealth deviceshellip

Current languagesHistorical languagesOther forms of human notationPossible language specializationFormal languageshellip

TextImageAudioVideoGraphicsOther sensor-based datahellip

The normal webThe deep webPeriodicalsBooksCatalogsBlogsGeodataScientific datasetsHealth datahellip

For all d in D all l in L all m in M and all c in C

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 10: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Ideal Distributed Computing

Devices

Research Challenges in Ideal Distributed Computing

Alternative designs that would give better energy efficiency at lower utilizationServer OS design aimed at many highly-connected machines in one buildingUnifying abstractions for exploiting parallelism beyond inter-transaction parallelism and map-reduceLatency reductionA general model of replication including consistency choices explained and codifiedMachine learning techniques applied to monitoringcontrolling such systemsAutomatic dynamic world-wide placement of data amp computation to minimize latency andor cost given constraints onBuilding retrieval systems that efficiently and usably deal with ACLsHolistic models of privacyThe user interface to the userrsquos diverse processing and state

Totally Transparent Processing

D The set of all end-user access

devices

L The set of all human languages

M The set of all modalities

C The set of all corpora

Personal ComputersPhoneMedia PlayersReadersTelematicsSet-top BoxesAppliancesHealth deviceshellip

Current languagesHistorical languagesOther forms of human notationPossible language specializationFormal languageshellip

TextImageAudioVideoGraphicsOther sensor-based datahellip

The normal webThe deep webPeriodicalsBooksCatalogsBlogsGeodataScientific datasetsHealth datahellip

For all d in D all l in L all m in M and all c in C

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 11: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Research Challenges in Ideal Distributed Computing

Alternative designs that would give better energy efficiency at lower utilizationServer OS design aimed at many highly-connected machines in one buildingUnifying abstractions for exploiting parallelism beyond inter-transaction parallelism and map-reduceLatency reductionA general model of replication including consistency choices explained and codifiedMachine learning techniques applied to monitoringcontrolling such systemsAutomatic dynamic world-wide placement of data amp computation to minimize latency andor cost given constraints onBuilding retrieval systems that efficiently and usably deal with ACLsHolistic models of privacyThe user interface to the userrsquos diverse processing and state

Totally Transparent Processing

D The set of all end-user access

devices

L The set of all human languages

M The set of all modalities

C The set of all corpora

Personal ComputersPhoneMedia PlayersReadersTelematicsSet-top BoxesAppliancesHealth deviceshellip

Current languagesHistorical languagesOther forms of human notationPossible language specializationFormal languageshellip

TextImageAudioVideoGraphicsOther sensor-based datahellip

The normal webThe deep webPeriodicalsBooksCatalogsBlogsGeodataScientific datasetsHealth datahellip

For all d in D all l in L all m in M and all c in C

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 12: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Totally Transparent Processing

D The set of all end-user access

devices

L The set of all human languages

M The set of all modalities

C The set of all corpora

Personal ComputersPhoneMedia PlayersReadersTelematicsSet-top BoxesAppliancesHealth deviceshellip

Current languagesHistorical languagesOther forms of human notationPossible language specializationFormal languageshellip

TextImageAudioVideoGraphicsOther sensor-based datahellip

The normal webThe deep webPeriodicalsBooksCatalogsBlogsGeodataScientific datasetsHealth datahellip

For all d in D all l in L all m in M and all c in C

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 13: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Totally Transparent Processing

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 14: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

ldquoHybridrdquo Intelligence

To extend the capability of people not in isolationAggregation of empirical signal is exceedingly valuableEx

Feedback in Information Retrieval eg in ranking or spelling correctionMachine learning eg image content analysis speech recognition with semi-supervised learning

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 15: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Research Challenges in Transparent Computing amp Hybrid Intelligence

Endless applications with very new user interface implicationsAddressing limits to dataTechniques to integrate user-feedback in acceptable fashionsApproaches to new signalExplanation scale and variance minimization in machine learningInformation fusionlearning across diverse signals ndash The Combination Hypothesis more generallyUsability devices and subpopulationsPrivacy

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 16: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Domains of Application

Search enginesTranslationSpeech recognitionVision

Remedial EducationPersonal healthEpidemiologyEconomic predictionSocietalenvironmental optimizationSocial Networking in ever more cleveruseful ways Humanities and Social SciencesMulti-player gaming

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 17: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Translation

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 18: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Machine Translation Google

Statistical Machine TranslationModel translation process with a statistical modelLearning from data monolingual amp bilingual

More data better translation qualityComputationally expensive approach

Models have many hundreds of Gigabyte of data(Moores law helps here)

Applying syntax information as a signal

ResultsMuch better translation qualityOngoing progress

More research groups 58 languages (so far)

recently Haitian Creole Urdu Georgian Latin

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 19: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Grand Challenges

Morphologytranslating into morphologically rich languageseg Russian Hungarianneed morphology-aware translation models

Reliabilitysome translation mistakes more severe than others

hotel - MontrealHeath Ledger - Tom Cruise

Research How to detect crazy translations

Long-distance reorderingsimple case SVO SOV(one) approach parse source amp reorder

issue parsing accuracy for out-of-domain texts

Finding all Training Data

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 20: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

How about Poetry

Paper at EMNLP 2010 conferenceldquoPoeticrdquo Statistical Machine Translation Rhyme and Meter DGenzel JUszkoreit FOch EMNLP 2010

ApproachEnforce meter and rhyme as extra constraints(similar to language model)Eg iambic pentameter stress pattern 0101010101Produce most probable translation that obeys constraints(Function follows form)

Example output (couplet in amphibrachic tetrameterAn officer stated that three were arrestedand that the equipment is currently tested

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 21: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Speech

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 22: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Goals for Speech Technology at Google

Much of the worldrsquos information is spoken ndash we need to recognize it before we can organize it

YouTube transcription and translation (breaking the language barrier for YouTube access)

Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed

Spoken input and output is key to usability

Our goal is completely ubiquitous availability of speech io (every applicationservice every usage scenario every language)

How do we get thereDelivery from the cloud ndash support constant iteration and refinement

Operating at large scale ndash train huge statistical models on huge amounts of data

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 23: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Learning from use - without human transcription

ChallengesHow do we grow the model to take advantage of the data (richer models of accent speaker noise etc)Huge computational demandsInfrastructure demands ndash parallelization ndash leverage Google software environment

Training Acoustic Models wUnsupervised Learning

Supervised vs unsupervised training - hours of

data vs error rate

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 24: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Vision

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 25: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Computer Vision

Advance state-of-the art in 3 key areas of imageaudiovideo analysis and apply results to our multimedia products

Semantic Interpretation Generate human understandable description of content (eg auto-tagging videos on YouTube Image annotation porn classification etc)Matching Find similar entities from a large corpus (eg find similar on image search video fingerprinting for YouTube etc )Synthesis Generate better imagesvideo by understanding the statistics of a large corpus of images (eg better facades in 3D building on Google Earth automatic shadow removal from areal images etc)

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 26: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Semantic Interpretation sample problem - Video Annotation

Video metadata has a cognitive cost on the user because they have to type it in be careful about what keywords they use and in general try to make their video searchableMany uploaders donrsquot have the motivation or energy to provide proper metadataNoisy metadata hurts everyone ndash spam misspellings 1337 acronyms etc

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 27: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Cloud-based ComputingStructured Data

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 28: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Structured Data on the Web

Discovery and search for structured dataThe deep Web -- significant gap in coverageStructured tables on the Web -- not leveraged in search

Enable easy creation management sharing and publishing of structured data

Fusion Tables wwwgooglecomfusiontables

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 29: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Google Fusion Tables host manage collaborate on visualize and publish data tables online

What can I do with Fusion Tables

Host data online - and stay in controlcontrol can be at the level of columns or rows

Re-use data without making copies

Collaborate on the detailsMerge data from multiple tablesComment on individual rows columns or cells

Make a map (or chart or timeline) in minutes

Manage data via our site or an API

Fusion Table Example Gallery

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 30: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Easy Data Upload Attribution recorded

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 31: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Easily Create Informative Maps

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 32: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Easily Create Informative Maps

baby steps towards the dream platform

DEMOcircle of blue

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 33: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Cloud-based ComputingPrediction

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 34: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

1 Upload

2 Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3 Predict

Machine learning as a web serviceSmart Apps for every developer

- RESTful HTTP service- Simple integration

Prediction API

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 35: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Under the hood many classifiersregressorsRecent research efficient and theoretically principled methods for distributed learning (NIPS-09 HLT-10)

Network costs can be reduced by an order of magnitude with minimal loss in classifier accuracy

Under the API

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 36: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Operations Reseach and Optimization

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 37: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Operations Research ChallengesSize Optimization is often NP Complete

Increasing the size by 1 doubles the search spaceThe tools are barely keeping up with the problems

Uncertainty Data is often fuzzy How do you route cars when there are roadblocks new one-ways traffic jamsCan you use optimization in on-line algorithm connected to usersHow well can you optimize against forecasted data how do you react if the forecast is bad

User expectation and requirementsThe definition of problems is also unclear What is the objective What is a good solution Can I violate this requirement By how much

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 38: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Operations Research Opportunities

Machine Learning can help us in two waysBy providing guidance towards good solutionsBy qualifying valid solutionsBy reducing the search space

Large computing resources means we can try a bit harderCrowd-Sourcing means better data better feedback better evaluations of algorithms and solutionsHaving all our code open-source means we can collaborate on building the best set of tools

See httpcodegooglecompor-tools

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 39: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Semantic Processing

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 40: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Web Inference and LearningGoal better understanding of Web content and user intentMethod algorithms that draw reliable semantic inferences from the wealth of evidence implicit in massive Web data

How to interpret this term in this context

Does this sentence answer that question

Will this user click on that ad

Learning create concise representations to support good inferences

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 41: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Meaning from the Web

Elementary semantic inference what are the possible classes for each instance

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 42: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Combination wins

Combined graph14M nodes75M edges

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 43: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Applications to Society Follow

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 44: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Earth Engine

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 45: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Motivation - Carbon Forest Tracking

UNEP Atlas of our Changing Environment

1975 1989 2001

Rondonia Brazil

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 46: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

A Sampling of Other Use Cases

Disease Early Warning Remote surveillance of disease and prediction of epidemics

Population Census Supplements traditional census and mapping in developing regions

Humanitarian Crisis Mapping Can detect and monitor a growing range of crisis typesWater Resources Monitor water quality and availability and alleviate water shortage problemsFood Security Famine early warning rainfall and water requirements estimations agr production estimates and irrigation and fertilizer supply amp demandGlobal Education Programs

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 47: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Parallel Geo-Processing ldquoin the cloudrdquo

(a brief illustration)

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 48: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Original image

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 49: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Original image is divided into 256px sub-units

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 50: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Sub-units are distributed

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 51: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Sub-units are distributed to separate machines

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 52: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Sub-units are distributed to separate machines where they can be processed in parallel

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 53: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Thousands can be processed simultaneously

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 54: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Result is reassembled

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 55: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Result is reassembled into a finished image

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 56: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Global-scale earth observation and informatics platformFor public benefit and to support emerging green economyHelp science come out of research lab and into operational use at scaleUnprecedented catalog of earth observation data for mining and analysisPromote transparency reproducibility collaboration ldquoopen sciencerdquo

Very fast computation of scientific map productsIntrinsically-parallel pixel processing systemBuilt-in Google algorithms as well as user-suppliedEarth Engine API for 3rd party algorithm developmentAccess control versioning provenanceOnline and desktop versions (open source desktop version)

On a lot of useful dataEvery available Landsat and MODIS scene (more satellites coming)Commercial datasets (very high resolution satellite imagery)Environmental data (atmospheric ocean terrestrial)User-supplied (ex in-situ data collected via Android phones)

Overview

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 57: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Scale of Data

US Satellite imagery dataset LandsatPeru 60 Landsat scenes (3Gpix 20GB) per coveringWorld 8000 Landsat scenes (2TB) per coveringComplete global coverage every 16 daysOperating since 1972 historical archive holds ~4PBUS NASA EOS approaching ~10PB of Earth images

Europe European Space Agency (ESA)ESA satellite missions MERIS Envisat othersSpotImage (France) 20M SPOT images since 1986 10000 new images collected daily 5+ PB archiveESA Launching Sentinel-1 in 2011

Representative Examples

Envisat Gulf Oil Spill June 2010 (ESA))MERIS Hurricane Isabel Sept 2003 (ESA) Spot Image Xingu Brazilian Amazon

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 58: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

View on YouTube

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 59: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Health (US)

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 60: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Google Health personal health dashboard

Launched in May 2008

Major update Sept 2010

User controls and owns hisher data

A platform thathellipProvides a dashboard for wellness information amp medical recordsAllows user to connect and interact with a broad group of ldquoadd onrdquo servicesIncludes a non-tethered PHR

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 61: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Crisis Response

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 62: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Person Finder

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 63: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Pre-Earthquake - Aug 26 2009

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 64: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

1 Day After Earthquake - Jan 13 2010

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 65: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

13 Days After Earthquake - Jan 25 2010

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 66: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Digital Humanities

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 67: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Illuminating the Humanities

Q What can you do with12 million books inover 400 languagescomprised of 5 billion pages and 2 trillion wordsall digitized

A Look to the humanities for new questionsHow would you (re)define Victorian literature

What are the differences between the English and Latin editions of Hobbesrsquo Leviathan

How have places changed over the course of history

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 68: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Digital Humanities Awards

Research program supporting university research taking a computational approach to traditional humanist questions US program Summer 2010

12 projects23 researchers15 universities

European program Winter 2010

10 projects planned $1M total funding

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 69: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Education

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 70: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Curriculum Development

Seeding and supporting computing curriculum developmentExploring computational thinking in K12 (launch late Oct)CS4HS High school computer science (cs4hscom)Undergraduate open source CS curriculum Google Code University (codegooglecomedu)Lantern platform Wiki for open source curriculum development (in collaboration with Khan Academy)

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 71: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Talent Development Google Summer of CodeTM

Program GenesisrdquoFlip bits not burgersrdquo during summer holidaysExposure to real-world software development

Students paired with mentor from OS communityExecute to milestones laid out in accepted applicationStipend allows students to concentrate on OS development

20101026 students150 organizations 69 countries

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 72: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Technology Leadership App Inventor for Android

Visual programming environment for Android mobile devicesHelping people become creators (rather than consumers) of technologyLaunched in Google Labs July 12 2010

httpappinventorgooglelabscomabout

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 73: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

CS4HS European WorkshopsEacutecole Polytechnique Feacutedeacuterale de Lausanne (EPFL) Switzerland Building and Programming Robots

ETH Zurich Switzerland ABZ Ausbildung - und Beratungszentrum fuer Informatikunterricht

Makerere University Uganda Grassroot approach to improve the quality of applicants of computing programs at Makerere University

Manchester University United Kingdom Animation11

Oslo University Norway TENK

Queen Mary University United Kingdom cs4fn magazine

RWTH Aachen Germany Bright Brains in Computer Science

Sapienza University of Rome Italy Challenge and Fun with the CS Olympiads

University of Stuttgart Germany UniS2010

Technion Israel High School Computer Science Female Students Visits in Google Impressions Conceptions and Influences

Trinity College Dublin Ireland Computer Programming Outreach B2C

University of Cape Town South Africa Project Umonya

University College Dublin Ireland CS Summer School

University of Warsaw Poland Mastering Programing Skills Workshops for Teachers

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 74: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Technology Leadership Google Code University

Course content on current computing technologies and paradigmsCS Curriculum SearchTech Talks on CS TopicsTutorials lecture slides problem sets for a variety of topic areas

AJAX ProgrammingAlgorithmsDistributed SystemsWeb SecurityLanguagesPractical Skills (MySQL Linux)

httpcodegooglecomedu

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 75: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Supporting our Academic Institutions

Research Awards Programs - 230+ projects funded in the last year

Next Due dates August 15 (CS Awards) October 15 (Marketing Awards)Research-awardsgooglecomFocused Grant Program

Visiting Faculty Program - 20 faculty (ongoing)University-relationsgooglecom

PhD Fellowship Program2009 13 students supported in North America2010 15 in North America 16 outside North AmericaOver 150 other scholarships

~1000 interns worldwideCS4HS 1500+ teachers (~100000 students) US amp EMEA

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate

Page 76: Rapid Advances in Computer Science and Opportunities for ... · Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement

Final Thoughts

Scale of Communication and Computing is profoundEndless opportunity for technical growth

Some large themesMajor new application domains

Google rapidly to innovate in sciencetechnology and value to consumersWe are providing increased support for academic institutions in computer science and related areas

Its a most exciting area in which to innovate