63
Post Page Rank: Search, AI, and the Limits of Probability Joe Buzzanga [email protected] October, 2015

PPRank 2015

Embed Size (px)

Citation preview

Page 1: PPRank 2015

Post Page Rank: Search, AI, and the Limits of Probability

Joe [email protected]

October, 2015

Page 2: PPRank 2015

Topics• Key Technology Transitions• Mobile

• Apps vs Web • Intelligence

• Rethinking search• Machine learning• Knowledge Bases• Intelligent Personal Assistants

• Philosophical Implications

Page 3: PPRank 2015

Search—Key Transitions

• Desktop to Mobile• Web Pages to Apps• Keyword to Concept• Links to Answers• Passive to Active• Insentient to Sentient

Page 4: PPRank 2015

A Billion Dollar Algorithm

4

• Intuition: Link Structure of the web analogous to citation graph

• Almost instantly leapfrogged alternative systems

Page 5: PPRank 2015

Early View of Advertising“Currently, the predominant business model for commercial search engines is advertising. The goals of the advertising business model do not always correspond to providing quality search to users….

For this type of reason and historical experience with other media [Bagdikian 83], we expect that advertising funded search engines will be inherently biased towards the advertisers and away from the needs of the consumers.”

Source: Brin & Page: The Anatomy of a Large Scale Hypertextual Web Search Engine, Appendix A, Advertising and Mixed Motives, Stanford University, 1998

Figure 1: High Level Google Architecture

5

Page 6: PPRank 2015

Page Rank in Context

1980 1990 2000 2010

1981 IBM PC Announced 1994 Mosaic Browser Launches 2004 Google IPO 2011 Google "Brain" Project1982 TCP/IP DoD standard 1998 Page Rank 2006 Hinton's Breakthrough 2011 Siri on iPhone 4s1983 IEEE Ethernet Standard 1999 IEEE Standardizes WiFi 802.11b 2007 Apple Launches the iPhone 2012 Google Knowledge Graph1989 Berners-Lee Invents the Web 1999 3GPP Release 99 2007 Android Launches 2013 Google App Indexing

2012 Deep Neural Net for Google VS2014 Google Acquires Deep Mind

PC::::Wired::::Narrowband

PC::::Wired::::Broadband

Smartphone::::Wireless::::Broadband

Page 7: PPRank 2015

Web Search-->Mobile Web Search"In fact, more Google searches take place on mobile devices than on computers in 10 countries including the US and Japan. May 5, 2015http://adwords.blogspot.com/2015/05/building-for-next-moment.html

Page 8: PPRank 2015

Apps vs Web

"As web use migrates increasingly to smartphones — many publishers say 50 to 60 percent of their digital readers now come from mobile — people have spent less time on the web. This year, according to the research firm eMarketer, smartphone users in the United States are projected to spend 81 percent of their time using mobile apps instead."NY Times, 10/7/2015 http://www.nytimes.com/2015/10/08/business/media/google-announces-service-to-speed-loading-of-news-articles.html?ref=business&_r=0

Page 13: PPRank 2015

Dead Web?

Page 14: PPRank 2015

Dead Web?

Page 15: PPRank 2015

Apple vs Google

“We did not enter the search business, Jobs said. They entered the phone business. Make no mistake they want to kill the iPhone. We won’t let them, he says. …This don’t be evil mantra: “It’s bullshit.” Source: Steve Jobs, Wired, Jan. 30, 2010

http://www.wired.com/epicenter/2010/01/googles-dont-be-evil-mantra-is-bullshit-adobe-is-lazy-apples-steve-jobs/

Page 16: PPRank 2015

Does Apple Want to Kill the Open Web Google?

Page 17: PPRank 2015

App Indexing• Apple announced deep linking

to apps in June WWDC as part of iOS9• And…Apple's iOS 9 operating system

will allow content blocking extensions to be added to Safari.

• Google has indexed 50 billion deep links to apps so far (source: Google Q2 2015 Earnings Call)• Android and iOS9

Page 18: PPRank 2015

Apple iOS9 Search

Page 19: PPRank 2015

Apple iOS9 Search

Page 20: PPRank 2015

The Disappearing Search Box

http://insidesearch.blogspot.com/2015/05/now-on-tap.html

Page 21: PPRank 2015

What is Search?

Page 22: PPRank 2015

Search: Basic Model

Page 23: PPRank 2015

Is Search A Solved Problem?

Page 24: PPRank 2015

The Dream of Search

Yet, in many ways, we’re a million miles away from creating the search engine of my dreams, one that gets you just the right information at the exact moment you need it with almost no effort.  

Larry Page, 2013 Google Founders Letter, https://investor.google.com/corporate/2013/founders-letter.html

Page 25: PPRank 2015

Rethinking Search• AI

• Machine Learning• Especially Deep Learning

• Knowledge Bases

• Intelligent Personal Assistants

Page 26: PPRank 2015

Birth of AI

https://www.aaai.org/ojs/index.php/aimagazine/article/viewFile/1911/1809

Page 27: PPRank 2015

AI: Major Branches

Artificial Intelligence

Machine Learning Symbol/Logic/Rules.

Supervised Unsupervised Reinforcement

Deep Learning

Page 28: PPRank 2015

Machine Learning Algorithms

https://s3.amazonaws.com/MLMastery/MachineLearningAlgorithms.png?__s=6ac4tvech5skhbnirvex

Page 29: PPRank 2015

Machine Learning: Definitions

Kulkarni, Parag. "Introduction to reinforcement and systemic machine learning."Reinforcement and Systemic Machine Learning for Decision Making (2012)

"The only thing you need to know is that machine learning applies statistical models to the data you have in order to make smart predictions about data you don't have." Harvard Business Review, vol. 93 number 11, Nov. 2015 p.38

Page 30: PPRank 2015

Machine learning is a scientific discipline that explores the construction and study of algorithms that can learn from data. Such algorithms operate by building a model based on inputs and using that to make predictions or decisions, rather than following only explicitly programmed instructions https://en.wikipedia.org/wiki/Portal:Machine_learning

Machine Learning: Definitions

Page 31: PPRank 2015

Machine Learning is Eating the WorldBaidu’s Artificial-Intelligence Supercomputer Beats Google at Image RecognitionMay 13, 2015 http://www.technologyreview.com/Scientists See Promise in Deep-Learning

ProgramsNov. 23, 2012 http://www.nytimes.com/

A Google Computer Can Teach Itself GamesFeb. 25, 2015 http://bits.blogs.nytimes.com/

Amazon's Voice Controlled Echo Is Now Available to Anyone That Wants ItJune 23, 2015 http://www.theverge.com

Facebook Launches Advanced AI Effort to Find Meaning in Your PostsSept. 2013 http://www.technologyreview.comGoogle Acquires British Artificial

Intelligence DeveloperJan. 27, 2014 http://dealbook.nytimes.com

Amazon Launches Machine Learning-As-A-ServiceApril 10, 2015 http://www.informationweek.com

Google Turning Its Lucrative Web Search Over to AI MachinesOct. 26, 2015 http://www.bloomberg.com/

Wolfram's Image Recognition Reflects A Big Shift in AIMay 15, 2015 http://www.wired.com

Page 32: PPRank 2015

Apple

Page 33: PPRank 2015

ML is Everywhere

"Google using ML in 47 products" Jeff Dean, 2015 NVIDIA GPU Technology Conference

Page 34: PPRank 2015

Google Voice Search

We are happy to announce that our new acoustic models are now used for voice searches and commands in the Google app (on Android and iOS), and for dictation on Android devices. In addition to requiring much lower computational resources, the new models are more accurate, robust to noise, and faster to respond to voice search queries - so give it a try, and happy (voice) searching!

Source:http://googleresearch.blogspot.com/2015/09/google-voice-search-faster-and-more.html

Posted: Thursday, September 24, 2015  Posted by Hasim Sak, Andrew Senior, Kanishka Rao, Françoise Beaufays and Johan Schalkwyk – Google Speech Team

Back in 2012, we announced that Google voice search had taken a new turn by adopting Deep Neural Networks(DNNs) as the core technology used to model the sounds of a language. These replaced the 30-year old standard in the industry: the Gaussian Mixture Model (GMM). DNNs were better able to assess which sound a user is producing at every instant in time, and with this they delivered greatly increased speech recognition accuracy. 

Today, we’re happy to announce we built even better neural network acoustic models using Connectionist Temporal Classification (CTC) and sequence discriminative training techniques. These models are a special extension of recurrent neural networks (RNNs) that are more accurate, especially in noisy environments, and they are blazingly fast!

Page 35: PPRank 2015

RankBrain: From Words to NumbersSemantic Equation: Rome-Italy+China=Beijing

"In the few months it has been deployed, RankBrain has become the third-most important signal contributing to the result of a search query….handling up to 15% of queries a day".http://www.bloomberg.com/news/articles/2015-10-26/google-turning-its-lucrative-web-search-over-to-ai-machines

if you only knew that Rome was the capital of Italy, and were wondering about the capital of China, then the equation Rome -Italy + China would return Beijing.http://deeplearning4j.org/word2vec.html

Google Turning Its Lucrative Web Search Over to AI MachinesOct. 26, 2015 http://www.bloomberg.com/

Page 36: PPRank 2015

Hype Cycle

Source: Gartner http://www.gartner.com/newsroom/id/3114217

Page 37: PPRank 2015

Deep Learning Catalysts“It has been obvious since the 1980s thatbackpropagation through deep autoencoderswould be very effective for nonlinear dimensionality reduction, provided that computerswere fast enough, data sets were big enough,and the initial weights were close enough to agood solution. All three conditions are nowsatisfied."Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the dimensionality of data with neural networks." Science 313.5786 (2006): 504-507.

Page 38: PPRank 2015

What is Deep Learning?

Deep learning is a technology for endowing a machine with a learning capability by loosely mimicking the neural structure of the neocortex. They are an evolution of earlier "shallow" artificial neural networks.

“If you want to understand how the mind works, ignoring the brain is probably a bad idea."--Source: Geoff Hinton, quoted in Chronicle of Higher Education, Feb. 23, 2015

Page 39: PPRank 2015

Why Deep Learning?

Page 40: PPRank 2015

Deep Learning Neural Nets Thrive on DataThe other side of the information explosion

Source: IDC http://www.emc.com/leadership/digital-universe/2014iview/executive-summary.htm

Page 41: PPRank 2015

The Other Side of Info ExplosionThe Unreasonable Effectiveness of DataHalevy, A.; Norvig, P.; Pereira, F.Intelligent Systems, IEEEYear: 2009, Volume: 24, Issue: 2Pages: 8 - 12, DOI: 10.1109/MIS.2009.36

Page 42: PPRank 2015

Rock Stars of Deep Learning

Page 43: PPRank 2015

Rock Stars of Deep Learning

Page 44: PPRank 2015

Rock Stars of Deep Learning

Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

Deep learning Yann LeCun, Yoshua Bengio & Geoffrey Hinton, Nature 521, 436–444 (28 May 2015) doi:10.1038/nature14539

Page 46: PPRank 2015

ML (Neural Nets) for Text

This article demonstrates that we can apply deep learning to text understanding from character level inputs all the way up to abstract text concepts, using temporal convolutional networks(LeCun et al., 1998) (ConvNets). We apply ConvNets to various large-scale datasets, including ontology classification, sentiment analysis, and text categorization. We show that temporal ConvNets can achieve astonishing performance without the knowledge of words, phrases, sentences and any other syntactic or semantic structures with regards to a human language. Evidence shows that our models can work for both English and Chinese.

arXiv:1502.01710v2 [cs.LG] 7 Apr 2015

Text Understanding from Scratch

Page 47: PPRank 2015

ML (Neural Nets) for TextNatural Language Processing (Almost) from ScratchWe propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling…. our system learns internal representations on thebasis of vast amounts of mostly unlabeled training data.

Journal of Machine Learning Research 12 (2011) 2461-2505 Submitted 1/10; Revised 11/10; Published 8/11

Page 48: PPRank 2015

Knowledge Bases Emerge

Page 49: PPRank 2015

Knowledge Base: Application

• Understand query• Understand content (web pages, documents, images…)

• Return answers• Inference?

"We’ve gone up a level from just talking about the words to talking about what the thing actually is. In crawling and indexing documents we can now have an understanding of what the document is about. If the document is about famous tennis players we actually know it’s about sport and tennis. Every piece of information that we crawl, index, or search is analyzed in the context of Knowledge Graph. That’s not the same as completely understanding the text as you and I might do but it’s a step towards it.--John Giannandrea, Google VP, Jan. 2014 http://www.technologyreview.com

Page 50: PPRank 2015

Fact ExtractionAutomatic Extraction of Facts, Relations, and Entities for Web-Scale Knowledge Base Population Ndapandula T. Nakashole A dissertation presented for the degree of Doctor of Engineering in the Faculty of Natural Sciences and Technology UNIVERSITY OF SAARLAND MAX PLANCK INSTITUTE FOR INFORMATICS · Saarbrücken, Germany 2012http://people.mpi-inf.mpg.de/~nnakasho/thesis/nakashole-phd-thesis.pdf

Page 51: PPRank 2015

Knowledge Base ComparisonName Entities Facts SourceDBpedia 4.6M 3B triples WikipediaYAGO 10M 120M Wikipedia, WordNet,

GeoNamesNELL(Carnegie Mellon)

5.2M 50M Candidates, 2.6M Confident

Machine learning, crawls open web

Knowledge Graph (Google)

500M 3.5B Wikipedia, Freebase, and other sources

Knowledge Vault (Google)

45M 271M Machine learning, crawls open web

 Satori (Microsoft)

N/A N/A N/A 

Page 52: PPRank 2015

Knowledge Base: Examples

Page 53: PPRank 2015

Knowledge Base: Examples

Page 54: PPRank 2015

Knowledge Base: Q/A Example

Page 55: PPRank 2015

Intelligent Personal Assistants

Page 56: PPRank 2015

Intelligent Personal AssistantsVendor Product Initial Release Scope

Apple Siri October, 2011 General

Google Google Now July, 2012 General

Microsoft Cortana April, 2014 General

Samsung S Voice May, 2012 General

Baidu Duer Sept. 2015 General

Amazon AlexaNovember, 2014 (Limited)June, 2015 Wide

General; embedded in Echo, but available for 3rd party hardware integration

Facebook M August, 2015 General, human assistedNuance Ask Nina August, 2012 Task specific – Customer carex.ai x.ai Restricted beta Task specific – Scheduling assistantViv Viv In development GeneralGenee Genee In Beta Task specific – Scheduling assistant

Soundhound Hound In Beta on Android GeneralClarity Lab—University of MIchigan

Sirius March, 2015 General, Open Source IPA

Page 57: PPRank 2015

Intelligent Personal Assistants: Siri Patent

https://www.google.com/patents US Patent Application, Publication number:US20120016678 A1, Jan.19, 2012, Original Assignee: Apple

Page 58: PPRank 2015

Philosophical Implications: The Limits of Probability?

Page 59: PPRank 2015

Limits of Probability?

Page 60: PPRank 2015

Limits of Probability?"It must be recognized that the notion ‘probability of a sentence’ is an entirely useless one, under any known interpretation of this term.”--Noam Chomsky, cited in Manning, Christopher D. "Probabilistic syntax." Probabilistic linguistics (2003): 289-341.

"Thus one might assume that knowledge of a "universal grammar", in the widest sense, is an innate property of the mind, and that this given system of rules and principles determines the form and meaning of infinitely many sentences (and the infinite scope of our knowledge and belief) from the minute experiential base that is actually available to us."--Chomsky, Noam. Quine’s empirical assumptions. Springer Netherlands, 1969.

Page 61: PPRank 2015

Limits of Probability?

Page 62: PPRank 2015

Limits of Probability?"Einstein said to make everything as simple as possible, but no simpler. Many phenomena in science are stochastic, and the simplest model of them is a probabilistic model; I believe language is such a phenomenon and therefore that probabilistic models are our best tool for representing facts about language, for algorithmically processing language, and for understanding how humans process language.”

--Peter Norvig,  On Chomsky and the Two Cultures of Statistical Learning, undated, http://norvig.com/chomsky.html.

Page 63: PPRank 2015

THANKS