22
INF 141 COURSE SUMMARY Crista Lopes

INF 141 Course summary

  • Upload
    mendel

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

INF 141 Course summary. Crista Lopes. Lecture Objective. Know what you know. Problem Space of this course. “Big Data” How to collect it index it search it for relevant information. Industry segment of this course. Search engines Google MS Bing nameless others - PowerPoint PPT Presentation

Citation preview

Page 1: INF 141 Course summary

INF 141COURSE SUMMARYCrista Lopes

Page 2: INF 141 Course summary

Lecture Objective

Know what you know

Page 3: INF 141 Course summary

Problem Space of this course “Big Data” How to

collect it index it search it for relevant information

Page 4: INF 141 Course summary

Industry segment of this course Search engines

Google MS Bing nameless others

Web information retrieval is big $$$

Page 5: INF 141 Course summary

Technical content of this course

Engineering Math

Page 6: INF 141 Course summary

Lecture 2 Search engines history Search & advertising on the Web Web corpus

Page 7: INF 141 Course summary

Lecture 3 Characteristics of the Web

duplication, linkage, spam how big rate of change evolution

Characteristics of Web search Users reasons for searching characteristics of queries used behavior towards results the need behind the query

Page 8: INF 141 Course summary

Lecture 4 Search Engine Optimization

Page 9: INF 141 Course summary

Lectures 5 and 6 Web crawling

architecture of a crawling infrastructure algorithms constraints

Page 10: INF 141 Course summary

Lectures 7, 8 and 9 Index construction

what index is efficient data structures efficient algorithms for constructing it

Page 11: INF 141 Course summary

Lecture 10 Map Reduce Index compression

Page 12: INF 141 Course summary

Lecture 11 Retrieval

boolean zones TF metrics

Page 13: INF 141 Course summary

Lecture 12 Ranked Retrieval

weighting fields

Page 14: INF 141 Course summary

Lecture 13 Better scoring

TF-IDF Corpus-wide statistics

Page 15: INF 141 Course summary

Lecture 14 Vector Space model Score by magnitude (euclidian distance) Score by angle (cosine distance)

Page 16: INF 141 Course summary

Lecture 15 Language statistics Language processing

tokenizing stemming stopping

Link analysis PageRank

Page 17: INF 141 Course summary

Lecture 16 Hadoop

Page 18: INF 141 Course summary

Lecture 17 Retrieval performance: precision and

recall

Latent Semantic Analysis Singular Matrix Decomposition

Page 19: INF 141 Course summary

Lecture 18 Retrieval on LSI

Use of Latent Dirichlet Allocation (LSA)

Page 20: INF 141 Course summary

All together Search engines history Search & advertising on the Web Web corpus Characteristics of the Web Characteristics of Web search Users Search Engine Optimization Web crawling Index construction Index compression Map Reduce Boolean retrieval Parametric retrieval Scored retrieval TF-IDF and corpus-wide statistics Language statistics Language processing Link analysis (PageRank) Hadoop Precision and Recall LSI (and LDA)

Page 21: INF 141 Course summary

Big Data jobs plenty… not just traditional search

making sense of data

search on google

Page 22: INF 141 Course summary

Where to go from here Data mining Machine learning