Discovering Knowledge in and Extracting Information...

Center of Signal and Image ProcessingGeorgia Institute of Technology

Discovering Knowledge in and Extracting Information from

Multimedia Patterns

Chin-Hui LeeSchool of ECE, Georgia Institute of Technology

Atlanta, GA 30332, USAchl@ece.gatech.edu

(Most work finished in Bell Labs, some work done while visiting NUS in 2001-2002)

ISIMP2004, HKPolyU, Oct. 21,2004

2 Center of Signal and Image ProcessingGeorgia Institute of Technology

Outline• Rich content of heterogeneous media patterns

– Text, audio, video, speech, image, object, graphics, sketch, etc. – Web is becoming the largest multimedia databases & playground

• 4M in human information processing technology– Multimedia, multi-modal, multi-lingual, multi-disciplinary

• Technology dimensions (more language engineering)– Parametrization, feature extraction, modeling, segmentation, etc.– Coding, synthesis, recognition, verification, understanding, etc.

• Knowledge discovery and information extraction – From spotting cues and events to understanding media patterns

• Summary and emerging opportunities

Evolution of Language and Media

Paper Radio Historic Flow of Knowledge & Civilization

Print(1450AD)

Telegraph &Telephone

TV Computer & Digital

Processing

Hyper & Virtual

Media ? (21st Cen)

WrittenLanguage (3000BC)

SpokenLanguage

ElectronicMedia

(1900AD)Recording

Internet & WWW

Growth in Network TrafficGrowth in Network Traffic

97 98 99 0096 01

Voice Traffic:576 TB/day

Data Traffic:1178 TB/day at YE’00,

2136 TB/day at YE’01

The Internet ExplosionThe Internet ExplosionThe Internet Explosion

Internet Hosts

CAGR since 1998 100%

traffictraffic

2,000,000,000 Web Pages

75,000,000

275,000,000 Worldwide Users

Heterogeneous Multimedia Pages

Rich Content:

• Audio• Video• Image• Speech• Graphics• Objects• Comic Strips• Files.xxx• Links• Multilingual

Picasso’s “Parade” (1917)

A Picture is worth more than a thousand words?

On display at IFC, HK

Ubiquitous Wireless AccessUbiquitous Wireless Access(Mobile Info Access and Transactions)(Mobile Info Access and Transactions)

Devices

Services

Internet

Corporate Networks

Air Access Interface

Network

Any wireless device Any air interface Any desired network Any service

Multimodal Access of Multimedia DBs(Research & Business Opps for Info Intelligence)

User Model

User Input

Keyboard

Speech

MM-pad

Speech Recognizer

Text Processing

Multimedia Presentation

User Intent Understanding

Audio/video Recognizer

Audio/Video Rendering

Indexed A/V Database

A/V Browser

InformationAppliance

Info Fusion

Raw A/VDatabase

Multimedia Processing

User Feedback

VideoMultimedia

IndexingAudioText

Info Fusion & Retrieval

Q&A Dialogue

Network

Human Information Technologies & 4M• Multimedia Documents

– Audio, video, speech, image, text, chart, map, etc.– Indexing, retrieval, presentation, rendering, etc.

• Multi-Modal Human-Machine Interface (HCI)– Speech, gesture, point ‘n’ click, pen, MM sketch pad, etc.– Multiple sensory inputs and feedbacks

• Multilingual Information Sources– Multilingual human language understanding– Multilingual presentation, cross-language referencing

• Multidisciplinary Collaborative Research– Engineers, scientists, artists, psychologists, etc.– Human factors, behavior science, wide range of soft topics

Human Language Engineering Abstraction

• Modeling of Input-Output Relationship– Shannon’s Channel Modeling and Decoding Paradigm

• Signal Processing of Linguistic Features– e.g. latent semantic analysis and vector space representation

• Similarity Measures between Documents– Clustering and modeling of linguistic events

• Machine Learning Techniques for Classifier Design• Document Classification, Verification, Understanding

– Many research and business opportunities

Vector Space Representation of Queries & Documents (Latent Semantic Indexing)

Credit CardServices

DepositServices

ConsumerLending

Home EquityService

LoanServicing

Query Vector Feature Extraction• Text Pre-processing (SMART, Salton, 1971)

– extract root form of a word, e.g. check for checking– remove ignore words, e.g. um, uh– remove stop words, e.g. I would like to– count occurrences of remaining key terms

QueryVectorText

Speech

MorphologicalFiltering

Query-VectorExtractionASR

Stop/Ignore List

Key Term List

LSA Based Feature Extraction• LSA Matrix (also known as Routing Matrix) C

– number of times word occurs in :– total number of words present in :– total number of occurs in A :– “indexing” power of in corpus A :– normalized entropy:

jijiij nnc ⋅−= /)1( ε

10log1log

1 ≤≤−=⋅⋅∑ = in

ij εε

ijnsum)column (jn⋅

sum) row(⋅inii εη −=1

power indexing maximum if0 ⋅== iiji nnεprobable)(equally power no if1 N

in ⋅==ε{

LSA Feature Space• Mapping into Latent Semantic Space S

– each document vector (N column vectors of matrix C ) is mapped to an (1xR)-vector

– each term vector (M row vectors of matrix C ) is mapped to an (1xR)-vector

– each query vector (a new Mx1 vector) is mapped to an (1xR)-vector through the pseudo-document vector

– closeness in the S space is much easier measured for both document-document and term-term comparisons

jaSdv t

Stu ii =ib

Stu ii =

Sdv tj

=• •

200150000,100000,10(SVD)

−≈≈≈

TSDC t

Confidence Scoring• Inner Product: tyxyxs •=),(

• Cosine:)],([cosor||||),( 1 yxsyx

yxyxst

−•=

• Confidence Scoring: Sigmoid function fitting1)( ]1[),;( −+−+= βαβα sesConf

• Other Scores– Euclidean, Manhattan, etc.

• Generalized Scores– between any two vectors: );,(),( Γ= yxfyxs

Term Clustering• MMT characterizes all co-occurrences between

terms, the (i, j) cell of MMT infers the similarity between wi and wj

• Define a distance measureSuSu

uSuSuSuwwK

),cos(),( ==

),(cos),( 1jiji wwKwwD −=

• Given D, one can perform word clustering using any clustering algorithm, e.g. K-Means

• For document clustering, use MTM instead

K-Means Term Clustering Example• 9492 words into 100 clusters (one example)

oub bank Singapore cent uob db account share singtel trade Bangkok manage save entity annual ocbc tangible debt stikeppel custom transact currency deposit card sixth citibank integer subscribe handset creation loan auditor merger autom merge sharehold attract uncondiasx optu sembawang ibra restructursingland landlord uic yaw sgx

Document Clustering Example• 2000 documents into 100 clusters (one example)

N Korea Proposes Resumed Talks with S Korea-YonhapNorth Korea Proposes Resuming Talks with SeoulSouth Korea Set for Key Vote on Approach to NorthKorea to Replace Four to Eight Ministers on FridayS.Korea to Push North Policy Despite Kim Setback

……

Conventional View on PR

Unknown Pattern dj

Classifier Ti

Classifier T1

Classifier Tm

L1(dj)

Li(dj)

Lm(dj)Label by m-th classifiers

Modeling and recognition units are the same !

Shannon’s Channel Modeling Paradigm –An Information Theoretic Perspective

OI IChannelP(O|I)

ChannelDecoder

( | ) ( )ˆ arg max ( | ) arg max( )I I

P O I P II P I OP O∈Γ ∈Γ= =

• Channel input is hidden (unobserved) while output is observed and used to infer the input (which is often approximated by a structural Markov model in many problems in speech, language and MM processing)

• Channel Modeling with (I, O) pairs in training• Modeling units are usually smaller than recognition units

Other Applications in Pattern Recognition

Application Input Output P(I) P(O|I)

OCR Error Model

Character (Letter) LM

Noisy Letters

Actual Letters

Optical Char. Recognition

Tagging ModelPOS Tag LMWord Sequence

POS Tag Sequence

Part-of-Speech Tagging

Parsing ModelLM of Derivations

Word Sequence

Parse TreeParsing

Semantic Model

Concept LMWord Sequence

Semantic Concept

Text Understanding

Translation Model

Source LM

Target Sentence

Source Sentence

Machine Translation

Bio-genetic Model

LM of Nucleotides

Noisy DNA Sequence

Actual DNA Sequence

Bioinformatics

Modeling Input-Output Associations• Artificial Neural Network (ANN)

– MLP functional approximation and input-output mapping• Classification and Regression Tree (CART)

– Multi-layer tree approximation• Support Vector Machine (SVM) and LVQ• Kernel-based, mixture of experts, Bayesian network • Other Machine Learning Techniques• Many New Applications

– Rule induction, statistical parsing, machine translation, etc.– Pronunciation modeling and multilingual transliteration– Information retrieval, text categorization, and call routing

Hidden Markov Model (HMM) -Dynamic Time or Space Warping

PΛ(X|C) = ∑ PΛ(X, q|C)q

PΛ(X, q|C) = a0 Π aqt-1 qt bqt(xt)t

X = (x1, x2, x3, ….., xT )

• Each state represents a process ofmeasurable observations.

• Inter-process transition is governed by afinite state Markov chain.

• Processes are stochastic and individualobservations do not immediately identifythe hidden state.

HMM models spectral and temporal variations simultaneously!

Text Categorization: Training Classifiers

(1) Feature Extraction &

Reduction(2) Classifier

Learning

Training set for each category Ci , i= 1,…,m. (Positive +Negative)

Classifier Tifor category Ci

Doc. in new feature space

Related Work on Classifier Design• Decision Tree: Simple, popular, and powerful

classifier. Many available tools, C4.5, CART, ID3

( ) 01

f X W wx w=

= −∑Linear discriminative function:

• Support Vector Machine (SVM)• Naïve Bayes: simple distributions for each class• K-Nearest Neighbor (kNN)• Semantic Perceptron Net (SPN)• Hidden Markov Model (HMM) • Discriminative Training

Reading Tables in Documents (TTS)

COMPANY TODAY' S YESTERDAY' S OPEN CHANGE OPEN CHANGE BLUE I NC 75 1/ 2 + 1 1/ 8 74 9/ 16 - 4 1/ 4 GREEN. COM 89 1/ 4 + 2 88 5/ 8 - 2 13/ 16 RED I NC 22 1/ 4 + 5/ 16 21 13/ 16 - 3/ 8 YELLOW LTD 103 3/ 8 - 1 13/ 16 101 - 4 PURPLE I NC 27 11/ 16 - 2 5/ 8 27 5/ 8 - 1 1/ 8 BROWN. COM 68 + 11/ 16 66 11/ 16 - 1 5/ 8 PI NK LTD 130 7/ 16 + 1 1/ 16 130 - 2 3/ 8

Document understanding is needed before rendering !

Web Information Access & PresentationNews Page (HTML)

Sampras volunteers for Davis Cup doublesduty

-------------------------------------------------------------------------------------------------------------------------------------------------------------

Sampras …….----------------------------------------------------------

News Content(Text)

SummaryLinks

• Web data mining• Web content extraction• Topic detection and automatic summarization• Information rendering and presentation• Q&A construction for natural interface

Image Segmentation & Annotation

• Concept definition needed?• What is image understanding?

“Building, sky, lake, tree, landscape”

Concept vs. Content Based Search

GoogleATR ConceptSearch

Query Taxonomy

……

… …

Multilingual IA (IIS/Taiwan)Top 4 keywords Top 4 keywordsImages Images

彩虹 (Rainbow)天氣 (Weather)花 (Flower)自然 (Nature)

向日葵 (Sunflower)花 (Flower)植物 (Plant)沙漠 (Desert)

海豹 (Seal)哺乳類 (Mammal)海岸 (Coast)動物 (Animal)

太陽系(Solar System)慧星 (Comet)熱帶魚(Tropical Fish)太空 (Universe)

瀑布 (Waterfall)地形 (Landform)自然 (Nature)蟑螂 (Cockroach)

狗 (Dog)哺乳類 (Mammal)穿山甲 (Pangolin) 羊(Sheep)

Cross-Language Web Search (IIS/Taiwan)• A Web search service allows users to query in one language and

search documents that are written or indexed in another language.

Audio Segmentation & Annotation (DP-Based Often Involved Segmental Models like HMM)

Audio Speech

SpeechFind: Speech & Speaker AnnotationFully searchable online database of spoken word collections spanning the 20th century

http://svoice.colorado.edu (Bowen Zhou)

Video & Audio Segmentation(Story Segmentation of Audiovisual Documents)

Video Clip Browsing over IP on 3G

From Web Search to Web Mining• Exploring the Development of Advanced IR

Techniques through Web Mining

Weblogs, texts, images, …

• Cross-Language IR• Concept Search • Personalized Search• Multimedia Search

Knowledge Discovery & Info Extraction

Search Engine

Language info Speaker ProfileImage SemanticsBackground infoTerm ExtractionFace/Object IDEtc.

• Anchor Texts• Query Term Logs• Query Session Logs• Audio/Image Banks

Personal Media: A New Scenario

media miningcontent

analysis

authored story

semantic analysis media servernavigation

Specification Media Space Composition Presentation

Summary• Rich content of heterogeneous media patterns

– Text, audio, video, speech, image, object, graphics, sketch, etc. – Web is becoming the largest multimedia databases & playground

• 4M in human information processing technology– Multimedia, multi-modal, multi-lingual, multi-disciplinary

• Technology dimensions– Parametrization, feature extraction, modeling, segmentation, etc.– Coding, synthesis, recognition, verification, understanding, etc.

• Knowledge discovery and information extraction – Spotting cues/events embedded in unconstrained media patterns

• Many emerging research opportunities

Discovering Knowledge in and Extracting Information...

Documents

CMSP: What Might It Mean for California’s Ports Harbors? · 2012. 2. 23. · Some Definitions of CMSP • “CMSP is a comprehensive, adaptive, integrated, ecosystem‐based, and

Subject: EXPEDITED ELIGIBILITY PROCESS REQUIREMENTS · january 10, 1995 cmsp letter 95-1 all cmsp county welfare directors all expedited eligibility contact persons to= subject: expedited

Dea.brunel.ac.Uk Cmsp Home Saeed Vaseghi Chapter04-Z-Transform

BarulHo Sem Gute Garbelotto/CMSP O Gute … 2013, a Lei do Pancadão foi pro-posta pelos vereadores da Câmara Municipal de São Paulo (CMSP) Coronel Camilo (PSD), Dalton Sil-vano

COUNTY MEDICAL SERVICES PROGRAM (CMSP) ELIGIBILITY MANUAL · February 1, 2015 1-10 1-0402. County Medical Services Program (CMSP) County Medical Services Program (CMSP) means the

(cmsp) eligibility manual - County Medical Services Program

CR réunion CMSP / ELAN Antibe – 31 Janvibr 201...CR réunion CMSP / ELAN Antibe – 31 Janvibr 201 A l’initiative du Collectii Méiditerranéien our la Sauvegarde des Palmiers

Cloud and Managed Services Program (CMSP) …Cloud and Managed Services Program Journey 2007 MSCP for Telecom Providers 2008 Cisco Powered services branding 2012 CMSP Providers, Builders,

Multiplexing References Frequency-division multiplexing ...eie.polyu.edu.hk/~em/dtss05pdf/Multiplexing.pdf · Multiplexing References – Frequency-division multiplexing and Time-division

GIS in CMSP - ESRI · 2012. 12. 6. · CMSP • Uncertainty • Temporal dynamics • Engaging & curating citizen science • Big data integration/synthesis • Spatial connectivity

CMSP - Centro de Mídias da Educação de São Paulo | CMSP - … · 2020. 6. 4. · Gestão do Sistema Gestão Escolar Pedagógico Recursos Humanos Serviços Escolares Vida Escolar

CMSP Eligibility manual

CERTIFIED MINE SAFETY PROFESSIONAL HANDBOOKsmecmsp.org/cmsp/includes/themes/cmsp/images/2018/SMECMSP_Handbook.pdfThe CMSP certification was designed in 1991 by a group of mining health

Câmara Municipal de São Paulo CMSP COMISSÃO DE POLÍTICA URBANA … · 2017-05-12 · Câmara Municipal de São Paulo ‐CMSP COMISSÃO DE POLÍTICA URBANA ... –Operação Urbana

(External) CMSP FY17 Update Power Hour · CMSP Partners’ choice on per deal basis: • Simplified Pricing with pre-approved upfront discount for predictability • Best available

US National Ocean Policy and Framework for Coastal and Marine Spatial Planning (CMSP) · 2015. 7. 28. · CMSP Framework • Definition of CMSP: A comprehensive, adaptive, integrated,

Proposta Orçamentária 2013 - CMSP

05e - CMSP - CSRC · 2018. 9. 27. · This document, the Cryptographic Module Security Policy (CMSP), also referred to as the Security Policy, specifies the security rules under which

Wireless Emergency Alerts Commercial Mobile Service Provider (CMSP) Cybersecurity ... · 2016. 6. 9. · Wireless Emergency Alerts Commercial Mobile Service Provider (CMSP) Cybersecurity

Connect to Care by CMSP Approved Procedure Code Listmyconnecttocare.org/wp-content/uploads/2021/05/Connect... · 2021. 5. 12. · Connect to Care by CMSP Approved Procedure Code List