DrillDown: Interactive Retrieval of Complex Scenes Using...

DrillDown: Interactive Retrieval of Complex Scenes Using Natural Language Queries

When we’d like to retrieve an image of a complex scene

Difficult to describe the whole scene in one sentence

Image Search Engine

Single sentence as queryNo refinement (no interaction)

Find a specific image in our gallery album

or online image collection

Image Retrieval with Multiple Rounds Queries

Drill-down: Interactive Retrieval of Complex Scenes using Natural Language QueriesFuwen Tan, Paola Cascante-Bonilla, Xiaoxiao Guo, Hui Wu, Song Feng, Vicente Ordonez.Conf. on Neural Information Processing Systems. NeurIPS 2019. Vancouver, Canada. December 2019.

Previous efforts on Image-Text Matching

Two women sitting on the sofa

Woman in white shirt holding a dog

Woman in yellow shirt holding a cat

CNN RNN

1D Feature Space

[1] DeViSE: A Deep Visual-Semantic Embedding Model. Andrea Frome, Greg S. Corrado, Jonathon Shlens, Samy Bengio, Jeffrey Dean, Marc’Aurelio Ranzato, Tomas Mikolov. NIPS 2013.[2] Deep Fragment Embeddings for Bidirectional Image Sentence Mapping. Andrej Karpathy, Armand Joulin, Li Fei-Fei. NIPS 2014

Previous efforts on Image-Text Matching

[3] Unified Visual-Semantic Embeddings: Bridging Vision and Language with Structured Meaning Representations. Hao Wu, Jiayuan Mao, Yufeng Zhang, Yuning Jiang, Lei Li, Weiwei Sun, Wei-Ying Ma. CVPR 2019.

Observations

Feature channels

s2D image representation can help distinguish instances sharing the same feature subspace

Observations

Feature channels

1D sentence representation can NOT distinguish instances sharing the same feature subspace

Observations

Feature channels

2D sentence representation

“person” subspace

“dog” subspace

“cat” subspace

Instance1

Instance2

Instance3

We still want compact representations

Especially, if it is for retrieval applications

Feature vector 1Sentence 1

Text input

Pre-allocated state vectors

Text feature

Action: which state vector to

update

Update the state vector

Pairwise alignment between state vectors and

image regions

Simulated queries through region-phrase annotations at training time

Human queries

Quantitative evaluation on a test set of 10000 images

Although, the more state vectors,

the better

Although, the more state vectors,

the better

We could have an even more compact representation

Quantitative evaluation on a test set of 10000 images

Target

Future work: instance aware text encoder for dialog based applications?

Potential challenges:● Named entity detection● Coreference resolution● Negation● ...

DrillDown: Interactive Retrieval of Complex Scenes Using...

Documents

Research: Developing an Interactive Web Information Retrieval and Visualization System

3D Sketching for Interactive Model Retrieval in Virtual

Zingchart drilldown-interactive-feature

An Interactive Activation Model of Arithmetic Fact Retrieval

Interactive Information Retrieval · Interactive Information Retrieval Ian Ruthven University of Strathclyde Introduction Information retrieval is a fundamental component of human

Active learning methods for interactive image retrieval

Teaching Information Retrieval With Web-based Interactive

ERP Drilldown - Controller-Financial Reports (Customer)

WCF Technical Drilldown

Brush Creek Community Partners DrillDown

City of Miami Neighborhood Market DrillDown - ci.miami.fl.us

Parallel interactive retrieval of item and associative ...memolab.syr.edu/pdfs-criss/CoxCriss_2017CogPsych.pdf · Parallel interactive retrieval of item and associative information

Interactive Image Retrieval using Self-Organizing Mapslib.tkk.fi/Diss/2003/isbn9512267659/isbn9512267659.pdfKoskela, M. (2003): Interactive image retrieval using Self-Organizing Maps

Interactive search in image retrieval: a surveypress.liacs.nl/publications/Interactive search in image retrieval - a... · user feedback such as active learning. We try to present

Methods for Evaluating Interactive Information Retrieval ...Information Retrieval Vol. 3, Nos. 1–2 (2009) 1–224 c 2009 D. Kelly DOI: 10.1561/1500000012 Methods for Evaluating Interactive

Ccs Technical Press Workshop Drilldown

Dialog-based Interactive Image Retrieval · interaction. We formulate the task of dialog-based interactive image retrieval as a reinforcement learning problem, and reward the dialog

User Searching Behaviors (and Interactive Retrieval Techniques) within a Library Gateway

for Interactive Indexing and Retrieval of Pat ology Images

Systems Management Server 2003: Technical Drilldown