88
LECTURE 10: RESEARCH DIRECTIONS IN MOBILE AR Mark Billinghurst [email protected] Zi Siang See [email protected] November 29 th -30 th 2015 Mobile-Based Augmented Reality Development

Mobile AR Lecture 10 - Research Directions

Embed Size (px)

Citation preview

Page 1: Mobile AR Lecture 10 - Research Directions

LECTURE 10: RESEARCH DIRECTIONS IN MOBILE AR

Mark Billinghurst [email protected]

Zi Siang See [email protected]

November 29th-30th 2015

Mobile-Based Augmented Reality Development

Page 2: Mobile AR Lecture 10 - Research Directions

Looking to the Future

Page 3: Mobile AR Lecture 10 - Research Directions

The Future is with us It takes at least 20 years for new

technologies to go from the lab to the lounge..

“The technologies that will significantly affect our lives over the next 10 years have been around for a decade.

The future is with us. The trick is learning how to spot it. The commercialization of research, in other words, is far more about prospecting than alchemy.”

Bill Buxton

Oct 11th 2004

Page 4: Mobile AR Lecture 10 - Research Directions

Research Directions • Tracking

• Markerless tracking, hybrid tracking

• Interactions • Displays, input devices, gesture, social

• Applications • Collaboration

• Scaling Up • User evaluation, novel AR/MR experiences

Page 5: Mobile AR Lecture 10 - Research Directions

TRACKING

Page 6: Mobile AR Lecture 10 - Research Directions

Sensor tracking •  Used by many “AR browsers” •  GPS, Compass, Accelerometer, (Gyroscope) •  Not sufficient alone (drift, interference)

Page 7: Mobile AR Lecture 10 - Research Directions

Combining Sensors and Vision •  Sensors

•  Produce noisy output (= jittering augmentations) •  Are not sufficiently accurate (= wrongly placed augmentations) •  Gives us first information on where we are in the world,

and what we are looking at

•  Vision •  Is more accurate (= stable and correct augmentations) •  Requires choosing the correct keypoint database to track from •  Requires registering our local coordinate frame (online-

generated model) to the global one (world)

Page 8: Mobile AR Lecture 10 - Research Directions

Outdoor AR Tracking System

You, Neumann, Azuma outdoor AR system (1999)

Page 9: Mobile AR Lecture 10 - Research Directions

Wide Area Tracking

•  Process •  Combine panorama’s into point cloud model (offline) •  Initialize camera tracking from point cloud •  Update pose by aligning camera image to point cloud •  Accurate to 25 cm, 0.5 degree over wide area

Ventura, J., & Hollerer, T. (2012). Wide-area scene mapping for mobile visual tracking.In Mixed and Augmented Reality (ISMAR), 2012 IEEE International Symposium on (pp. 3-12). IEEE.

Page 10: Mobile AR Lecture 10 - Research Directions

Project Tango

• Smart phone + Depth Sensing • Sensors

• Gyroscope/accelerometer/compass • 180º field of view fisheye camera • An infrared projector. • 4 MP RGB/IR camera

Page 11: Mobile AR Lecture 10 - Research Directions
Page 12: Mobile AR Lecture 10 - Research Directions

How it Works

• Sensors •  4MP RGB/IR camera : can capture full color images and

detect IR reflections. •  IR Depth Sensor : Used to measure depths with IR pulse •  Tracking Camera : To track objects

• 3 Basic operations •  In real time can map depth of environment • Measure depth accurately using IR pulse • Create a 3D model of the environment real time

Page 13: Mobile AR Lecture 10 - Research Directions

Applications

• Indoor tracking, games, disability, etc

Page 14: Mobile AR Lecture 10 - Research Directions

Qualcomm Smart Terrain

• Reconstructs the environment • Builds mesh

• Recognizes props • Separate objects in environment

Page 15: Mobile AR Lecture 10 - Research Directions

Applications

• Gaming, advertising, training

Page 16: Mobile AR Lecture 10 - Research Directions

DISPLAYS

Page 17: Mobile AR Lecture 10 - Research Directions

Occlusion with See-through HMD • The Problem

• Occluding real objects with virtual • Occluding virtual objects with real

Real Scene Current See-through HMD

Page 18: Mobile AR Lecture 10 - Research Directions

ELMO (Kiyokawa 2001)

• Occlusive see-through HMD • Masking LCD • Real time range finding

Page 19: Mobile AR Lecture 10 - Research Directions

ELMO Demo

Page 20: Mobile AR Lecture 10 - Research Directions

ELMO Design

• Use LCD mask to block real world • Depth sensing for occluding virtual images

Virtual images from LCD

Real World

Optical Combiner

LCD Mask Depth Sensing

Page 21: Mobile AR Lecture 10 - Research Directions

ELMO Results

Page 22: Mobile AR Lecture 10 - Research Directions

Contact Lens Display •  Babak Parviz

•  University Washington

• MEMS components •  Transparent elements •  Micro-sensors

• Challenges •  Miniaturization •  Assembly •  Eye-safe •  Providing power, data

Page 23: Mobile AR Lecture 10 - Research Directions

Contact Lens Prototype

Page 24: Mobile AR Lecture 10 - Research Directions

Wide FOV Displays

• Wide FOV see-through display for AR •  LCD panel + edge light point light sources •  110 degree FOV

Maimone, A., Lanman, D., Rathinavel, K., Keller, K., Luebke, D., & Fuchs, H. (2014). Pinlight displays: wide field of view augmented reality eyeglasses using defocused point light sources. In ACM SIGGRAPH 2014 Emerging Technologies (p. 20). ACM.

Page 25: Mobile AR Lecture 10 - Research Directions

INTERACTION

Page 26: Mobile AR Lecture 10 - Research Directions

The Vision of AR

Page 27: Mobile AR Lecture 10 - Research Directions

To Make the Vision Real..

• Hardware/software requirements • Contact lens displays • Free space hand/body tracking • Environment recognition • Speech/gesture recognition • Etc..

Page 28: Mobile AR Lecture 10 - Research Directions

Natural Interaction

• Automatically detecting real environment • Environmental awareness • Physically based interaction

• Gesture Input • Free-hand interaction

• Multimodal Input • Speech and gesture interaction • Implicit rather than Explicit interaction

Page 29: Mobile AR Lecture 10 - Research Directions

AR MicroMachines

• AR experience with environment awareness and physically-based interaction • Based on MS Kinect RGB-D sensor

• Augmented environment supports • occlusion, shadows • physically-based interaction between real and virtual objects

Page 30: Mobile AR Lecture 10 - Research Directions

Operating Environment

Page 31: Mobile AR Lecture 10 - Research Directions

Architecture • Our framework uses five libraries:

• OpenNI • OpenCV • OPIRA • Bullet Physics • OpenSceneGraph

Page 32: Mobile AR Lecture 10 - Research Directions

System Flow • The system flow consists of three sections:

•  Image Processing and Marker Tracking • Physics Simulation • Rendering

Page 33: Mobile AR Lecture 10 - Research Directions

Physics Simulation

• Create virtual mesh over real world

• Update at 10 fps – can move real objects

• Use by physics engine for collision detection (virtual/real)

• Use by OpenScenegraph for occlusion and shadows

Page 34: Mobile AR Lecture 10 - Research Directions

Rendering

Occlusion Shadows

Page 35: Mobile AR Lecture 10 - Research Directions

Mo#va#on  AR  MicroMachines  and  PhobiAR  

•     Treated  the  environment  as            sta/c  –  no  tracking  

•     Tracked  objects  in  2D  

More  realis#c  interac#on  requires  3D  gesture  tracking      

Page 36: Mobile AR Lecture 10 - Research Directions

Architecture 5. Gesture

•  Static Gestures •  Dynamic Gestures •  Context based Gestures

4. Modeling

•  Hand recognition/modeling •  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface

Page 37: Mobile AR Lecture 10 - Research Directions

Architecture

5. Gesture

•  Static Gestures •  Dynamic Gestures •  Context based Gestures

4. Modeling •  Hand recognition/

modeling •  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface

o  Supports PCL, OpenNI, OpenCV, and Kinect SDK.

o  Provides access to depth, RGB, XYZRGB. o  Usage: Capturing color image, depth image and

concatenated point clouds from a single or multiple cameras

o  For example: Kinect for Xbox 360

Kinect for Windows

Asus Xtion Pro Live

Page 38: Mobile AR Lecture 10 - Research Directions

Architecture 5. Gesture

•  Static Gestures •  Dynamic Gestures •  Context based Gestures

4. Modeling •  Hand recognition/

modeling •  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface

o  Segment images and point clouds based on color, depth and space.

o  Usage: Segmenting images or point clouds using color models, depth, or spatial properties such as location, shape and size.

o  For example:

Skin color segmentation

Depth threshold

Page 39: Mobile AR Lecture 10 - Research Directions

Architecture 5. Gesture

•  Static Gestures •  Dynamic Gestures •  Context based Gestures

4. Modeling •  Hand recognition/

modeling •  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface

o  Identify and track objects between frames based on XYZRGB.

o  Usage: Identifying current position/orientation of the tracked object in space.

o  For example:

Training set of hand poses, colors represent unique regions of the hand.

Raw output (without-cleaning) classified on real hand input (depth image).

Page 40: Mobile AR Lecture 10 - Research Directions

Architecture 5. Gesture

•  Static Gestures •  Dynamic Gestures •  Context based Gestures

4. Modeling •  Hand recognition/

modeling •  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface

o  Hand Recognition/Modeling !  Skeleton based (for low resolution

approximation) !  Model based (for more accurate

representation) o  Object Modeling (identification and tracking rigid-

body objects) o  Physical Modeling (physical interaction)

!  Sphere Proxy !  Model based !  Mesh based

o  Usage: For general spatial interaction in AR/VR environment

Page 41: Mobile AR Lecture 10 - Research Directions

Results

Page 42: Mobile AR Lecture 10 - Research Directions

Architecture 5. Gesture

•  Static Gestures •  Dynamic Gestures •  Context based Gestures

4. Modeling •  Hand recognition/

modeling •  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface

o  Static (hand pose recognition) o  Dynamic (meaningful movement recognition) o  Context-based gesture recognition (gestures

with context, e.g. pointing) o  Usage: Issuing commands/anticipating user

intention and high level interaction.

Page 43: Mobile AR Lecture 10 - Research Directions

Gesture Based Interaction

• Use free hand gestures to interact • Depth camera, scene capture

• Multimodal input • Combining speech and gesture

HIT Lab NZ Microsoft Hololens

Meta SpaceGlasses

Page 44: Mobile AR Lecture 10 - Research Directions

Natural Gesture Interaction on Mobile

• Use mobile camera for hand tracking •  Fingertip detection

Page 45: Mobile AR Lecture 10 - Research Directions

Capturing Behaviours

▪ 3 Gear Systems ▪ Kinect/Primesense Sensor ▪ Two hand tracking ▪ http://www.threegear.com

Page 46: Mobile AR Lecture 10 - Research Directions

Performance

▪ Full 3d hand model input ▪  10 - 15 fps tracking, 1 cm fingertip resolution

Page 47: Mobile AR Lecture 10 - Research Directions

Multimodal Interaction

• Combined speech input • Gesture and Speech complimentary

• Speech • modal commands, quantities

• Gesture •  selection, motion, qualities

• Previous work found multimodal interfaces intuitive for 2D/3D graphics interaction

Page 48: Mobile AR Lecture 10 - Research Directions

Free Hand Multimodal Input

• Use free hand to interact with AR content • Recognize simple gestures • No marker tracking

Point Move Pick/Drop

Page 49: Mobile AR Lecture 10 - Research Directions

Multimodal Architecture

Page 50: Mobile AR Lecture 10 - Research Directions

Multimodal Fusion

Page 51: Mobile AR Lecture 10 - Research Directions

Hand Occlusion

Page 52: Mobile AR Lecture 10 - Research Directions

User Evaluation

• Change object shape, colour and position • Conditions

• Speech only, gesture only, multimodal

• Measure • performance time, error, subjective survey

Page 53: Mobile AR Lecture 10 - Research Directions

Experimental Setup

Change object shape and colour

Page 54: Mobile AR Lecture 10 - Research Directions

Results • Average performance time (MMI, speech fastest)

• Gesture: 15.44s • Speech: 12.38s • Multimodal: 11.78s

• No difference in user errors • User subjective survey

• Q1: How natural was it to manipulate the object? •  MMI, speech significantly better

•  70% preferred MMI, 25% speech only, 5% gesture only

Page 55: Mobile AR Lecture 10 - Research Directions

COLLABORATION

Page 56: Mobile AR Lecture 10 - Research Directions

Resolution Tube

•  http://www.resolutiontube.com/ •  Shared video calls with annotations

Page 57: Mobile AR Lecture 10 - Research Directions

Vipaar Lime - https://www.vipaar.com/

•  Remote collaboration on handheld •  Remote users hands appear in live camera view

Page 58: Mobile AR Lecture 10 - Research Directions

SOCIAL IMPLICATIONS

Page 59: Mobile AR Lecture 10 - Research Directions

Consider the Whole User

Page 60: Mobile AR Lecture 10 - Research Directions

How is the User Perceived?

Page 61: Mobile AR Lecture 10 - Research Directions

TAT Augmented ID

Page 62: Mobile AR Lecture 10 - Research Directions
Page 63: Mobile AR Lecture 10 - Research Directions
Page 64: Mobile AR Lecture 10 - Research Directions

Social Acceptance

• People don’t want to look silly •  Only 12% of 4,600 adults would be willing to wear AR glasses •  20% of mobile AR browser users experience social issues

• Acceptance more due to Social than Technical issues •  Needs further study (ethnographic, field tests, longitudinal)

Page 65: Mobile AR Lecture 10 - Research Directions

CROSSING BOUNDARIES

Page 66: Mobile AR Lecture 10 - Research Directions

Crossing Boundaries

Jun Rekimoto, Sony CSL

Page 67: Mobile AR Lecture 10 - Research Directions

Invisible Interfaces

Jun Rekimoto, Sony CSL

Page 68: Mobile AR Lecture 10 - Research Directions

Milgram’s Reality-Virtuality continuum

Mixed Reality

Reality - Virtuality (RV) Continuum

Real Environment

Augmented Reality (AR)

Augmented Virtuality (AV)

Virtual Environment

Page 69: Mobile AR Lecture 10 - Research Directions

The MagicBook

Reality Virtuality Augmented Reality (AR)

Augmented Virtuality (AV)

Page 70: Mobile AR Lecture 10 - Research Directions

Invisible Interfaces

Jun Rekimoto, Sony CSL

Page 71: Mobile AR Lecture 10 - Research Directions

Example: Visualizing Sensor Networks

•  Rauhala et. al. 2007 (Linkoping) • Network of Humidity Sensors

•  ZigBee wireless communication

• Use Mobile AR to Visualize Humidity

Page 72: Mobile AR Lecture 10 - Research Directions
Page 73: Mobile AR Lecture 10 - Research Directions
Page 74: Mobile AR Lecture 10 - Research Directions

Invisible Interfaces

Jun Rekimoto, Sony CSL

Page 75: Mobile AR Lecture 10 - Research Directions

Ubiquitous AR (GIST, Korea)

• How does your AR device work with other devices? • How is content delivered?

Page 76: Mobile AR Lecture 10 - Research Directions

CAMAR - GIST

(CAMAR: Context-Aware Mobile Augmented Reality)

Page 77: Mobile AR Lecture 10 - Research Directions

Requirements for Ubiquitous AR • Hardware is available (mobile phones). •  Required are software standards:

•  APIs for common framework, independent of hardware. •  ARML as descriptor language for AR environment, scenario, etc.

•  Further required: •  Authoring tools for creating AR applications •  AR Enabled infrastructure (buildings etc)

Page 78: Mobile AR Lecture 10 - Research Directions

Reality Virtual Reality

Terminal

Ubiquitous

Desktop AR VR

Milgram

Weiser

UbiComp

Mobile AR

Ubi AR

Ubi VR

Page 79: Mobile AR Lecture 10 - Research Directions

SCALING UP

Page 80: Mobile AR Lecture 10 - Research Directions

Reality

VR

Ubiquitous

Terminal

Milgram

Weiser

Single User

Massive Multi User

Page 81: Mobile AR Lecture 10 - Research Directions

Massive Multiuser

• Handheld AR for the first time allows extremely high numbers of AR users

• Requires • New types of applications/games • New infrastructure (server/client/peer-to-peer) • Content distribution…

Page 82: Mobile AR Lecture 10 - Research Directions

Social Network Systems •  2D Applications

•  MSN – 29 million •  Skype – 10 million •  Facebook – up to 70m

•  Desktop VR •  SecondLife > 50K •  Stereo projection - <500

•  Immersive VR •  HMD/Cave based < 100

•  Augmented Reality •  Shared Space (1999) - 4 •  Invisible Train (2004) - 8

Page 83: Mobile AR Lecture 10 - Research Directions

PERSONAL VIEW

Page 84: Mobile AR Lecture 10 - Research Directions

Augmented Reality 2.0 Infrastructure

Page 85: Mobile AR Lecture 10 - Research Directions

Leveraging Web 2.0 • Content retrieval using HTTP • XML encoded meta information

• KML placemarks + extensions • Queries

•  Based on location (from GPS, image recognition) •  Based on situation (barcode markers)

•  Syndication • Community servers for end-user content •  Tagging

• AR client subscribes to data feeds

Page 86: Mobile AR Lecture 10 - Research Directions

Scaling Up

• AR on a City Scale • Using mobile phone as ubiquitous sensor • MIT Senseable City Lab

•  http://senseable.mit.edu/

Page 87: Mobile AR Lecture 10 - Research Directions
Page 88: Mobile AR Lecture 10 - Research Directions

WikiCity Rome (Senseable City Lab MIT)