6
Vishnoo - An Open-Source Software for Vision Research Enkelejda Tafaj, Thomas C. K¨ ubler, J¨ org Peter, Wolfgang Rosenstiel Wilhelm-Schickard Institute for Computer Science University of T¨ ubingen, Germany [email protected] Martin Bogdan Computer Engineering University of Leipzig, Germany Ulrich Schiefer Institute for Ophthalmic Research University of T¨ ubingen, Germany Abstract The visual input is perhaps the most important sensory information. Understanding its mechanisms as well as the way visual attention arises could be highly beneficial for many tasks involving the analysis of users’ interaction with their environment. We present Vishnoo (Visual Search Examination Tool), an integrated framework that combines configurable search tasks with gaze tracking capabilities, thus enabling the analysis of both, the visual field and the visual attention. Our user studies underpin the viability of such a platform. Vishnoo is an open-source software and is available for download at http://www.vishnoo.de/ 1. Introduction Understanding the visual sensory and attentive mech- anisms has been scope of a lot of research in medicine, psychology and engineering. Research on visual field, visual function and attention has benefited especially from the development of eye tracking algorithms and devices. The quantification of eye movements led not only to new diagnostic methods but also to a better understanding of the visual function. Scientific work on vision and visual search is based on the design and configuration of a specific psychophysical task and the analysis of the user’s behavior. A specific task could be for example the localization of a randomly positioned target inbetween distractors repre- senting and simplifying typical visual search activities in everyday life e.g. finding a specific item in a supermarket shelf. Such tasks are often designed only once for a specific scientific study without being reused within the research community. This is mainly due to the lack of a platform that integrates visual search tasks with eye-tracking facilities. Although both commercial eye-trackers e.g. Dikablis [7], Interactive Minds Eye-Tracker [9], EyeGaze [11], SMI [21] or Tobii [23], and open-source solutions e.g. ITU Gaze Tracker [19] provide powerful algorithms for gaze tracking, synchronization of scan paths with stimuli events, analysis and visualization of the visual scan paths, they still do not provide user-friendly interfaces for stimuli generation or task programming. On the other hand there are several tools for stimulus presentations in experimental studies in vision research, neuroscience and psychology, e.g. The Psychtoolbox for Matlab (The MathWorks Inc., Massachusetts, USA) [1], PychoPy [17], SuperLab [4], Presentation [14] or E-Prime [18]. These products offer stimulus delivery and experimental control but they lack of integration of eye-tracking and scan path analysis. To combine free-configurable search tasks with gaze tracking and visual scan path analysis we developed Vish- noo (Visual Search Examination Tool). Vishnoo provides mobile campimetry to assess the visual field, four build-in visual search tasks and robust algorithms for eye-tracking and visual scan path analysis. The built-in tasks include a comparative search task, MAFOV as a pop-out task to examine the bottom-up visual processing, a conjunctive search task and a video-plugin for the analysis of the visual processing of (natural scene) images and videos. Recorded scan paths can be efficiently analyzed. The architecture of the software platform is focus of the next section. Section 3 presents how the visual field can be measured using Vishnoo. The search tasks provided by our framework are discussed in Section 4. Section 5 deals with aspects of eye tracking and visual scan path analysis implemented in Vishnoo. Section 6 concludes this paper.

Vishnoo \u0026#x2014; An open-source software for vision research

Embed Size (px)

Citation preview

Vishnoo - An Open-Source Software for Vision Research

Enkelejda Tafaj, Thomas C. Kubler, Jorg Peter, Wolfgang RosenstielWilhelm-Schickard Institute for Computer Science

University of Tubingen, [email protected]

Martin BogdanComputer Engineering

University of Leipzig, Germany

Ulrich SchieferInstitute for Ophthalmic ResearchUniversity of Tubingen, Germany

Abstract

The visual input is perhaps the most important sensoryinformation. Understanding its mechanisms as well as theway visual attention arises could be highly beneficial formany tasks involving the analysis of users’ interaction withtheir environment.

We present Vishnoo (Visual Search Examination Tool),an integrated framework that combines configurable searchtasks with gaze tracking capabilities, thus enabling theanalysis of both, the visual field and the visual attention.Our user studies underpin the viability of such a platform.Vishnoo is an open-source software and is available fordownload at http://www.vishnoo.de/

1. Introduction

Understanding the visual sensory and attentive mech-anisms has been scope of a lot of research in medicine,psychology and engineering. Research on visual field,visual function and attention has benefited especially fromthe development of eye tracking algorithms and devices.The quantification of eye movements led not only to newdiagnostic methods but also to a better understanding ofthe visual function. Scientific work on vision and visualsearch is based on the design and configuration of a specificpsychophysical task and the analysis of the user’s behavior.A specific task could be for example the localization ofa randomly positioned target inbetween distractors repre-senting and simplifying typical visual search activities ineveryday life e.g. finding a specific item in a supermarketshelf. Such tasks are often designed only once for a specificscientific study without being reused within the researchcommunity. This is mainly due to the lack of a platform that

integrates visual search tasks with eye-tracking facilities.Although both commercial eye-trackers e.g. Dikablis [7],Interactive Minds Eye-Tracker [9], EyeGaze [11], SMI[21] or Tobii [23], and open-source solutions e.g. ITUGaze Tracker [19] provide powerful algorithms for gazetracking, synchronization of scan paths with stimuli events,analysis and visualization of the visual scan paths, theystill do not provide user-friendly interfaces for stimuligeneration or task programming. On the other hand thereare several tools for stimulus presentations in experimentalstudies in vision research, neuroscience and psychology,e.g. The Psychtoolbox for Matlab (The MathWorks Inc.,Massachusetts, USA) [1], PychoPy [17], SuperLab [4],Presentation [14] or E-Prime [18]. These products offerstimulus delivery and experimental control but they lack ofintegration of eye-tracking and scan path analysis.

To combine free-configurable search tasks with gazetracking and visual scan path analysis we developed Vish-noo (Visual Search Examination Tool). Vishnoo providesmobile campimetry to assess the visual field, four build-invisual search tasks and robust algorithms for eye-trackingand visual scan path analysis. The built-in tasks includea comparative search task, MAFOV as a pop-out task toexamine the bottom-up visual processing, a conjunctivesearch task and a video-plugin for the analysis of the visualprocessing of (natural scene) images and videos. Recordedscan paths can be efficiently analyzed.

The architecture of the software platform is focus of thenext section. Section 3 presents how the visual field canbe measured using Vishnoo. The search tasks provided byour framework are discussed in Section 4. Section 5 dealswith aspects of eye tracking and visual scan path analysisimplemented in Vishnoo. Section 6 concludes this paper.

2. Vishnoo architecture

Vishnoo is designed with respect to modularity, scalabil-ity, flexibility and adaptivity to new stimuli or search tasks.Thus the underlying platform architecture is plugin-basedand highly modular as depicted schematically in Figure 1.Each module represents a part of the examination workflow.

Figure 1. Schematic view of Vishnoo’s soft-ware architecture

Patient information module: This module manages thesubjects’s information, e.g. name, ID, date of birth etc. (incase of anonymous handling a subject is characterized onlyby an ID), information about the examiner, examinationdate and other settings. This data is further integrated intothe examination result.

Examination module: An examination is either theassessment of the visual field in case of the mobilecampimetry or is one of the four search tasks. It consistsof a task model and a set of corresponding configurationvalues. The task model contains queued individual tasks,which are processed sequentially or in parallel. A taskmay either be a single simple action, like displaying astimulus, or a concatenation of subtasks. For example thetask of presenting a stimulus may consist of displaying thestimulus, waiting some time, removing the stimulus andwaiting for user-feedback (e.g. a button pressed). Whenimplementing new examination types, tasks from existingexaminations can be reused which accelerates the designand development of new tasks, avoids redundant code andensures an overall good software maintenance.

Configuration: Vishnoo is designed not only for usagein scientific studies but also for clinical usage. Thereforewe provide several configuration possibilities. For examplein research studies it might be useful to perform changesof the psychophysical stimuli (e.g. presentation time),while in clinical usage it might be more of interest to havea standardized examination-like task (e.g. in campimetry,Section 3). Stimulus representations as well as input oroutput devices, e.g. touchscreen or eye-tracking devices,can be loaded as plugins during runtime. This is possibledue to encapsulated and well-documented interfaces.

Evaluator module: The evaluator module is taskspecific and provides a variety of useful algorithmsand functions for the visualization of the results of asearch task. The standard export format is XML, howevera plugin interface allows conversion to any other file format.

Vishnoo Plugins: Due to well defined interfaces newfeatures (e.g. other devices) and existing or third-partymodules for input, visualization and export can be inte-grated easily. Plugins will be linked dynamically duringruntime. Custom input devices or data sources can be usedby Vishnoo via the input plugin interface. One example forthis usage is the eye-tracking plugin.

A stimuli plugin delivered with Vishnoo includesvarious stimuli of simple geometric shapes like squares,circles, triangles, Landolt-rings and annulus with variablesizes and colors. The presentation of images or videosis managed by a video plugin. Due to the OpenGLacceleration, these resources are displayed smoothlyeven when a large number of stimuli is presented atthe same time. Further visualizations can be added eitherusing Nokia’s high level QT library or directly via OpenGL.

Exporter module: An export plugin enables the cus-tomization of export data formats to meet the individual re-quirements of Vishnoo users, e.g. when linking Vishnooresults with other, existing frameworks.

3. Mobile campimetry

Campimetry is the examination of the visual field on aflat surface (screen). The visual field represents the areathat can be perceived when the eye is directed forward. Asdiseases affecting the visual system result in visual fielddefects, the systematic measurement and documentationof the visual field is an important diagnostic test. This testconsists of measuring the sensitivity mostly in terms ofdifferential luminance sensitivity, of visual perception as afunction of location within the visual field [20]. In visual

field examinations test objects, most commonly light stim-uli, are projected onto a uniform background and subjectsrespond by pressing a response button to indicate that theydetected the stimulus. The size and location of a collectiongrid of stimuli is kept constant while their luminance variesuntil the dimmest stimulus that can be perceived by thesubject at each stimulus location is identified. The locationand pattern of missed stimuli defines the type of visual fielddefect.

For mobile campimetry Vishnoo integrates the PC-basedTubingen Mobile Campimeter that we developed and eval-uated in an earlier work [22]. Figure 2 presents a visualfield screening result of a healthy right eye with Vishnoo.Dots represent detected stimuli while dark rectangles repre-sent failed stimuli where no light can be perceived. The areathat results from the cluster of failed stimuli corresponds tothe blind spot. With TMC being suitable for fast screen-ing of the visual field, Vishnoo can be used for diagnosis inophthalmology and neurology.

Figure 2. Result of the visual field screeningwith Vishnoo’s task TMC

4 Search Tasks

Vishnoo provides four build-in visual search tasks:MAFOV (Figure 3(a)), a comparative search task (Figure3(b)), the Machner Test as a conjunctive search task (Figure3(c)) and a video-based search task (Figure 3(d)).

Comparative search tasks are usually used to measurethe visual span. The user has to detect a local matchor mismatch between two displays that are presentedside by side. When searching for a mismatch as in thetask presented in Figure 3(b), users can perform verydifferently depending on the visual scanning strategy theyuse. Vishnoo enables easy configuration of such tasks.User-feedback and eye-tracking information are integratedautomatically.

The video plugin enables the investigation of scanningstrategies when natural scene images are presented. Thisplugin is useful in research studies aiming at the evaluationor further development of established computational mod-els of visual attention, e.g. Itti&Koch [10].

4.1. MAFOV Test

The MAFOV (Modified Attended Field Of View) taskis a feature search (pop-out) task that can be used to in-vestigate the preattentive mechanisms of visual perception.A Landolt-Ring (stimulus) is presented among annuli(distractors) within thirty degrees of eccentricity. Stimulusand distractors are arranged in a grid, Figure 3(a), that cor-responds to a subset of locations defined in the standardizedvisual field examination grid implemented in the Octopusperimeters (Haag-Streit Inc. Koeniz, Switzerland).

During the search task the stimuli are presented ran-domly and sequentially. When a stimulus is presented allother grid positions are used as distractors and representedby an annular shape of the same size, Figure 3(a). After thepresentation time the stimulus is replaced by a distractor.The user then marks the grid position where he believesto have perceived the presented stimulus, Figure 4(b).The information about the presented stimulus, the user’sresponse behavior, the visual search of the subject as wellas the user’s response time is aggregated to compute a finalperformance score, MAFOV Score.

The MAFOV score is a value in the range of one to ten,where ten represents best performance (the grid positionwhere the user response is marked matches perfectly withthe position of the presented stimulus). One stands for thefailure to detect the stimuli by far. For the computation ofthis score, the sum of the distances between each stimulusand the corresponding position in the user response iscalculated. One dimension we consider in the computa-tion of the MAFOV score is the radial precision. Thismeasure accounts for irregular or spare grids, while naiveapproaches using Euclidean distance measures will resultin unreliable scores. To calculate the radial precision weconsider the grid density in the neighborhood of a stimulus.The resulting distance measure expresses the distancesrelative to the nearest neighbours in the grid.

deuc(Stimulus, Feedback)1

#Neighbours ·∑

Neighboursdeuc(Stimulus, Neighbour)

By default the three nearest neighbor stimuli are used tocalculate the mean distance. This approach provides goodresults for both dense and spare grids. An example result ofthe MAFOV task is depicted in Figure 5.

(a) MAFOV

(b) Comparative Search Task

(c) Conjunction Search Task

(d) Free Search Task

Figure 3. Visual Search Tasks in Vishnoo

Currently we are using this task in the field of ophthal-mology in different research studies to examine the visualsearch behavior and exploration capability of subjects withvisual field defects. As shape, size, color and arrangementof stimuli and distractors in Vishnoo are free configurable,the user can easily adapt this task to meet his requirements.

(a)

(b)

Figure 4. MAFOV

4.2. Machner Test

The Machner test is a conjunction search task developedby Machner et. al. [13]. Conjunction search representsthe process of searching for a target that is defined by acombination of two or more properties, e.g. shape andcolor. An example task when using the Machner test couldbe for example to find all red rectangles in the presentedimage, Figure 3(c). To find these targets the subject has tosystematically search the image. In contrary to the MAFOVtask, where the subject’s performance depends mainly onthe pre-attentive visual perception, in conjunction searchthe user performance depends above all on higher cognitivefunctions like the visual search strategy and the spatialworking memory.

Figure 5. Visualization of a MAFOV result

During this task eye-movements are recorded and thevisual scan path is analyzed. The analysis comprises of thedetection of the number of saccades and the time spent toexamine each region, the number of correct answers andthe subject’s search strategy.

The Machner test as presented here is a specific config-uration of a conjunction search tasks that can be designedwith Vishnoo. Again, as shape, color and position of targetsare free-configurable the user can use the templates of theMachner test to design similar tasks.

Besides vision research, the Machner test can also beused for rehabilitation in medicine, e.g. training new searchstrategies in subjects with impaired visual field to help themregain a better perception of their environment.

5. Eye-Tracking and Scan Path Analysis

Although there are some well performing commercialeye trackers available, i.e. [7], [11], [21], [23], theysuffer mainly from two big disadvantages compared withopen-source solutions: they are either available at very highcosts and thus unaffordable for many academic researchor clinical studies, or delivered as black-box solutions notallowing access to the signal/video processing routines.Nevertheless, for both categories the methods consistof mainly two parts: the extraction of the pupil and itscoordinates using image processing methods and themapping of pupil position coordinates into coordinated inthe scene-image.

The detection of the pupil is performed mainly eitherusing feature-based or model-based approaches. Feature-based approaches aim at localizing image features relatedto the position of the eye, e.g. threshold-based techniqueswhere the pupil is detected based on a threshold obtainedfrom image characteristics or manually specified by the

user, e.g. [2], [16]. In model-based approaches the pupilcontour detection is done by iteratively fitting a model,e.g. finding the best-fitting circle or ellipse for the pupilcontour [5] or [15]. Model-based techniques provide amore precise estimation of the pupil position on the cost ofthe computational speed that is crucial especially at highsampling rate (frames per second) and image resolution.

In Vishnoo we have integrated Starburst, an hybrideye-tracking method introduced by [12] that combinesfeature-based and model-based algorithms for infraredimages. After locating and removing the corneal reflectionfrom the image, Starburst locates the pupil edge point usingan iterative feature-based technique. In a next step thealgorithm fits an ellipse to a subset of the detected edgepoints using the Random Sample Consensus (RANSAC)paradigm [12], Figure 6. To map the pupil positioninto coordinated in the scene a second-order polynomialmapping based on a 9-point calibration grid is used.

In our setup we use a monochrome USB camera with aninfrared filter at a sampling rate of 60 frames per second.As eye-tracking is a plugin in Vishnoo and integrated as a.dll, the Starburst algorithm can easily be replaced by othereye-tracking solutions.

Figure 6. Detection of the pupil by the eye-tracking module

Visual Scanpath Analysis Scanpath modelling and anal-ysis in done by a modified version of iComp [8]. Fixationsand saccades are identified by calculating movement vari-ances, moving speed between adjacent saccades and fixa-tion duration [3]. Single measurements are clustered to fix-ations using a Gaussian kernel. Letters are then automati-cally assigned to fixation clusters and comparison of scan-paths is done by string-editing and alignment algorithms[6]. The final scan path is visualized as ellipses around thefixation centers, where the ellipse dimensions correspondto the variance of eye movements mapped to the fixation,obtained by a principal component analysis.

6. Conclusion

Vishnoo is a new platform approach providing a widerange of visual search tasks for easy and fast examina-tion of visual field and visual attention. Vishnoo offerseasily adaptable stimulus presentation, eye-tracking andevaluation of the visual scan path combined in a singleplatform. Up to now Vishnoo is delivered with fourhighly configurable built-in tasks. The underlying softwarearchitecture is modular and plug-in based thus new featureslike user-specific hardware, database connections, newstimulus types or even new search tasks can be added veryeasily.

Different layers of control and configuration makeVishnoo an attractive choice for both scientific researchstudies as well as clinical practice. Practical usage wasevaluated in cooperation with ophthalmologists and provedthe advantages of easy adaption to changing requirementsin research studies without the need of long-term develop-ment and a fixed examination flow.

Currently Vishnoo is being used in research studiesto examine the visual search, attention and explorationcapabilities of subjects with visual field impairments suchas Glaucoma and Hemianopsia.

The Vishnoo platform is available for download athttp://www.vishnoo.de/.

7. Acknowledgements

This work was partially supported by the MFG StiftungBaden-Wurttemberg and the Wilhelm-Schuler-Stiftung.

References

[1] D. H. Brainard. The Psychophysics Toolbox. Spatial vision,10(4):433–436, 1997.

[2] X. Brolly and J. Mulligan. Implicit calibration of a remotegaze tracker. In IEEE Conference on CVPR Workshop onObject Tracking Beyond the Visible Spectrum, 2004.

[3] R. G. Brown and P. Y. C. Hwang. Introduction to RandomSignals and Applied Kalman Filtering, volume 2nd ed. JohnWiley and Sons, 1997.

[4] Cedrus Corporation. http://www.superlab.com,2011.

[5] J. Daugmann. High confidence visual recognition of personsby a test of statistical independence. IEEE Transactions onPattern Analysis and Machine Intellegence, 15(11):1148–1161, January 1993.

[6] A. T. Duchowski, J. Driver, S. Jolaoso, W. Tan, B. N. Ramey,and A. Robbins. Scanpath comparison revisited. In Pro-ceedings of the 2010 Symposium on Eye-Tracking Research

& Applications, ETRA ’10, pages 219–226, New York, NY,USA, 2010. ACM.

[7] Ergoneers GmbH. http://www.ergoneers.com/de/products/dlab-dikablis/overview.html,2011.

[8] J. Heminghous and A. T. Duchowski. icomp: a tool for scan-path visualization and comparison. In Proceedings of the3rd symposium on Applied perception in graphics and visu-alization, APGV ’06, pages 152–152, New York, NY, USA,2006. ACM.

[9] Interactive minds GmbH. http://www.eyegaze.com/content/eyetracking-research-tools,2011.

[10] L. Itti and C. Koch. Computational modelling of visual at-tention. Nature reviews. Neuroscience, 2(3):194–203, Mar.2001.

[11] LC Technologies, Inc. http://www.interactive-minds.com/, 2011.

[12] D. Li, D. Winfield, and D. J. Parkhurst. Starburst: A hybridalgorithm for video-based eye tracking combining feature-based and model-based approaches. 2005 IEEE ComputerSociety Conference on Computer Vision and Pattern Recog-nition, 3:79, 2005.

[13] B. Machner, A. Sprenger, D. Kompf, T. Sander, W. Heide,H. Kimmig, and C. Helmchen. Visual search disorders be-yond pure sensory failure in patients with acute homony-mous visual field defects. Neuropsychologia, June 2009.

[14] Neurobiological . http://www.neurobs.com, 2011.[15] K. Nishino and S. Nayar. Eyes for relighting. In ACM SIG-

GRAPH 2004, volume 23, pages 704–711, 2004.[16] T. Ohno, N. Mukawa, and A. Yoshikawa. Freegaze: a

gaze tracking system for everyday gaze interaction. In Eyetracking research and applications symposium, pages 15–22, 2002.

[17] J. W. Peirce. PsychoPy–Psychophysics software in Python.Journal of Neuroscience Methods, 162(1-2):8–13, 2007.

[18] Psychology Software Tools, Inc. http://www.pstnet.com/eprime.cfm, 2011.

[19] J. San Agustin, H. Skovsgaard, E. Mollenbach, M. Bar-ret, M. Tall, D. W. Hansen, and J. P. Hansen. Evaluationof a low-cost open-source gaze tracker. In Proceedings ofthe 2010 Symposium on Eye-Tracking Research & Applica-tions, ETRA ’10, pages 77–80, New York, NY, USA, 2010.ACM. http://www.gazegroup.org/downloads/23-gazetracker/.

[20] U. Schiefer, H. Wilhelm, and W. Hart. Clinical Neuro-Ophthalmology: A Practical Guide. Springer Verlag, Berlin,1 edition, 2008.

[21] SensoMotoric Instruments GmbH. http://www.smivision.com, 2011.

[22] E. Tafaj, C. Uebber, J. Dietzsch, U. Schiefer, M. Bogdan,and W. Rosenstiel. Introduction of a portable campimeterbased on a laptop/tablet pc. In Proceedings of the 19th Imag-ing and Perimetry Society (IPS), Spain, 2010.

[23] Tobii Technology AB. http://www.tobii.com, 2011.