View
212
Download
0
Category
Preview:
Citation preview
Picture ContextCapturingfor MobileDatabases
Stavros Christodoulakis and Michalis FoukarakisTechnical University of Crete
Lemonia RagiaAdvanced Systems Group
Hiroaki Uchiyama and Takuya ImaiRicoh
Mobile device manufacturers
today are embedding sensors
for GPS measurements, com-
pass data, and time informa-
tion in their cameras, mobile phones, and
PDAs, opening up a wide range of opportunities
for better management of and interaction with
pictures in databases. In this article, we describe
a software environment that uses the informa-
tion from sensors to provide rich picture manage-
ment functionality. The software environment
offers several services, including semantic map
personalization, spatial picture registration,
and identification of the semantic objects.
The system we describe takes advantage of
position and direction sensors that associate
picture contents with the captured environ-
ment. The semantic maps associate and visual-
ize geometric representations of semantic
objects (like medieval forts) with regions on a
map. These maps can be personalized by repre-
senting concepts or items of interest. The
semantic maps also include geographic seman-
tic objects, such as mountains, oceans, and so
on. The system identifies properties of the geo-
graphic semantic objects, such as color and
shape, to find such objects on the picture,
guided by the knowledge of the location and
direction of the picture and the spatial context
provided by the semantic maps. The system can
register the objects of the semantic maps on top
of the picture, which allows us to develop
advanced database functionalities for semantic
content retrieval, interaction with the semantic
objects in pictures, and visualization of the
database contents on top of maps. The system
also supports user event modeling and captur-
ing and automatically associates the events
with the pictures, using contextual data.
Approach and applicationsIn comparison to previous work, see the
‘‘Related Work’’ sidebar, our emphasis is on
capturing and exploiting the contextual param-
eters at the time of picture taking and using
camera-integrated sensors and algorithms for
precise picture registration in the captured spa-
tial context. Our objective is to exploit picture
taking through deep geospatial semantics
related to the picture content to provide a com-
plete value chain for offering rich functionality
to end users.
We have integrated this functionality in the
SPatial Image Management (SPIM) software. A
particular application of this software environ-
ment is for tourists, who can choose to view se-
mantic maps of the places they are going to
visit and get detailed semantic information
about objects of interest (such as parks, tem-
ples, villages, and so on). The system can pro-
cess the pictures taken during the trip and
provide a living memory of the trip. The se-
mantic objects depicted in them can be
shown on top of the pictures themselves, allow-
ing interactive exploration of the picture con-
tents and linking with other information
sources. The combination of GPS and compass
sensors with the camera and the semantic
maps is also useful in many other applications,
such as mobile learning, damage registration in
disaster areas, and archaeological site or out-
door zoo touring.
The software environment enables the cre-
ation and use of a knowledge base containing
objects that might be of interest to the user.
The knowledge base might include a number
of domain ontologies, such as Greek archeolog-
ical monuments, medieval churches, modern
cultural buildings, and so on. The domain
ontologies consist of hierarchies of semantic
concepts and types with attributes; each
[3B2-14] mmu2010020034.3d 22/4/010 15:19 Page 34
Mobile and Ubiquitous Multimedia
A sensor-based
camera system
associates picture
contents with the
captured
environment to
enable semantic
content retrieval,
interaction, and
visualization.
1070-986X/10/$26.00 �c 2010 IEEE Published by the IEEE Computer Society34
[3B2-14] mmu2010020034.3d 22/4/010 15:19 Page 35
Related WorkMuch research in the past has focused on the automatic
classification of pictures using low-level features. Scene clas-
sification approaches exploit domain semantics and global
image features to give general descriptions of pictures and
their content (streets, buildings, and so on), or classify
them as indoor or outdoor.1,2 Image metadata such as expo-
sure time and aperture have also been used for classifica-
tion.3 Our work focuses on the detailed annotation of the
parts of the images that contain significant objects, not clas-
sification of the image as a whole. Significant parts of our
work focus on landscape pictures. Important research has
been done in the identification of parts of a picture, for ex-
ample, a blue sky.4 We exploit GPS and compass information
as well as additional geographic context to improve the
existing algorithms for picture registration.
Several authors have discussed the use of ontologies as a
means of image and video annotation.5-7 An advantage to
these approaches is that they systematically manage the
knowledge in a domain including concept type hierarchies,
concept properties, and individuals, unlike tags found in so-
cial networks. The plethora of images found in folksonomies
and photo-sharing sites such as Flickr has been exploited to
extract activity, event, and place semantics from user tagged
photos for annotating picture contents.8,9 The quality of the
tags, however, is often questionable.
A problem with the ontology-based approaches is that
they often rely on extensive user manual annotation during
database insertion, a task that’s unlikely to occur due to the
time required. The capturing of context at the time that a pic-
ture is taken can provide the means for automatic semantic
annotations and powerful semantic retrieval functionality.10,11
Another research project aimed to assist in organizing col-
lections of georeferenced pictures and combine location and
time parameters along with minimal user annotation to de-
rive some of the picture semantic content.12 In addition,
the World Wide Media eXchange (WWMX) project provides
another important option for organizing georeferenced
images.13 Pictures are indexed by the WWMX database
according to time and location. The WWMX browser visual-
izes them using a map interface and provides retrieval func-
tionality. This method presents different approaches to
acquiring location tags, browsing images, and visualizing
them on a map. There are many important early applications
in this area, notably in culture and tourism, that have used
functionalities similar to what we describe in this article.14
References
1. M. Vailaya et al., ‘‘Image Classification for Content-Based
Indexing,’’ IEEE Trans. Image Processing, vol. 10, no. 1,
2001, pp. 117-129.
2. A. Yavlinsky, E. Schofield, and S. Ruger, ‘‘Automated Image
Annotation Using Global Features and Robust Nonparametric
Density Estimation,’’ Image and Video Retrieval, W.K. Leow
et al., eds., LNCS 3568, 2005, Springer, pp. 507-517.
3. M. Boutell and J. Luo, ‘‘Beyond Pixels: Exploiting Camera
Metadata for Photo Classification,’’ Proc. IEEE Conf Com-
puter Vision and Pattern Recognition (CVPR), vol. 38, no. 7,
Elsevier, 2004, pp. 935-946.
4. A.C. Gallagher, J. Luo, and W. Hao, ‘‘Improved Blue Sky
Detection Using Polynomial Model Fit,’’ Proc. IEEE Int’l
Conf. Image Processing, IEEE Press, 2004, pp. 2367-2370.
5. L. Hollink, ‘‘Adding Spatial Semantics to Image Annota-
tions,’’ Proc. 4th Int’l Workshop on Knowledge Markup
and Semantic Annotation, 2004, pp. 31-40; http://www.
few.vu.nl/~guus/papers/Hollink04c.pdf.
6. C. Tsinaraki and S. Christodoulakis, ‘‘An MPEG-7 Query Lan-
guage and a User Preference Model that Allow Semantic Retrieval
and Filtering of Multimedia Content,’’ Proc. ACM Verlag Multi-
media Systems J., special issue on semantic multimedia adapta-
tion and personalization, vol. 13, no. 2, 2007, pp. 131-153.
7. C. Tsinaraki, P. Polydoros, and S. Christodoulakis, ‘‘Interoper-
ability Support between MPEG-7/21 and OWL in DS-MIRF,’’
IEEE Trans. Knowledge and Data Engineering, special issue on
the Semantic Web era, vol. 19, no. 2, 2007, pp. 219-232.
8. D. Joshi and J. Luo, ‘‘Inferring Generic Activities and Events
from Image Content and Bags of Geo-Tags,’’ Proc. Conf.
Image And Video Retrieval, ACM Press, 2008, pp. 37-46.
9. T. Rattenbury, N. Good, and M. Naaman, ‘‘Towards Auto-
matic Extraction of Event and Place Semantics from Flickr
Tags,’’ Proc. Ann. ACM Conf. Research and Development in
Information Retrieval, ACM Press, 2007, pp. 103-110.
10. Christodoulakis et al., ‘‘Semantic Maps and Mobile Context
Capturing for Picture Content Visualization and Management
of Picture Databases,’’ Proc 7th Int’l Conf. Mobile and Ubiqui-
tous Multimedia (MUM), ACM Press, 2008, pp. 130-136.
11. J. Li et al., ‘‘New Challenges in Multimedia Research for
the Increasingly Connected and Fast Growing Digital Soci-
ety,’’ Proc. ACM Int’l Conf. Multimedia Information Retrieval
(MIR), ACM Press, 2007, pp. 3-10.
12. M. Naaman, Leveraging Geo-Referenced Digital Photographs,
doctoral dissertation, Stanford Univ., 2005.
13. K. Toyama, R. Logan, and A. Roseway, ‘‘Geographic Loca-
tion Tags on Digital Images,’’ Proc. 11th Int’l Conf. Multi-
media, ACM Press, 2003, pp. 156-166.
14. S. Christodoulakis et al., ‘‘A Distributed Multimedia Tour-
ism Information System,’’ Proc. Int’l Conf. Information and
Communication Technologies in Tourism (Enter), 1997,
pp. 295-306; http://195.130.87.21:8080/dspace/bitstream/
123456789/604/1/Minotaurus%20a%20distributed%
20multimedia%20tourism%20information%20system.pdf.
35
semantic object (individual) belongs to one of
these concepts. A special case of the ontologies
supported is a semantic geographic ontology
that contains concepts such as lakes, oceans,
mountains, islands, villages, and so on.
The software environment helps create se-
mantic maps that associate polygon representa-
tions (also called footprints) of semantic
individuals. The software associates semantic
individuals with a set of GPS positions that de-
scribe their enclosing polygon on the land. The
footprint can then be visualized on top of any
calibrated map, making the knowledge base in-
dependent of map information and allowing
reuse of the same semantic objects in different
map environments.
Because the number of domain ontologies
and semantic individuals contained in seman-
tic maps can be large, the users are provided
with services to personalize the content of
each map to suit their interests. They can spec-
ify conditions on the ontologies that they want
represented, the types from each ontology, as
well as specific semantic individuals. The ser-
vices help construct a personalized semantic
map. Personalized semantic maps contain
fewer objects, and only objects that are of inter-
est to the user. This reduces the chance of infor-
mation overload and improves the visualization
of such maps on screens.
Managing pictures and their
semantic contentThis section describes the integration of the
camera with the GPS and compass sensors and
the capturing of the Exif1 metadata, as well
as the use of the Exif metadata for associating
the picture with the spatial context that it cap-
tures. We have used a Ricoh Caplio 500 SE dig-
ital camera that communicates with a GPS
receiver with an integrated digital compass
using a Bluetooth interface. Recent camera
models already integrate these sensors. The ad-
ditional position and direction parameters cap-
tured by the sensors are automatically stored in
the Exif header of the produced image, along
with image capturing parameters (such as
focal length, aperture, and so on) and other
metadata.
We use the information captured in the Exif
data at the time of picture taking to calculate
contextual parameters that let us associate the
digital picture’s segments with the semantic
spatial objects that the picture captures. Our
objective is to be able to describe as accurately
as possible the spatial content of the digital
image. To do that, we use standard camera
parameters such as the sensor size, picture tak-
ing parameters such as the focal length, GPS
parameters such as location and altitude, and
compass parameters such as the angle with re-
spect to the magnetic north. These parameters
allow us to calculate the location and direction
of the picture with respect to the geographic
north and the camera angle of view.
Taking into account the camera location
and direction, and associating it with the spatial
information and the semantic geographical
individuals contained in semantic maps, we
can automatically predict the geographic
objects that appear in the direction of the pic-
ture. Associating the contextual metadata
about a picture with the semantic maps lets us
similarly predict the semantic objects described
by the semantic map ontologies that are within
the picture. When more than one semantic ob-
ject is predicted to be within a picture, the
objects are ranked according to their distance
and their relative location within the picture’s
angle of view and focusing area, if available.
The association of a picture with the semantic
objects can be used for metadata generation re-
lated to the picture’s contents or for under-
standing and visualizing the content as a way
to more effectively understand the real world.
For the purpose of picture annotation, we
calculate the 2D model of the picture contents
by taking into account the land formations and
semantic objects in the direction of the picture
(see Figure 1). There is no exact correspondence
between the picture contents and the 2D model
of the picture contents (or the visible horizon
in the picture and the visible horizon in the
2D representation) because of the additional
degrees of freedom of the camera (tilt and rota-
tion). This might not be crucial for the purpose
of picture annotation (some false drops might
result if the camera’s tilt results in the cropping
of some of the predicted objects). However,
exact correspondence becomes important
when the precise location of semantic objects
within the pictures is used in applications
that allow user interaction with the semantic
objects as a kind of virtual window to the
world. This functionality requires more precise
registration of the picture with the 2D represen-
tation of the spatial contents, so that when the
user points the mouse cursor at a specific
[3B2-14] mmu2010020034.3d 22/4/010 15:19 Page 36
IEEE
Mu
ltiM
ed
ia
36
picture location, the system will be able to infer
which spatial real-world objects are at this posi-
tion. This kind of accuracy can’t be directly
obtained from just the GPS and compass data.
Spatial context registrationTo obtain the additional accuracy needed for
user interaction and visualization with pictures,
we developed algorithms that match picture
contents with the 2D view of the spatial envi-
ronment—obtained from the camera location
and the picture direction, both recorded by
GPS and compass in the Exif data. To calculate
this 2D view, which includes land formations
and the semantic objects, an algorithm traces
the rays that start from the camera and move
along the camera direction within the angle
of view until they reach geographic formations
that stop them (forming a picture cone).2 The
algorithm can detect discontinuities that
come from ground formations (for example a
hill followed by a valley followed by a moun-
tain creates a discontinuity in the visible boun-
daries of the hill). The semantic objects
themselves (including geographic objects such
as islands) or the visible horizon might create
other discontinuities.
To match the objects of the picture with the
objects in the 2D view, we segment the picture
using a modified statistical region-merging
method3 to obtain important segment bounda-
ries and other characteristics, such as mountain
peaks, that can be matched with the corre-
sponding 2D shapes. Because the semantic
maps include geographic objects and their foot-
prints, we can exploit characteristics of any
type of visible geographic object (such as sky
and ocean color) to find the location of those
objects within the picture. For the current sys-
tem, we have concentrated on extracting the
boundaries of the skyline, high mountains,
and the ocean area, but we plan to investigate
additional possibilities in the future. The boun-
daries of those objects are calculated from both
the segmented picture and the 2D representa-
tion to be used later for the registration
algorithm.
Because the picture objects might not match
the 2D objects (due to errors in direction, tilt,
rotation, and so on), we want to transform
the 2D representation so that it can be superim-
posed correctly on top of the picture. We use an
error metric to quantify the quality of match-
ing. The basic algorithm for matching is a vari-
ation of the line matching algorithms.4 A
successful match enables the interactive explo-
ration of the contents of a picture in real time
and the association of the picture locations
and cones of view with semantic maps for visu-
alization and browsing the database contents.
ExperimentationWe have performed experiments to under-
stand the sensitivity of the algorithms that we
[3B2-14] mmu2010020034.3d 22/4/010 15:19 Page 37
Figure 1. Construction of the 2D representation of the spatial view. (a) A picture showing mountains, land, and ocean. (b) Part of the
semantic map that contains the polygon-shaped semantic objects, the picture’s direction and angle of view, and the visible land
formations along that direction (dark areas inside the cone). A rectangular semantic object on the right side of the angle of view isn’t
visible to the user due to the hill on the right; hence it doesn’t appear in the 2D representation. (c) The 2D representation of the image
containing visible land formations and semantic objects present on the semantic map calculated from the camera’s position and
direction parameters.
(a) (b) (c)
Ap
ril�Ju
ne
2010
37
use with respect to the errors in the model
parameters and the lack of sensor tilt and rota-
tion. In addition, we wanted to determine the
relative value of the types of semantic geo-
graphic objects in the precise registration of
the pictures.
Because we had observed that errors in the
GPS determination have little impact on accu-
racy, we experimented with errors produced
from inaccuracies in the direction determina-
tion. We examined and compared the results
produced when the direction captured by the
camera system deviated from the true direc-
tion. Although we have developed an error
metric to evaluate the quality of matching,
sometimes the error metric produces results
that don’t match those expected by a human.
Thus, we used human evaluations of the
matching quality. A user visually categorized
the results of registering the picture to the 2D
representation of the spatial view in the actual
direction of the picture. We categorized the
results using a scale from 1 to 7 and considered
those results to be satisfactory when they
achieved a grade higher than 3.
Table 1 shows the results of the experimen-
tation. For no errors in the captured direction
of the picture, the algorithms achieve satisfac-
tory results in 91.3 percent of the cases. For
errors in the direction of about 4 degrees, the
performance of the algorithms is solid, having
81.2 percent satisfactory results. When the
error in the compass measurements is 7 degrees
or more, the algorithms often don’t have
enough information from the 2D picture
representation to produce an accurate match,
resulting in lower percentage and quality of
successful matches. Although the sample is rel-
atively small, it demonstrates that the use of a
compass and semantic maps improves the pic-
ture registration results greatly, that the picture
registration quality achieved is good, and that
deviations in the direction determination
might result in significant deterioration of pic-
ture registration quality.
In the experiments, we observed that the
boundaries of the blue sky and the mountains
are useful for accurate picture registration.
We expected this because the location of ex-
perimentation (Crete) had clear blue skies
and mountains. However, it’s conceivable
that other geographic features could also be
useful in enhancing the performance results
or achieving results where clear separation be-
tween mountains and blue sky don’t exist.
We also tested the ocean as a geographic ob-
ject and examined its capability to improve
the results of registration achieved by the
blue sky separation. The results showed
that the combined use of the two geographic
features for picture registration was better
than the sole use of blue sky separations in
about 30 percent of the cases. Our experi-
ments indicate that this area of research is
promising; we intend to pursue more research
in this area in the future.
Time, user location, and event metadataEvents are meaningful ways of modeling the
content of pictures. For example, the MPEG-7
Semantic Model is based on event modeling.5,6
In SPIM, we use a part of MPEG-7 for event
modeling and capturing. We characterize
events by name, location, and time, and
might use actors that participate in the events.
Events are also organized in semantic event
hierarchies. For example, a wedding event
might be composed of smaller events like the
ceremony, wedding dinner, and so on. A sum-
mer vacation in Crete in 2009 could be an
event that is subdivided into smaller events of
visiting various places within Crete. Summer
vacations in Crete in 2009 are of the same
type as other summer vacations.
Our system allows specification and brows-
ing of event hierarchies in a simple manner.
Additional retrieval and browsing functional-
ities might allow users to specify events at var-
ious levels of the hierarchy. In the current
[3B2-14] mmu2010020034.3d 22/4/010 15:19 Page 38
Table 1. Results from the experimentation on 69 pictures with three error
categories in the compass measurements: no error, 4 degree deviation,
and 7 degree deviation.
Distinction
True
heading
4 degree
deviation
7 degree
deviation
Perfect (7) 7 10 3
Good (6) 26 16 13
Acceptable (5) 19 18 12
Average (4) 11 12 14
Bad (3) 3 7 11
Awful (2) 1 4 10
Failed (1) 2 2 6
Number passed 63 56 42
Number failed 6 13 27
Pass percentage 91.3 % 81.2 % 60.9 %
Fail percentage 8.7 % 18.8 % 39.1 %
IEEE
Mu
ltiM
ed
ia
38
system, the elementary event that is associated
with the picture is automatically determined by
the time the picture was taken. The time the
picture was taken uniquely determines a leaf
in the event hierarchy, which the system uses
to associate the picture with all the event-
instance-related information. A sophisticated
retrieval system would be able to exploit the
event hierarchies or the event instance data.
We explicitly model and associate with the
events the user location at the time of picture
taking (as opposed to the location of the objects
that appear within the picture). The user loca-
tion is captured by the GPS parameters of the
Exif file and is automatically converted to the
location name using the organization of infor-
mation in the semantic maps (geographic hier-
archies). In addition, the automatic assignment
of location names can be exploited in the re-
trieval interfaces.
SPIM software environment and servicesThe SPIM software offers client�server ser-
vices to create personalized maps. The server
includes a map database and a database con-
taining domain ontologies and individuals as
well as services that can create personalized
maps according to the user’s interests (domain
ontologies, types of concepts, specific individu-
als). The picture management software acts as a
client program to the semantic map server; it
accesses and stores the delivered personalized
semantic maps. The software manages the pic-
tures for the user and includes the services for
picture capturing, registration, storage, index-
ing, and object annotation. The software also
provides the retrieval functionality as well as
the user interfaces for visualization and object
interaction.
Figure 2 shows an example of SPIM func-
tionality. The user can specify ontologies,
types from each ontology, or even individuals,
and the system will decide which individuals
satisfy the constraints. SPIM then emphasizes
the location of the semantic objects that satisfy
the constraints and lists the pictures associated
with those semantic objects. The user can ask to
see the picture footprint (which is the geomet-
ric representation of its location and cone of
view) on top of the map, or select a semantic
object from the map and ask to see all the pic-
ture footprints from a particular database. The
user can see the pictures themselves by select-
ing the picture footprint or by clicking on the
picture thumbnail, and can list the semantic
individuals that appear in the picture (villages,
churches, and so on).
Figure 3 (next page) shows the user interface
that allows interactive exploration of the spatial
information associated with a picture. The user
can point the mouse to a location in the picture.
If the mouse is above certain semantic objects,
they are highlighted and the name of the
semantic object and relevant information is dis-
played when clicking on them. The user can
choose to hide the semantic individuals and
their boundaries to view the original picture.
ConclusionsThe research in this article has emphasized
the importance of detailed registration of the
remote scenes on pictures so that the user can
point to rather small objects visible in the pic-
ture and interact with them. A wide range of
visualization and interactive functionality
services for personalized information manage-
ment systems can be supported using this func-
tionality. As the accuracy of capturing the
location of remote objects is critical for such
interactions, we are currently performing
more research in the integration of additional
[3B2-14] mmu2010020034.3d 22/4/010 15:19 Page 39
Figure 2. An example of the SPIM user interface. Semantic objects are selected
and shown as polygons on a map, and the footprints of selected semantic
objects are displayed. The user can select footprints and see the corresponding
pictures and their information. Locations of pictures are shown as small
circles. The user is able to see what semantic entities are on top of the map and
view information about them. In the figure, a semantic individual describing a
mountainous area has been selected.
Ap
ril�Ju
ne
2010
39
contextual information that is readily available—
such as the time of the day and year with
respect to current location, the camera
azimuth, and so on—in the algorithms that
perform picture registration. We are also
exploring more alternative functionalities for
personalized information management sys-
tems that are enabled by the detailed picture
registration to the 3D scenes. MM
References
1. Exif Version 2.2 Digital Still Camera Image File
Format Standard, 2002 Japan Electronics and
Information Technology Industries Assoc.; http://
www.exif.org/Exif2-2.pdf.
2. R. Franklin and C.K. Ray, ‘‘Higher Isn’t Necessarily
Better: Visibility Algorithms and Experiments,
Advances in GIS Research,’’ Proc. 6th Int’l Symp.
Spatial Data Handling, T.C. Waugh and R.G.
Healey, eds., Taylor & Francis, 1994, pp. 751-770.
3. R. Nock and F. Nielsen, ‘‘Statistical Region Merg-
ing,’’ IEEE Trans. Pattern Analysis and Machine In-
telligence, vol. 26, no. 11, 2004, pp. 1452-1458.
4. J.R. Beveridge and E.M. Riseman, ‘‘How Easy Is
Matching 2D Line Models Using Local Search?’’
IEEE Trans. Pattern Analysis and Machine Intelli-
gence, vol. 19, no. 6, 1997, pp. 564-579.
5. C. Tsinaraki and S. Christodoulakis, ‘‘An MPEG-7
Query Language and a User Preference Model
that Allow Semantic Retrieval and Filtering of
Multimedia Content,’’ Proc. ACM Verlag Multi-
media Systems J., special issue on semantic
multimedia adaptation and personalization, vol.
13, no. 2, 2007, pp. 131-153.
6. C. Tsinaraki, P. Polydoros, and S. Christodoulakis,
‘‘Interoperability Support between MPEG-7/21
and OWL in DS-MIRF,’’ IEEE Trans. Knowledge and
Data Engineering, special issue on the Semantic
Web era, vol. 19, no. 2, 2007, pp. 219-232.
Stavros Christodoulakis is professor and director of
the MUSIC/TUC laboratory at the Department of
Electronic and Computer Engineering, Technical
University of Crete. His research interests include in-
formation systems, multimedia, semantics, and inter-
operability. Christodoulakis has a PhD in computer
science from the University of Toronto, Canada.
Contact him at stavros@ced.tuc.gr.
Michalis Foukarakis is a graduate student in elec-
tronic and computer engineering at the Technical
University of Crete, where he works as a research as-
sistant at the Laboratory of Distributed Multimedia
Information Systems and Applications. His research
interests include semantic spatial image management
and ontologies. Foukarakis has a MS in electronic and
computer engineering from the Technical University
of Crete. Contact him at foukas@ced.tuc.gr.
Lemonia Ragia is a research assistant at the Univer-
sity of Geneva. Her research interests include data
mining, spatial databases and data, model manage-
ment and schema matching, and high-performance
visualization of spatial information. Ragia has a PhD
in photogrammetry from the Institute of Photogram-
metry, Bonn, Germany. Contact her at lemonia.
ragia@cui.unige.ch.
Hiroaki Uchiyama is a software engineer at Ricoh.
His research interests include Bluetooth technology
and developing business-oriented digital cameras
using Bluetooth, WiFi, GPS, and barcode functional-
ities. Uchiyama has an MS in electrical and electronics
engineering from Sophia University, Tokyo. Contact
him at ucchon@nts.ricoh.co.jp.
Takuya Imai is a software engineer at Ricoh. His re-
search interests include implementing Bluetooth tech-
nology for digital cameras and innovative application
research for Bluetooth-equipped devices. Imai has a BS
in mechanical engineering from Meiji University,
Tokyo. Contact him at takuya.imai@nts.ricoh.co.jp.
[3B2-14] mmu2010020034.3d 22/4/010 15:19 Page 40
Figure 3. Interactive exploration of image contents. The user is able to select
semantic objects depicted in the picture and obtain relevant information
about them.
IEEE
Mu
ltiM
ed
ia
40
[3B2-14] mmu2010020034.3d 22/4/010 15:19 Page 41
Recommended