Populating Semantic Virtual Environmentsgaia/publications/Semantics-CASA-2014.pdf · mug CoffeeMug bottle catsupbottle KetchupBottle jug waterjug WaterJug implement writingimplement

Populating Semantic Virtual Environments

Abstract

The capacity for knowledge representationwithin simulated environments is a growingfield of research geared towards the inclusionof detailed descriptors within the environmentthrough which to provide a virtual agent theknowledge necessary to afford decision-makingand interaction. One such layer of represen-tation is that concerning the natural-language,or semantic, depiction of the environment andthe objects within it. However, even as tech-nology grows more capable of representingsuch information-rich environments the processof authoring and injecting this knowledge haslargely remained a manual effort. Not only isthis effort arduous for the author in both timeand energy expenditure required, but the pro-cess also lends itself to limiting the ontology ofthe simulation further constraining the extent towhich such knowledge may be used.

Here we offer an affordable method forsemi-automating the generation and injec-tion of semantic properties into a virtualenvironment for the purpose of produc-ing more natural agent-object interaction.

Keywords: Smart Objects, AutonomousActors, Virtual Humans, Semantics andOntologies for Virtual Environments

1 Introduction

The utilizaion of semantics representation asa key component in the construction of vir-tual environments is a growing practice both inacademia and industry, particularly as a meansby which to afford virtual agents a greater formof autonomy [1]. Regardless, the current processby which semantics and their respective ontolo-

Figure 1: A ShrimpPlatter virtual object withembedded semantic properties.

gies are constructed is unable to adequately keepup with what is required of them.

We present in this paper a method and as-sociated set of tools to facilitate the construc-tion of semantic ontologies for virtual environ-ments, as well as the generation of semanticsthemselves. Section 2 discusses related workin the field of semantic ontologies and their ap-plication. Section 3 further explores the issuessurrounding current efforts at incorporating in-creasingly large semantic ontologies into virtualenvironmnets. Section 4 presents our tools au-thored in response to these efforts and details re-garding their function and capability. The paperconcludes with a discussion of potential appli-cations of this approach, as well future work.

2 Related Work

The concept of object affordance offers us aframework by which to characterize an agent’sability to perceive and interact with the environ-ment in which it exists. While the notion ofan affordance varies in its use within the fieldof human-computer interation [2], the definitionas originally put forth by psychologist James J.Gibson [3] forms the basis on which this re-search is motivated.

Gibson’s affordances are highlighted throughthree defining characteristics [2]:

• Affordances present to an agent potentialinteractions it may perform with respect tothat object’s current environment.

• The existance of an affordance is indepen-dent of the agent’s ability to perceive it.

• The existance of an affordance is binary.

With particular respect to the first point, affor-dances not only encapsulate the specific actionsthemselves but descriptors which may in turn beused to facilitate the performance of action.

Smart objects, as introduced by Kallmann andThalmann [4], apply the concept of affordancesto objects within a virtual environment in a moregeneric manner. A smart object illustrates thatobject’s functionality, possible interactions with,and low-level manipulation descriptors in an ef-fort to enable real-time interactions between theagent and itself.

While the notion of smart objects is more con-cerned with providing the abstracted behavioralrepresentations and functional aspects of an ob-ject in application [5, 6, 1], this method for rep-resenting affordances in a direct manner to theagent provides a base through which to providemost high-level descriptions.

The development of natural language ontolo-gies for the representation of objects and theiraffordances has been well-explored within thefield of AI. Bindiganavale el. al. [7] pioneeredthe conceptualization of actions into an ontol-ogy to support the execution of actions usingnatural language instructions. Pellins et. al.[8] have introduced the use of semantics for theconstruction of virtual enviromemts at a concep-tual level. Kalogerakis et. al. [9] proposedthe use of ontologies for the structuring of con-

tent within objects of a virtual environment us-ing OWL graphs, while Pittarello et. al. [10]similarly explored the use of X3D for annotat-ing interactive environments with textual infor-mation in a hierarchal form. Balint et. al. [11]likewise studied the construction of hierarchiesof ontologies using semantic information to fa-cilitate agent behavior.

Our implementation does not attempt to cre-ate its own ontology but rather combines ex-isting ontologies into a single framework uponwhich virtual environments may be enriched andmaintained.

3 Incorporating Semantics

When handling the incorporation of semanticsinto the virtual environment there are two pri-mary issues that need to be addressed:

• The quality of the data being included.• The inherent trade-off between the quantity

of information within a smart object andthat object’s extensibility.

In order to ensure the quality of the semanticdata within an object we utilize pre-fabricated,open source lexical corpa. To allow for the in-clusion of incresing amounts of object-specificdata we explore the further abstraction of smartobjects through modularization and runtime-specific attributes.

3.1 Lexical Databases

As stated prior, the hand-crafting of object hi-erarchies and semantic ontologies in the pro-duction of virtual environments suffers from nu-merous pitfalls inherent in the manual approachitself. Such systems are often manufacturedaround specific applications in simulation, re-sulting in inflexibility with regards to scalabilityand increased overhead in the time and effort re-quired to maintain the model [1]. Additionally,leaving the simulation author to construct theirown ontology not only detracts from time betterspent concerned with application of said ontol-ogy, but also may result in models that are, whilefunctional, fundamentally incorrect in their con-struction, resulting in problems later on.

entity

physical entity

object

whole

artifact

instrumentality

container

spoon

Spoon

bin

ashcan

GarbageCanTrashCan

dish

bowl

Soup Bowl

box

strongbox

cashbox

Cash Register

vessel

bowl

Bowl

drinking vessel

mug

Coffee Mug

bottle

catsup bottle

KetchupBottle

jug

water jug

WaterJug

implement

writing implement

Pen

cleaning implement

swab

Mop

broom

Broom

utensil

kitchen utensil

cooking utensil

pan

Pan

tool

hand tool

Hammershovel

Shovel

wrench

Wrench

pliers

Needlenose Pliers

cutting implement

cutter

bolt cutter

BoltCutter

edge tool

knife

Knife

ax

Axe

Figure 2: A Sample object hierarchy as generated through WordNet.

Many research endeavours have attemptedto assist in the production of relational hier-archies through such means as crowdsourcing[12, 13]. Crowdsourcing by nature operates un-der an open world model, under the formal logicassumption that the truth of a statement is in-dependent of whether or not it is known byany one individual to be true [14]. Irrespec-tive of preventative steps taken to ensure the in-tegrity of collected input, the gathering of infor-mation from such a diverse population results indata that may be misleading, irrevelant, or oth-erwiase incorrect. This approach is fundamen-tally in conflict with the closed-world model un-der which we construct ontologies for game andsimulation, wherein what is known is fact andwhat is not is assumed false. While it should notbe denied that crowdsourced data has its appli-cation, there exist numerous drawbacks whichmake it highly impractical for use within thecontext of a closed-world system.

In order to provide for a more consistentmeans of reliable data collection we approachthe use instead of established lexical corpa suchas WordNet [15], VerbNet [16], and PropBank[17]. The primary focus on the use of thesecorpa is their freedom of availability and ease ofaccessibility. Additionally, for each corpus thereexist a variety of developent tools, facilitatingnot only the manipulation of corpus data but alsothe incorporation of this data into a developmentcycle. Finally, the corpa are set up such thatthe information contained within any one corpusmay be used in conjunction with the other twocorpa for the purposes of creating more com-

plex, complete environments.

3.2 Modularized Smart Objects

As originally shown by Kallmann and Thall-mann [4], there exists a disjunction between thespecificity of information that may be containedwithin a smart object, and the extensibility ofthat object with respect to its applicability be-yond the given scenario. While the overloadingof smart objects ostensibly results in the pro-duction of entities which sit in direct contrastto the core principles of the theory, strict adher-ence to the prescribed level of abstraction effec-tively limits the extent to which an implementa-tion may otherwise allow for the enhancementof a scene.

Therefore, in order allow for the inclusion ofincreasingly detailed, domain-specific informa-tion in the construction of smart objects, whilemaintaining the concept’s fundamental notionsof generality and extensibility, it is necessaryto alter the approach itself by which smart ob-jects are traditionally constructed by introducingfurther abstraction in the form of smart objectmodularization. Through further separating ofthe knowledge representaion of objects from itsgraphical representation, this approach enablesus to treat smart objects as more capable, gen-eral entities than previously permitted.

4 Semantic Generation

The system is built in two parts. The pre-runtime component encompasses the two tools

for semi-automating the generation of both anobject hierarchy as well as the requisite seman-tic modules for the produced hierarchy. Theruntime component is composed of the Seman-tic Module Handler (SMH) which acts as ab-stracted knowledge layer between simulationagent and object for managing the loading, dis-semination, and interaction with all semantics inan environment.

4.1 Hierarchy Generation

All pre-runtime components have been built us-ing Python, an interpreted object-oriented lan-guage [18]. Python scripts are easily testableand capable of cross-platform operation. Theprocess by which object hierarchies are gener-ated makes heavy use of the WordNet corpus, ascontrolled through the use of the Natural Lan-guage Processing Toolkit (NLTK) [19].

The WordNet database itself is organized intoa hieracrhy of synsets, or synonym sets, of theform word.PartOfSpeech.synsetNumber(e.g. rabbit.n.01). Each synset is representa-tive of a single meaning of a given word, for whichthere may exist multiple definitions. The construc-tion of the WordNet hierarchy itself is that of a di-rected acyclic graph with all nodes extending fromthe most general form, entity.n.01. Thereforethe challenge in constructing hierarchies from thisdata comes from first associating each object in ascene to its proper synset definition, then properlytrimming the resulting hierarchy into a tree structuremore suitable for use in simulation.

To assist the simulation author in the constructionof an object hierarchy, all that is required is a listof all the unique objects to be represented within ascene, labeled in the form

object[ : keywords ]The keyword arguments constitute assistive text

used in determining the proper synset to be associ-ated with the given object. While optional, if noneare provided the author may be prompted to select aspecific definition upon hierarchy generation.

For example, if an environment is to include anumber of common household pets, the author mayinclude the following :

rabbit : fur ears lagomorphcatdog : canine mammal friend

An example hierarchy generated using thismethod is presented in Figure 2.

Once a hierarchy is generated, a MySQL tableis injected into the simulation database ready for

immediate use by the author. Additionally, at thisstage all appropriate semantic modules are generatedand the object-module associations stored within thedatabase.

Objects Found Mismatch Accuracy (%) Time (s)50 49 0 98.00 12.97100 100 5 95.00 18.70150 150 5 96.67 24.83200 198 4 97.00 33.02250 249 3 98.40 37.06300 299 8 97.00 42.07350 349 5 98.29 55.16400 400 11 97.25 62.19450 447 11 96.89 67.98500 498 6 98.40 69.96

Figure 4: Testing of the hierarchy genera-tion process was conducted overan increasing number of randomly-generated objects, measuring the de-gree to which synsets were correctlymapped back.

Tests were performed by creating object group-ings of various size, consisting of objects extractedfrom randomly-selected WordNet synsets. For eachhierarchy, the items were organized into a list inthe previously described object : keywordsformat and the resulting list used to generate a hierar-chy using our tool. The output list of paired synsetswas then compared with the original list to determineoverall accuracy.

This method for generation has been shown to beexceedingly capable in its ability to accurately con-struct object hierarchies, with an average 97.29%success rate, as demonstrated in Figure 4. The mea-surement of time shown relates not only to the con-struction of the hierarchy, but also the generation ofall related semantics modules, as well as the writingof all necessary information to the database. For ev-ery 50 objects there appears an average 6.33 secondincrease in overhead (Figure 5).

The process iteself is a one-time operation over-head; once a hierarchy is produced there is no needto re-run the tool except for when modifications tothe hierarchy itself are made.

4.2 Semantic ModularizationThe underlying focus of modularizing smart objectcomponents to facilitate the construction of better,more accessible virtual environments relies not onlyon the ability of the system to manage this informa-tion during runtime, but also on the capacity to orga-nize and manage this information during the designprocess. For these reasons, an XML-like schema wasused utilized, as demonstrated in Figure 6.

Figure 3: The agent interacts with the virtual environment with the assistance of semantic affordances.

Figure 5: Increasing the number of unique ob-jects in a scene has minimal effect onthe time taken to compute a new hier-archy and generate the appropriate se-mantic modules.

The structure of each file is provided to be easy tomaintain and extend by the simulation author. Thename of the file corresponds to the specific WordNetsynset from which it was produced to avoid conflictwith similarly-defined objects. Semantic affordancesare stored as qualitative properties of the form<type value>. This allows us to further delini-ate our semantics into textual descriptors relating tohigher-level illustrative properties and those relatingmore directly to specific actions which may be af-forded by the object.

Modules themselves are not limited to establish-ing attributes of specific objects as the exampledemonstrates, but may be defined instead as collec-tions of related affordances of any general type. Asa result, it is possible to have modules such as Dry,Mammal, or John.

bowl 01.mod<name>bowl_01</name><modType>object</modType><class>PhysicalObject</class><state>entity</state><qualitative>

<prop val="round"/><prop val="vessel"/><prop val="holding"/><prop val="top"/><prop val="food"/><prop val="container"/><act val="hold"/><act val="contain"/>

</qualitative>

Figure 6: A sample semantic module containgrelational semantics for object bowl.

To specify the relation of modules to their respec-tive virtual objects at runtime, the existence of thesemodules are stored in an indexed list within the pro-vided runtime database. Correspondences betweenobject and module are then stored in a separatetable which is then referred to when loading objectsinto an environment.

4.3 Runtime PerformanceThe management of semantic modules at runtime iscontroled by a separate handler, the SMH, in an im-plementation similar to the actionary as developedby Bindiganavale el. al. [7]. A hierarchical approachwas necessary to allow for a greater level of preci-sion and control over the dissemination of semanticsaccross an increasinly complex virtual environment.

RnR1Site.... Rn

SmartObject

R1Site....

Environment

MnM0

Module Library

.....................

SmartObject

Affordance LibrarySemantic Module Handler

Figure 7: An overview of the semantic injectionsystem hierarchy.

Modules are loaded into the simulation once onan per-need basis, utilizing the open source toolRapidXML [20] to limit the overhead cost to frac-tions of a second per file. As each file is loadeda corresponding container, an SMod, is created intowhich the appropriate semantics are stored, to be dis-tributed to further instances of the object without ne-cessitating subsequent rereadings. Instances of smartobjects themselves are appended with site containerswhich act as intermediaries between the object andhandler. The site containers themselves store seman-tic regions, which are capable of representing indi-vidual segments or the whole of an object. It is theseregions into which semantics from the handler aretransferred and subsequently interacted with. Thecomplete system hierarchy is illustrated in Figure 7.

Given a hierarchy constructed of 50 objects cho-sen at random, with a resulting 300 generated seman-tic values, the scalability of the system was testedover a set of randomly-generated environments. Thetwo primary areas of concern over the applicabilityof the given implementation were the initial overheadof injecting semantics into the environment (Figure8) and the processing time required for semantic-based agent-object interaction (Figure 10).

The processes of loading an environment and in-jecting it with the necessary semantics occur in par-allel. Therefore, in order to measure the speed atwhich semantic affordances were both loaded intothe system as well as disseminated across all appro-priate virtual objects, it was necessary to first cal-culate the average time required to load each scene

without the given affordances. From this we wereable to approximate the injection speed of seman-tics within each environment, as shown in Figure 8.Overall, the speed at which semantics were injectedincreased at a rate proportional to two seconds per100 objects, or 600 semantic inclusions (Figure 9).

Objects Approx.SemanticCount

InjectionSpeed (s)

TotalLoadTime (s)

100 600 13 36200 1200 14 37300 1800 17 40400 2400 18 41500 3000 21 44

Figure 8: The measured performance of thedistribution of semantic affordancesacross a suite of increasingly largerandomly-generated environments.

Figure 9: The time necessary to inject a vir-tual environment with semantic affor-dances proportional to the total loadtime of the scene.

While the preprocess overhead time required forinjection is demonstratably linear with respect to thesize of the environment, it must also be computation-ally feasible to utilize these semantics during run-time, particularly through the carrying out of agenttasks. To test this, the agent was placed within thegenerated environments and directed to walk fromone end to the other, such that all objects were causedto enter its field of perception. Figure 10 presents theresults of these tests. The complexity of agent searchthrough its environment is roughly O(log(n)), withrespect to the number of searchable semantics.

Objects Approx.SemanticCount

Average SearchTime (ms)

100 600 1200 1200 2300 1800 2400 2400 3500 3000 3

Figure 10: The process of searching throughthe semantics of objects within anagent’s field of perception has a neg-ligible impact on the overall perfor-mance of the simulation.

5 ConclusionIn this paper we have proposed a method geared to-wards facilitating the construction of semantic on-tologies for virtual environments, as well as an ap-proach for the generation of semantic affordancesand their inclusion within these environments. Themethod has been shown to be computationally ef-ficient in its implenentation and scalable across di-verse environments of varying size and semantic in-clusion.

A future extension of this work would be to morefully integrate the framework with high-level agentplanning systems, using the embedded semantics ina more direct manner in order to assist behaviour anddecision making processes.

References[1] Tim Tutenel, Rafael Bidarra, Ruben M Smelik,

and Klaas Jan De Kraker. The role of seman-tics in games and simulations. Computers inEntertainment (CIE), 6(4):57, 2008.

[2] Joanna McGrenere and Wayne Ho. Affor-dances: Clarifying and evolving a concept. InGraphics Interface, volume 2000, pages 179–186, 2000.

[3] James Jerome Gibson. The ecological ap-proach to visual perception. Routledge, 1986.

[4] Marcelo Kallmann and Daniel Thalmann.Modeling behaviors of interactive objects forreal-time virtual environments. Journal of Vi-sual Languages & Computing, 13(2):177–195,2002.

[5] Jean-Luc Lugrin and Marc Cavazza. Makingsense of virtual environments: action represen-tation, grounding and common sense. In Pro-ceedings of the 12th international conference

on Intelligent user interfaces, pages 225–234.ACM, 2007.

[6] Marc Cavazza, Simon Hartley, Jean-Luc Lu-grin, and Mikael Le Bras. Qualitative physicsin virtual environments. In Proceedings of the9th international conference on Intelligent userinterfaces, pages 54–61. ACM, 2004.

[7] Rama Bindiganavale, William Schuler, Jan M.Allbeck, Norman I. Badler, Aravind K. Joshi,and Martha Palmer. Dynamically alteringagent behaviors using natural language instruc-tions. In Proceedings of the Fourth Inter-national Conference on Autonomous Agents,AGENTS ’00, pages 293–300, New York, NY,USA, 2000. ACM.

[8] Bram Pellens, Olga De Troyer, Wesley Bille,Frederic Kleinermann, and Raul Romero. Anontology-driven approach for modeling behav-ior in virtual environments. In Proceedingsof the 2005 OTM Confederated InternationalConference on On the Move to MeaningfulInternet Systems, OTM’05, pages 1215–1224,Berlin, Heidelberg, 2005. Springer-Verlag.

[9] Evangelos Kalogerakis, StavrosChristodoulakis, and Nektarios Moumoutzis.Coupling ontologies with graphics content forknowledge driven visualization. In VirtualReality Conference, 2006, pages 43–50. IEEE,2006.

[10] Fabio Pittarello and Alessandro De Faveri. Se-mantic description of 3d environments: a pro-posal based on web standards. In Proceedingsof the eleventh international conference on 3Dweb technology, pages 85–95. ACM, 2006.

[11] John T Balint and Jan M Allbeck. Macgyvervirtual agents: using ontologies and hierarchiesfor resourceful virtual human decision-making.In Proceedings of the 2013 international con-ference on Autonomous agents and multi-agentsystems, pages 1153–1154. International Foun-dation for Autonomous Agents and MultiagentSystems, 2013.

[12] Massachusetts Institute of Tech-nology. ConceptNet 5, 2013.http://conceptnet5.media.mit.edu.

[13] Kai Eckert, Mathias Niepert, Christof Nie-mann, Cameron Buckner, Colin Allen, andHeiner Stuckenschmidt. Crowdsourcing the as-sembly of concept hierarchies. In Proceedingsof the 10th annual joint conference on Digitallibraries, pages 139–148. ACM, 2010.

[14] Stuart Jonathan Russell, Peter Norvig, John FCanny, Jitendra M Malik, and Douglas D Ed-wards. Artificial intelligence: a modern ap-proach, volume 2. Prentice hall EnglewoodCliffs, 1995.

[15] Princeton University. About WordNet, 2010.http://wordnet.princeton.edu.

[16] Martha Palmer and Karin Kipper. Verb-Net: A Class-Based Verb Lexicon, 2013.http://verbs.colorado.edu.

[17] Martha Palmer and Mitch Marcus. PropositionBank, 2013. http://verbs.colorado.edu.

[18] Guido Van Rossum et al. Python programminglanguage. In USENIX Annual Technical Con-ference, 2007.

[19] Steven Bird, Ewan Klein, and Edward Loper.Natural Language Processing with Python.O’Reilly Media, 2009.

[20] Marcin Kalicinski. RapidXml.http://rapidxml.sourceforge.net/, 2009.

[21] Marcelo Kallmann and Daniel Thalmann.Modeling objects for interaction tasks. In Com-puter Animation and Simulation98, pages 73–86. Springer, 1999.

[22] Tolga Abaci, Jan Ciger, and Daniel Thalmann.Planning with smart objects. In WSCG (ShortPapers), pages 25–28. Citeseer, 2005.

[23] Isaac Dart and Mark J Nelson. Smart ter-rain causality chains for adventure-game puz-zle generation. In Computational Intelligenceand Games (CIG), 2012 IEEE Conference on,pages 328–334. IEEE, 2012.

[24] Tolga Abaci and Daniel Thalmann. Planningwith smart objects. In In The 13th Interna-tional Conference in Central Europe on Com-puter Graphics, Visualization and ComputerVision: WSCG, pages 25–28, 2005.

[25] Henry Lieberman, Hugo Liu, Push Singh, andBarbara Barry. Beating common sense into in-teractive applications. AI Magazine, 25(4):63,2004.

Documents

Populating Semantic Virtual Environmentsgaia/publications/Semantics-CASA-2014.pdf · mug CoffeeMug bottle catsupbottle KetchupBottle jug waterjug WaterJug implement writingimplement