17
PFG 2013 / 3, 0221 0237 Report Stuttgart, June 2013 © 2013 E. Schweizerbart'sche Verlagsbuchhandlung, Stuttgart, Germany www.schweizerbart.de DOI: 10.1127/1432-8364/2013/0172 1432-8364/13/0172 $ 0.00 Automatic Detection and Classification of Objects in Point Clouds using multi-stage Semantics HUNG QUOC T RUONG,HELMI BEN HMIDA,FRANK BOOCHS, Mainz,ADLANE HABED, CHRISTOPHE CRUZ,YVON VOISIN &CHRISTOPHE NICOLLE, Dijon, France Keywords: 3D processing, point clouds, knowledge modelling, ontology, scene object classification plications in many fields. Because such pro- cessing tasks are extremely laborious and dif- ficult when carried out manually, it is of the utmost importance that they benefit from the support or even be entirely performed 1 Introduction Object detection, recognition and reconstruc- tion from digitized data, typically images and point clouds, are important tasks that find ap- Summary: Due to the increasing availability of large unstructured point clouds obtained from laser scanning and/or photogrammetric data, there is a growing demand for automatic processing meth- ods. Given the complexity of the underlying prob- lems, several new methods try to use semantic knowledge in particular for supporting object de- tection and classification. In this paper, we present a novel approach which makes use of advanced al- gorithms to benefit from intelligent knowledge management strategies for the processing of 3D point clouds and for object classification in scanned scenes. In particular, our method extends the use of semantic knowledge to all stages of the processing, including the guidance of the 3D processing algo- rithms. The complete solution consists of a multi- stage iterative concept based on three factors: the modelled knowledge, the package of algorithms, and the classification engine. Two case studies il- lustrating our approach are presented in this paper. The studies were carried out on scans of the waiting area of an airport and along the tracks of a railway. In both cases the goal was to detect and identify objects within a defined area. With our results we demonstrate the applicability of our approach. Zusammenfassung: Automatische Detektion und Klassifikation von Objekten in Punktwolken unter Nutzung mehrschichtiger Semantik. Infolge der zu- nehmenden Verf ügbarkeit großer unstrukturierter Punktwolken aus Laserscanning und Photogram- metrie entsteht wachsender Bedarf für automati- sierte Auswerteverfahren. Angesichts der häufig hohen Komplexität der in den Punktwolken enthal- tenen Objekte stoßen rein datengetriebene Ansätze an ihre Grenzen. Es entstehen vermehrt Konzepte, die auf verschiedene Weise auch Gebrauch von der Semantik machen. Semantik und Algorithmik sind dabei oft eng miteinander verwoben und f ühren zu Limitationen in Art und Umfang der nutzbaren Se- mantik. Mit der vorgestellten Lösung werden Algo- rithmik und Semantik klar getrennt und mit den exakt auf diese Domänen zugeschnittenen Werk- zeugen behandelt. Deren prozedurale Verknüpfung f ührt dann zu einem neuen Verarbeitungskonzept, das eine nach unserem Kenntnisstand bislang nicht erreichte Flexibilit ät und Vielseitigkeit in der Nut- zung unterschiedlichster Semantiken besitzt und auch die Steuerung der Algorithmen integriert. Die iterative Gesamtlösung fußt auf drei Säulen, näm- lich dem modellierten Wissen, dem Pool der Algo- rithmen und dem Identifikationsprozess. Erreich- bare Resultate werden an zwei Beispielen doku- mentiert. Ein Beispiel befasst sich mit der Analyse von Punktwolken aus dem Bereich der Lichtraum- vermessung an Bahntrassen, das zweite mit Räum- lichkeiten in einem Flughafen. In beiden Fällen müssen bestimmte Objektarten aufgefunden und klassifiziert werden.

Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

  • Upload
    doannga

  • View
    225

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

PFG 2013 / 3, 0221–0237 ReportStuttgart, June 2013

© 2013 E. Schweizerbart'sche Verlagsbuchhandlung, Stuttgart, Germany www.schweizerbart.deDOI: 10.1127/1432-8364/2013/0172 1432-8364/13/0172 $ 0.00

Automatic Detection and Classification of Objects inPoint Clouds using multi-stage Semantics

HUNG QUOC TRUONG, HELMI BEN HMIDA, FRANK BOOCHS, Mainz, ADLANE HABED,CHRISTOPHE CRUZ, YVON VOISIN & CHRISTOPHE NICOLLE, Dijon, France

Keywords: 3D processing, point clouds, knowledge modelling, ontology, scene objectclassification

plications in many fields. Because such pro-cessing tasks are extremely laborious and dif-ficult when carried out manually, it is of theutmost importance that they benefit fromthe support – or even be entirely performed

1 Introduction

Object detection, recognition and reconstruc-tion from digitized data, typically images andpoint clouds, are important tasks that find ap-

Summary: Due to the increasing availability oflarge unstructured point clouds obtained from laserscanning and/or photogrammetric data, there is agrowing demand for automatic processing meth-ods. Given the complexity of the underlying prob-lems, several new methods try to use semanticknowledge in particular for supporting object de-tection and classification. In this paper, we presenta novel approach which makes use of advanced al-gorithms to benefit from intelligent knowledgemanagement strategies for the processing of 3Dpoint clouds and for object classification in scannedscenes. In particular, our method extends the use ofsemantic knowledge to all stages of the processing,including the guidance of the 3D processing algo-rithms. The complete solution consists of a multi-stage iterative concept based on three factors: themodelled knowledge, the package of algorithms,and the classification engine. Two case studies il-lustrating our approach are presented in this paper.The studies were carried out on scans of the waitingarea of an airport and along the tracks of a railway.In both cases the goal was to detect and identifyobjects within a defined area. With our results wedemonstrate the applicability of our approach.

Zusammenfassung: Automatische Detektion undKlassifikation von Objekten in Punktwolken unterNutzung mehrschichtiger Semantik. Infolge der zu-nehmenden Verfügbarkeit großer unstrukturierterPunktwolken aus Laserscanning und Photogram-metrie entsteht wachsender Bedarf für automati-sierte Auswerteverfahren. Angesichts der häufighohen Komplexität der in den Punktwolken enthal-tenen Objekte stoßen rein datengetriebene Ansätzean ihre Grenzen. Es entstehen vermehrt Konzepte,die auf verschiedene Weise auch Gebrauch von derSemantik machen. Semantik und Algorithmik sinddabei oft eng miteinander verwoben und führen zuLimitationen in Art und Umfang der nutzbaren Se-mantik. Mit der vorgestellten Lösung werden Algo-rithmik und Semantik klar getrennt und mit denexakt auf diese Domänen zugeschnittenen Werk-zeugen behandelt. Deren prozedurale Verknüpfungführt dann zu einem neuen Verarbeitungskonzept,das eine nach unserem Kenntnisstand bislang nichterreichte Flexibilität und Vielseitigkeit in der Nut-zung unterschiedlichster Semantiken besitzt undauch die Steuerung der Algorithmen integriert. Dieiterative Gesamtlösung fußt auf drei Säulen, näm-lich dem modellierten Wissen, dem Pool der Algo-rithmen und dem Identifikationsprozess. Erreich-bare Resultate werden an zwei Beispielen doku-mentiert. Ein Beispiel befasst sich mit der Analysevon Punktwolken aus dem Bereich der Lichtraum-vermessung an Bahntrassen, das zweite mit Räum-lichkeiten in einem Flughafen. In beiden Fällenmüssen bestimmte Objektarten aufgefunden undklassifiziert werden.

Page 2: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

222 Photogrammetrie • Fernerkundung • Geoinformation 3/2013

the analysis of the results and the object classi-fication. Knowledge is also used to support thechoice among different algorithms, the com-bination of these, and the adopted strategies.Our main contribution is a comprehensiveset-up to model and use knowledge from vari-ous domains and to let it interact and contrib-ute to all steps of an object detection process.This starts with inferring steps controllingalgorithms based on object and scene relatedknowledge in order to select adapted algorith-mic strategies and ends with a knowledge-based object classification and simultaneousextension and updating of the knowledge base(KB).Our paper is structured as follows. An over-

view of the relevant literature on the topic ispresented in section 2. Our proposed solutionis outlined in section 3. Knowledge buildingand knowledge management are discussed insection 4. Section 5 is dedicated to our knowl-edge-based strategy for object detection andclassification. This is followed by two case-studies involving real-world examples in sec-tion 6. Our conclusion and future work aregiven in section 7.

2 State of the Art

Early 3D processing techniques were eitherdata-driven or model-driven and often basedon statistical approaches. Many such methods,generally based on fitting techniques employ-ing local or global optimization and statisti-cal regression, often in conjunction with therandom sampling consensus (RANSAC) algo-rithm for robustness, have attracted and con-tinue to attract significant attention (NURUN-NABI et al. 2012). However, many data-drivenmethods, in particular those relying on thesegmentation of data into primitive shapes,are known to be highly sensitive to noise aswell as to local deformations (TARSHA-KURDIet al. 2007). Model-driven approaches, whileless sentitive to local irregularities, require re-liable geometrical models which are often dif-ficult to obtain especially when dealing withcomplex scenes (HUANG et al. 2011). Howev-er, despite of the robustness and efficiency ofmany such processing algorithms, they cannotresolve ambiguities when assigning semantic

through – numerical algorithms. Most exist-ing algorithms are data-driven and rely bothon extracting discriminating features from thedataset, and also on numerical models char-acterizing either geometric, e.g. flatness androughness, or physical, e.g. colour and tex-ture, properties of the sought objects. The nu-merical model and the extracted features arecombined to form a decision. These methodsare generally affected by the nature of datasetand the behaviour of the algorithms. Instead,it is up to the user to decide, often subjective-ly but generally based on one’s experience,which algorithms are better suited for any par-ticular kind of objects and/or the datasets. Itgoes without saying that the success of theseapproaches is significantly compromised bythe increasing complexity of the objects andthe decreasing quality of the data. Further-more, relying on only a restricted set of fea-tures and individual algorithms to process thedata might lead to unreliable results. One wayto overcome the drawbacks of the data-drivenapproaches is to resort to the use of additionalknowledge. For instance, knowledge charac-terizing the objects to be detected with respectto the data at hand or their relationships toother objects may generally be derived before-hand. Such knowledge not only allows for asystematic characterization and parameteriza-tion of the objects but also supports the quan-tification of the effectiveness of the algorithmsto be used.The work presented in this paper precise-

ly aims at efficiently exploiting additionalknowledge in the processing of point clouds.In particular, our work bridges semantic mod-elling and numerical processing strategiesin order to benefit from knowledge in any orall parts of an automatic processing chain.Our approach is based on structuring variousknowledge components into ontology contain-ing a variety of elements taken from multiplesources such as digital maps and geographicalinformation systems. However, we do not onlyrely on information about objects potentiallypresent in the scene, i.e. their characteristics,a hierarchal description of their sub-compo-nents, and spatial relationships, but also on thecharacteristics of the processing algorithms athand. During processing, the modelled knowl-edge guides the algorithms and supports both

Page 3: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

Hung Quoc Truong et al., Automatic Detection and Classification 223

jects. HEDAU et al. (2010) located objects of aspecific geometry in an indoor scene. The de-tector computes the 3D location of an objectalong with its orientation using the geometryand the mutual arrangement of the object andthe scene as well as a single image. Althoughquite useful for scene understanding, such anapproach is limited to dealing with the case ofa single object in the scene.Localizing multiple objects in a scene has

proved to be a difficult and challenging prob-lem that often requires considering spatialand/or semantic relationships between objects.One way to address such problem is to resortto the use of semantic knowledge. The abilityto exploit semantic knowledge is limited whenthe number of objects becomes large as it re-quires an adequate way of structuring prop-erties of objects and relationships betweenthem. In some approaches, this is carried outthrough a hierarchical description of the at-tributes of each object and those of the scene.For instance, TEBOUL et al. (2010) segmentedbuilding facades using a derivation tree rep-resenting the procedural geometry, and con-nected knowledge representation by gram-mars with machine learning. Furthermore,this approach proposed a dynamic way of per-forming a search through a perturbation mod-el. RIPPERDA & BRENNER (2006) also extractedbuilding facades using a structural descriptionand used reversible jump Monte Carlo Mar-kov chains (GREEN 1995) to guide the applica-tion of derivation steps during the building ofthe tree. Another application of using knowl-edge is to infer the missing parts with detectedparts. For example, PU & VOSSELMAN (2009)reconstructed building facades from terrestri-al laser scanning data. Knowledge about size,position, orientation and topology is used torecognize features, e.g. walls, doors and win-dows, and also to hypothesize the occludedparts. In a similar work (SCHOLZE et al. 2002), amodel-based reconstruction method was pro-posed. In this method, semantic knowledgeis also used to infer missing parts of the roofand to adjust the overall roof topology. Theseapproaches use knowledge to evaluate resultsof numerical processes, but do not integrate itinto the processing as such.Since the use of knowledge is also useful

within the processing chain, other works have

labels to objects in a scene. Such ambiguitiescan be efficiently dealt with when integratingsemantic knowledge with data processing (seefor instance BUSCH et al. 2005, HELMHOLZ etal. 2012).As far as feature-based object recognition is

concerned, some of the approaches have beenused both in 2D images and in 3D data. Forinstance, VOSSELMAN & DIJKMAN (2001) madeuse of higher level 3D features such as simpleroof shapes, i.e. flat roofs, gable roofs and hiproofs, which are generally present in buildingstructures. The authors relied on the use ofthe 3D Hough transform to detect planar rooffaces in point clouds, and hence reconstructedthe scene in a higher level of abstraction. Theirsegmentation strategy was based on detectingintersecting lines and “height jump edges” be-tween planar faces. PU & VOSSELMAN (2006)used segmentation and feature extraction al-gorithms to recognize building componentssuch as doors, walls, windows from pointclouds. Based on constraints on these compo-nents, they were able to determine the catego-ries to which each extracted feature belonged.However, the results were not satisfactory ifthe data did not clearly describe an object dueto either the presence of noise or occlusions.An important processing approach, which

partly solves some limitations of data-drivenmethods, makes use of artificial intelligencetechniques to enforce the robustness of theprocessing and to allow for the recognition ofmore complex objects. A typical work in thiscategory is the one presented by ANGUELOV etal. (2005) in which object segmentation andclassification are obtained by a learning pro-cedure employing Markov random fields andquadratic programming. Such methods gener-ally require a large number of training data-sets in order to obtain good results.Building on the above results, significant

improvements have been achieved in 3Ddata processing by additionally incorporat-ing semantic aspects. The method proposedby CANTZLER et al. (2002) relies on a seman-tic network defining the relationships be-tween objects in a scene such as walls beingperpendicular to the floor and rules which theextracted features must obey. However, prob-lems arise when dealing with complex indoorscenes possibly including many types of ob-

Page 4: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

224 Photogrammetrie • Fernerkundung • Geoinformation 3/2013

changes in the environment. The method usesexpert knowledge that is explicitly formulat-ed by rules. Depending on a given task, thesystem selects a sequence of relevant imageprocessing tools and adjusts their parametersto obtain results with some predefined qual-ity goals. Results on object contour detection,carried out in various conditions, show thebenefit of taking into account expert knowl-edge for adjusting the parameters of variousimage processing operators. However, knowl-edge in these approaches has not been fullyexploited: other capabilities, such as process-ing guidance, have not been explored.Knowledge-based methods have the abil-

ity to not only manage and exploit geometricand/or topological relations between objects,but also to embed scene structures into se-mantic frameworks. Such knowledge is oftentranslated into geometric constraints that canbe used to improve object detection. Variouskinds of knowledge-based methods have ap-peared for applications in object detection,demonstrating a clear and increasing interestfor such approaches. This expresses a certainexpectation about the role of semantics in fu-ture processing solutions. A step forward to-wards benefiting from the use of knowledge insuch solutions would be a comprehensive ap-proach that exploits knowledge in all process-es, i.e. in guiding the numerical processing,evaluating, and classifying detected objects.Such an approach is proposed in this paper.

3 System Overview

When attempting to build an integrated ap-proach with knowledge directing all parts ofthe process, several aspects have to be consid-ered. At first, the whole process needs to beincorporated into a knowledge managementtool. Therefore, it is necessary to have a pro-cess guiding all individual steps, leading froman initial situation to the final result. Insidethis overall process, one part has to cover thenumerical processing and another part has tohandle the processing results. This latter parthas to evaluate the results, draw conclusionsabout what has been found, and also what thismeans for further processing. This includesthe need to update the content of the database

focused on knowledge management withincomputation. For example, MAILLOT & THON-NAT (2008) used a visual concept ontologycomposed of visible features such as spatialand relationships, colour and texture to recog-nize objects by matching numerical featuresand visual concepts.DURAND et al. (2007) pro-posed a recognition method based on an on-tology which has been developed by expertsof the domain; the authors also developed amatching process between objects and theconcepts of ontology to provide objects witha semantic meaning. Interest also grows in de-veloping knowledge-based system for variousdata processing tasks such as data segmen-tation and registration but also for scene un-derstanding and interpretation. For instance,TRINDER et al. (1998) proposed a knowledge-based method which automatically extractsroads from aerial images. The description ofroads includes radiometric, geometric prop-erties and spatial relationships between roadsegments, all formulated as rules in PROLOG.The knowledge base stores structures of roadsand relationships between them extractedfrom images. By using topological informa-tion of road networks, the method is able topredict missing road segments. However, theused semantic model is limited to one type ofobjects (roads). GROWE & TONJES (1997) pre-sented a knowledge-based approach for theautomatic registration of remotely sensed im-ages. Knowledge is explicitly represented us-ing semantic nets and rules. Prior knowledgeabout scene objects and a geographic informa-tion system (GIS) are used to select and matchthe best set of features. MATSUYAMA (1987)proposed a method for automatic interpreta-tion of remotely sensed images. The approachemphasises the use of knowledge managementand control structures in aerial image under-standing systems: a blackboard model for in-tegrating diverse object detection modules, asymbolic model representation for 3D objectrecognition, and integration of bottom-up andtop-down analyses. Two kinds of knowledgeare considered in their expert system: knowl-edge about objects and knowledge about anal-ysis tools, e.g. image processing techniques.ROST & MÜNKEL (1998) proposed a knowl-edge-based system that is able to automati-cally adapt image processing algorithms to

Page 5: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

Hung Quoc Truong et al., Automatic Detection and Classification 225

or updating already existing elements beforerunning the next stage of processing. The pro-cess ends either when all objects are detect-ed and classified or in absence of any changein the annotation process for a predeterminednumber of iterations (whose values remain atthe discretion of the user).Objects are represented by a point cloud or

possibly data from other sources. Such datadepend on many factors such as the type ofthe sensing system and the measuring/captur-ing conditions. This representation has to behandled by algorithms which also depend onmany additional factors, e.g. noise, other datacharacteristics, and already existing objects.Strong interrelationships among these fac-tors have a direct influence on the efficiencyof the detection and classification processes.The more flexibly these factors and interac-tions are controlled, the better results are to beexpected. For these reasons, knowledge fromdifferent domains is required and the qualityof these various knowledge sets has signifi-cant impact on the results (BEN HMIDA et al.2011). Our solution relies on four main knowl-edge categories to construct the core of theKB: the scene knowledge, the spatial knowl-edge, the data knowledge and the algorithmknowledge. Each field of knowledge is rep-resented by circles in Fig. 1, and relationshipsbetween these concepts are represented by di-rected edges. The scene knowledge containsinformation related to the content of the sceneto be processed, important characteristics ofobjects, e.g. geometric features, appearance

with the objects that have been found. This da-tabase has to be managed in a way that everydetected object is transferred from some ini-tial state to a final one within the frameworkof a rule-based system.The main components of our system are il-

lustrated in Fig. 1. The adopted strategy is ap-plied to the analysis of 3D point clouds, butcan also be extended to other data sources. Itis based on explicitly formulating prior knowl-edge of the scene, on spatial relations of ob-jects and on processing algorithms. It is a mul-ti-stage concept based on three components:the modelled knowledge (Fig. 1 left), the pack-age of algorithms (Fig. 1 top-right) and theclassification engine (Fig. 1 bottom-right). Inthe initial stage, the available knowledge istransferred into a KB. Starting from this in-itial stage, an update process, which invokesthe algorithms and the classification engine, islaunched. Here, the algorithm selection mod-ule (ASM) guides the processing via selectinga set of processing algorithms based on the na-ture of the target objects, and produces newelements which can be identified. These ele-ments are passed on to the classification en-gine, which, based on the existing knowledgeexpressed in the ontology, attempts to applySemantic Web Rule Language (SWRL) (HOR-ROCKS et al. 2004) rules and description logic(DL) constraints in order to identify the natureor object category of the elements. This clas-sification handles the output obtained from thealgorithms. The result of the classification stepupdates the KB by inserting newly classified

Fig. 1: System architecture.

Page 6: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

226 Photogrammetrie • Fernerkundung • Geoinformation 3/2013

As an additional technology, SWRL isavailable. It is a program which infers logicfrom the KB to derive a conclusion based onobservations and hypotheses. For instance, thefollowing rule (2) asserts that a detected ele-ment of class Geometry which has a distancefrom DistanceSignal of 1000 m, has a heightequal to or greater than 4 m, and which hasa linear structure, will be inferred as a Main-Signal.

Geometry(?x) ^ hasLine(?x, ?l)^ line(?l) ^ DistanceSignal (?y)^ DistanceFrom(?x, ?y, ?dis)^ swrlb:GreaterThan (?dis,1000)^ hasHeight(?x, ?h) ^ swrlb:GreaterThan(?h, 4) → MainSignal(?x) (2)

Variables are indicated by the standardconvention in which they are prefixed by aquestion mark symbol (e.g. ?x). An impor-tant SWRL feature is its ability to allow user-defined built-ins, i.e. user-defined predicates,such as, swrlb:equal and swrlb:lessThan, thatcan be used in SWRL rules, which help in theinteroperation of SWRL with other formal-isms and provide an extensible infrastructurefor knowledge-based applications.The techniques mentioned above serve as

tools to formalize the identified and acquiredknowledge. As explained, the actual solu-tion handles four separate domains: the sceneknowledge, the spatial knowledge, the dataknowledge and finally the algorithm knowl-edge. All these knowledge domains have theirrepresentations in the domain ontology andparticipate in the whole processing cycle. Thegraphical structure of the top-level concepts ofthe ontology is given in Fig. 2, where we findfour main concepts, called Classes in the nextparagraphs. In order to proceed, these classeshave to describe the different actors used dur-ing the detection and the classification processin a structured hierarchical way. The mainfactors that have to be modeled are: processingalgorithms, point cloud data or image resourc-es, and target objects with their geometry andcharacteristics. The class DomainConceptrepresents the different objects found in thetarget scene and can be considered the mainclass in this ontology. This class is further spe-cialized into classes representing the different

and texture, and the geometry that compos-es its structure. Such knowledge is not onlyimportant for identification and classificationprocesses but also supports the selection andguidance of the algorithms. The spatial knowl-edge models the relationships between objectsin the scene. It is an important factor for theclassification process because it supports anobject’s state disambiguation based on its re-lationship with the common environment. Thedata knowledge expresses important charac-teristics of the data itself. Finally, algorithmknowledge characterizes the behaviour of al-gorithms and determines which purpose theyfulfil, which input is expected, which output isgenerated, and which geometries they are de-signed for. Based on this knowledge, a dynam-ic algorithm selection is possible allowing fora dynamic adaptation to processing situationsgiven from other domains (Fig. 1).

4 Building Knowledge

The concept requires efficient methods forknowledge representation, management andinteraction with algorithms. Efficient knowl-edge representation tools are available fromthe semantic web framework, which express-es knowledge through the web ontology lan-guage (OWL) (BECHHOFER et al. 2004). The en-capsulation of semantics within OWL throughdescription logics (DLs) axioms has made itan ideal technology for representing knowl-edge from almost any discipline. We use theOWL to represent expert knowledge aboutthe scene of interest and for algorithmic pro-cessing. With OWL ontology, we are able todescribe complex semantics of a scene. Forinstance, the statement “A railway track is alinear feature with two linear structures run-ning parallel to each other within a certaindistance” can be expressed through logicalstatements. Likewise, we define the semanticsof algorithmic processing within OWL. Forexample, the CheckParallel algorithm is de-signed for detecting a Signal, which containsparallel linear structures.

CheckParallel $ isDesignedFor.Signal Signal.hasParallel.{true} (1)

Page 7: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

Hung Quoc Truong et al., Automatic Detection and Classification 227

a height above ground between 4 m and 6 m; itcomprises a vertical structure that connects toa cube on the ground; at the top, there are twoparallel linear structures; and along the tracks,the distance from an electric pole (type 2) to asignal column is 1000 m within the bounds ofa predefined tolerance, for example ± 0.5 m,depending on the quality of data, measure-ment uncertainty and noise.Knowledge about 3D spatial relationships is

used to enhance the classification process. In-formation about how objects are scattered in a3D scene makes the detection and classifica-tion easier. For instance, given the detectionof a wall, the probability of detecting doors orwindows is higher. 3D spatial knowledge in-cludes standards like the 3D topologic knowl-edge, 3D metric knowledge and 3D process-ing knowledge. Spatial knowledge containsrelationships such as disjoint, contain, inside,cover, equal, overlap. The terms represent thegeometric relations between components of anobject or between objects. Each of the men-tioned types of spatial knowledge contains avariety of relations modelled in the ontologystructure. The top level ontology is designedto include the topological relationships. Thisis used to enrich an existing KB to make itpossible to define topological relationships be-

detected objects. The other classes are used toeither describe the object geometry throughthe Geometry class by defining its geometriccomponent or to describe its characteristicsthrough the Characteristics class. Ultimately,the system selects algorithms for the process-ing chain based on their compatibility with theobject geometry and characteristics read fromthe Algorithm class.Knowledge of different domains is acquired

from the relevant sources. Domain experts arethe most reliable knowledge sources. How-ever, information sources such as CAD, GISdata, or other available documents in the caseof detailed input can also be used to extractknowledge. In our case, the algorithm knowl-edge is acquired by experts in numerical pro-cessing and the scene knowledge is acquiredfrom existing digital documents as a CADdrawing or GIS dataset.The scene knowledge is described in the

schema of ontology and includes semanticsof the objects such as properties, restrictions,relationships between objects and geometries.The more information about an object is cre-ated and used, the more accurate the detectionand classification process is. An example ofdefining a semantic object is the following: anelectric pole (type 2) along a railway track has

Fig. 2: General ontology schema overview.

Fig. 3: Metric rules.

Page 8: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

228 Photogrammetrie • Fernerkundung • Geoinformation 3/2013

the same quality in other settings, it is impor-tant to assess the risk-benefit factors of everyalgorithm with various possible settings. Theclass RiskBenefits includes all identified risksand benefits. The class contains the four in-stances Distinct, Illusive, Noise, and Detec-tionError. The instances are the risks or thebenefits with some influence on the algorithmsor at least on the parameters of the algorithms.Note that the classes above form an ontology,which can also be used for other domains,such as creation of semantic annotated mapsby a mobile robot, mobile mapping of streetfurniture or forests, and semantic place label-ling from airborne laser data.Knowledge modelling and human interac-

tion: The process of modelling knowledge re-quires the user to collect “information” fromrelated domains. This process is currently car-ried out manually. “Collecting information”can imply extracting knowledge from varioussources or filling the ontology with objectscorresponding to specific classes, object prop-erties, algorithms, algorithmic properties, etc.Some of these tasks such as data extractionfrom technical documents have the potentialto be done automatically using specializedprocessing tools borrowed from the documentanalysis community (TANG et al. 1996). De-pending on the available tools and target ap-plication including its related domains, theknowledge modelling process may take a sin-gle person from one to several days of work(data extraction and ontology modelling) in-cluding interaction with domain experts andmodelling all relationships. Examples for thelength of this process and the amount of hu-man interaction are given in section 6. How-ever, although such figures may seem signifi-cant, one has to keep in mind that knowledgemodelling for a given application is done onlyonce and used for processing numerous pointclouds with virtually very little or no chang-es to the ontology. Other approaches such asthose based on machine learning would alsorequire a significant amount of preparation toextract training data and carry out annotationsgenerally from large amounts of scans, whichmay require at least as much time as model-ling an ontology. This is especially true whendealing with special environments such asrailways or industrial plants, which are often

tween objects in a specific case. Metric knowl-edge presents important information, becausethe different elements fulfil very strict metricrules which can also be used for the detectionand classification process. In the example ofscenes specific for railways, Fig. 3 shows anontological structure supported by the SWRLrules which can automatically specify that anobject with certain characteristics that has adistance of 1000 ± 0.5 m from Distance sig-nal, can be a Main signal.Regarding the numerical processing algo-

rithms, effectiveness depends on the quality ofthe data (resolution, noise), the characteristicsof the object that needs to be detected, or oth-er factors depending on a specific case. Algo-rithms are modelled under specialized class-es of algorithms, sharing certain taxonomi-cal and relational behaviour. The hierarchicalrepresentation of the algorithms is addressedby dividing the algorithms according to thecontext in which they are executed. Classes,including GeometryDetection, Appearance-Detection, ImageProcessing and NoiseReduc-tion, follow such a hierarchal structure. Like-wise, relational semantics are represented byproperties. In broader terms, there are twotypes of relationships: one which applies tothe geometry that an object in Domain Con-cept possesses, and one which relates distinctobjects. The first category of relationshipsis used for detecting geometries. The objectproperty isDesignedFor maps algorithms tothe respective geometries. For example: Line-Detection1 isDesignedFor lines. The secondset of algorithm properties hasInput/hasOut-put are inter-relational properties to connectalgorithms based on the compatibility of out-put from an algorithm to the inputs of others.It is necessary to adapt the processing pa-

rameters depending on data, scene and char-acteristics of objects to enable a well focusseddetection and classification. The concept al-lows for these interactions, as it is able to au-tomatically change the strategy based on acompromise between quality and risks. A partof the KB is dedicated to risk-benefit factorsthat have an influence on the algorithms. Thiswas derived from “trial and error” simulationswith every individual algorithm. Since an al-gorithm may perform best with some givenparameters in one setting, and fail to deliver

Page 9: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

Hung Quoc Truong et al., Automatic Detection and Classification 229

es. At the beginning of each iteration, the con-tent of the KB is used to detect new features,may it be a new object or a component of it.These new geometric features are passed onto the KB in order to extend the KB for thefollowing classification. This classification isguided by the content and the structure of theKB, which has reasoning capabilities based onproperty restrictions or rule languages (suchas SWRL) and refines the actual content. Thisrefined content is used in the next iteration.The process is repeated until all entities havebeen completely annotated and meet the fol-lowing convergence conditions: (1) All objectsdefined on the KB are detected and annotat-ed (simple change detection). (2) A predefinednumber of iterations without refinement forany entity has been reached.

5.2 Usage of Algorithms guided byKnowledge

Object related knowledge does not influenceclassification only, but also algorithmic pro-cessing. Different algorithms are designed fordifferent contexts. The differences can be ad-dressed and properly modelled. The KB holdsthe algorithm knowledge in the class Algo-rithm. This class is related to other classes in-side the KB, such as objects. This allows forthe modification of the role of algorithms, e.g.parameter, sequences, corresponding to theKB details. The interrelationships among dif-ferent algorithms are mapped through com-patibility of their input and output character-istics (Fig. 4). Fig. 4 illustrates that more thanone path from an initial algorithm to a desiredone exist. We use the well-known Djikstra’salgorithm (DJIKSTRA 1959) for finding theshortest path in the graph leading to the de-sired algorithm. This approach has the advan-

subject to regulations, which require a certainlevel of expertise.

5 Knowledge Guidance for theObject Detection andClassification Process

5.1 Knowledge-driven Strategy

The knowledge formalization is based onthe understanding of the underlying seman-tics and processes it using technologies suchas OWL. The top-level ontology presents themain knowledge framework and holds genericsemantics for all addressed domains. Regard-ing the case studies, this framework containsthe scene, object geometries, spatial relationsand algorithms and originates from existingknowledge sources, such as information sys-tems, or guidelines of the Deutsche Bahn (DB,German Railways), and an extensive study ofthe sample scenarios. Obviously, quality andcompleteness of such formalized knowledgestrongly influence the quality of the results,and have to be adapted to the individual ap-plication. In the general case, such a frame-work only contains the abstract and generalknowledge of object categories, the structureof a scene, geometric relations between ob-jects, the structure of data, the nature of algo-rithms and the potential relationships betweenthese components. In a simpler scenario withspecific information about potentially exist-ing objects, for example known through CADor Industry Foundation Class (IFC) files, thedetection strategy can be guided more easi-ly and may be reduced to a change detectionproblem.Starting from the initial situation, the pro-

cess iteratively updates the KB at certain stag-

Fig. 4: Algorithm sequences extracted from the graph.

Page 10: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

230 Photogrammetrie • Fernerkundung • Geoinformation 3/2013

The use of spatial relations (Metric, Topo-logic, and Directional) between the detectedentities is one possible extension of such sim-ple geometry (BEN HMIDA et al. 2012). It onlyrequires the appropriate algorithms and thenprovides the result for the topological op-eration. ZLATANOVA et al. (2002) gives a sur-vey of different 3D models and relations. Thespatial operators available for a spatial querylanguage consist of 3D topological operators(BORRMANN & RANK 2008), 3D metric oper-ators (BORRMANN et al. 2009), 3D directionaloperators (BORRMANN & RANK 2009) and fi-nally 3D Boolean operators (BORRMANN et al.2006). In a simplified example, the followingrule specifies that a “Building” that overlaps a“Railway” (both defined in the ontology), is a“RailwayStation”.

Building(?b) ^ Railway(?r) ^ topo:overlaps(?b, ?r) → RailwayStation(?b) (4)

Fig. 5 shows our process guided by variousknowledge domains in object detection andclassification. In this figure, object classes arereferred to as A, B, C, D, and E. We recall herethat the process iterates until convergence, i.e.all objects are labelled, or stopping conditions,i.e. maximum number of iterations without re-finement, are met.

6 Case Study

Two case studies illustrating our approach arepresented in this section: Deutsche Bahn (DB)and Frankfurt Airport (Fraport). The goal inboth cases was to detect and check relevantobjects inside a defined work area.

tage of preventing the sequence of algorithmsto form an endless loop and allows for findingan appropriate sequence.At any given iteration, each entity in the KB

may be assigned a new label: identified, un-known or ambiguous. This label may change inthe course of an iteration. Based on this infor-mation, the ASM chooses the best algorithmfor generating new characteristics, which willhelp in the next classification step. This selec-tion also integrates the choice of an optimalsequence out of several possible ones (routes)of algorithms (or nodes). Various knowledgecomponents can have an impact here, e.g. data(noise, point density, point of view), object(size, shape, orientation), and scene (possibleobjects, neighbourhood).

5.3 Classification Step

As discussed in section 4, the ontology sche-ma holds the semantics of the objects such asits geometries and other spatial characteris-tics. This information supports identifyingdetected entities and is used in the inferenceprocess. The complexity of the required rulesdirectly depends upon the complexity of theprocessed situation. In simple cases, even verysimple rules are sufficient to produce a cor-rect result. However, this concept also allowsto handle more complex situations. A simpleclassification of an entity (Geometry) based ona SWRL rule annotates an electric pole (type2), as found along railway tracks:

Geometry (?x) ^ hasHeight(?x,?ht) ^ swrlb:greaterThan(?ht, 4) ^swrlb:lessThan(?ht, 6)→ ElectricPole2(?x) (3)

Fig. 5: Knowledge-driven method for object detection and classification process.

Page 11: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

Hung Quoc Truong et al., Automatic Detection and Classification 231

these will require more iterations and addi-tional rules in order to achieve a stable clas-sification.The aspect of quality can also be incorpo-

rated into the concept. This may either be re-alized by thresholds modelling data noise orby changing the strategy of selecting a paththrough the graph. The latter case handlessituations in which features are sensitive tonoise and corresponding algorithms mightfail. For instance, an electric pole (type 2)is represented by parallel vertical supports.ASM searches and selects the relevant algo-rithm – CheckParallel – from the algorithmiclibrary. This library is described by a graph(see Fig. 6) representing all allowed connec-tions, based on input and output between al-gorithms. Based on some data quality thresh-olds, the sequence may or may not includepre-processing algorithms (e.g. NoiseReduc-tion). On the path from the starting algorithm(in this case, PositionDetection) to the desiredalgorithm (CheckParallel), ASM infers andinvokes all concerned algorithms based on thehasInput/hasOutput property. Segmentation,NoiseReduction and LineDetection1 are theselected ones. Afterwards, ASM links themtogether to create a proper sequence: it thenlooks as follows (result illustrated in Fig. 7c):PositionDetection → Segmentation →

NoiseReduction→ LineDetection1 → Check-Parallel.The execution of this sequence provides a

list of recognized object entities, which then

6.1 Object Classification in theRailway System (DB)

In the DB example, we used scans in the vi-cinity of the tracks. Data were captured fromLIMEZ III, a surveying train equipped witha laser scanner mounted at its front-end. Twonon-domain experts worked for approximate-ly 20 days to build the DB example ontology.They were supported by experts of the Ger-man railway (DB). The available knowledge isused to classify the entities as:• Identified: as soon as a feature value is inthe range of a class. This annotation has tobe supported by subsequent classificationsand remains valid as long as no conflict isdetected.

• Ambiguous: as soon as a feature value satis-fies more than one class. Both annotationsare stored and have to be separated by sub-sequent classifications and remain doubtfulas long as no separation is possible.

• Unknown: indicates that a feature valuedoes not match any existing class. Furtherprocessing requires the ASM to select otherproperties in order to continue the process.

Although a simple example, this neverthe-less shows the general logic, which can thenbe further extended with other considerationsamong entities. Success is directly related tothe ability to detect entities and the signifi-cance of the feature values chosen. Less char-acteristic features can also be used. However,

Fig. 6: Graph of possible algorithmic paths generated by ASM and used for detecting objects inboth DB and Fraport cases.

Page 12: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

232 Photogrammetrie • Fernerkundung • Geoinformation 3/2013

tialization step shown in (Fig. 7b). All entitiesinclude possible objects in the scene but alsonoise and objects of no interest. The true num-ber of railway objects was 13 (Tab. 2). Withthe second iteration, the process tries to refinethe results and classify the objects. At the end,10 out of 13 real railway objects were correct-ly classified, 50 entities which represent non-railway objects were classified as unknown,and 3 railway objects could not be unambig-uously classified with the rules implemented.The results in Fig. 7d were obtained by oursoftware system. Computation took about 10minutes on an Intel Xeon 2.4 GHz with 12GRAM. Note that our software is a prototypeand has not been optimized for performance.In our experiments, we used the “shortestpath” criterion from starting the algorithm tothe desired algorithm in order to find the opti-mal algorithm sequence. Our system assumesequal weights for all edges in the algorithmsgraph, i.e. factors that are intrinsic to algo-rithms such as time and memory requirementsare not taken into account at this stage. Re-sults can be improved by applying more com-plex rules, possibly using additional geomet-ric constraints such as line or plane orienta-tion, angle between lines or number of linesexpressed in the rule (5):

Geometry(?x) ^ hasLine(?x, ?l) ^ line(?l) ^DistanceSignal (?y) ^ DistanceFrom(?x, ?y,?dis) ^ swrlb:GreaterThan (?dis,1000) ^hasHeight(?x, ?h) ^ swrlb:GreaterThan(?h, 4) ^ hasVerticalLineNumber(?x, ?vn)^ swrlb:lessThanOrEqual(?vn, 2) ^hasObliqueLineNumber(?x, ?on) ^swrlb:equal(?on, 0) → MainSignal(?x) (5)

In order to relate the classification to hu-man interpretation the point cloud was pre-sented to test persons. They identified 8 of13 railway objects based on a visual inspec-tion of the cloud and without taking into ac-count topological or descriptive knowledge.This just shows the limited representation ofobjects inside such types of point clouds. Onemajor reason for the poor quality of the pointcloud is the fact that only the side of the objectfacing the tracks is captured due to the scan-ner on the train. However, this also shows theusefulness of additional knowledge.

are classified. Further sequences are used toimprove the quality and to reduce the ambigu-ity within the results (Fig. 7d). Iterations arerepeated until a complete annotation for allentities is performed. The convergence con-ditions are applied to terminate the detectionprocess for entities.We have processed a 500 m section along

the railway. Out of 12 algorithms modelled inthe KB (Fig. 6), the following ones were usedby the system to classify objects (Tab. 1): Posi-tionDetection, Segmentation (cropping pointssurrounding a given position), Dimension-Approximation, NoiseReduction, LineDetec-tion1 (using RANSAC) and AngleCalculation.Knowledge was collected carefully in order toprovide a reliable KB related to objects, scene,the nature of the data, algorithms and relation-ships between them. The base was progres-sively extended with new knowledge gainedeither from the analysis of the detected geom-etries or from classification results. Initially,17 classes were defined as subclasses of the 5classes in Tab. 1. These classes represent dif-ferent types of signals and electric poles thatcan be found along the tracks and are of in-terest to our study. A total of approximately500 geometries such as 3D line segments, an-gles and points of interest were recognized, 10SWRL rules are used and 63 entities (possibleobject positions) were identified after the ini-

Tab. 1: Classes and properties used in DB sce-nario.

Class Object properties

Electric pole(type 1)

Vertical structure, height,perpendicular lines

Electric pole(type 2)

Vertical structure, height,parallel lines

Electric pole(type 3)

Vertical structure, height,oblique line

Main signal(mechanical)

Vertical structure, height,perpendicular lines, parallelline, number of lines

Main signal(light)

Vertical structure, height,perpendicular lines, parallelline, oblique line, number oflines

Page 13: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

Hung Quoc Truong et al., Automatic Detection and Classification 233

6.2 Object Detection inside AirportBuilding (Fraport’s Waiting Area)

In the second case, we used scans from an en-vironment inside the airport buildings, typi-cally a waiting area. Changes in the techni-cal infrastructure were of main interest. Datawere obtained from classical terrestrial laserscanning. The Fraport scenario is differentfrom the DB test example because a data baseof expected objects in the scene exists andcan be used as a-priori knowledge. Two per-sons worked for about 10 days to fill the on-tology with knowledge such as properties of

Results obtained after the processing alongthe tracks are shown in Tab. 2. Remarkableto see that the only failures using knowledgewere Main Signals (light) that could also notbe recognized by visual inspection. This ismainly caused by the poor quality of the data,especially in terms of point density, whichmade such structures hardly visible and undis-tinguishable. Some objects, the type 2 electricpoles, were successfully identified using theautomated detection and classification where-as visual inspection failed.

Fig. 7: (a) Point cloud representation of a section of a railway; (b) Results after executing the ini-tialization step, projecting the point cloud to the ground plane, rectangles denote possible objectpositions; (c) Results from detecting 3D lines of a signal and electric pole (type 3) along the rail-way; (d) Positions of objects and annotation results after the first iteration.

Tab. 2: Experiment in a section of DB railway, comparison result between two approaches: Visualinspection using the standard software tool of DB, and knowledge-based data processing.

Object Visual inspection Knowledge-based data processing

Electric pole (type 1) 1/1* 1/1

Electric pole (type 2) 2/4 4/4

Electric pole (type 3) 1/1 1/1

Main signal (mechanical) 2/4 2/4

Main signal (light) 2/3 2/3

Total 8/13 (61.53 %) 10/13 (76.92 %)

(*) Number of detected objects over number of ground-truth objects.

Page 14: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

234 Photogrammetrie • Fernerkundung • Geoinformation 3/2013

tical planes (Fig. 8a, b). This was possible bya vertical projection of the point cloud fol-lowed by Hough Line detection to locate thestatic objects’ position on the ground plane.VerticalProjection and HoughLineDetectionare included in PositionDetection algorithm.Points with a vertical projection in the vicin-ity of these lines were used to define segmentscorresponding to vertical planes. The follow-ing step was used to verify walls, separationpanels or advertising panels defined in thedata base based on their particular length andheight (Fig. 8e). However, there are also manymoveable objects like chairs, tables, counters,or trash bins, which also need to be detectedto update the KB. All objects already availablefrom the first validation phase gave a geomet-ric and semantic frame helping to support thedetection of unknown moveable objects. Forexample, chairs were searched for in a specificarea defined within a certain distance from thewall (5 m in our experiments) and 0.7 m abovethe floor. Note that the reference frame of ourpoint cloud is attached to the floor such thatthe latter is simply determined by fitting a hor-izontal plane (initialized at height Z = 0) usingthe PlaneDetection algorithm. We focussedon detecting walls in the border region of thecheck-in area. The static structures obtained

objects, scene, nature of data and characteris-tics of buildings. The data sources were CADplans, related documents from the experts andobservations from the real scene. The pro-cess first attempted to validate the presenceof static objects such as walls, and separationor advertising panels in the point cloud thatwere supposed to exist according to the database (Tab. 3). After that, moveable objects likechairs, trash bins, were detected and also fedinto the KB. The initialization was differentfrom the DB case because of more complexobjects and the prominent role of many ver-tical planes. Therefore, we first detected ver-

Tab. 3: Classes and properties used in the Fra-port scenario.

Class Object properties

Wall Vertical plane, length,height

Separation panel Vertical plane, length,height

Advertising panel Vertical plane, length,height, number of planes

Chair Horizontal plane, leaningplane, angle betweenplanes, length of chair

Fig. 8: Fraport scenario: (a) 3D scan of a check-in area, (b) detected walls, (c) point cloud exhibit-ing chairs, (d) detection results of a chair set, (e) annotated static objects, (f) identification resultsobtained on 12 chair sets in a waiting area (failures 1–2, partial detection 3–7, successful identifi-cation 8–12).

Page 15: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

Hung Quoc Truong et al., Automatic Detection and Classification 235

fully identified, the five chair sets 3–7 wereonly partly detected and the two chair sets “1”and ”2” could not be identified due to missingpoints. In the next stage of processing, objectswere verified using topological constraints,such as a distance-based identification fromthe identified objects. Finally, 10 out of 12chair sets could be correctly classified evenin an insufficient dataset. The results reportedhere were obtained with an ontology that hadbeen filled with approximately 350 detectedgeometries (planes, line segments…) and used4 SWRL rules. The process took about 7 min-utes on an Intel Xeon 2.4 GHz with 12G RAMwhen using our prototype software. The fullprocess of detecting chair sets including wallidentification is depicted in Fig. 9.

7 Conclusion

This paper presents a knowledge-driven ap-proach to detect objects in point clouds. It isbased on the semantics of different associateddomains which assist in detecting and clas-sifying objects. Knowledge supports all pro-cessing steps including the arrangement ofthe data processing. This allows inter-relatingthe characteristics of algorithms with those ofthe objects in the domain of the application.Our system also provides the flexibility to in-fer the strategy from existing knowledge, andto adapt the processing to the application-spe-cific requirements. The permanent interactionbetween the algorithms and the KB allows fora smooth and gradual construction of the KBwhich contains at the end of the process allentities which can be detected and identified.Admittedly, it takes time to collect the knowl-edge at the beginning. However, it has only tobe collected once and is later always available

from the point cloud are shown in Fig. 8e.Only two walls exist in the scene and the re-maining larger static structures are either sep-aration or advertising panels, which are eas-ily distinguishable from walls by their height.Both walls were successfully identified.After the walls were detected, ASM gen-

erated, based on the properties of a chair, i.e.chair’s length, horizontal plane, leaning plane,angle between two planes, an appropriate se-quence of algorithms to invoke:PositionDetection → Segmentation →

PlaneDetection → DimensionApproximation→ AngleCalculation → FitChairFitChair is used to combine the detected

geometries of a chair as depicted in Fig. 8d, f.A rule is also applied to classify chairs:

Geometry(?x) ^ hasCorrespondingGeo(?x, ?l)^ LeaningPlane(?l) ^hasCorrespondingGeo(?x, ?s)^ HorizontalPlane(?s) ^hasAngle(?x, 120) ^ hasLength(?x, ?len)^ swrlb:greaterThan(?len, 370) ^swrlb:lessThan(?len, 380) → Chair(?x) (6)

Chair sets are arranged parallel to the wallsand represented by very sparse point clouds(Fig. 8c). Nevertheless, it is possible to detect,model and identify chair sets based on a se-quence of algorithms making use of topologi-cal and geometrical constraints arising frompreviously detected elements. Six algorithmswere used (out of the 12 in Fig. 6) such as: Po-sitionDetection, Segmentation, Dimension-Approximation, PlaneDetection (based onRANSAC), AngleCalculation and FitChair(which verifies a chair by two connectedplanes in an angle of 120°).The results obtained are shown in Fig. 8f in

which the five chair sets 8-12 were success-

Fig. 9: Chair set detection process.

Page 16: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

236 Photogrammetrie • Fernerkundung • Geoinformation 3/2013

Qualitative Spatial Relation Using ConstructiveSolid Geometry, Logic Rules and Optimized9-IMModel. – International conference on com-puter sciences and automation engineering 2:453–458, Zhanjiajie China.

BORRMANN, A., VAN TREECK, C. & RANK, E., 2006:Towards a 3D spatial query language for build-ing information models. – Computing and Deci-sion Making in Civil and Building Engineering.

BORRMANN, A. & RANK, E., 2008: Topological op-erators in a 3D spatial query language for build-ing information models. – Conference on Com-puting in Civil and Building Engineering.

BORRMANN, A.& RANK, E., 2009: Specification andimplementation of directional operators in a 3Dspatial query language for building informationmodels. – Advanced Engineering Informatics23: 32–44.

BORRMANN, A., SCHRAUFSTETTER, S. & RANK, E.,2009: Implementing metric operators of a spatialquery language for 3D building models: octreeand B-Rep approaches. – Journal of Computingin Civil Engineering 23 (1): 34–46.

BUSCH, A., GERKE, M., GRÜNREICH, D., HEIPKE, C.,LIEDTKE, C. & MÜLLER, S., 2005: AutomatisierteVerifikation topographischer Geoinformationunter Nutzung optischer Fernerkundungsdaten.– PFG – Photogrammetrie, Fernerkundung,Geoinformation 2: 111–122.

CANTZLER, H., FISHER, R.&DEVY, M., 2002:Qualityenhancement of reconstructed 3D models usingcoplanarity and constraints. – Pattern Recogni-tion: 34–41.

DIJKSTRA, E.W., 1959: A Note on Two Problems inConnexion with Graphs. – Numerische Mathe-matik 1: 269–271.

DURAND, N., DERIVAUX, S., FORESTIER, G., WEMMERT,C., GANCARSKI, P., BOUSSAID, O. & PUISSANT A.,2007: Ontology-based Object Recognition forRemote Sensing Image Interpretation. – Toolswith Artificial Intelligence (ICTAI 2007): 472–479.

GREEN, P.J., 1995: Reversible jump Markov chainMonte Carlo computation and Bayesian modeldetermination. – Biometrika 82 (4): 711–732.

GROWE, S. & TONJES, R., 1997: A knowledge basedapproach to automatic image registration. – In-ternational Conference on Image Processing 3:228–231.

HEDAU, V., HOIEM, D. & FORSYTH, D., 2010: Think-ing Inside the Box: Using Appearance Modelsand Context Based on Room Geometry. – TheEuropean Conference on Computer Vision(ECCV): 224–237.

HELMHOLZ, P., BECKER, C., BREITKOPF, U., BÜSCHEN-FELD, T., BUSCH, A., BRAUN, C., GRÜNREICH, D.,HEIPKE, C., MÜLLER, S., OSTERMANN, J., PAHL, M.,

when needed. In addition, the KB can be itera-tively extended by the operator at many practi-cal waypoints. The quality of results dependson the robustness of the implemented algo-rithms, the selected strategy and the amount ofknowledge integrated. In practice, the solutionis oriented towards the requirements of a spe-cific application. Further development is need-ed to make algorithms more robust to qualityvariations in the data, and to segment morecomplex objects. Furthermore, the knowledgesources (data features, object properties andscene characteristics) have to be extended inorder to enhance the classification processing,especially regarding ambiguous cases. Lastly,both an expansion of the ontology and furtherimplementation and testing of rules are cur-rently considered and subject to investigation.

Acknowledgements

The work presented in this paper is part of theresearch project WiDOP – WissensbasierteDetektion von Objekten in Punktwolken fürAnwendungen im Ingenieurbereich, fundedby the German Federal Ministry for Researchand Education (grant no. 1758X09). Projectpartners are Metronom Automation GmbH,DB Netz AG (railway) and Fraport AG (facil-ity management).

References

ANGUELOV, D., TASKARF, B., CHATALBASHEV, V.,KOLLER, D., GUPTA, D., HEITZ, G. &NG, A., 2005:Discriminative learning of markov randomfields for segmentation of 3d scan data. – Com-puter Vision and Pattern Recognition (CVPR) 2:169–176.

Bechhofer, S., Van Harmelen, F., Hendler, J., Hor-rocks, I., McGuinness, D.L., Patel-Schneider,P.F. & Stein, L.A., 2004: OWL web ontologylanguage reference. – W3C recommendation 10.

BEN HMIDA, H., CRUZ, C., NICOLLE, C.& BOOCHS, F.,2011: From 3D Point Clouds To Semantic Ob-jects An Ontology-Based Detection Approach.– International Conference on Knowledge Engi-neering and Ontology Development, Paris,France.

BEN HMIDA, H., CRUZ, C., NICOLLE, C. & BOOCHS, F.,2012: From Quantitative Spatial Operator to

Page 17: Automatic Detection and Classification of Objects in … · 222 Photogrammetrie • Fernerkundung• Geoinformation 3/2013 theanalysisoftheresultsandtheobjectclassi-fication.Knowledgeisalsousedtosupportthe

Hung Quoc Truong et al., Automatic Detection and Classification 237

TARSHA-KURDI, F., LANDES, T., GRUSSENMEYER, P. &KOEHL, M., 2007: Model-driven and data-drivenapproaches using LIDAR data: analysis andcomparison. – International Archives of Photo-grammetry, Remote Sensing and Spatial Infor-mation Systems: 87–92.

TEBOUL, O., SIMON, L., KOUTSOURAKIS, P. & PARAGI-OS, N., 2010: Segmentation of Building FacadesUsing Procedural Shape Priors. – Computer Vi-sion and Pattern Recognition: 3105–3112.

TRINDER, J.C. & WANG, Y., 1998: Knowledge-basedroad interpretation in aerial images. – Interna-tional Archives of Photogrammetry and RemoteSensing 32: 635–640.

VOSSELMAN, G. & DIJKMAN, S., 2001: 3D buildingmodel reconstruction from point clouds andground plans. – International Archives of Photo-grammetry Remote Sensing and Spatial Infor-mation Sciences, 34 (3/W4): 37–44, Athens,Georgia, USA.

ZLATANOVA, S., RAHMAN, A.& SHI, W., 2002: Topol-ogy for 3D spatial objects. – International Sym-posium and Exhibition on Geoinformation: 22–24.

Addresses of the Authors:

M.Sc. HUNG TRUONG & M.Sc. HELMI BEN HMIDA,University of Applied Sciences Mainz und Univer-sity of Burgundy, e-mail: {truong}{helmi.benhmi-da}@geoinform.fh-mainz.de und {quoc-hung.truong}{helmi.benhmida}@u-bourgogne.fr

Prof. Dr. FRANK BOOCHS, i3mainz, University ofApplied Sciences Mainz, Lucy-Hillebrand-Str. 2,55128 Mainz, Tel.: +49-6131-628-1489, Fax: +49-6131-628-91489, e-mail: [email protected]

Prof. Dr. ADLANE HABED, Prof. Dr. CHRISTOPHECRUZ, Prof. Dr. YVON VOISIN& Prof. Dr. CHRISTOPHENICOLLE, University of Burgundy, Le2i, Route desPlaines de l’Yonne, BP 16, 89010 Auxerre cedex,France, Tel.: +33-3-86-49-28-51, Fax: +33-3-86-49-28-50, e-mail: {adlane.habed}{christophe.cruz}{yvon.voisin}{christophe.nicolle}@u-bourgogne.fr

Manuskript eingereicht: März 2012Angenommen: März 2013

ROTTENSTEINER, F., VOGT, K. & ZIEMS, M., 2012:Semi-Automatic Quality Control of Topograph-ic Datasets. – Photogrammetric Engineering &Remote Sensing 78 (9): 959–972.

HUANG, H., BRENNER, C. & SESTER, M., 2011: 3Dbuilding roof reconstruction from point cloudsvia generative models. – International Confer-ence on Advances in Geographic InformationSystems: 16–24.

HORROCKS, I., PATEL-SCHNEIDER, P.F., BOLEY, H., TA-BET, S., GROSOF, B. & DEAN, M., 2004: SWRL: Asemantic web rule language combining OWLand RuleML. –W3CMember submission 21: 79.

MAILLOT, N. & THONNAT, M., 2008: Ontology BasedComplex Object Recognition. – Image and Vi-sion Computing (IVC) 26 (1): 102–113.

MATSUYAMA, T., 1987: Knowledge-based aerial im-age understanding systems and expert systemsfor image processing. – Geoscience and RemoteSensing 3: 305–316.

NURUNNABI, A., BELTON, D.&WEST, G., 2012: Diag-nostic-robust statistical analysis for local surfacefitting in 3D point cloud data. – ISPRS Annals ofthe Photogrammetry, Remote Sensing and Spa-tial Information Sciences I-3, Melbourne, Aus-tralia.

PU, S. & VOSSELMAN, G., 2006: Automatic extrac-tion of building features from terrestrial laserscanning. – International Archives of Photo-grammetry, Remote Sensing and Spatial Infor-mation Sciences 36: 25–27.

PU, S. & VOSSELMAN, G., 2009: Knowledge basedreconstruction of building models from terres-trial laser scanning data. – Journal of Photo-grammetry and Remote Sensing 64 (6): 575–584.

RIPPERDA, N. & BRENNER, C., 2006: Reconstructionof Facade Structures Using a Formal Grammarand RjMCMC. – The 28th conference on PatternRecognition: 750–759.

ROST, U. & MÜNKEL, H., 1998: Knowledge basedconfiguration of image processing algorithms. –International Conference on Computational In-telligence & Multimedia Applications.

SCHOLZE, S., MOONS, T. & VAN GOOL, L., 2002: Aprobabilistic approach to building roof recon-struction using semantic labelling. – PatternRecognition: 257–264.

TANG, Y., LEE, S.W. & SUEN, C., 1996: Automaticdocument processing: A survey. – Pattern Rec-ognition 29 (12): 1931–1952.