Spatiotemporal and Semantic Information Extraction From Web News Reports About Natural Hazards

Embed Size (px)

DESCRIPTION

information extracted from the Web, especially unconstructed data such as online news reports, is agrowing area of research. Extracting spatiotemporal and semantic information from a set of Webdocuments enables us to build a rich representation of geographic knowledge described in text, capturingwhere, when, or what events have occurred. This work investigates the role ontologies play as a keycomponent in the process of semantic information extraction. We show how ontologies can be used inconjunction with natural language gazetteers in order to process semantic information about hazardevents and augment spatiotemporal extraction with semantics. We are interested in capturing thespatiotemporal patterns of hazard-related events from online news reports to track the occurrencesand evolution of natural hazards, such as severe storms. A hazard ontology has been created to assistthe spatiotemporal information extraction process, especially with the automatic detection of differentkinds of events at multiple granularities from unstructured texts revealing relationships between theevents over space–time. The extraction and retrieval of semantic information about event dynamicsprovides information about the progression of events using both natural and human perspectives.

Citation preview

  • ity, I

    Keywords:Spatiotemporal semantic informationretrievalNatural language processingHazard ontology

    nfoth

    . Ex

    component in the process of semantic information extraction. We show how ontologies can be used in

    al (Gcally o

    For example, it is possible to extract ZIP codes, addresses, well-known landmarks, peoples names, or telephone numbers fromWeb documents. While the elds of NLP and GIR, as rapidlyexpanding elds of scientic research, have contributed solutionsfor helping users to nd information based on their interests, thepossibility of automatically tracking spatiotemporal and semantic

    . In this research,rts aboutted with hvent), as w

    higher-level events (e.g., more general hazard impact orresponse events). Making spatiotemporal information moreingful by adding semantics (e.g., hazard impact, response, orery) improves an understanding about the dynamics that underliehazard outbreaks. Users can directly visualize spatial and temporaltrends of different hazard events and map these events and theirsemantics at multiple levels of detail, capturing aspects that arenot explicitly described in text documents. This work can also beapplied to dynamics more generally, for example, extractingsemantics about disease outbreak events or political campaigns.

    Corresponding author.E-mail addresses:[email protected] (W. Wang), kathleen-stewart@uiowa.

    edu (K. Stewart).

    Computers, Environment and Urban Systems 50 (2015) 3040

    Contents lists availab

    Computers, Environmen

    vietexts to perform useful tasks with a set of computational algo-rithms and statistical approaches (Chowdhury, 2003). BridgingGIR and NLP techniques, salient geographic information can beautomatically extracted from large volumes of unstructured text.

    to reveal more about the nature of the disastersemantic information from online news repostorms includes domain-related events associa(e.g., an airport closed event or power outage ehttp://dx.doi.org/10.1016/j.compenvurbsys.2014.11.0010198-9715/ 2014 Elsevier Ltd. All rights reserved.severeazardsell ashazardmean-recov-Internet search engines that involve not only traditional informa-tion retrieval (IR) methods, such as indexing, searching, browsing,crawling and querying, Web documents, but also the geographicscope of documents (Teitler et al., 2008; Purves & Jones, 2010;Karimzadeh et al., 2013). Natural language processing (NLP) is usedto analyze the content and context of documents, and manipulate

    when, or what events have occurred. The contribution of thisresearch is to show how semantic information about events,important for understanding the different kinds of events thatoccur during natural hazards such as hurricanes, blizzards or otherdisasters, can be extracted and used in combination with spatialand temporal information that is also extracted from documentsGISGazetteers

    1. Introduction

    Geographic information retrievopportunities for users to automaticonjunction with natural language gazetteers in order to process semantic information about hazardevents and augment spatiotemporal extraction with semantics. We are interested in capturing thespatiotemporal patterns of hazard-related events from online news reports to track the occurrencesand evolution of natural hazards, such as severe storms. A hazard ontology has been created to assistthe spatiotemporal information extraction process, especially with the automatic detection of differentkinds of events at multiple granularities from unstructured texts revealing relationships between theevents over spacetime. The extraction and retrieval of semantic information about event dynamicsprovides information about the progression of events using both natural and human perspectives.

    2014 Elsevier Ltd. All rights reserved.

    IR) provides valuablebtain information from

    information about events in Web news articles is still a challenge.Extracting spatiotemporal and semantic information from a set ofWeb documents makes it possible to build a rich representationof geographic knowledge that is described in text, capturing where,Available online 26 November 2014documents enables us to build a rich representation of geographic knowledge described in text, capturingwhere, when, or what events have occurred. This work investigates the role ontologies play as a keySpatiotemporal and semantic informationreports about natural hazards

    Wei Wang , Kathleen StewartDepartment of Geographical and Sustainability Sciences, The University of Iowa, Iowa C

    a r t i c l e i n f o

    Article history:Received 12 March 2014Received in revised form 10 October 2014Accepted 2 November 2014

    a b s t r a c t

    In the eld of geographic iinformation extracted fromgrowing area of research

    journal homepage: www.elseextraction from Web news

    A, United States

    rmation science, modeling geographic dynamics based on spatiotemporale Web, especially unconstructed data such as online news reports, is atracting spatiotemporal and semantic information from a set of Web

    le at ScienceDirect

    t and Urban Systems

    r .com/locate /compenvurbsys

  • framework is developed to assist GIR by mapping between alterna-tive spatiotemporal classications. Three ontologies are integrated

    nmeThis paper investigates the role ontologies play as a key compo-nent in the process of geographic information retrieval. Ontologiesdeal with the actual nature of the phenomena, focusing on the nat-ure and the organization of reality (Kalfoglou & Schorlemmer,2003; Welty & Murdock, 2003). In information science, ontologieshave been applied as a tool for knowledge management. Theyprovide a concept-based framework using formal methods thatrepresent entities, attributes, relationships, and values for a spe-cic domain (Noy, 2004; Sawsaa & Lu, 2012; Wiegand & Garcia,2007). Well-developed ontologies can serve as a standard forconceptualizing and understanding domains of interest, andontologies enable semantic interoperability (Cruz, Ganesh, &Mirrezaei, 2013; Cruz & Xiao, 2005; Fernadez et al., 2011;Schorlemmer & Kalfoglou, 2008). Ontologies also play an impor-tant role in GIR techniques, by providing a knowledge base thatsupports semantic understanding of text and improves searchresults as well as extracted information (e.g., geographic informa-tion). Associations between different semantics can be achieved byapplying ontologies with classication schemes and hierarchies(Jones & Purves, 2008; Kemp, Tan, & Whalley, 2007;Kontopoulos, Berberidis, Dergiades, & Bassiliades, 2013). Ontolo-gies can also be applied for the disambiguation of geographicnames and improving gazetteer interaction (Janowicz & Keler,2008; Machado, Alencar, Campos, & Clodoveu, 2011; Volz, Kleb,& Mueller, 2007). Events can also be represented in ontologies(Allen & Ferguson, 1994; Imran, Elbassuomi, Castillo, Diaz, &Meier, 2013; Lee, Wu, Yang, Wen, & Chiang, 2013; Worboys &Stewart Hornsby, 2004). In this research, we show how usingontologies in GIR applications is helpful for relating multipleperspectives of hazards (in this case, environmental and humanperspectives), for example, weather-related and human impactaspects using terms and relations from the ontology. Two keyresearch questions are addressed through this work:

    How can gazetteers and ontologies be integrated in order to beused for semantic information extraction of hazard event infor-mation at multiple granularities?

    How does representing semantic information associated withhazard events using ontological relations on maps contributeto an understanding of event dynamics?

    For this work, we are interested in capturing the spatiotemporalpatterns of hazard-related events from texts and to track the differ-ent kinds of events relating to both environmental and humanperspectives over spacetime. A hazard-based ontology has beenbuilt to assist the spatiotemporal and semantic information extrac-tion and retrieval process, especially useful for the automaticdetection of semantics in unstructured texts and for representingthe relationships between events. This approach to automaticinformation extraction is particularly useful if a large set of docu-ments describing many different events needs to be analyzed forcontent. When harnessed to a GIS for mapping, the semantic infor-mation retrieval provides richness to the resulting maps, offeringmultiple perspectives of the hazard as well as multiple levels ofdetails. The approach used in this research is shown in Fig. 1, usesinputs (i.e., online news reports) and integrates text linguisticprocessing with spatial, temporal, and semantics gazetteers, anda hazard ontology in order to perform information extraction pro-cessing. The outputs of this processing is stored in a geodatabase,and geocoded for mapping in ArcGIS.

    The rest of this paper is organized as follows. Section 2discusses related work on applying ontologies for geographic infor-mation retrieval. Section 3 describes the development and use of a

    W. Wang, K. Stewart / Computers, Envirohazard ontology, and then demonstrates how the ontology isintegrated with semantic and spatial and temporal gazetteers inan NLP environment for information extraction. The followingto capture the semantics of spatial (Marine surface), temporal (fourseasons) and thematic (shery-related) dimensions drawn fromtwo heterogeneous shery databases (structured data). The the-matic or domain ontology provides a hierarchical structure of con-cept terms and also links between concepts in one domain (i.e., theshery) with other domains (e.g., biology). In this paper, we alsosections explain how event dynamics and semantic informationare extracted from online news reports and represented usingGIS. The proposed approach is applied to a document set of newsreports on Hurricane Sandy from OctoberNovember 2012. Themethods are evaluated in Section 7. Conclusions and futureresearch are discussed in the nal section.

    2. Related work

    Digital text information is widely available in the form of Webarticles, news reports, blogs, Twitter feeds, and other formats. Spa-tial and temporal information is commonly referenced in theseWeb documents, and extracting such information provide detailsabout dynamic change patterns and trends of world events overspacetime (Grover et al., 2010; Janowicz, Scheider, Pehle, &Hart, 2012; Sankaranarayanan, Teitler, Lieberman, & Sperling,2009). Extracted results can be visualized in a GIS environment,transferring a textual surrogate of text documents to a visual sur-rogate (Gregory & Hardie, 2011). Ontologies have been included inthe GIR process to facilitate the retrieval of heterogeneous geo-graphic information from texts, and knowledge representationand reasoning (Buscaldi, Bessagnet, Royer, & Sallaberry, 2014;Kontopoulos et al., 2013; Machado et al., 2011). Ontologies reectdifferent relations that hold among entities, for example, entitiesare arranged into superclasses and subclasses related by is-a rela-tions in order to classify entities into subgroups, or to capture part-whole relationships through part-of relations. Ontologies have alsobeen applied to capture events, such as biomedical events (Huet al., 2011) and business events (Arendarenko & Kakkonen,2012; Saggion, Funk, Maynard, & Bontcheva, 2007). Ontologiescan refer to upper-level concepts or more specialized domain-based concepts. An upper-level ontology is a model that capturescommon (high-level) abstraction of entities in the world such asRegion, Event, Physical Object and Feature (e.g., DOLCE and BFO)(Guarino, 1998; Smith & Ceusters, 2010). Upper ontologies maybenet GIR by formalizing and integrating extracted informationrelating to high-level semantics (Gangemi, Guarino, Masolo,Oltramari, & Schneider, 2002). Domain ontologies on the otherhand, capture classes that are more specialized and relate to a par-ticular domain (e.g., medicine, transportation, hazards) (Guarino,1998; Stewart, Fan, & White, 2013). Domain ontologies are espe-cially useful for processing and reasoning over text content, andenhance semantic information construction in information extrac-tion systems. The SPIRIT project (SPatially-aware InformationRetrieval on the InTernet) incorporates a domain ontology, a geo-graphic ontology, footprints, and spatial indexing to create a spatialsearch engine in which geospatial semantic differences betweenplaces are distinguished (Jones & Purves, 2008; Purves et al.,2007). The developed geographic ontology contains actual andalternative place names, place types, spatial footprints (i.e.,geometric extent), and spatial relationships (i.e., part-of), and isapplied to recognize place names and support disambiguation ofplace name expression in user queries. For a project on marineenvironmental management (Kemp et al., 2007), an ontology

    nt and Urban Systems 50 (2015) 3040 31are interested in spatial and temporal aspects as well as hazardsemantics, as we wish to automatically extract the spatiotemporalpatterns of hazard-related events from texts in order to represent

  • y

    singnceer,

    mp

    nmeand track the evolution of different kinds of hazards over space andtime.

    In the next section, we discuss the development of an ontologythat supports information extraction relating to hazard eventsespecially severe storms (hurricanes, tornadoes, blizzards, etc.).The ontology is used for distinguishing different kinds of hazard-related events and semantic information extraction. It includeshazard-related terms from not only a natural perspective (e.g., geo-logical hazard, hydrological hazard, etc.), but also from a humanperspective, where the impacts of hazards (e.g., airport closings,ight cancellations, closed roads, power outages, etc.) and thehuman responses to hazards (e.g., evacuations, power restored)are modeled.

    Web Text Documents

    Hazard Ontolog

    Spaal, Temporal,and Semanc Gazeeers

    Linguisc Proces(Tokenizer, Sentesplier, POS taggRules, etc)

    Input

    Process

    GATE 8.0

    Fig. 1. The process for incorporating ontologies into spatiote

    32 W. Wang, K. Stewart / Computers, Enviro3. Constructing a hazard ontology

    To support semantic information extraction tasks involvingevents that occur during natural hazards such as severe storms,an ontology was created that includes key hazard terms. For thiswork, an open source toolkit, NeOn (http://neon-toolkit.org/) isused to construct the hazard ontology (Fig. 2). This platform waschosen for its exible environment that supports easy importing,editing, visualizing, and exporting ontologies. The ontology is cre-ated from authoritative sources on hazards, e.g., the US FederalEmergency Management Agency (http://www.fema.gov/) thatprovides a source for common hazards terminology, and the USNational Weather Service (http://www.weather.gov/) for differentkinds of hazardous weather entities. Existing ontologies (e.g.,OpenCyc (http://sw.opencyc.org/) and GeoNames (http://www.geonames.org/)) are also used as sources for terms relatingto natural hazards, as well as terms extracted from news documentsets during training (discussed in detail in Section 4). Classes arelinked by either is-a or part-of relations, and for classes linked byis-a relations, there is a single superclass for a set of subclasses(i.e., this ontology does not use multiple inheritance). In this work,part-of relations model relationships where a part has a functionwith respect to the whole (e.g., HazardManagement part-of Natual-Hazard). Both of these relations support reasoning over multiplegranularities of events and features in the news reports. Althoughthe ontology was designed to be comprehensive to support theextraction process and currently consists of approximately 250classes, we anticipate that this ontology could certainly beextended for future uses (see Wang, 2014 for a version of theontology).

    Upper-level classes in the ontology describe the most abstractentities and includes two classes, Object and Happening. ClassObject describes high-level categories of entities that can be spe-cialized as one of three subclasses, Agent, Place, or Time. Happeningrefers to the dynamic processes that occur during all hazards.Event, is a subclass of Happening, and is associated with subclassesNaturalHazard and MeteorologicalEvent. Four subclasses aresubsumed by class NaturalHazard including GeologicHazard,ClimateHazard, HydrologicHazard, and WildreHazard. Meteorologi-calEvent subsumes weather classes including Precipitation, Stormand Wind.

    Spaal, Temporal and Semanc Informaon Set in a Geodatabase

    Geocoding

    Geovisualizaon

    Integrate

    Output

    oral and semantic information extraction for hazard events.

    nt and Urban Systems 50 (2015) 3040HazardManagement captures the human aspects associatedwith a natural hazard. This class shares a part-of relation with Nat-uralHazard, and has four subclasses, Prediction, HazardImpact,HazardResponse, and HazardRecovery. HazardImpact has subclassesCommercialImpact, FacilityImpact, IndividualImpact, ResidentialIm-pact, and PoliticalImpact. For example, Hurricane Sandy occurredduring the 2012 US Presidential election, and numerous election-eering events were affected (typically cancelled) as a result of thesevere weather. These events are modeled as PoliticalImpact events.Subclasses of HazardResponse include CommunityResponse, Emer-gencyResponse, and TransportationResponse. Recovery subsumesCommericalRecovery, FacilityRecovery, ResidentialRecovery, Trans-portationRecovery, and UtilityRecovery.

    4. Integrating the ontology with spatial, temporal and semanticgazetteers

    In order to prepare Web news reports for the informationextraction process, the content of the news reports is annotatedwith respect to spatial, temporal and hazard-related information.In this work, information extraction is implemented using GATE8.0, General Architecture for Text Engineering (http://gate.ac.uk/).The principal information extraction engine uses three differentgazetteers. Traditionally, a gazetteer is regarded as a dictionarythat contains lists of geographic references (Goodchild & Hill,2008), and is used for extracting place names in information retrie-val systems. The term gazetteer in GIR is applied more broadly.

  • terf

    nmeHere, gazetteer refers to a dictionary with lists of specic terms orphrases that are used to match the corresponding informationfrom text documents. In this work, different gazetteers store thedifferent kinds of vocabulary found in news reports on hazards.

    Fig. 2. A view of the NeOn in

    W. Wang, K. Stewart / Computers, EnviroGATEs default gazetteer supports extracting locations and datefrom text documents. Most of GATEs regional references arerelated to general world geographical references, especially geo-graphical locations in the UK. US geographical information is notwidely covered in GATEs default gazetteer. The spatial gazetteerdeveloped for this research extends GATEs original gazetteer byimporting U.S. state abbreviations, county names for all states,and 25,150 regional places (e.g., cities, towns, villages, census des-ignated places, airports, schools, etc.). The geographic data isobtained from StreetMap Premium for ArcGIS (http://www.esri.-com/data/streetmap/), Geonames (http://www.geonames.org)and the U.S. Gazetteer Files for the 2010 Census (https://www.census.gov/geo/mapsdata/data/gazetteer-2010.html). The spatialdata were stored with a set of .lst les describing spatial termsbelonging to the same type, e.g., city.lst, county.lst, state.lst,airport.lst, etc. Currently, this gazetteer contains locations in theUS including the Caribbean that supports the extraction of placesfor the news reports describing the hazard events used in thisresearch, but the gazetteer could be expanded as necessary. In atest we found that the developed spatial gazetteer could detect801 locations from 11 test documents, as compared to only 380locations extracted using GATEs default gazetteer running on thesame documents. A temporal gazetteer has also been developedto complement GATEs built-in support, and contains commontemporal references, such as day, week, month, and year, for tem-poral processing with additional references, such as early morningor late Monday evening to expand further the temporal annotationcapabilities. Finally, a semantic gazetteer has also been developed.For the semantic gazetteer, a set of 180 news reports on hazardsserved as training documents for collecting samples of hazard-related terms or phrases based on different hazard topics. Ingeneral, there is a range to the number of training articles used,however, it is not uncommon to see 70% of a data set used for train-ing purposes (Resnik & Lin, 2010). In this work, 60% of a set of CNNnews documents on blizzards, hurricanes, ooding, tornadoes, andwildres were processed to build the semantic gazetteer. Hazard-

    ace for the hazard ontology.

    nt and Urban Systems 50 (2015) 3040 33related references were manually annotated and stored in a set of.lst les in the semantic gazetteer to correspond with differentclasses in the hazard ontology, e.g., storm.lst, precipitation.lst, hydro-logicalhazard.lst, communityresponse.lst, hazardimpact.lst, etc. Eachlist le contains a set of related hazard events as identied in thedocuments during training. Rules have been implemented in theJAPE transducer (a Java Annotation Pattern Engine) in GATE tosupport spatial, temporal, and semantic reference matching fromthe gazetteers and extraction from the texts. As a result of training,the ontology contains terms that are representative of the contentsof the news reports as well as the contents of the expert hazardsources also used to create the ontology, and in this way, the ontol-ogy is able to map to the contents of the news articles.

    To link the terms from the semantic gazetteer with classes inthe hazard ontology, the ontology is imported into GATE as a lan-guage resource, an OWLIM ontology. An ontology API in GATE sup-ports several plugins, such as OntoGazetteer, and this is used tolink the terms from the semantic gazetteer with classes in theontology. In this way, gazetteer terms at different granularitiescan be related using the ontology. To build the connection betweenthe gazetteer and ontology classes in OntoGazetteer, a mapping leis created. The mapping le is used to connect lists in the semanticgazetteer and classes in the hazard ontology. For example, precipi-tation.lst:http://gate.ac.uk/hazardontology.owl:Precipitation links thelist of different precipitation-related terms in the semantic gazet-teer to the class Precipitation in the ontology. Terms in the semanticgazetteer are associated with different granularities in the ontol-ogy using the relations specied in the ontology (e.g., is-a andpart-of). This allows for reasoning over the ontological relations,and allows us to model these event types at different levels ofgranularity in the hazard ontology. Is-a relations hold betweensuperclasses and subclasses and supports reasoning over multiple

  • according to the details given in a sentence (Wang & Stewart,2014). These rules assist in correctly assigning locations and tem-

    mapping of extracted spatiotemporal and semantic informationis implemented using ESRIs ArcMap 10.1 software. Further discus-

    nmeporal information to these events, as well as retrieving a temporalordering of hazard events. For example, for the sentenceMore than2000 Westar Energy Customers in Riley County were without powerSaturday evening, Saturday evening and without power areannotated as temporal and event information respectively andare assigned to Riley County after processing. The event withoutpower was matched with terms in the PowerShortage.lst le inthe semantic gazetteer. In the ontology, PowerShortage is a subclassof UtilityImpact, and UtilityImpact is a subclass of HazardImpact. Inthis case, without power is associated with the higher-level classHazardImpact through linking the semantic gazetteer and the haz-ard ontology. Saturday evening, Riley County, without power andHazardImpact are processed as a set (along with all other extractedgranularities but does not hold between all classes, for example,cases such as county and states where a part-of relation is used.Is-a is not used to describe the relations between certain spatialfeatures. Together, is-a and part-of relations are used to reasonacross various levels of granularity in the hazard ontology. Forexample, assume the term supercell is annotated in one of the textdocuments, and that this term also exists in the ontology as a sub-class of Thunderstorm. Given that Thunderstorm, is a Storm, it can beinferred that Supercell is-a Storm after the processing in GATE.Combining semantic retrieval with a spatiotemporal perspectiveaffords important opportunities for humanenvironment applica-tions offering new insights about the dynamics of reported eventsin document collections.

    5. Spatiotemporal and semantic information retrieval fornatural hazards

    Incorporating hazard-related semantics as part of spatiotempo-ral information extraction provides a story-telling approach tomapping that involves both environmental and human perspec-tives. The process of automated spatiotemporal and semanticinformation extraction from text documents uses the hazard ontol-ogy and the spatial, temporal and semantic gazetteers to map theresults of spatiotemporal information extraction, i.e., the differentkinds of events (e.g., hazard events, response events and impactevents) that unfold over space and time. These events, i.e., theactual terms and phrases from the news reports as well ashigher-level semantics derived from the ontology, can be visual-ized and represented at different spatial granularities.

    The extraction process involves automated parsing of spatial,temporal and semantic information from text documents, process-ing these terms with rules developed and implemented in GATE forlinking the events with the appropriate spatial and temporal infor-mation (Wang, 2014; Wang & Stewart, 2014). The extracted resultsare then saved to a geodatabase. For many documents, events aredescribed in connection to one or more geographical referencesand also possibly associated with temporal expressions. However,not every sentence will have both spatial and temporal informa-tion available. Five possible cases can arise with respect to spatio-temporal information present in a sentence: only spatialinformation is present; one spatial term and one temporal refer-ence is present; one spatial term and multiple temporal references;multiple spatial terms and a single temporal reference; and multi-ple spatial and multiple temporal terms are present (StewartHornsby & Wang, 2010; Wang, 2014; Wang & Stewart, 2014).The rules are implemented in Java with GATE API for assigningproper temporal and location information with event terms

    34 W. Wang, K. Stewart / Computers, Enviroterms) to a geodatabase. In additiona, all temporal informationneeds to be converted to a standard time format (i.e., YYYY-MM-DD hh:mm:ss) in the geodatabase. In this work, temporalsion relating to the data quality of extracted information is inSection 7.

    Using these methods, spatial, temporal and semantic informa-tion about events are automatically extracted fromWeb text docu-ments. Ontologies bolster the semantic processing by linkingannotated terms to the different kinds of semantics they representandmodeling the events at different granularities. In thisway, eventsemantics (e.g., hazard impact events, response events, or recovery-related events) can be represented on amap. This affords importantopportunities for humanenvironment applications as the spatio-temporal evolutionof a set of events canbe tracked andnew insightsrevealed about the dynamics and different kinds of reported eventsin document collections that might otherwise be unknown.

    6. A case study-Hurricane Sandy, OctoberNovember, 2012

    To illustrate this work, a case study is presented using a set of50 CNN news reports collected from October 24November 4,2012 about Hurricane Sandy, a very severe weather event thathit the east coast of the US. The evolution of reported events canbe tracked, visualized and analyzed on maps using this approachfor spatiotemporal and semantic information extraction.

    As a result of text processing, event-related terms extractedfrom all 50 news reports are represented in Fig. 3, with the tempo-ral pattern of events represented using graduated colors to empha-size temporal order, light pinkdark reddark bluelight bluewhere light pink refers to the earliest dates of extracted eventinformation, October 24, 2012, and light blue is associated withthe latest date extracted, November 5, 2012. Most of the earliestevents reported in the tracked period (Oct 24 to Oct 27), werelocated as far south as the western Caribbean Sea, Jamaica, Cuba,Bahamas, and Florida. By October 27th, events were being reportedfurther north along the coast to locations including Charleston, SC,Cape Hatteras, NC, and Mount Airy, NC. After processing all 50 oftheWeb news reports, mapped events generally shifted from southto north as far up as Maine in the northeast. Inland events werealso reported, for example, events in West Virginia, Ohio, andexpressions undergo additional processing to standardize temporalinformation and order the events over time. For example, theduration of events is assigned to a discrete time period, e.g., earlyThursday is assigned a period between 5:00 am and 11:00 am onThursday. The published date of the news reports also serves as acriterion to compare temporal expressions (e.g., Thursday, tomor-row, the next day) in the documents so that these expressionsmay be ordered temporally and assigned with appropriate dates.The data is then ready for geocoding and mapping.

    Using Yahoos geocoding API, geographic coordinates areassigned to each extracted location (http://www.gpsvisualiz-er.com). In this research, for generalized (i.e., high-level)geographic references including states, counties, and regional geo-graphic entities (e.g., southwest Florida, Riley county), geocodingselects a central point of reference. Researchers are investigatingtechniques for improved handling of vague places (Chasin,Woodward, Witmer, & Kalita, 2013; Delboni & Borges, 2007;Jones, Purves, Clough, & Joho, 2008; Vasardani, Winter, & Richter,2013), and this aspect remains a challenge for geographic informa-tion retrieval. For spatial references that are lower-level, such as aneighborhood or street reference (e.g., City Island, New York orWest Street, Manhattan), the gazetteers can be parsed and loca-tions geocoded on maps to show where events are occurring. The

    nt and Urban Systems 50 (2015) 3040Pennsylvania. This map illustrates the focus of reporting on thehighly populated regions of New York and New Jersey aroundOctober 30th, 2012. The text extraction process shows other details

  • nmeW. Wang, K. Stewart / Computers, Envirotoo, for example, the transition of the storm from tropical storm tohurricane in the afternoon of Oct 24th.

    With the addition of ontology-based semantic processing, it ispossible to learn more about the kinds of events occurring during

    Fig. 3. Extracted events relating to Hurricane Sandy from 50nt and Urban Systems 50 (2015) 3040 35the hurricane and where and when they occurred. Applying theontology makes it possible to represent the extracted texts accord-ing to the different classes of events. An analysis based on a subsetof 12 news articles published on October 30th, a peak day during

    CNN news reports for the period Oct 24Nov 04, 2012.

  • nme and Urban Systems 50 (2015) 304036 W. Wang, K. Stewart / Computers, Envirothe Hurricane Sandy, shows events classied as HazardImpact,HazardResponse and HazardRecovery (Fig. 4). For example, powerfailed, ights cancelled, and lost homes are specializations of theclass HazardImpact while emergency evacuation and reghters bat-tled blaze activities signify aspects of HazardResponse. The results ofthe analysis capture the fact that power outages were a key eventfor facilities impacted by Hurricane Sandy as reported in Manhat-tan, Queens, and Staten Island in New York, and Newark in NewJersey. Response-related events were triggered in Breezy Point,Brooklyn, and Staten Island in New York, with emergency evacua-tion in Moonachie, Queens, and Roosevelt in New York. Eventsassociated with Recovery were reported in Manhattan and Syosset.To understand the spatiotemporal pattern of classied events withrespect to the population density of the U.S. in 2012 obtained from

    Fig. 4. Semantic information retrieval for New York and New Jersey over space and time.also reported in the texts.ntESRIs 2012 Updated Demographics1, the results are mapped overspace and time against high population areas. The extracted haz-ard-related events are clustered in major cities with high populationdensity, such as Manhattan, Brooklyn, and Queens in New York.Based on the results, more spatiotemporal and semantic queriescan be conducted to detect inherent facts that cannot be directlyretrieved from the texts. For example, which locations in New YorkCity were reported to be impacted by Sandy from the noon of Oct29th until the morning of the 30th? The spatial restriction for thisquery is New York City, while the semantic restriction is HazardIm-pact. The temporal restriction, Oct29th afternoon to Oct 30th morning

    The news reports are from Oct 30th, 2012, although events from Oct 29Oct 31 were

    1 http://www.arcgis.com/home/item.html?id=a18f489521ba4a589762628893be0c13.

  • W. Wang, K. Stewart / Computers, Envi nmeis between 10/29/2012 12:00 pm and 10/30/2012 8:00 am. Theresults returned that satisfy these three conditions include the bor-oughs of Queens and Manhattan in New York City.

    Hazard impact events extracted from the 50 news reports arealso mapped with a kernel density analysis in order to show thespatial distribution pattern of hazard impact events (Fig. 5a). Ker-nel density analysis estimates the intensity of events by generatinga smooth surface using a quadratic kernel function (Li, Goodchild& Xu, 2013; Silverman, 1986). Two parameters are used in thekernel density analysis: kernel search radius (bandwidth) to calcu-late density and cell size for the output raster data. A kernel searchradius of 55 km has been used to avoid creating a map that is toosmooth or too ambiguous to interpret. A cell size of 6 km was usedto show sufcient detail. The areas with high density of hazardimpact events are represented with the darkest hue of red, suchas the east coast of New York, New Jersey, Philadelphia, VirginiaMaryland, Baltimore, and Washington. As shown in Fig. 5bHurricane Sandy impact data obtained from a FEMAModeling TaskForce2 is mapped for the areas impacted by Hurricane Sandy with

    Fig. 5. (a) Kernel density analysis for hazard impact-related events extracted fromFEMA Modeling Task Force.

    2 http://fema.maps.arcgis.com/home/webmap/viewer.html?webmap=307dd522499d4a44a33d7296a5da5ea0.ro,

    ,,

    50 CNnt and Urban Systems 50 (2015) 3040 37four different levels (i.e., very high, high, moderate, and low). Fig. 5aand b shows a similar spatial pattern for the extracted HazardImpactevents and the impact analysis from FEMA, suggesting that in certaincases automated extraction may be useful for a rst approximationof event occurrences.

    However, there are also certain locations, for example, Wash-ington DC, that have different impact levels on both gures. Oneof the possible reasons is that a large number of hazard impactevents, such as school closures, airline cancellations, power out-ages, and road closures associated with these areas might not bemeasured by FEMA. Another possible reason is that the collectedFEMA data was starting from 11/8/2012, which was slightly laterthan the news reports. Fig. 5b uses a composite of buildingdestroyed, surge, wind, precipitation and snow data from FEMAto assess the impact. The event information retrieved from textdocuments can be used in combination with other sources of data(e.g., FEMA impact data) to enhance the understanding of hazardevents, for example, to estimate the number of different kinds ofevents reported. In addition, integrating data from differentsources also allows us to discover additional information, such asnd all recovery events in FEMA very high impact areas from the earlymorning of Oct 30th to late afternoon on November 2nd. While GIR isvulnerable to the quality of the text that is being processed (e.g., if

    N news reports; (b) Hurricane Sandy impact analysis (11/8/20124/18/2013) from

  • 8. Conclusions and future work

    nmeerroneous information about events is reported, then that informa-tion will be mapped), we are optimistic about the utility of GIR forrevealing the pattern of events and their semantics. The evaluationdiscussed in the next section underscores further the degree towhich the results are meaningful and not just due to chance.

    7. Evaluation of approach

    The set of steps presented in Section 5 show how extractedinformation from documents can be integrated to represent thedynamics of events described in news reports on hazard events.In addition to comparing the results with FEMA maps, an evalua-tion has been undertaken for testing the automatic association ofextracted events with the appropriate location, time and semanticinformation. The results suggest that these methods are promisingfor generating appropriate spatiotemporal and semantic assign-ments. The evaluation was conducted using 10% of the HurricaneSandy data set (10 news reports). In IR and NLP, the most com-monly used evaluation approach is to compute precision and recallstatistics over a set of evaluation data (Manning, Raghavan, &Schtze, 2008). Precision refers to how many of the extractedresults are correct. The higher precision, the fewer errors are con-tained in the extracted results. Recall indicates the numbers ofitems that should have been detected, and records how many areeffectively extracted. The highest recall value (i.e., 1) indicatesthe results that need to be extracted are actually all extracted(Sitter, Calders, & Daelemans, 2004). In this evaluation, we haverevised slightly the traditional evaluation metrics of using preci-sion and recall to measure the performance of our approach.

    STSprecision = the number of correctly resolved spatiotemporalsemantic references/the number of spatiotemporal semanticreferences that the system or users attempt to resolve;

    STSrecall = the number of correctly resolved spatiotemporalsemantic references/the number of all such references.

    To compute the STSprecision and STSrecall values, automati-cally processed results by the system are compared with a goldstandard in order to acquire the numbers of correctly resolved spa-tiotemporal references, incorrectly resolved spatiotemporal refer-ences, and missing spatiotemporal references. We recruitedhuman subjects to provide a gold standard for this evaluation(Clark, Fox, & Lappin, 2010). For human evaluators, ve volunteerswere hired to manually process the evaluation data. The volunteerswere trained before they conducted the evaluation tasks by provid-ing two sample news reports along with instructions of how to rec-ognize spatial, temporal and event information. In the two trainingdocuments, spatial, temporal, and event information have beenannotated by a human expert. The instructions provide predeter-mined criteria and examples for volunteers so they understandthe denition of spatial, temporal, semantic information and theassignment tasks. After training, volunteers were assigned 10CNN news reports to process. Each volunteer manually annotatedspatial, temporal, and semantic terms from the 10 news reports,and assigned the annotated terms to events based on the contextin the text documents. The results are expressed as a set of {spatial,temporal, semantic} vectors stored in a .csv database le.

    It is recognized that human subjects might not be consistent intheir judgments of spatial, temporal and semantic informationcontained in the news reports, as well as how to detect spatialand temporal information and assign it to an event. Therefore,Kappa statistics are applied to the results of the human evaluatorsto achieve the gold standard (Viera & Garrett, 2005; Manning et al.,

    38 W. Wang, K. Stewart / Computers, Enviro2008). This involves comparing the results from each volunteer. Inthis evaluation, the gold standard is based on results with at least80% agreement (four out of ve volunteers). Results that are lowerIn this research, an approach to automatically extract spatio-temporal and semantic information from Web news reports abouthazards using ontologies with NLP and GIR techniques is pre-sented. The extracted results are mapped using ArcGIS to representthe spatiotemporal patterns of hazard events. Events can be cate-gorized into different classes according to their semantic proper-ties and formalized in an ontology. Incorporating semantics inspatiotemporal information retrieval provides insight into factsthat cannot be directly retrieved from texts, for example, whatkinds of events were reported over a certain time window or in acertain region. The approach has been applied to a case study of50 CNN Web news reports on Hurricane Sandy from October toearly November, 2012. The spatial and temporal characteristicsassociated with the hazard-related events can be automaticallyextracted and visualized without going through each individualdocument. The results provide new details about the kinds ofevents occurring during a hazard (impact, response or recoveryevents), patterns of change (e.g., tropical storm to hurricane), andspatiotemporal trends for hazard events as well as an understand-ing of events at multiple spatial and temporal granularities. Thiswork addresses two research questions including one on the inte-gration of gazetteers and ontologies for semantic informationretrieval and the contribution of semantic information for mappinghazard dynamics. We describe how gazetteers can be linked withontologies to relate event information at different granularities.Mapping semantic information associated with hazard events asshown in this work, also contributes to an understanding of eventdynamics from different perspectives (human-related activities vs.environmental phenomena).than 80% agreement are required to be rechecked and discussedwith the tester, to decide whether they should be included orexcluded from the gold standard. Results with 0% agreement areexcluded.

    The results are based on 113 processed spatiotemporal andsemantic references in the text documents. For this evaluation,there were 94 correct references, 19 incorrect references, and 21missed spatiotemporal and semantic references compared withthe gold standard (134 spatiotemporal and semantic referencesset). Based on this performance, STSprecision and STSrecall are cal-culated as 0.83 and 0.82 respectively. The STSprecision and STSre-call values achieved by our method compares well with otherprecision and recall values in the eld of information extraction,for example, the precision (0.756) and recall (0.4135) values fromGATE using ANNIE on spatial information retrieval (Clough,2005), and average evaluation results (precision 77.6 and recall86.1) using benchmark data to process spatial and temporal infor-mation (Strotgen, Gertz, & Popv, 2010).

    In this evaluation, the quality of extracted spatial results for the50 CNN Web news reports was also checked. The geographic loca-tions can be grouped into two different classes: explicit locations(i.e., places that are equal or ner in spatial granularity as cities)and generalized locations (e.g., Southern New Jersey or East Coast).The results contain 262 explicit locations and 386 generalized loca-tions. It is worth noting that for the case study, the analysis wasundertaken with CNN articles. It is probably the case that detailedgeographic locations (e.g., explicit street names or addresses) areless likely to be reported in these news reports than generalizedlocations. Moving to a different, regional or local news source mayincrease the relative amounts ofmore detailed location information.

    nt and Urban Systems 50 (2015) 3040While the results obtained for spatiotemporal and semanticinformation retrieval are exciting, developing the approach is nottotally straightforward and many steps are needed to ensure that

  • nmeintegration occurs between the various stages. Future efforts willbe directed toward improving and extending the gazetteers, forexample, including spatial and temporal relations and dealing withmore complex phrases, e.g., in and around Iowa City. Also sincemultiple news reports can refer to the same event, having a strat-egy to lter the documents to avoid multiple representationswould be useful (Wang & Stewart, 2013). Additional open chal-lenges remain since generalized or vague locations or temporal ref-erences, for example, are commonly included in text, and how totranslate these places into relevant event locations is still an openresearch question. News articles have been useful to develop andtest this research and this approach may be especially promisingfor obtaining information about local dynamics where text-baseddescriptions may be one of the few available sources of informa-tion about events that are happening in a local area, although thechoice of news outlet could impact the granularity of spatial infor-mation. In addition, twitter messages are another source for spatio-temporal event information that could be examined as tweets aredesigned to work as micro versions of blogs or news reports. Oneof the most important advantages of Twitter is the rapid transmis-sion of information. Future work could involve developing anextension of the research applied to twitter feeds that adds real-time spatiotemporal information to complement the results of thisresearch for example, for hazard damage assessment and hazardresponse planning.

    References

    Allen, J. F., & Ferguson, G. (1994). Actions and events in interval temporal logic.Journal of Logic and Computation, Special Issue on Action and Processing, 205245.

    Arendarenko, E., & Kakkonen, T. (2012). Ontology-based information and eventextraction for business intelligence. Articial Intelligence: Methodology, Systems,and Applications. Lecture Notes in Computer Science, 7557, 89102.

    Buscaldi, D., Bessagnet, M. N., Royer, A., & Sallaberry, C. (2014). Using the semanticsof texts for information retrieval: A concept- and domain relation-basedapproach. In Proceedings of the 17th East European conference on advances indatabases and information systems: New trends in databases and informationsystems (Vol. 241, pp. 257266), Genoa, Italy.

    Chasin, R., Woodward, D., Witmer, J., & Kalita, J. (2013). Extracting and displayingtemporal and geographical entities from articles and historical events. TheComputer Journal, 57(2). http://dx.doi.org/10.1093/comjnl/bxt112.

    Chowdhury, G. (2003). Natural language processing. Annual Review of InformationScience and Technology, 37, 5189.

    Clark, A., Fox, C., & Lappin, S. (2010). The handbook of computation linguistics andnatural language processing. 978-1-4051-5581-6. Wiley_Blackwell.

    Clough, P. (2005). Extracting metadata for spatially aware information retrieval onthe internet. In Proceedings of the 2005 workshop on geographic informationretrieval (pp. 2530), Bremen, Germany.

    Cruz, I. F., Ganesh, V. R., & Mirrezaei, S. I. (2013). Semantic extraction of geographicdata from web tables for big data integration. In Proceedings of the 7th workshopon geographic information retrieval (pp. 1926), Orlando, Florida.

    Cruz, I. F., & Xiao, H. (2005). The role of ontologies in data integration. EngineeringIntelligent Systems for Electrical Engineering and Communications, 13(4), 245.

    Delboni, T. M., & Borges, K. A. (2007). Semantic expansion of geographic web queriesbased on natural language positioning expressions. Transactions in GIS, 3,377397.

    Fernadez, M., Cantador, I., Lopez, V., Vallet, D., Castells, P., & Motta, E. (2011).Semantically enhanced information retrieval: An ontology-based approach.Web Semantics: Science, Services and Agents on the World Wide Web, 9(4),434452.

    Gangemi, A., Guarino, N., Masolo, C., Oltramari, A., & Schneider, L. (2002).Sweetening ontologies with DOLCE. In Proceedings of the 13th internationalconference on knowledge engineering and knowledge management, London, UK.

    Goodchild, M. F., & Hill, L. L. (2008). Introduction to digital gazetteer research.International Journal of Geographical Information Science, 22(10), 10391044.

    Gregory, I. N., & Hardie, A. (2011). Visual GISting: Bringing together corpuslinguistics and geographical information systems. Literary and LinguisticComputing, 26(3), 297314.

    Grover, C., Tobin, R., Byrne, K., Woollard, M., Reid, J., Dunn, S., et al. (2010). Use of theEdinburgh geoparser for dereferencing digitized historical collections. Literaryand Linguistic Computing, 368(1925), 38753889.

    Guarino, N. (1998). Formal ontology and information systems. In Formal ontology ininformation systems. Proceedings of the 1st international conference on formalontology in information systems (pp. 315), Trento, Italy.

    W. Wang, K. Stewart / Computers, EnviroHu, Z., Li, J. S., Zhou, T. Sh., Yu, H. Y., Suzuki, M., & Araki, K. (2011). Ontology-basedclinical pathways with semantic rules. Journal of Medical Systems, 36,22032212.Imran, M., Elbassuomi, S., Castillo, C., Diaz, F., & Meier, P. (2013). Extractinginformation nuggets from disaster-related messages in social media. InProceedings of the 10th international ISCRAM conference, Baden-Baden,Germany. .

    Janowicz, K., & Keler, C. (2008). The role of ontology in improving gazetteerinteraction. International Journal of Geographical Information Science, 22(10),11291157.

    Janowicz, K., Scheider, S., Pehle, T., & Hart, G. (2012). Geospatial semantics andlinked spatiotemporal data-past, present, and future. Semantic Web, 3, 321332.

    Jones, C. B., & Purves, R. (2008). Geographical information retrieval. InternationalJournal of Geographical Information Science, 22(3), 219228.

    Jones, C. B., Purves, R. S., Clough, P. D., & Joho, H. (2008). Modelling vague placeswith knowledge from the web. International Journal of Geographical InformationScience, 22(10), 10451065.

    Kalfoglou, Y., & Schorlemmer, M. (2003). Ontology mapping: The state of the art. TheKnowledge Engineering Review, 18(1), 131.

    Karimzadeh, M., Huang, W., Wallgrun, J. O., Hardisty, F., Pezanowski, S., Mitra, P.,et al. (2013). GeoTxt: A web API to leverage place references in text. InProceeding of the 7th workshop on geographic information retrieval (pp. 7273),Orlando, Florida.

    Kemp, Z., Tan, L., & Whalley, J. (2007). Interoperability for geospatial analysis: Asemantics and ontology-based approach. In Proceedings of the eighteenthAustralian database conference (Vol. 63, pp. 8392), Ballarat, Victoria, Australia.

    Kontopoulos, E., Berberidis, C., Dergiades, T., & Bassiliades, N. (2013). Ontology-based sentiment analysis of twitter posts. Expert System with Applications,40(10), 40654074.

    Lee, C. H., Wu, Ch. H., Yang, H. Ch., Wen, W. Sh., & Chiang, Ch. Y. (2013). Exploitingonline social data in ontology learning for event tracking and emergencyresponse. In 13 Proceedings of the 2013 IEEE/ACM international conference onadvances in social networks analysis and mining (pp. 11671174), New York, NY.

    Li, L., Goodchild, M., & Xu, B. (2013). Spatial, temporal and socioeconomic patternsin the use of twitter and icker. Cartography and Geographic Information Science,40(2), 6177.

    Machado, I. M. R., Alencar, R. O. D., Campos, R. D. O., & Clodoveu, A. D. (2011). Anontological gazetteer and its application for place name disambiguation in text.Journal of the Brazilian Computer Science, 17(4), 267279.

    Manning, C. D., Raghavan, P., & Schtze, H. (2008). Introduction to InformationRetrieval. Cambridge: Cambridge University Press.

    Noy, N. F. (2004). Semantic integration: A survey of ontology-based approaches.SIGMOD Record, 33(4), 6570.

    Purves, R. S., Clough, P., Jones, C. B., Arampatzis, A., Bucher, B., Finch, D., et al. (2007).The design and implementation of SPIRIT: A spatially aware search engine forinformation retrieval on the internet. International Journal of GeographicalInformation Science, 21(7), 717745.

    Purves, R., & Jones, C. (2010). Geographic information retrieval. SIGSPATIAL Special,3(2), 24.

    Resnik, P., & Lin, J. (2010). Evaluation of NLP systems. The Handbook ofComputational Linguistics and Natural Language Processing, 57, 271.

    Saggion, H., Funk, A., Maynard, D., & Bontcheva, K. (2007). Ontology-basedinformation extraction for business intelligence. The Semantic Web LectureNotes in Computer Science, 4825, 843856.

    Sankaranarayanan, J., Teitler, B. E., Lieberman, M. D., & Sperling, J. (2009).TwitterStand: News in tweets. In Proceedings of 17th ACM SIGSPATIALinternational conference on advances in geographic information systems (pp. 4251), Seattle, WA, USA.

    Sawsaa, A. F., & Lu, J. (2012). Developing a domain ontology of information science(OIS). International Conference on Information Society, 447452.

    Schorlemmer, W. M., & Kalfoglou, Y. K. (2008). Institutionalising ontology-basedsemantic integration. Applied Ontology, 3(3), 131150.

    Silverman, B. W. (1986). Density estimation for statistics and data analysis. London:Chapman and Hall.

    Sitter, A. D., Calders, T., & Daelemans, W. (2004). A formal framework for evaluation ofinformation extraction. .

    Smith, B., & Ceusters, W. (2010). Ontological realism: A methodology forcoordinated evolution of scientic ontologies. Applied Ontology, 5(34),139188.

    Stewart, K., Fan, J. C., & White, E. (2013). Thinking about spacetime connections:Spatiotemporal scheduling of individual activities. Transactions in GIS, 17(6),791807.

    Stewart Hornsby, K., & Wang, W. (2010). Representing dynamic phenomena basedon spatiotemporal information extracted from web documents. Extendedabstracts. In Proceedings of the sixth international conference on geographicinformation science, Zurich, Switzerland. .

    Strotgen, J., Gertz, M., & Popv, P. (2010). Extraction and exploration ofspatiotemporal information in documents. In Proceedings of the 6th workshopon geographic information retrieval, GIR 2010 (pp. 18), Zurich, Switzerland.

    Teitler, B. E., Lieberman, M. D., Panozzo, D., Sankaranarayanan, J., Samet, H., &Sperling, J. (2008). NewsStand: A new view on news. In Proceedings of the 16thACM SIGSPATIAL international conference on advances in geographic informationsystems (Vol. 18, pp. 110), Irvine, California, USA.

    nt and Urban Systems 50 (2015) 3040 39Vasardani, M., Winter, S., & Richter, K. F. (2013). Location place names from placedescriptions. International Journal of Geographical Information Science, 27(12),25092532.

  • Viera, A. J., & Garrett, J. M. (2005). Understanding interobserver agreement: TheKappa statistic. Family Medicine, 37(5), 360363.

    Volz, R., Kleb, J., & Mueller, W. (2007). Towards ontology-based disambiguation ofgeographical identiers. In Proceedings of WWW2007 Workshop I3: Identity,identiers, identication, entity-centric approaches to information and knowledgemanagement on the web, Banff, Canada. .

    Wang, W. (2014). Automated spatiotemporal and semantic information extraction forhazards. Ph.D. dissertation. University of Iowa Library.

    Wang, W., & Stewart, K. (2013). Extracting spatiotemporal and semanticinformation from documents. In Proceeding of the 7th workshop on geographicinformation retrieval (pp. 1718), Orlando, Florida.

    Wang, W., & Stewart, K. (2014). Creating spatiotemporal semantic maps from webtext documents. In M.-P. Kwan, D. Richardson, D. Wang, & C. Zhou (Eds.), Spacetime integration in geography and GIScience: Research Frontiers in the US andChina. Dordrecht: Springer.

    Welty, C. A., & Murdock, J. W. (2003). IBM research report. The semantics of multipleannotations. .

    Wiegand, N., & Garcia, C. (2007). A task-based ontology approach to automategeospatial data retrieval. Transaction in GIS, 11(3), 355376.

    Worboys, M. F., & Stewart Hornsby, K. (2004). From objects to events: GEM, thegeospatial event model. In Proceeding of the 3rd international conference ongeographic information sciences (Vol. 3234, pp. 327344), Adelphi, MD.

    40 W. Wang, K. Stewart / Computers, Environment and Urban Systems 50 (2015) 3040

    Spatiotemporal and semantic information extraction from Web news reports about natural hazards1 Introduction2 Related work3 Constructing a hazard ontology4 Integrating the ontology with spatial, temporal and semantic gazetteers5 Spatiotemporal and semantic information retrieval for natural hazards6 A case study-Hurricane Sandy, OctoberNovember, 20127 Evaluation of approach8 Conclusions and future workReferences