Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
ARCGIS GEODATABASE DATA MODEL FOR CAVE SCIENCE
A THESIS PRESENTED TO THE DEPARTMENT OF GEOLOGY AND GEOGRAPHY
IN CANDIDACY FOR THE DEGREE OF MASTER OF SCIENCE
By AARON ADDISON
NORTHWEST MISSOURI STATE UNIVERSITY MARYVILLE, MISSOURI
JULY, 2006
GEODATABASE DATA MODEL FOR CAVE SCIENCE
ArcGIS Geodatabase Data Model
for Cave Science
Aaron Addison
Northwest Missouri State University
THESIS APPROVED
Thesis Advisor Date
Dean of Graduate School Date
iii
ArcGIS Geodatabase Data Model for Cave Science
Abstract
The purpose of this research is to determine whether a usable ArcGIS geodatabase data
model could be developed for use in cave science. Traditionally, cave scientists, or
speleologists, have collected various data in multiple formats. In many cases, researchers
are collecting the same data using different methodologies. This is undesirable not only
from the repetition of work, but perhaps more importantly because many of the more
scientifically interesting caves are such fragile environments that they cannot tolerate
additional and especially redundant data collection. Additionally, the geodatabase
provides a common data format for researchers to exchange data when working with
colleagues or personnel from agencies that manage caves.
An ArcGIS geodatabase data model was developed utilizing ArcCatalog and
standard cave feature classifications. The data model was then tested against an existing
traditional cave map to determine whether or not the geodatabase model was functional.
The results were encouraging as the data model was able to handle the majority of data
types and their accompanying representation. Problem areas discovered during
development included the inability of the geodatabase to facilitate multiple types of
geometry for a single feature class and cartographic map finishing.
iv
TABLE OF CONTENTS
ABSTRACT........................................................................................................... iii
TABLE OF CONTENTS....................................................................................... iv
LIST OF FIGURES ............................................................................................... vi
ACKNOWLEDGEMENTS................................................................................. viii
LIST OF ABBREVIATIONS................................................................................ ix
CHAPTER I: INTRODUCTION................................................................................................1
The Geodatabase Model ....................................................................................3 Research Objectives...........................................................................................4 Study Area .........................................................................................................5 Rationale for Cave Science Data Model............................................................7
II: LITERATURE REVIEW....................................................................................9 Cave Survey Software........................................................................................9
Application of GIS Related to Speleology.......................................................13 Literature Review Summary ............................................................................15 III: CONCEPTUAL FRAMEWORK AND METHODOLOGY ..........................17 Study Area: Great Onyx Cave .........................................................................18 Data Source Descriptions.................................................................................20 Research Methodology ....................................................................................21 IV: RESULTS AND TESTING.............................................................................29 Conceptual Design ...........................................................................................29 Logical Design .................................................................................................31 Physical Design................................................................................................33 Survey Feature Dataset..............................................................................33 Passages Feature Dataset..........................................................................36 Hydrology Feature Dataset .......................................................................43 CrossSections Feature Dataset..................................................................46 Profile Feature Dataset .............................................................................49 Object Classes............................................................................................50 Raster Data Classes ...................................................................................52 Page Layout Theme....................................................................................52 Supporting Attribute Domains ...................................................................53 Testing the Data Model....................................................................................55
v
V: CONCLUSIONS...............................................................................................60 Future Research ...............................................................................................61 APPENDICES .......................................................................................................64 Appendix A: ArcCatalog View of Cave Science Data Model...............................64 Appendix B: National Speleological Society Map Symbols .................................66 Appendix C: Missouri Speleological Survey Map Symbols .................................69 Appendix D: Test Data Sample: Great Onyx Cave ...............................................84 REFERENCES ......................................................................................................87
vi
LIST OF FIGURES
Figure Page 1. Study area .......................................................................................................6
2. WALLS shapefile export dialog...................................................................11
3. COMPASS shapefile export dialog..............................................................13
4. Study area detail: Great Onyx Cave .............................................................19
5. Data model thematic layers ..........................................................................30
6. Geodatabase design structure .......................................................................32
7. SurveyStations feature class .........................................................................34
8. SurveyVectors feature class..........................................................................35
9. SurveyAnnotation feature class ....................................................................35
10. SurveyStations featured linked relationship class ........................................36
11. PassageWalls feature class with subtypes ....................................................37
12. PassageFeatures feature class with subtypes ................................................39
13. Ceiling channel and Joint attribute domains.................................................39
14. FloorMaterial feature class with subtypes ....................................................40
15. Speleothems feature class with subtypes......................................................41
16. PassageAnnotation feature class...................................................................42
17. FeaturePoint feature class with subtypes......................................................44
18. Streams feature class with subtypes .............................................................44
19. Pools feature class with subtypes .................................................................45
20. Sump Type and Water Quality attribute domains ........................................46
vii
21. Cross section of plan view............................................................................47
22. CrossSections feature class...........................................................................48
23. CrossSectionsNearSurveyStation relationship class ....................................48
24. ProfileLines feature class .............................................................................49
25. ResearchProjects object class .......................................................................51
26. Researchers object class ...............................................................................51
27. ResearchProjecthasResearcher relationship class ........................................51
28. Internal Annotation attribute domains ..........................................................54
29. Import from CAD geodatabase tool..............................................................56
viii
ACKNOWLEDGEMENTS
I wish to thank my friends and fellow cavers for their assistance and support of
this research project. In particular, the collective understanding of Bernie Szukalski,
Alan Glennon, and David McKenzie, on the topics of GIS and cave science is greatly
appreciated. Without their input and suggestions, this project would not have been
possible. I also need to thank my family for missed time together as I completed my
research.
The Cave Research Foundation (CRF) and the National Park Service (NPS) were
kind enough to provide the test data and research for the data model. The members of
CRF and the NPS staff are truly dedicated to the exploration and management of the
world class cave resources found within Mammoth Cave National Park. I would
especially like to thank Mark DePoy and Rick Olson with the NPS for their support, and
Dave West and Bob Osburn for their suggestions for the data model.
I would also like to thank my thesis committee members Dr. Patty Drews, Dr.
Ming Hung, and Dr. Rickard Toomey III for their support and guidance along the way.
In addition I would like to thank all of my instructors both past and present for taking the
time to teach.
I dedicate this work to my wife Anica, and my daughters Aslee and Anora. They
have supported me unwaveringly throughout my studies.
ix
LIST OF ABBREVIATIONS
CAD Computer Aided Drafting
COTS Commercial Off-The-Shelf Software
CRF Cave Research Foundation
CRWR Center for Research in Water Resources
DOQQ Digital Orthophoto Quarter Quadrangle
ESRI Environmental Systems Research Institute
GIS Geographic Information System
GRASS Geographical Resources Analysis Support System
GOC Great Onyx Cave
MrSID Multi Resolution Seamless Image Database
MSS Missouri Speleological Survey
MXD ESRI map document
MXT ESRI map template
OGC Open Geospatial Consortium
NAD North American Datum
NPS National Park Service
NSS National Speleological Society
SMAPS Survey Manipulation, Analysis and Plotting Software
TIF Tagged Image File
USGS United States Geological Survey
1
CHAPTER I
INTRODUCTION
Caves have been used by humans for various activities since prehistoric times.
People have used caves as shelter, burial sites, storehouses, natural laboratories, and even
as places to worship various gods. The scientific study of caves and the cave
environment is called speleology (Moore and Sullivan 1978). Speleology crosses many
different areas of scientific specialization. Obvious areas include geology and hydrology,
but other areas of science are equally as important, such as biology, microbiology,
paleontology, anthropology, and meteorology. Speleology may be compared with the
domain of geography in the sense that it is not bound by a single area of study.
Caves provide a valuable laboratory for science. They have unique environments
that provide the setting for unique science. Scientists use caves to study undisturbed
environments (White 1988). Biologists are using caves to learn what micro-organisms
might be found on Mars (Cole 1999). Still other scientists are studying caves to better
understand groundwater contamination (Pfaff and Glennon 2004). Geologists use caves
to get a first hand view of the processes that continue to form the Earth.
The decentralized characteristics of speleology have given rise to a multitude of
data gathering and data storage techniques. In many cases, researchers are collecting the
same data using different methodologies. This is undesirable not only from the repetition
of work, but perhaps more importantly because many of the more scientifically
interesting caves are such fragile environments that they cannot tolerate additional,
sometimes redundant, data collecting.
2
Unfortunately, there is no common ground for the various scientific domains to
conduct cross analysis with other domains. Geographic Information Systems (GIS) have
shown great promise in addressing this need for the scientific community. Geographic
patterns and relationships that were previously unknown or overlooked can be discovered
with simple mapping tools within GIS. Analysis of more complex relationships and even
forecasting behavior or presence/absence of conditions can be conducted with the
modeling tools in GIS.
The cave science projects that have integrated GIS philosophies and techniques
have discovered the usefulness of GIS. Unfortunately, many of these projects are also
challenged by the generic nature of the data storage in GIS, the steep learning curves for
commercially available GIS software, and file management.
All of the scientists studying caves and the cave environment are collecting large
amounts of data. Much of this data is spatial in nature. Scientists are recording the
location of specific species occurrences, archeological remains, and geological features.
The common thread of all data collection is that location is important. Other data being
collected includes inventory information on environmental conditions and areas that are
at risk for contamination. These data are used by scientists looking for certain
environmental conditions and the spatial relationships between the features found in the
cave environment. GIS provides many valuable tools in working with this field data.
First, it allows the storage of data gathered in a logical format which allows for easy data
maintenance and retrieval. Secondly, GIS provides an environment where the cave
scientists can visualize their spatial data. Visualization provides a powerful method of
data analysis that can reveal data relationships not easily seen in tabular format. Finally,
3
GIS tools allow the researchers to leverage other scientists’ data and data collected by
governmental agencies to conduct domain specific research without needing to incur the
significant effort of base data collection such as cave mapping.
Many researchers and governmental agencies are already using GIS in their cave
science work. These implementations are not following any established guidelines for
data storage or a data model. In some cases, such as the National Park Service (NPS),
individual GIS data users are often left to their own data design without strong intra-
agency coordination, let alone coordination with other agencies. This can create
confusion when individuals move to other park units or data must be exchanged between
parks for research or management projects within a single park.
Data modeling can be as much as an art as a structured exercise. Shaw (2005)
notes that data modeling has become “more art than science”. However, he follows the
statement by comparing data modeling to disciplines such as architecture or engineering
where art and creativity are at the forefront, but they are only useful when built on a
foundation of “rules and knowledge”. Creating a usable data model for cave science or
any other domain is a balancing act of leveraging the structure of data model against the
nuances of the data being stored and analyzed.
The Geodatabase Model
In 1999, a leading GIS software vendor Environmental Systems Research
Institute (ESRI) introduced a new concept for GIS data storage. The new data structure
was called the geodatabase and held great promise for not only data storage, but for data
topology and analysis as well. Over the last five years, ESRI and users have developed
4
several different data models for various industries (ESRI 2006a). These data models
provide a framework and a starting point for users to begin using the geodatabase for data
storage. Data models may also be designed to support more specialized models that
extend the usability of the data. One example of this type of extension would be the
creation of a geometric network based on a street centerline feature class. Lastly, the data
model structure provides a consistent format for researchers and organizations that must
exchange data.
Cave science and GIS have both made great advances. As information
management and storage become more and more important for speleology, the ArcGIS
family of products and the geodatabase model may provide much needed tools to
advance science. If a geodatabase model for speleology can be developed, it would
provide an important tool for all of the researchers working on cave science.
Research Objectives
The objective for this research is to investigate whether or not a usable ArcGIS
geodatabase data model can be created and implemented for speleological science. The
term “usable” is an important one. It is possible to develop a model that encompasses the
needs of the domain, yet is not practical to implement on real-world research.
This research does not endeavor to include every imaginable scenario that may be
needed in a cave science data model. Often data modelers will pride themselves on the
completeness of their model design, even though the model becomes unwieldy for the
community of users that it was designed to help. Esoteric data models also carry
overweight baggage in the form of unused feature classes, attributes and relationships.
5
The design goal of a “usable” data model is centered on the premise of including the
features and functionality currently being used by the cave research community.
Study Area
The study area for this research is the front section of Great Onyx Cave (GOC).
Great Onyx is a large cave situated under Flint Ridge in the northern area of Mammoth
Cave National Park, Kentucky (Figure 1). Mammoth Cave National Park is situated in
the south central region of Kentucky. Mammoth Cave was established as a National Park
in 1941. Although Mammoth Cave and several other caves in the park were used by
prehistoric man, Mammoth Cave received Federal protection because of its vast network
of interconnected tunnels and passageways. These passages comprise the longest known
cave system in the world at just over 346 miles of mapped passage (Osburn 2005). The
entire park area was designated as a World Biosphere Reserve in 1996 by the United
Nations.
Great Onyx operated as a commercial cave long before the region was set aside as
a national park. Edmund Turner discovered it in 1916. It was operated as an independent
commercial cave until 1961, when the property was sold to the National Park Service. A
road to the cave and a hotel at the entrance were constructed. Fortunately, the owners of
GOC understood the need to protect this significant cave. Today, GOC is publicly
accessible only on ranger led trips operated by the National Park Service. The visitation
and notable cave resources lead to significant management concerns. The Cave
Research Foundation (CRF) continues to explore and document the cave today under a
cooperative agreement with the NPS. Great Onyx was chosen as a test for the cave
6
science data model as it has a complete map of the entrance area, in addition to a number
of completed and ongoing research projects.
Figure 1 – Study area
Mammoth Cave National Park
Great Onyx Cave
7
Rationale for Cave Science Data Model
Work on geodatabase data models is widespread compared to work on cave
science. ESRI created the geodatabase model and understandably has published the
major works discussing the design and usability of the geodatabase data model. As
discussed previously, most of this work has been targeted towards large market segments
that could benefit from the geodatabase model. Specifically the information presented by
MacDonald (2001) and Arctur and Zeiler (2004) provides guidelines for designing and
implementing a new geodatabase model. These steps are outlined and detailed in the
Research Methodology section of this thesis.
The ESRI geodatabase was chosen as a foundation for this research in large part
because ESRI is the market leader in GIS software (ESRI 2002). There are several other
GIS software packages that provide more flexibility for data storage. For example, CAD
based GIS software will allow mixed geometry types within a single feature class. While
such functionality is attractive, especially in the context of cartographic production,
development of a data model for these software packages would almost certainly limit the
already niche user base of the data model.
It is difficult to imagine research in the cave science environment that does not
rest squarely on spatial data. The research community appears to be slowing embracing
GIS philosophy and toolsets. This idea is supported by the fact that the various cave
survey programs have started to incorporate basic GIS functionality in their software.
Although the geodatabase is subject to the rapidly changing domain of GIS, adopting this
file format and data structure is gaining momentum as a replacement for legacy data such
8
as ESRI’s shapefile and coverage formats. The geodatabase simplifies file management,
data management, and extends the capabilities of the data for the end user.
9
CHAPTER II
LITERATURE REVIEW
Cave Survey Software
Significant research has been done in cave science and in geodatabase models.
However, a review of the literature suggests that almost no research activity has taken
place in connecting these two areas together. GIS is still an emerging concept to many
cave science researchers. Cave science is considered a niche science to the greater GIS
industry and has gone largely unnoticed by the industry leaders. Geodatabase models
have concentrated on more widely used models such as parcel mapping, utility networks,
and transportation systems. Many of these models may have applications to a speleology
data model.
Various types of cave survey software have been developed over the last 30 years.
In most cases the programs have been developed through the effort of an individual caver
with the unique combination of computer programming and cave survey skills. The
following is a summary, organized by author, of significant milestones and differing
points of view in the context of data modeling of cave data.
The Survey Manipulation, Analysis and Plotting Software (SMAPS) was
developed by Doug Dotson in the early 1990s (Dotson 1992). SMAPS was a MS-DOS
based software package that was designed to address all of the aspects of the cave
management process. The software introduced a hierarchical, or tree based, file storage
system and the ability to add a geographical reference to the data, both firsts for cave
researchers. In later versions of the software, Dotson created a GIS option that could be
10
added to the base module of SMAPS by including an early version of the open source
Geographical Resources Analysis Support System (GRASS) GIS system. The GIS
functions included the ability to add attribute data to the base survey data, display surface
contours, and run basic queries. In addition to the on-screen functionality of SMAPS, it
also provided printer and plotter drivers so that researchers could generate hard copy
records of their maps.
One of the first software packages to store spatial data for cave researchers was a
FORTRAN based program named Ellipse, developed by David McKenzie in the mid
1970s (McKenzie 2006). Ellipse ran on a mainframe computer and was capable of
generating not only line plots of collected survey data, but also the associated walls of the
passages. The software gained widespread use by cave projects in the southern USA and
throughout Mexico. McKenzie has continued development on his software for over
thirty years. The software was ported to micro-computers in the 1980s and ultimately to
the C++ language running on the Windows platform in 1994 (McKenzie 2006).
The Windows version of the software is called WALLS. WALLS utilizes a tree
structure for storage of data files and a custom binary database for storing processed data.
This combination of features allows the software to efficiently handle hundreds of
thousands of data points. Storing the data files separately has the added benefit of
allowing multiple users to work on the data files prior to processing. In addition to
storing traditional survey stations, vectors and geographical reference, WALLS also
allows the user to store attribute information in the form of “flags” and “notes”. Flags are
typically used to indicate specific areas of interest, while notes are used to add notation
text to a feature. In both cases, the attribute data is stored as ancillary text in the data file.
11
McKenzie has also added the ability to export the four data layers described above, along
with the passage walls to ESRI’s shapefile format (Figure 2). These four data layers are
separated into survey stations, survey lines, flags and notes. Survey stations are defined
as point features and represent the traverse points in the cave. Survey lines are line
features and connect survey stations. Flags are point features and provide a way to
isolate specific stations for the purpose of alternate symbology. The note data layer is
also a point feature that provides a way for WALLS to replace survey station names with
longer more descriptive names. Both the flag and note data layers would be replaced with
simple attribute data in the context of ArcGIS. The export functionality of WALLS has
allowed the cave science community to leverage the analysis tools of ArcGIS in a wide
range of applications. Later versions of the shapefile export utility allow the creation of
3D shapefiles that can be used in conjunction with ArcGIS extensions such as 3D Analyst
and Spatial Analyst.
Figure 2 – WALLS shapefile export dialog (McKenzie 2006)
12
COMPASS is another widely used cave survey program. Like WALLS, it was
originally developed in the 1970s for use on a mainframe computer by Larry Fish (Fish
2006). The software was ported to the Apple II in the late 1980s, and finally to Windows
in 1994. Fish states that his design goals for the software are to “visualize and analyze”
(Fish 2006). These qualities are quite evident in his work.
COMPASS is an assembly of modules held together by a workbench interface.
The user can input their information in the Editor module, pass the information to the
Compile module, and finally realize Fish’s visualization and analysis goals in the Viewer
module. The Viewer module provides the interface that most researchers would use
while interacting with the data. Functions such as attribute symbolization and attribute
queries are supported.
Another important module of COMPASS is the CaveBase database module. This
module provides attribute data storage in a custom database format created by Fish. The
implementation allows the user to define database fields, import data from industry
standard database formats, and leverage a custom developed query builder tool to analyze
data. Many of these functions are very similar to those found in commercial GIS
software such as ArcGIS. COMPASS also has a built-in export utility for the ESRI
shapefile format (Figure 3). The utility allows for control of layers to be exported and
associated settings. Similar to WALLS, COMPASS provides the ability to export 2D or
3D shapefiles for use with various ESRI extensions. The 2D option is provided to allow
compatibility with older versions of ArcGIS that did not support 3D data types.
COMPASS does provide the added functionality of allowing the user to export point
layers named “feature location” and “feature lines”.
13
The feature layers exported by COMPASS provide a link to data stored in the
CaveBase database. This functionality closely resembles the behavior between the
geometry of shapefiles and the tabular information typically stored in attribute tables.
Applications of GIS Related to Speleology
Cave passages range from very simple single passage systems to extremely
complex sponge work systems. Some work has been done in flow modeling and
networks. The Center for Research in Water Resources (CRWR) team led by Dr. David
Maidment has developed a watershed model for surface streams called Arc Hydro
(Maidment 2002). The CRWR team has created a data model for defining hydrology
Figure 3 – COMPASS shapefile export dialog (Fish 2006)
14
networks and streams. The model also supports the inclusion of time series
measurements such as staff gauges. All of these features may have valuable application
towards a cave science data model in terms of network analysis and spatiotemporal data
storage.
Glennon (2001) investigated the use of GIS tools for bookkeeping large complex
spatial data sets found in some cave systems. His research outlines the variety of data
sources utilized in documenting caves and their environments. Glennon leverages the
power of GIS, specifically ArcGIS, for his research on the morphometric relationships to
active flow networks with the Mammoth Cave watershed. Also investigated were the
GIS analysis and quantitative modeling of karst processes.
Phaff and Glennon (2004) described work on groundwater protection. Their work
on the ModelBuilder functionality in ArcGIS 9 is useful as an example of the information
products that will be required from a cave science data model. Their work utilized a
geodatabase for storage and analysis; however, it does not appear that they used a
structured data model in their work. The data model should be able to support tools such
as ModelBuilder. This research also provides good examples of how karst data is used
with publicly available data such as land use information and topographic data.
Moyes and Awe (2000) provided another great example of an information product
from a cave science data model. In their report on using GIS for spatial analysis of an
ancient cave site, they discussed the importance of the spatial component of data
inventories. They described how data can not only be stored, but reclassified if needed.
The report finished with an analysis produced with ArcGIS tools.
15
Some work has been done on developing cave science tools with the ArcGIS
environment. In 1998 Bernard Szukalski, an ESRI employee and long time caver,
identified the need for a utility to assist the cave research community in translating their
cave survey data to an ArcGIS compatible format. ESRI’s shapefile format was chosen
as a widely implemented file type, and Szukalski coupled the functionality with basic
georeferencing tools to create the CaveTools extension for ArcGIS (Szukalski 2004).
This work was prior to the export routines developed within cave software products such
as WALLS and COMPASS which eventually superseded the need for the CaveTools
utility. Still, Szukalski’s work demonstrated that there was demand for GIS tools and
capabilities within the cave science community.
There are several other specialized cave survey software programs available and
used by cave researchers. These include programs such as Winkarst, WinCAPS,
CaveRender, and Survex. All of these programs perform similar functions to the
software systems detailed above, but do not appear to be as widely used as programs such
as WALLS or COMPASS.
Literature Review Summary
COMPASS and WALLS are the two most popular cave survey software packages
available to researchers. These software packages have been developed over many years
by members of the cave community (Szukalski 2004). These packages provide basic data
entry and processing functions in support of cartographic efforts. Both software modules
have made some data storage efforts similar to those found in commercial GIS software,
but neither provides the data analysis functions found in even the most basic GIS
16
software. A structured data model, such as the one outlined by this research, is needed to
address this notable lack of functionality. In addition, the reviewed literature indicates
that the only GIS file format available for export from the cave survey programs is the
shapefile. The shapefile format is adequate for simple data storage, but lacks needed
information to participate in more advanced GIS analysis such as network tracing and
annotation feature classes. The shapefile format also cannot support the use of subtypes
and attribute domains commonly used in current versions of ArcGIS.
Glennon’s research (2001) points out the similarities between subsurface streams
and surface streams. This finding would seem to support the idea that parts of the CRWR
work may be usable in a cave science geodatabase model. In both the CRWR Arc Hydro
data model and Glennon’s research, work focused on the flow modeling aspects of data.
However, neither of these studies specifically addresses the data modeling challenges
presented when attempting to store spatial data about the cave itself.
The data model should be able to support tools such as ModelBuilder. The
research presented by Phaff and Glennon (2004) also provides good examples of how
karst data is used with publicly available data such as land use information and
topographic data. This data usage is a good example of an information product a cave
science data model must support. In addition, the data model should be supportive of
data types available in the geodatabase structure. These include incorporating subtypes
and attribute domains, relationship classes, network feature types, and topology rules.
The work of these researchers provides a solid foundation for extending the
development of GIS concepts and tools in the context of cave science. This research is a
step towards the goal of a spatial data model for speleologists.
17
CHAPTER III
CONCEPTUAL FRAMEWORK AND METHODOLOGY
The most significant challenge of developing a data model for cave science is to
build consensus among the potential users for the content and framework of the model. It
is particularly difficult when most of the user community, in this case cave researchers,
have only a cursory knowledge of GIS and data modeling in general. Humans are
creatures of habit and often are slow to embrace change. When applying technology to a
new area of study, it is helpful to bring users along by duplicating traditional data
presentation within the new technology. For instance, the casual observer may puzzle as
to why a standard CAD line weight of “1” is 0.32 inches, but closer examination reveals
that a size “1” pen for the traditional board draftsman matches that width exactly. As the
various CAD vendors transitioned users to their digital drafting software, they needed a
common point of reference for the end users.
Along these lines this research attempts to duplicate the cartographic output that
both data creators and data users have established as useful. This information product is
often the nexus of the data needed by the cave scientist. To facilitate this effort, the
current map symbols from the Missouri Speleological Survey (MSS), and the National
Speleological Society (NSS) were implemented in the data model as feature classes,
subtypes and attribute domains. Great Onyx Cave was surveyed by CRF, which has
adopted the MSS map symbol set as its standard. It is hoped that by incorporating the
classifications of the MSS and NSS map symbols that the common point of reference will
be established for researchers familiar with traditional methods.
18
A second challenge is incorporating geometric network tracing in the data model.
Existing cave survey software does not support network creation or analysis.
Many cave researchers and agencies responsible for cave management have a need to
generate cost of travel analysis based on factors such as time, hazards, and fragility (Hale
2005). The three dimensional nature of many caves does not lend itself well to the
geometric networking tools found in the base ArcGIS software package. ArcGIS
Network Analyst will support the needs of 3D cave network modeling. The development
of such custom networking tools is beyond the scope of this research, but it is an
objective of this study to design the data model to support such functionality in the future.
Lastly the data model must be documented using accepted methods. These
methods include illustrations showing the thematic organization of the data, the
schematic diagrams of data structures, and a written description of the data model.
Failure to properly document the model may significantly impact the ability to
communicate the purpose and usage of the data model.
Study Area: Great Onyx Cave
In addition to the main cave system, there are over 300 “lesser” or small caves
within Mammoth Cave National Park (House 2005). These caves range widely in length
and complexity. The smallest of caves may only be tens of feet long. The largest caves in
this classification are around three miles in known length. Many of these caves were
used by Indians inhabiting the area before Europeans arrived. Early settlers mined the
onyx and saltpeter from the caves leaving a record of modern man’s usage. Great Onyx
Cave is one of these “lesser” caves in the park. The overall size of the GOC map
19
prohibits inclusion in the this thesis, but Figure 4 illustrates the level of detailed data
collection for the entire cave.
Great Onyx Cave has multitude of uses and a robust data set (House 2005).
Implementation of the model on the main cave system would be problematic due to the
size of the system and unique challenges it presents as the longest known cave system in
the world. Many sections of the system are still being surveyed and processed (Osburn
2005). The wide variation in cave characteristics of Great Onyx provides a good testing
environment for a cave science geodatabase data model.
Figure 4 – Study area detail: Great Onyx Cave (Gulden 2006)
© Cave Research Foundation
20
Data Source Descriptions
All data needed for this research existed. The only new data created was in the
context of representing the various features within the geodatabase. The core data used
in the data model was provided by CRF member Bob Gulden, cartographer of the Great
Onyx Cave map. Great Onyx was surveyed to CRF data standards by Gulden and others
over several years. Their efforts represent hundreds of hours of underground field data
collection. Gulden then drafted the original map in AutoCAD software (Gulden 2006).
Utilizing different layers in CAD, the various graphics were organized into common
themes. Still, the features were not attributed with any non-graphical information that
would be expected in a GIS environment. The Great Onyx map represents the most
complex data to be loaded into the cave science data model.
The cave survey line plot data was in the WALLS format. This data consists of
all survey stations and vectors. The standard survey practices at Mammoth Cave do not
utilize the flag and note fields discussed earlier in Chapter II. All survey data was
converted to shapefiles using the shapefile exporter within the WALLS software (Figure
2). The shapefiles were converted to feature classes with the ArcToolBox functions
provided with ArcGIS.
Research data used to test the data model developed for this thesis is based on a
census of a several existing CRF and NPS research projects in Great Onyx Cave.
Although based on CRF and NPS data, the actual research data was not available for
testing during the timeframe for this research. While most of the CRF and NPS research
has spatial characteristics, none of the data for these research projects currently is
associated with a map. The data was incorporated into the geodatabase data model so
21
that it is available for analysis and visualization. These data are currently in either
FileMaker Pro or Microsoft Access databases and text file formats.
Raster datasets were also tested with the data model. These data are in various
file formats. Topographic data was obtained from the United States Geological Survey
(USGS) in a tagged image format (TIF) format. These files are georeferenced to the
Kentucky South State Plane coordinate system in North American Datum (NAD) 83 feet.
Digital Orthographic Quarter Quadrangles (DOQQ) orthographic photos were also
obtained from USGS in the same projection as the topographic data. The geologic maps
for the study area were obtained from the State of Kentucky Geologic Survey in a multi
resolution seamless image database (MrSID) format. The raster files depicting geology
have no explicit georeference metadata, but were determined to have the same spatial
characteristics as the USGS data by inspection.
Research Methodology
The ArcGIS database modeling process follows database abstractions developed
in the 1980s (Nyerges 2006). These abstractions are classification, generalization,
association, and aggregation. ArcGIS modifies the terminology as follows: classification,
subtypes, relationships, and topology, respectively. Zeiler (1999) suggests a geodatabase
development methodology that closely follows a broader GIS implementation philosophy
presented by Tomlinson (2003). This process includes determining information products,
defining objects and graphical representations, matching objects to corresponding
geodatabase elements and lastly organizing and implementing the system. Perhaps the
most comprehensive guidelines for geodatabase design are described by Arctur and
22
Zeiler (2004). Their research isolates geodatabase design into three distinct phases. The
conceptual design begins the design process by identifying not only the information
products the geodatabase will support, but also points to the need for development of
thematic layers and scale considerations of the data. The second design phase builds the
logical data structure. Attribute fields and spatial characteristics of the design are
outlined and assembled in to a proposed geodatabase design. Arctur and Zeiler conclude
the process with the physical design of the data model. This stage of the process includes
reviewing and refining the geodatabase, developing workflows for the thematic layers
and documentation of the modeling process.
The development of the cave science geodatabase data model follows the outline
described by Arctur and Zeiler (2004) and recommended by ESRI. The process can be
more specifically broken down into ten steps adapted from Arctur and Zeiler (2004). The
following information describes how this methodology was used in the context of this
research.
1. Identify the information products to be produced from the geodatabase.
Domain experts were solicited for information using a wide variety of methods.
Internet newsgroups focused on cave exploration and speleology were queried for
input on the conceptual design of the data model. The speleology discussion group
on the NSS website was polled for input from the caver community. Newsgroups
dedicated to GIS as applied to caves and karst on the ESRI and Yahoo websites were
queried. Several responses were received and suggestions were incorporated into the
23
overall design. Interesting, many more responses were received offering data for
testing purposes or expressing interest in the results produced by this research.
Many individual stakeholders were also contacted. These conversations took
place by email and phone. Significant insight regarding the philosophy and design of
existing cave survey software and data collection methodologies were obtained.
Several cave resource managers were also contacted to better understand the types of
information products and analysis that are needed by their organizations. It is
anticipated that these requirements will greatly expand once the managers realize the
full potential of the geodatabase. Lastly, I was able to speak with ESRI
representatives to better identify with market trends and future possibilities within the
geodatabase data structure. One significant trend identified was enhanced
cartographic support in future releases of ArcGIS. As discussed earlier, it is difficult
to get complete cooperation when designing data models. There appears to be a
“build it and they will come” approach to GIS within the cave research population.
2. Identify the base layers needed to support the information products.
The thematic base data layers for cartography were identified from MSS and NSS
map symbols. Appendices B and C show the NSS and MSS map symbols,
respectively. These symbols provide graphic representation for commonly found
features in the cave environment. Additionally, these map symbols are widely
circulated and generally accepted among US speleologists. The top level
cartographic map themes are passages, hydrology, profile and cross sections. The
survey theme contains only the survey stations and vectors and is based on
24
information exported from the COMPASS or WALLS cave surveying software. All
thematic layers were defined as a graphic data type supported by the geodatabase.
The spatial characteristics of the geodatabase are also important at this point of
the conceptual design. The spatial extent of each feature dataset within the model
was adjusted to accommodate a geographic dataset the size of the entire continental
USA at a precision of one centimeter. The actual geographic extent of this XY
domain is determined by setting the geographical reference of the geodatabase.
3. Specify the scale ranges needed and data types (point, line, polygon, raster).
The scale ranges for all feature classes were designed for large scale
representation. Most caves are not visible at conventional smaller scales. Even at the
common topographic map scale of 1:24,000, caves may be little more than a thin line
on the map. The larger scale representation also eliminated the need for multiple data
types for a single feature class. Features that may have been represented as a point
feature when using smaller map scales could easily be represented as polygons for the
larger map scales.
Line features appear to be minimally impacted by the scale range. Annotation
feature classes were optimized for a 1:600 scale. The resolutions of the raster
datasets utilized in this research were far lower than other data in the geodatabase. As
a result, the raster data was stored at maximum available resolution. The geodatabase
stores raster data with multi-resolution pyramids so that large datasets render more
quickly.
25
4. Describe datasets.
The final step in the conceptual design process was to group the feature classes
into datasets. As discussed in step 2, the top level feature datasets were identified as
passages, hydrology, profile, cross sections, and survey. These datasets are based on
the MSS and NSS map symbols and feature classifications. Each of these feature
classes is described in detail in the Results section of Chapter IV.
Raster datasets were defined as topographic, geologic and aerial photos. These
datasets are seamless and are well suited to the nature of the raster dataset. A raster
catalog was defined to store scanned survey notes, should they be needed.
5. Define the tabular database structure and any behavior for attributes.
Tabular datasets were defined for several feature classes. Effort was made in the
creation of feature classes to organize features so that common attribute data could be
collected for each feature. It is anticipated that a real world implementation of the
data model would not necessarily utilize all available attributes.
Subtypes were created for many feature classes to further classify features within
various feature classes. The subtypes also assist the user in controlling the behavior
and appearance of features. A limited number of attribute domains were defined to
limit the possible values for certain tabular attributes. It is expected that as the data
model matures, there will be additional attribute domains suggested for incorporation
to the model structure.
A relationship class was defined between the Researchers object class and the
ResearchProject object class. This relationship allows various types of research to be
26
incorporated in the cartographic features of the GIS. A second relationship class was
created between the CrossSections feature class and the SurveyStations feature class.
This relationship associates each cross section with a survey station.
6. Define spatial properties for all datasets.
All feature datasets share common spatial properties. These properties are then
subject to a geographical reference. The geographical reference should not be
confused with the spatial properties of the geodatabase. It may be useful to think of
the spatial properties as a piece of paper, and the geographical reference as a position
on a desk. The combination of these two parameters determines where the paper will
be located on the desk.
These settings are particularly important when topology rules are defined.
Feature classes must share the same spatial properties or these topology rules cannot
be enforced.
7. Create prototype geodatabase design.
This study has produced a prototype geodatabase design. The prototype is based
on the information products collected in step 1, the data types and structure outlined
in steps 2-5, and the spatial properties defined in step 6. The prototype geodatabase
was created with ArcCatalog.
Background information for the data model was collected during the literature
review process to better understand how other data models have been designed and
implemented in other fields of study.
27
8. Implement, review, and revise geodatabase design.
The cave science geodatabase data model was implemented against a real world
set of cave data for Great Onyx Cave. Great Onyx Cave is representative of the vast
majority of limestone caves in the world.
The data model will be implemented by cave researchers on their own data over
time. This will begin a more earnest peer review process. Through the review
process revisions will be made to the model and new functionality may be added to
extend the model.
9. Develop workflows for data creation and data maintenance.
Workflows were designed for importing cave survey data from WALLS and
COMPASS. These workflows leverage the ModelBuilder functionality of ArcGIS.
Shapefiles are incorporated to their corresponding feature class and attributes are
mapped according.
Creating cartographic features is a straightforward process that leverages existing
editing and data creation tools found in ArcGIS. Importing cartographic features
from drafting programs such as CAD is a more complex and problematic process.
CAD data was tested as a part of this research, but the overall cartographic effort in
the cave research community sorely lacks any type of data standards beyond the map
symbol sets. This lack of structure in drawing files must be addressed before
effective workflows can be developed for importing cartographic features.
28
10. Document geodatabase design using established methods.
The cave science geodatabase data model was documented by several established
methods. This thesis serves as written documentation of the model. Future efforts to
extend this model or refine the existing model should be able to use this
documentation to better understand the philosophy and methods used to create the
existing data model.
A poster size document illustrating the thematic layers, schematic diagram and
geodatabase structure was created to better illustrate the purpose and relationships of
the feature classes. This document will allow the users of the data model to easily
visualize the organization and components of the model. The poster also functions
as a “face” for the data model during discussions with colleagues wishing to
implement the data model.
Finally, an extensible markup language (XML) schema of the data model is
provided so that cave researchers may leverage the data model as a geodatabase
template for their own research. This template is compatible with ArcGIS and allows
the user to create an empty copy of the geodatabase ready to populate with their own
GIS data. The schema preserves all data structure, subtypes and relationship classes.
29
CHAPTER IV
RESULTS AND TESTING
The main result of this research was the development of a core data model for
cave science. As discussed elsewhere in this paper, the model is based on information a
researcher would expect to see on a traditional cave map. The feature sets in the data
model include those published by the NSS and MSS. The overall structure of the data is
best presented in a large poster format that illustrates the many roles and relationships of
the model. Here the various components of the data model are detailed individually
utilizing color schemes established by ESRI.
Conceptual Design
The early steps of the methodology developed in Chapter III state that the
information products for the data model should be identified. This process can be
expressed in the context of the various thematic data the model needs to support. The
themes are described not only in terms of the feature or layer names, but also outline
additional metadata. The information product description includes short explanations for
how the data will be used within the map, the source of the data, how the data is
represented in GIS, spatial relationships to other data, map accuracy and the scale the
information product is designed to support.
The cave science data model contains seven different layers or themes. These
layers are survey, passages, hydrology, cross sections, profile, raster image base, and
30
page layout (Figure 5). Documentation of the thematic layers is the final product of the
first three steps in the research methodology.
Figure 5 – Data model thematic layers
31
Logical Design
The themes were then expanded to identify how discrete features should be
modeled. All vector features were represented by creating feature classes. An
ArcCatalog tree view of the data model is illustrated in Appendix A. The geometry
feature type is defined as point, line or polygon. Each feature class can have only a
single geometry type (MacDonald 2001), so similar features were organized into each
feature class. Five feature datasets were used to group similar feature classes. Tabular
data for researchers and research projects was represented by object classes. Object
classes provide a mechanism for storing data in the model that does not have a spatial
component. The image base thematic layer was divided in to four information products.
The image themes were identified as aerial photos, geology, topographic and survey data.
The aerial photo raster dataset was created so that a seamless mosaic of images
covering a given study area may be stored in the geodatabase. Similar raster datasets
were developed to store scanned geological maps and USGS topographic map data. A
raster catalog was created to store scanned field data or other research findings that may
be useful to store in the geodatabase. A raster catalog differs from a raster dataset in that
the personal geodatabase only stores a pointer to the raster file and does not store the file
itself in the geodatabase (Wayne 2005). This is an important distinction because it
directly impacts the file size of the geodatabase and the number of files that must be
delivered when colleagues share data.
The cave science geodatabase data model is comprised of sixteen feature classes
organized into five feature datasets (Figure 6). The four raster classes and two object
classes are also shown in the figure. Three relationship classes and nine attribute
32
domains were developed to support the feature classes and are discussed later in this
chapter.
Figure 6 – Geodatabase design structure
33
Physical Design
Survey Feature Dataset
As shown in Figure 6, each of the feature datasets is comprised of several feature
classes. It is useful to expand each of these objects to understand how the data model is
designed to function. The core feature dataset is named Survey. The Survey dataset
encompasses two main feature classes and a supporting annotation feature class. The
SurveyStations feature class stores a geometry type of point and represents each station
surveyed in the cave (Figure 7). This information is critical as it often represents the only
precisely known locations in the cave which are retrievable. The SurveyVectors feature
class stores line type geometry and connects all survey stations (Figure 8). Appendix D
illustrates how the various survey features appear graphically.
Consideration was given for creating a topology rule to force all survey vectors to
be covered by survey stations, but was not implemented for two reasons. Topology rules
must be uniformly applied across datasets and reduce data flexibility. Secondly, all of the
survey data is imported from either the WALLS or COMPASS cave surveying programs
and generally is not manipulated once in ArcGIS. Manipulation of data outside of
WALLS or COMPASS presents an opportunity for introduction of errors and should be
avoided. This is especially important in larger cave systems that may have ongoing
exploration. The flexibility of the data model is maintained because new survey data sets
can be imported as they become available. If data is complete and no longer needs to be
maintained in a cave survey program, the data may be manually modified in ArcGIS.
Implementations of this type should be aware that there are no tools for managing
34
precision survey data in the basic ArcGIS platform, and may want to consider the
addition of the ArcGIS Survey Analyst extension.
The SurveyAnnotation feature class stores annotation linked to the SurveyStations
feature class (Figure 9). The name attribute of each survey station is linked to the class
for display. Storing annotation in a feature class provides added flexibility for
symbology when creating maps. Creating the feature linked annotation results in the
establishment of a relationship class in the geodatabase. This relationship class has
cardinality of one-to-many from the SurveyStations point feature class to the
SurveyAnnotation feature class (Figure 10). A second relationship class was created
between the SurveyStations feature class and the CrossSections feature class. This
relationship class is discussed in detail later in this chapter. It should be noted that users
not wishing to implement the SurveyAnnotation feature class could still use the basic
tools in ArcGIS for labeling.
Figure 7 – SurveyStations feature class
35
Figure 8 – SurveyVectors feature class
Figure 9 – SurveyAnnotation feature class
36
Passages Feature Dataset
The Passages feature dataset is comprised of five feature classes and contains all
of the information for the plan view of the map. The feature classes are PassageWalls,
PassageFeatures, FloorMaterial, Speleothems, and PassageAnnotation. The
PassageWalls feature class stores line geometry and supports a subtype for the attribute
of passage type (Figure 11). Illustrations of these features may be found in Appendices B
and C. The subtype implementation helps to enforce data integrity by establishing
acceptable attribute values. This is especially useful when several researchers or
organizations are working with a common data set or cave map.
While the PassageWalls feature class establishes the limits of the cave, it does not
represent the features surveyed in the cave. The PassageFeatures, FloorMaterial,
and Speleothems feature classes organize and store this type of data. One important
aspect of the development of the data model that must be considered is the spatial
representation of features. Many features found in caves are represented in different
ways. For example a single stalactite may be represented as a point feature while an area
of the same cave where a large area is covered with stalactites may be represented with a
Figure 10 – SurveyStations featured linked relationship class
37
polygon. This variation in cartographic representation is problematic when trying to
store data in a way that it can be meaningfully retrieved. For the purposes of this data
model all passage features are represented as polygons. This is based on the fact that all
features represent a spatial extent of some size.
The PassageFeatures feature class stores features with a polygon geometry type.
These features represent phenomenon in the cave that have been created by primary
processes such as flowing streams in the cave, faulting, and breakdown (Figure 12). This
Figure 11 – PassageWalls feature class with subtypes
38
would include all of the features commonly grouped as speleogens. The type of passage
feature is controlled by a subtype class. The attributes of the feature class include support
for joint control direction, ceiling channel type, size, survey station tie-in and a short
description. The joint control and ceiling channel data are supported by attribute domains
(Figure 13). These domains standardize the values for data that the user can enter for a
given attribute. In this implementation of the cave science data model, only features
common to limestone caves are supported.
The FloorMaterial feature class is similar to the PassageFeatures feature class
with the exception that it handles floor material that may or may not have been a result of
primary cave formation. Often, materials such as cobbles, clay, and sand are transported
and deposited in various places by cave streams. This same process can also bring debris
in to the cave. In undisturbed areas it is not uncommon to find vertebrate remains. Water
processes may deposit flowstone or reduce larger rocks to gravel. Similar to previously
described feature classes, the FloorMaterial feature class implements a subtype to
standardize the options to those features represented in either the NSS or MSS map
symbols. The type of floor material is the only required attribute, with added support for
size, survey station tie-in, and a description field (Figure 14). The design of the survey
station field is limited to eight characters to match the format of the station name attribute
of the SurveyStations feature class. Matching the design parameters of these two fields
allows for data joining when performing data analysis.
39
Figure 13 – Ceiling channel and Joint attribute domains
Figure 12 – PassageFeatures feature class with subtypes
40
The last major feature class of the Passages feature dataset is Speleothems. The
Speleothems feature class stores a polygon geometry type and is supported by a subtype
class for valid types of Speleothems (Figure 15). Speleothems are often referred to
collectively as cave formations. Features such as stalactites and stalagmites fall into this
feature class. Other common features found in the Speleothems feature class include
shields, rimstone dams and cave coral. The data model supports representation of
discrete individual formations as well as larger spatial areas representing several
Figure 14 – FloorMaterial feature class with subtypes
41
individuals. The available attributes for each speleothem include all of the same field
names as FloorMaterial with the addition of a field to record the facing direction of the
formation. The latter may be of special interest to researchers investigating speleothems
influenced by air flow direction.
Figure 15 – Speleothems feature class with subtypes
42
The final feature class of the Passages feature dataset is to support needed
annotation of features (Figure 16). The parameters are identical to the SurveyAnnotation
feature class with the exception that feature linked annotation is not implemented. The
latter was not implemented because the annotation class is supporting multiple feature
classes concurrently which is not supported by feature linking.
Figure 16 – PassageAnnotation feature class
43
Hydrology Feature Dataset
The last geographic features traditionally depicted on the plan view of a cave map
are the hydrological features. In the case of limestone caves, water and its related
hydrological systems are the primary mechanism for cave formation. In the context of
the cave science data model, hydrology is classified as a separate feature dataset
consisting of three main feature classes and an annotation feature class. This feature
dataset is unique because it is the only dataset where a single feature, water, can be
represented as a point, a line, or a polygon. Examples of the symbology used to depict
these features may be found in Appendices B and C. This is necessary in the data model
because water is integrally tied to cave formation. It also plays a major role in many cave
ecosystems as the vehicle for energy and food entering the cave environment.
Hydrological point features are stored in the Feature Point feature class (Figure
17). The class supports three core attributes. The feature type is controlled by a subtype
and classifies rapids or riffles, waterfalls, and well casings. The other attributes
supported are the size and height of the feature. The size attribute is a text field allowing
the user to enter information such as the length of the riffle or width of the stream at the
riffle point. The height attribute stores the vertical change of the overall feature and is a
numeric field.
Streams are the feature class within the hydrology feature dataset that supports
line type geometry (Figure 18). These features are used to represent flowing water in the
cave. Two attribute types are supported by the Stream class. The stream type attribute is
supported by a subtype class and the water quality attribute is supported by an attribute
domain. The water quality attribute is entered as “safe’ or “unsafe”. The intended
44
context for this attribute is in relation to human consumption, but this attribute could also
be used to indicate whether a particular stream was safe for researchers to conduct
investigations.
Figure 17 – FeaturePoint feature class with subtypes
Figure 18 – Streams feature class with subtypes
45
The polygon geometry type was used to represent pools, lakes and sumps within
the cave. A sump is defined as a cave passage that continues underwater, but cannot be
traversed without specialized diving equipment. These features are grouped into the
Pools feature class (Figure 19). Each pool feature stores four attributes. The pool type is
set by a subtype class. Water depth is stored as a numeric value. The water quality
attribute is linked to the same attribute domain as the Stream feature class. The possible
values for water quality are “Safe” or “Unsafe”. Sump type is also linked to an attribute
domain to indicate whether the sump is “Diveable” or “Not Diveable” (Figure 20).
Figure 19 – Pools feature class with subtypes
46
All of the hydrology feature classes are supported by the HydrologyAnnotation
feature class. This feature class is not linked to any feature classes and simply provides a
way for the data model to support various annotations that may be needed for cartography.
The attributes of the feature class are identical to the PassageAnnotation class shown in
Figure 16.
CrossSections Feature Dataset
While the plan view is perhaps the most used map information in cave science, cross
sections and a profile view can be useful in visualizing complex aspects of the cave
environment. One case where cross sections can be especially useful is in illustrating
ledges and undercuts that are present, but not obvious in the plan view (Figure 21). Cross
sections are normally oriented at right angles to the cave passage, but may be drawn at
other angles if needed.
Figure 20 – Sump Type and Water Quality attribute domains
47
The ability to provide context for plan view features suggests that it would be a
useful addition to the cave science geodatabase data model. Cross sections were
implemented by creating the CrossSections feature dataset. The dataset supports two
feature classes. The primary feature class is CrossSections, and is spatially represented
by polygons. The feature class has four attributes of interest (Figure 22). The overall
height and width of the passage may be stored, but are not required fields. All cross
sections should have the near station and facing direction fields populated. This
information is necessary to correctly position and orient the cross section. All of these
attributes are routinely collected as a part of cave surveying and represent no added work
for field teams. The CrossSections feature class has a relationship class with
SurveyStations feature class. The relationship cardinality is one-to-many from
SurveyStations to CrossSections and links the survey station name attribute (Figure 23).
An annotation feature class named CrossSectionAnnotation was developed to
support the CrossSections feature class. This feature class is not linked to any other
feature class and functions identically to the PassageAnnotation feature class shown in
Figure 21 – Cross section of plan view
Plan view
Cross Section
48
Figure 16. Any ancillary annotation supporting cartographic representation of cross
sectional data should be placed in this feature class.
Figure 23 – CrossSectionsNearSurveyStation relationship class
Figure 22 – CrossSections feature class
49
Profile Feature Dataset
Profiles are the final component to a cave map. The principal of showing a
profile in addition to a plan view is similar to that of cross sections. The profile view
runs longitudinally along the survey line. This view is often used to help visualize caves
with significant vertical extent or maze-like passages. Beyond cartographic
representation, this research was not able to clearly identify information products that are
unique to the profile. Accordingly, the data model was designed to support only the
graphic representation of the profile.
The Profile feature dataset contains two feature classes. The ProfileLines feature
class stores line type geometry and has no user attributes (Figure 24). A relationship
class between ProfileLines and SurveyStations was not suitable because profiles cover
multiple survey stations and may extend along entire cave passages. The
ProfileAnnotation feature class stores ancillary annotation related to the cartographic
display of profile data. The design and functionality of the ProfileAnnotation feature
class is identical to the PassageAnnotation feature class shown in Figure 16.
Figure 24 – ProfileLines feature class
50
Object Classes
The final component to the cave science data model was the creation of two
object classes to store basic information about research and researchers. As stated earlier
in this research, the cave science data model is designed to be a core data model. It is not
practical to encompass every domain that speleology envelops. The data model should,
however, provide a way to “hook” into other data models supporting cave research.
The two primary areas for attaching to other data models are the SurveyStation
feature class discussed earlier in this chapter and the Research object classes. The two
object classes are used to store non-graphical tabular data. The ResearchProjects object
class stores three attributes about each research project in the cave in addition to a unique
identification number (Figure 25). The attributes record the nearest survey station, a
short description, and the type of research. The type attribute is supported by a subtype
class containing different areas of scientific study. The ResearchProjects table is linked
to the Researchers object class (Figure 26) through the ResearchProjecthasResearcher
relationship class (Figure 27). This relationship enforces that every research project has
at least one researcher. The cardinality of the relationship is from ResearchProjects to
Researchers. The type of relationship has been set as “many to many” to support projects
with more than one primary investigator.
The Researcher table stores basic information about scientists conducting projects
in the cave (Figure 26). The table stores the name, email and phone contact information
for each researcher. The table is simple by design and may be linked to other databases
of information by the unique identifier for each entry.
51
Figure 25 – ResearchProjects object class
Figure 27 – ResearchProjecthasResearcher relationship class
Figure 26 – Researchers object class
52
Raster Data Classes
Three raster datasets and a single raster catalog were developed for the data model.
The raster datasets provide a container to create seamless mosaics of continuous raster
data. This data is classified into one of three themes: Aerial Photos, Geology, and
Topographic data. These data are well suited to the raster dataset structure because they
are continuous and most often are produced by a single agency. The latter is significant
since it provides uniformity when creating the seamless mosaic. The raster datasets have
no user definable attributes.
The Survey Data raster catalog provides a structure for referencing cave survey,
inventory or other field data within the geodatabase. The raster catalog does not directly
store the data, but rather provides a pointer to a location outside of the geodatabase. The
raster catalog is useful when two or more researchers are collaborating on a project and
need to have similar structures for organizing data. It should be noted, however, that
implementing the raster catalog feature of the geodatabase data model requires that all
raster documents outside of the geodatabase are also delivered when exchanging data.
Page Layout Theme
The final theme implemented by the data model (Figure 5) is Page Layout. The
layout theme exists outside of the geodatabase structure. It is contained in the map
document also known as the MXD file. MXD files may be generated for a given project
and saved as a template or MXT file. This functionality allows the reuse of map
elements that are common to all output. These elements may include items such as a
53
north arrow, bar scale, title annotation and neat line. Other annotation can be included on
per project basis.
Supporting Attribute Domains
There are four additional attribute domains that are internal to the geodatabase
data model (Figure 28). These domains support the feature linked annotation established
between the SurveyStations and the SurveyAnnotation feature classes. There are no user
definable attributes in these domains and they should not be manually adjusted.
The AnnotationStatus domain indicates whether a particular feature has
annotation placed or not. The BooleanSymbolValue domain toggles between yes and no
to indicate if user defined symbols are utilized. The HorizontalAlignment and Vertical
Alignment attribute domains store the horizontal and vertical position for each instance of
feature linked annotation.
54
Figure 28 – Internal Annotation attribute domains
55
Testing the Data Model
The cave science data model was tested on a digital cave map of Great Onyx
Cave (GOC) provided by CRF member Bob Gulden. The Gulden map was originally
drafted in a Computer Aided Drafting (CAD) software package called AutoCAD. CAD
software supports a layer based organization of data. Although there is no community
standard for cave map drawing layers, most cartographers follow some scheme simply
for map organization. The GOC map loosely organizes objects into several layers. The
main layers are more generalized than the data model. Layers such as “detail” and
“symbols” may have numerous features crossing multiple feature classes. For example,
features that would be found in the Hydrology feature dataset are found on the same layer
as features that would be in the Passages feature dataset. This is expected given that the
GOC map was not drawn with GIS in mind.
Two different methods were tested for importing the GOC map to the data model.
The first method simply adds the CAD data to an ArcGIS project. This method is basic
and relies on the generic CAD import tool found in ArcGIS. The results of this process
were unacceptable as the tool does not appear to support the advanced element types such
as Bezier curves commonly found in CAD data such as the GOC map.
The second methodology processed the GOC map with the “Import from CAD”
tool in ArcToolBox (Figure 29). This method simplifies the complex element types to
geometry that can be stored in the geodatabase. The obvious advantage to this method is
that there is no data lost as a result of the import process. The import tool also supports
attribute data from CAD, but processing the GOC map revealed that no attributes had
been created. Twelve layers were imported and stored in a temporary geodatabase by the
56
process. As expected, these layers did not easily support the cave science data model.
Since CAD does not enforce the geometry types that ArcGIS supports, many elements
that would ideally map to a single feature class in the data model are represented by both
lines and polygons. In these cases, the geometry would need to be recreated as features
with the appropriate geometry.
Once the geometry was correctly represented in the temporary geodatabase, the
various features were copied to the cave science geodatabase using the editing tools
found in ArcGIS. This process was straightforward and without complication. The
survey data was exported from WALLS and imported directly to the cave science
geodatabase without incident. Appendix D illustrates portions of the GOC map after
import to the cave science data model.
Figure 29 – Import from CAD geodatabase tool
57
One issue encountered was the geometry types used in drawing the map. CAD
software leverages several element types not easily supported by ArcGIS. Complex
element types, such as parametric B-spline geometry, are often used by cartographers to
represent cave walls and other features. These element types had to be converted to
simple lines before importing the data to GIS. The conversion produces a less aesthetic
line, which is a problem for cartographic representation.
By far, the most significant problem with testing the prototype cave science data
model was lack of standards for imported drawings. If a cave map was drawn from the
beginning using the data model, the data would be created and attributed correctly. The
ArcGIS editing environment provides all of the necessary functionality for drawing and
data entry. These tools could further be developed by programming to customize them
for the cave research community. The imported drawings are more problematic, though,
because they have all been drawn in different software packages and by different
individuals. Even within the same software the drawing organization and layer scheme is
left up to the individual. This fact makes it a necessity to adjust layer mappings for each
individual cartographer depending on what layer and geometry type features need to be
retained when importing a map. Establishing standards for maps not native to the data
model should be established for a given project.
Limitations on the allowable geometry types for features also proved to be
cumbersome. Many of the features that were to be stored in the FloorMaterial feature
class had been created using line tools. The FloorMaterial feature class only supports
polygonal geometry. This limitation of only storing a single geometry type would
suggest that several feature classes with identical user attributes may need to be
58
supported by the data model. This architecture may create a burdensome data model that
is unwieldy for any meaningful analysis using GIS tools.
Several relationship classes were implemented in the data model. This type of
structure appears to work well within the model. The cross sectional data was imported
from the GOC temporary geodatabase and stored in the CrossSections feature class. The
feature class was attributed with the survey station attribute to test the
CrossSectionsNearSurveyStation relationship class. The relationship class appeared to
work as designed, enforced that each cross section had a survey station reference. The
CrossSectionsNearSurveyStation is a good example of how one feature class can have a
mandatory attribute linked to a second feature class. If the map has cross sections that
cannot be related to the plan view, those cross sections are not useful in analysis. The
relationship class verifies that all cross section features have a station record in the
attribute table. Based on this result, it may be beneficial to establish additional
relationship classes for those features that employ the “near station’ record.
The object classes were relatively easy to fit into the data model. This
functionality was tested by creating simulated research projects and researchers. It was
hoped that actual research data could be used in the testing of the model, but ultimately
data was not available in the timeframe of this research. The addition of non-graphical
information is possible within the data model, but should generally be kept to a minimum
to keep file sizes small and more efficient. The flexibility of GIS for linking to digital
data outside of the geodatabase provides a powerful method for extending project data to
other domains and research collaborators.
59
The data model was only tested with the ESRI personal geodatabase format. This
format is based on the Microsoft Access database format. For most cave projects this
format is acceptable since the limit on the Access file size is two gigabytes. Support for
the geodatabase at the enterprise level is only possible by implementing ArcSDE. This
middleware software brokers data transactions between the geodatabase and the user(s).
ArcSDE requires specialized knowledge and hardware to operate. These requirements
put this type of enterprise implementation out of reach many end users, especially those
working independently on cave science.
60
CHAPTER V
CONCLUSIONS
The research objective for this thesis was to develop a usable ArcGIS geodatabase
data model for cave science. Because speleology is a wide field of study, the data model
focused on a core set of features that would be useful to all researchers. These features
were defined as the cave map symbols published by the NSS and MSS. Comments were
solicited from several cave researchers to help verify the data model usability.
The process started with the conceptual design of the data model. There was
some difficulty in getting feedback from other stakeholders in the cave research
community. This may be a result of GIS just beginning to reach a larger user base within
the community. Once the information products were finalized, the map symbols were
organized into a logical design of feature datasets and feature classes. There were some
compromises that had to be made with regard to geometry types and spatial
representation of features. This appears to be a somewhat problematic solution because
some feature classes would ideally support multiple geometry types.
The physical design and creation of the data model was straightforward with
ArcCatalog. Ultimately this proved to be the easiest part of the process. Finally the data
model was tested with the CAD map of Great Onyx Cave. The testing verified a problem
identified during the logical design of the data model. The geometry types used to create
features in CAD did not match the geometry types created for the cave science model.
This issue must be addressed before legacy data can be easily supported by the data
model.
61
An important limitation toward the utility of the data model being developed here
is potential resistance in the targeted community of users. Many cave researchers are
slow to change, and many more have not yet discovered GIS as a powerful tool.
Acceptance of the data model will take continued communication and ideas by a wide
cross-section of actual and potential users.
Future Research
This research is only a first step towards creating a viable data model for
speleologists. Most widely used data models take years to mature and have undergone
several iterations. Data models should be considered a work in progress, especially in
regard to spatial data. The explosive growth and development that the GIS industry has
been experiencing over the last several years have yielded new functionality on an almost
continual basis. The framework developed and described here can be extended in many
directions. Some of these extensions will require custom programming, while others will
be contingent on new features of the latest GIS software.
The release of ESRI’s ArcGIS 9.2 software slated for late 2006 promises more
robust cartographic functionality in the geodatabase (ESRI 2006b). New data structures
will be needed to support such operations. These features will likely provide benefits
valuable to the data model designed as a part of this research.
Support for automated import of data is needed to support casual GIS users. The
export routines from WALLS and COMPASS create files in the shapefile format. These
data files can be manually imported into the geodatabase provided that the user has a
working knowledge of ArcToolBox. Creation of geoprocessing scripts is possible using
62
the ModelBuilder module of ArcGIS. Establishing scripts would also allow for easy
attribute field mapping for WALLS, COMPASS or any other cave software program that
may be used on a given project. Similar geoprocessing scripts and ArcObjects code
could be developed for common tasks such as searching for survey stations or drawing
geometry. Tools such as these would allow non-technical GIS users to leverage the data
model and software for their research.
Network modeling is another useful area to consider developing. The
SurveyStations and SurveyVectors feature classes support the use of the built-in
geometric modeling tools in ArcGIS. However, these tools are very basic and only
support two dimensional networks. With the release of the 9.1 version of ArcGIS
Network Analyst, it is possible to create non-planar networks. This type of functionality
would allow cave researchers and managers to visualize attribute data related to the cave
GIS data in ways not previously possible.
Another area for future research centers on added cartographic tools in ArcGIS.
Of particular interest is the ability to support different symbology for features at different
map scales. This type of functionality is critical to supporting different feature geometry
types within a single feature class. Support for more complex geometry types such as
B-spline curves would benefit the usability of the data model. Unfortunately, inclusion
of such functionality is at the discretion of the software vendor.
Working towards a community wide standardization of data collection and
cartographic methods is badly needed. As researchers collect more data on more
domains this becomes of particular importance. The ability to collaborate is greatly
63
diminished if every project is stored in its own data “silo” and unable to easily interact
with other data.
The cave science data model is designed to be a core data model for speleology.
This data model may be extended in to various other domains. Some users may want to
extend the model to their particular area of interest such as paleontology, biology, or
geology, while cave managers may want to extend the model to areas such as
management concerns, interpretation, or maintenance.
Finally, compliance with the Open Geospatial Consortium (OGC) should be a
long term goal for any GIS data model. The OGC supports non-vendor specific data
standards. Many of the leading GIS software vendors support OGC in some manner. It
is not realistic to expect all speleologists to use ESRI software or any single software
package for GIS. The support for OGC standards not only expands the options for which
GIS software can be used with the data model, it also expands the opportunities for the
cave research community to collaborate.
This research provides a starting point for future development and refinement of
the cave science data model. The literature review for this thesis suggests that this is the
first attempt to apply the geodatabase structure to traditional cave maps. Work has been
done in the areas of groundwater modeling and karst systems, but it did not address the
research needs for science in the cave itself. GIS is the next step for cave map creation
and analysis. Moving the science beyond simple geometry is critical to exploiting the
spatial data as researchers continue to expand their knowledge of caves. I am hopeful
that this data model is a step in that direction.
64
Appendix A
ArcCatalog View of Cave Science Data Model
65
ArcCatalog View of Cave Science Data Model
66
Appendix B
National Speleological Society Map Symbols
67
NSS Map symbols reprinted with permission of the
National Speleological Society (Dasher 1994)
68
69
Appendix C
Missouri Speleological Survey Map Symbols
70
MSS Map symbols reprinted with permission of the
Missouri Speleological Survey (Thomson and Taylor 1991)
71
72
73
74
75
76
77
78
79
80
81
82
83
84
Appendix D
Test Data Sample: Great Onyx Cave
85
Sample of the Survey Feature Dataset testing
Survey Stations
Survey Vector
Survey Annotation
86
Sample of the Passages Feature Dataset testing
Breakdown
Passage Wall
Ledge
Flowstone Area
Slope in floor
87
REFERENCES
Arctur, D. and Zeiler, M., 2004, Designing Geodatabases: Case Studies in GIS Data Modeling, (Redlands: ESRI).
Cole, J., 1999, News Notes: Treasures in a Pristine Cave. Available online at:
http://www.geotimes.org/oct99/newsnotes.html (accessed 9 July 2005). Dasher, G., 1994, ON STATION A Complete Handbook for Surveying and Mapping
Caves. (Huntsville: National Speleological Society, Inc.). ESRI, 2002, COTS GIS: The Value of a Commercial Geographic Information System.
Available online at: http://www.esri.com/library/whitepapers/pdfs/cots-gis.pdf (accessed on 28 May 2006).
ESRI, 2006a, Data Models. Available online at:
http://support.esri.com/index.cfm?fa=downloads.dataModels.gateway (accessed 4 February 2006).
ESRI, 2006b, What’s Coming in ArcGIS 9.2. Available online at:
http://www.esri.com/software/arcgis/about/whats-coming.html (accessed 28 May 2006).
Fish, L., 2006, COMPASS History, Goal and Philosophy. Available online at:
http://fountainware.com/compass/miscitem.htm#GOALS (accessed28 May 2006). Dotson, D., 1992, The SMAPS Cave Management System, (Frostburg, MD:
Speleotechnologies Glennon, J., 2001, Application of Morphometric Relationships to Active Flow Networks
within the Mammoth Cave Watershed. Masters thesis, Available online at: http://www.uweb.ucsb.edu/~glennon/GlennonThesis.pdf (accessed 8 July 2005).
Gulden, B. 2006, Great Onyx Cave map. Unpublished. Hale, E., 2005, Geometric Network for Cave Survey Lines. Available online at:
http://forums.esri.com/Thread.asp?c=139&f=771&t=172648&mc=0#msgid508264 (accessed 8 June 2006).
House, S., 2005, Cave Research Foundation small caves database. Unpublished. Kilmchouk, A., Ford, D., Palmer, A., and Dreybrodt, W., (eds), 2000. Speleogenesis:
Evolution of Karst Aquifers, (Huntsville: National Speleological Society, Inc.).
88
MacDonald, A., 2001, Building a Geodatabase, (Redlands: ESRI). Maidment, D., 2002, Arc Hydro: GIS for Water Resources. (Redlands: ESRI). McKenzie, D., 2006, WALLS Project Editor – Tools for Cave Survey Data Management.
Available online at: http://www.utexas.edu/tmm/sponsored_sites/tss/Walls/tsswalls.htm (accessed 28 May 2006).
Moore, W., and Sullivan G., 1978, Speleology: The Study of Caves. (St. Louis: Cave
Books). Moyes, H., and Awe, J., 2000, Spatial Analysis of an Ancient Cave Site. Available online
at: http://www.esri.com/news/arcuser/1000/cave.html (accessed 8 July 2005). Nyerges, T., 2006, Developing a Geodatabase. Available online at:
http://courses.washington.edu/geog461/final_project_06/geodatabase_development.doc (accessed 3 February 2006).
Osburn, B., 2005, Exploration/Survey/Cartography Program Activities Report. Available
online at: http://www.cave-research.org/eocrf/eocart.html (accessed 3 February 2006). Pfaff, R. and Glennon, J., 2004, Working with ArcGIS 9: Building a Groundwater
Protection Model. Available online at: http://www.esri.com/news/arcuser/0704/files/modelbuilder.pdf (accessed: 8 July 2005).
Shaw, T., 2005, There is a Lot of New Stuff to Say About Data Modeling. Available
online at: Http://www.dmreview.com/editorial/newsletter_article.cfm?nl=dmdirect&articleId=1022729&issue=20157 (accessed 28 May 2006).
Szukalski, B., 2004, ESRI Cave and Karst News #9, July 2004. Available online at:
http://www.esri.com/industries/cavekarst/news_community/cavekarst_enews_0704.html (accessed: 3 February 2006).
Thomson, K. and Taylor, R., 1991, The Art of Cave Mapping. Missouri Speleology 31(1-
4). Tomlinson, R., 2003, Thinking About GIS: Geographic Information System Planning for
Managers. (Redlands, ESRI). Wayne, C., 2005, Managing Rasters in a Personal Geodatabase. Available online at:
http://www.esri.com/news/arcuser/0705/files/managerasters.pdf (Accessed 23 June 2006).
89
White, W., 1988, Geomorphology and Hydrology of Karst Terrains, (New York, Oxford University Press).
Zeiler M., 1999, Modeling Our World: The ESRI Guide to Geodatabase Design.
(Redlands, ESRI).