34
Preliminary Description of the Environmental Data Challenge for DoD M&S Briefing by: Virginia T. Dobey SAIC/SETA Support to DMSO Environmental Representation Domain Lead (703) 824-3411 or (703) 963-8512 [email protected]

Preliminary Description of the Environmental Data Challenge for DoD M&S

  • Upload
    clarke

  • View
    47

  • Download
    2

Embed Size (px)

DESCRIPTION

Preliminary Description of the Environmental Data Challenge for DoD M&S. Briefing by: Virginia T. Dobey SAIC/SETA Support to DMSO Environmental Representation Domain Lead (703) 824-3411 or (703) 963-8512 [email protected]. Level I. Level II. Level III. Level IV. Level V. - PowerPoint PPT Presentation

Citation preview

Page 1: Preliminary Description of the Environmental Data Challenge for DoD M&S

Preliminary Descriptionof the

Environmental Data Challengefor DoD M&S

Briefing by:Virginia T. DobeySAIC/SETA Support to DMSOEnvironmental Representation

Domain Lead(703) 824-3411 or (703) [email protected]

Page 2: Preliminary Description of the Environmental Data Challenge for DoD M&S

DMSO Task:Environmental Representations

Provide consistent, comprehensive environmental representations that include the natural environment, as well as representations of anthropogenic impacts, flora, and fauna, to DoD M&S users before FY 2014 when and where needed.

• Provide, before FY 2008, environmental data sets, algorithms, models, tools, and documentation to environmental resource repositories.

• Establish, before FY 2009, a capability to provide authoritative and dynamic representations of the natural environment.

• Establish and publish, before FY 2010, authoritative data sources, data dictionaries, data structure, attribution scheme, symbology, and metadata for each natural environment domain; and provide a common interchange mechanism for both static and dynamic environmental representations.

• Provide, before FY 2012, tools to ensure that natural environmental representations dynamically interact with other representations.

Explanation

Technologies provide the means to represent environmental data (terrain, ocean, air and space), and promote the unambiguous, loss-less and non-proprietary interchange of environmental data.

Level I Level II Level III Level IV Level V

Page 3: Preliminary Description of the Environmental Data Challenge for DoD M&S

Impact of platforms, weapons, sensors, and their actions onspace, atmosphere, terrain, and ocean conditions

Space conditionsAtmospheric conditionsTerrain conditionsOcean conditions

Effects ofspace, atmosphere, terrain, and oceanconditions onplatforms, weapons, and sensors

In M&S, a complete and accurate environmental representation must include not only the environmental conditions but also their effects on system C&P, as well as feedback of system activity on the environment. This, in turn, requires environmental data that can be FUSED with other data sources.

Page 4: Preliminary Description of the Environmental Data Challenge for DoD M&S

The Emerging GIG Data Environment(Task, Post, Process and Use - TPPU)

UbiquitousGlobal Network

MetadataCatalogs

Enterprise &Community

Web Sites Application Services

(e.g., Web)

Shared DataSpace

Metadata Registries

Security Services(e.g., PKI,

SAML)

Consumer

DeveloperPosts to and uses metadata registries to structure data and document formats for reuse and interoperability

ProducerSearches metadata catalogs to find dataAnalyzes metadata to determine context of data foundPulls selected data based on understanding of metadata

Describes content using metadataPosts metadata in catalogs and data in shared space

Data Standards posted in Metadata registries

Producer tags and post data

Location of Data Posted in Metadata Catalogs

Actual Data posted to shared data spaces

Consumer can find and pull data based on metadatatags

Page 5: Preliminary Description of the Environmental Data Challenge for DoD M&S

GIG Policy: The TPPU Paradigm

(diagram obtained from: http://ges.dod.mil/about/tppu.htm)

Page 6: Preliminary Description of the Environmental Data Challenge for DoD M&S

What are Warfighter Issues?Shifting Paradigms

• The adoption of a Net-Centric Data Enterprise

– It’s not just a producer / user world anymore… (now EVERYONE’s a producer!)

– Consumers want access to data / information / knowledge immediately

– Consumers want to input how the data is manipulated/filtered

• Moving from a …Collector / Product focus: Task, Process, Exploit and Disseminate

• To a ... Analyst / Data focus: Task, Post, Process and Use (share)

Reliance on “Factory”Resource intensive data

downloadOne (producer) to many

(consumers) Bandwidth utilization /

availability - not a consideration

Moving to “many-to-many” topology

Smart “data ordering” agentsSharing of information Immediate access to

Through-the-Sensor dataBandwidth - critical to warfighters

Page 7: Preliminary Description of the Environmental Data Challenge for DoD M&S

GIG: Increasing the Interoperability Challenge

• Everyone is a potential producer• Multiple legacy environmental data sources and

user systems exist– Significant investment in existing production and user hardware

and software– Data in multiple (often system-specific) formats need updating

• Few data resources are reliably compatible, even those produced by the Government – example: OAML — product-specific formats

• “Power to the Edge” concept empowers user to identify other sources of required data– No requirement for common data syntax/semantics– Increases the challenge of data fusion

Page 8: Preliminary Description of the Environmental Data Challenge for DoD M&S

GIG: Assumptions in Assessing Environmental Data Interoperability

• Traditional data producers will continue to provide data in producer-specific and product-specific formats following existing production guidelines, since those products and formats meet the general needs of most customers (users). Formats will continue to leverage producer standards such as the Joint METOC Conceptual Data Model and the Feature and Attribute Coding Catalog. Tailoring data to user requirements will remain a user responsibility.

• Users will need a data mediation capability that can access not only these traditional data sources but also non-traditional and often unknown data sources such as commercial products (sometimes having proprietary formats) and streaming data from in-situ sensors (anticipated development using future technology) which can be identified and obtained over the GIG

Page 9: Preliminary Description of the Environmental Data Challenge for DoD M&S

Barriers to Data Interoperability

• Data sources, models, and operational systems developed independently of each other

• Simulations not traditionally designed to interface with operational systems (and sometimes with each other!)

• Tailored (both in format and in content) datasets that are optimized for a specific system support only specific uses

Result: syntactically and semantically different forms of data representation are in use

Page 10: Preliminary Description of the Environmental Data Challenge for DoD M&S

Developing Interoperable Data“A data model is an abstract, self-contained, logical definition of the objects, operators,

and so forth, that together constitute the abstract machine with which users interact. The objects allow us to model the structure of data…An implementation of a given data model is a physical realization on a real machine of the components of the abstract machine that together constitute that model…the familiar distinction between logical and physical…” [emphasis in the original] C.J. Date1

“Logical Data Model: A model of data that represents the inherent structure of that data and is independent of the individual applications of the data and also of the software or hardware mechanisms which are employed in representing and using the data.” DoD 8320.1-M2

“Normalization leads to an exact definition of entities and data attributes, with a clear identification of homonyms (the same name used to represent different data) and synonyms (different names used to represent the same data). It promotes a clearer understanding and knowledge of what each data entity and data attribute means.”

C.Finkelstein3

1Colleague of E.F. Codd, originator and developer of relational database theory2 DoD authority on information engineering3 “Originator and main architect of the Information Engineering methodology”

Page 11: Preliminary Description of the Environmental Data Challenge for DoD M&S

Normalization Challenges

• Users are familiar with non-normalized physical data elements. Tendency is to call these “logical” and stop there.

• In any large data model, normalization is difficult. It is often ignored (benign neglect).

• Complete data models incorporate business rules (how the entities relate to each other).

• May not be needed for an implementation-independent model used to develop a data dictionary (of interoperable concepts), but…

Page 12: Preliminary Description of the Environmental Data Challenge for DoD M&S

User 1 …User 2 User NUser 1 …User 2 User N

*key*key*key *key*key*key

*key*key *key*key

*key*key *key*keyConverting user-specific data requirements into conceptual “building

blocks” for data integration

Logical data model building blocks are

the basis for application

data structures

Normalized logical data model serves as conceptual design “bridge” from the external schema to and from the internal schema

User application views

Also facilitates ingest of

other source data

Internal schemaExternal schema

Conceptual schema

Achieving Data Interoperability:The Three-Schema Architecture

Page 13: Preliminary Description of the Environmental Data Challenge for DoD M&S

The Three-Schema Architecture Applied to Environmental Data

User 1 …User 2 User NUser 1 …User 2 User N

*key*key*key *key*key*key

*key*key *key*key

*key*key *key*key

Normalized logical data model serves as conceptual “bridge

User (production) applications: • CBRN, • Weather effects,• Terrain trafficability, …

Allows for ingest of

other source data

…Prod 2 Prod NProd 1 …

Producer product formats: • METOC producer-specific formats, • NGA product formats, • JMCDM, • FACC, …

Fusion of normalized datainternal to the system

Implementation-independent “middle layer”can be placed at the producer interface, user interfaceor somewhere in between

Page 14: Preliminary Description of the Environmental Data Challenge for DoD M&S

Creating a Reusable Implementation-Independent Middle Layer

Such an architectural layer must be:• Independent of source products• Independent of optimized system

implementation• Provides for the FULL SPECIFICATION of all

source product data as well as all system data requirements

• Developed as an implementation-independent (LOGICAL) relational data model, as required by DoDAF OV-7 Product view

Page 15: Preliminary Description of the Environmental Data Challenge for DoD M&S

A Reusable Middle Layer for Environmental Data

• Requires standardized terms in all environmental domains – leverage existing International/DoD standards

• Requires a concise, well-organized, non-redundant data structure –

– Must extend from a normalized logical data model

• Requires highly granular, independent data elements –‘atomic’ level concepts– To support the many formats required by users recise

rendering of translations to and from the hub)

Page 16: Preliminary Description of the Environmental Data Challenge for DoD M&S

A Complete Representation: All Environmental Domains

Page 17: Preliminary Description of the Environmental Data Challenge for DoD M&S

A Concise Non-Redundant Data Structure

• Must address format as well as content– Format

• Must handle the large number of required data representation formats while preserving consistency of data (the “fair fight” across the federation)

– Content

• Must be based on atomic data elements from a normalized logical data model (support for data fusion)

Page 18: Preliminary Description of the Environmental Data Challenge for DoD M&S

Controlled Image Base (raster)

Foundation Feature Data (vector)

Lake

Trees

Vector topology

Challenge: The Many Formats of M&S Data

Geometry

DTED (gridded)

1, 2, and 3-D point observation data

Nested, gridded data

Surface Backscatter Strength

as a Function of Angle of incidence and EM Band Angle of incidence in degrees

15 30 45 60 75 90 microwave 300 290 240 207 198 170

L-Band 160 230 180 167 158 130 S-Band 165 152 78 22 8 1.5 X-Band 179 122 45 11 6 1 E

M B

and

V-Band 200 90 40 9 4 0.1 Tabular data

Interchange Hub

Page 19: Preliminary Description of the Environmental Data Challenge for DoD M&S

And More Formats: Algorithmic/Model Support and Output Data

Page 20: Preliminary Description of the Environmental Data Challenge for DoD M&S

And Even More: Five-D Data Visualization

Page 21: Preliminary Description of the Environmental Data Challenge for DoD M&S

The Final Additions to the set of M&S Formats

• Compact Terrain Data Base (proprietary)• DTED (product)• E&S GDF (proprietary)• E&S S1000 (proprietary)• GeoTIFF• Gridded raster• MultiGen (proprietary)• Shapefile (proprietary)• Terrex DART, Terra Vista (proprietary)• Vector Product Format (product)

Page 22: Preliminary Description of the Environmental Data Challenge for DoD M&S

“Atomic” Level Concepts

To facilitate precise rendering of translations to and from the hub

Producers use their own coding systems, each of which captures specific desired information—some of which may be captured by others, and some of which may be unique. Almost always each producer carries information not available from other sources. Extracting information “imbedded” in definitions through explicit statement of atomic attributes assists in adding attributes without overwriting the object

Page 23: Preliminary Description of the Environmental Data Challenge for DoD M&S

The Value of Atomic-Level Attributes: An Example

Entity: Bridge over riverEntity: Suspension bridgeEntity: Bridge for two-way traffic

Decomposed:Bridge + located over water body = riverBridge + bridge type = suspensionBridge + traffic carried = vehicular + number of traffic directions = 2

Results in:Bridge + located over water body = river

+ bridge type = suspension+ traffic carried = vehicular+ number of traffic directions = 2

(each of these attributes can be changed/updated as new information is acquired)

Page 24: Preliminary Description of the Environmental Data Challenge for DoD M&S

“Complete and Accurate”—Does That Mean Data Fusion?

• Is the COP affected by METOC conditions? If so, can those effects be reflected in actual changes to the COP on the user system? This can be handled internally to the system without requiring data fusion capability.

• Does the user need to derive useful or critical information from the interaction of METOC/terrain data and information in the COP and provide it to other systems? The answer to this question determines whether data fusion is required by the user.

• Will the warfigher integrate environmental data into operational problems or will he use them as map or other overlays? The answer to this question determines whether data fusion is required by the user and allowed by the producer.

• Does the user need to have the ability to update METOC conditions and effects as reported by data from other (e.g., intel, foreign forces, etc.) battlefield sources? The answer to this question determines whether data fusion is required by the user.

Page 25: Preliminary Description of the Environmental Data Challenge for DoD M&S

• What is the total set of requirements?

• There are many processes and products involved (some of which, as in ArcInfo/ArcView terrain products, may be proprietary)—but the exchange mechanism must be independent of these. While we may know all of the currently available sources, will there ever be new ones available to the warfighter?

• Different views of the environment

– Air, land, sea, space

– Spatial location and orientation (coordinate system and datum)

• Lack of underlying environmental framework

– No integrated reference model available

• Representation (how the concept will be depicted on the user’s system—a visual object? 2D or 3D? A data point? Background data for algorithm use?)

• Naming/semantics

– Existing Data Models are conceptual, future models which are non-integrated and don’t address current data repositories and data interchange requirements

Bu

sin

ess

Tec

hn

ical

Summary: The Challenge of Data Fusion

Page 26: Preliminary Description of the Environmental Data Challenge for DoD M&S

DataProducer 1

DataProducer 2

DataProducer 3

DataProducer n

DataConsumer

Application A

DataConsumer

Application B

DataConsumer

Application C

DataConsumer

Application Z

THE TRADITIONAL SOLUTION: Direct Mapping

RESULT: A BIAS AGAINST TRANSLATION SOFTWARE

Page 27: Preliminary Description of the Environmental Data Challenge for DoD M&S

COMMONINTERCHANGE

HUB

DataProducer 1

DataProducer 2

DataProducer 3

DataProducer n

DataConsumer

Application A

DataConsumer

Application B

DataConsumer

Application C

DataConsumer

Application Z

A GIG-Oriented Solution: The Interoperable “Middle Layer”

Page 28: Preliminary Description of the Environmental Data Challenge for DoD M&S

What works for one system…creates unusual behaviors in another…

The Result of Improper Data Fusion

Page 29: Preliminary Description of the Environmental Data Challenge for DoD M&S

Why Not Let the Producers Handle it All?

SIMNETDatabase

InterchangeSpecification

(SDIS)

SIMNETDatabase

InterchangeSpecification

(SDIS)

Project 2851Project 2851

Standard SimulatorDatabase

Interchange Format(SIF)

Standard SimulatorDatabase

Interchange Format(SIF)

?

AS

D(C

3I)

PD

M-8

5

ASD(C3I) PDM-85 directed DMA (NGA’s predecessor) to STOP producing system-specific formats. Without some means of creating interoperable, reusable data, billions of dollars of DoD investment in simulation and other systems would have been lost.

Page 30: Preliminary Description of the Environmental Data Challenge for DoD M&S

SEDRIS: How it works

1. Identify representation structure of original data object (point, vector, raster, etc.—geometry, topology, grid, pixel, etc.) (this is the data format)

2. Separate attribution of the object (what it is, characteristics of what it is) from its representation (this is the data content)

3. Determine georeferencing of the object (this is the location of each object in its original spatial reference frame—UTM, MGRS, WGS-84, any local inertial or celestial reference datum, etc.)

4. Overlay representation on SEDRIS Data Representation Model, convert attribution to EDCS codes, and decompose georeferencing using Spatial Reference Model

5. Reassemble objects from multiple sources using the SEDRIS Transmittal Format to integrate/fuse data (more than just the simple overlay that is used in C4I, M&S systems now)

Page 31: Preliminary Description of the Environmental Data Challenge for DoD M&S

So Why Keep SEDRIS?• SEDRIS is user-oriented. It opens up and reconciles

data from multiple producers for multiple users.• SEDRIS is like any other standard for interoperability

– it “costs” resources to implement in any single system. It is not useful for a standalone system

– It saves significant resources when used in more than one system

• Assessment: “It is not in industry’s best interest to use SEDRIS. It is absolutely essential that the Government keep SEDRIS alive.”

Page 32: Preliminary Description of the Environmental Data Challenge for DoD M&S

BACKUP SLIDES

Page 33: Preliminary Description of the Environmental Data Challenge for DoD M&S

Formal Definitions of the Normal Forms (1 of 2)

• 1st Normal Form (1NF)– Def: A table (relation) is in 1NF if – 1. There are no duplicated rows in the table.– 2. Each cell is single-valued (i.e., there are no repeating groups or

arrays).– 3. Entries in a column (attribute, field) are of the same kind.

• Note: The order of the rows is immaterial; the order of the columns is immaterial.

• Note: The requirement that there be no duplicated rows in the table means that the table has a key (although the key might be made up of more than one column—even, possibly, of all the columns).

• 2nd Normal Form (2NF) – Def: A table is in 2NF if it is in 1NF and if all non-key attributes are

dependent on all of the key.• Note: Since a partial dependency occurs when a non-key attribute is

dependent on only a part of the (composite) key, the definition of 2NF is sometimes phrased as, "A table is in 2NF if it is in 1NF and if it has no partial dependencies."

• 3rd Normal Form (3NF) – Def: A table is in 3NF if it is in 2NF and if it has no transitive

dependencies.

Page 34: Preliminary Description of the Environmental Data Challenge for DoD M&S

Formal Definitions of the Normal Forms (2 of 2)

• Boyce-Codd Normal Form (BCNF) – Def: A table is in BCNF if it is in 3NF and if every determinant is a

candidate key.• 4th Normal Form (4NF)

– Def: A table is in 4NF if it is in BCNF and if it has no multi-valued dependencies.

• 5th Normal Form (5NF) – Def: A table is in 5NF, also called "Projection-Join Normal Form"

(PJNF), if it is in 4NF and if every join dependency in the table is a consequence of the candidate keys of the table.

• Domain-Key Normal Form (DKNF) – Def: A table is in DKNF if every constraint on the table is a logical

consequence of the definition of keys and domains.

Source: DATABASE-MANAGEMENT PRINCIPLES AND APPLICATIONSDr. Ronald E. Wyllys, The University of Texas at Austin, Austin, Texas, 78712-1276 http://www.gslis.utexas.edu/~l384k11w/normover.html