31
Central Data Exchange Environmental Information Exchange Network Exchange Network Enhancements By David Fladung April 19, 2006

Central Data Exchange Environmental Information Exchange Network

  • Upload
    maxim

  • View
    55

  • Download
    2

Embed Size (px)

DESCRIPTION

Central Data Exchange Environmental Information Exchange Network. Exchange Network Enhancements By David Fladung April 19, 2006. Agenda. CDX Overview Open Source Utilization Data Transformation (Mapper) Business Process Execution Language (BPEL) Rich User Interface (RUI) client - PowerPoint PPT Presentation

Citation preview

Page 1: Central Data Exchange Environmental Information Exchange Network

Central Data ExchangeEnvironmental Information Exchange Network

Exchange Network Enhancements

By David Fladung

April 19, 2006

Page 2: Central Data Exchange Environmental Information Exchange Network

Agenda

• CDX Overview• Open Source Utilization• Data Transformation (Mapper)• Business Process Execution Language (BPEL)• Rich User Interface (RUI) client• Geographic Data Interaction

Page 3: Central Data Exchange Environmental Information Exchange Network

CDX Overview

Page 4: Central Data Exchange Environmental Information Exchange Network

CDX Overview

Page 5: Central Data Exchange Environmental Information Exchange Network

Open Source Utilization

• CDX utilizes about 50 open source products/frameworks• JBoss (Wind River Node application server)• PostgreSQL (Wind River Node database)• Struts (Model View Controller [MVC])• Hibernate (Object Relational Mapping [ORM])• Axis (WS engine and libraries)• Maven (build and release management)• AspectJ (quality of service)• StAX (streaming parsing of large XML)• Velocity (templating/mapping)• Quartz (job scheduling)• ActiveBPEL (business process management)

Page 6: Central Data Exchange Environmental Information Exchange Network

Open Source Utilization

Yellow – current open source implementationGrey – potential for open source implementationWhite – not applicable

Page 7: Central Data Exchange Environmental Information Exchange Network

Open Source Utilization

• Advantages• Low Total Cost of Ownership (TCO)• Rich user community• Adequate documentation• Proven performance• Promotes rapid development• Easy to integrate

• Disadvantages• Potential that product may no longer be supported• Advanced support may require cost

Page 8: Central Data Exchange Environmental Information Exchange Network

Data Transformation

• Convert from one data format to another• XML• Flat file (i.e. delimited)• Database

• Handle large file sizes• Use streaming approach rather than in memory

• Provide a robust and reusable interface• Standard configuration files• Standard APIs• Reusable across multiple tiers

Page 9: Central Data Exchange Environmental Information Exchange Network

Data Transformation

• TRI OUT – flat file to XML• NC Node – database to XML for Beaches and NEI data• Puerto Rico Node – flat file to XML for AQS data• Wind River Node – database to XML for AQS• Geo Toolkit for Region 5 – XML to XML for Geo data• EnviroFlash – flat file to unstructured email (text)• TRIME (XML to database)• Water Sentinel (database to XML, XML to database)• GLNPO (database to Excel, database to XML)

Page 10: Central Data Exchange Environmental Information Exchange Network

Data Transformation

Yellow – current use of mapper implementationWhite – not applicable

Page 11: Central Data Exchange Environmental Information Exchange Network

Data Transformation

• Architecture• Mapping engine

• Run the transformation process• Built on the Velocity open source project

• Configuration files • Mapping instructions• Location of the data sources and data targets• Conditional logic, custom methods

• Custom Java methods - provides the custom transformation such as data formatting. • Pluggable readers• Pluggable writers

Page 12: Central Data Exchange Environmental Information Exchange Network

Data Transformation

• Mapping steps• Logical mapping

• The process of analyzing the data source and the data target and creating the document that specifies the relations between the source and target fields. • If the data source is relational database, this process includes developing the query to extract the data from the database.

• Physical mapping - the process of creating the configuration files to implement the logical mapping specifications.• Custom methods (if needed)

Page 13: Central Data Exchange Environmental Information Exchange Network

Data Transformation

• Database to XML (Puerto Rico Node)## Database Query#set ($sqlQuery = "select distinct TRANSACTION_TYPE, ACTION_CODE, STATE_CODE, COUNTY_CODE, SITE_ID from ${tableName}RA where ACTION_CODE = 'D' and TRANSACTION_TYPE = 'RA'") ## Set Reader properties#set ($tmp = $MapperEngine.setMapReaderProperty('SQL_COMMAND', $sqlQuery ) )#set ($tmp = $MapperEngine.setMapReaderProperty('ENCODING', 'XML_ENCODING') )## Loop for each record in result set#foreach($row in $MapperEngine.getIterator())## Write XML<aqs:ActionRawDataDelete>

<aqs:SiteIdentifierDetails>## Use value from record as a variable <aqs:StateCode>$!row.STATE_CODE</aqs:StateCode> <aqs:CountyCode>$PRFunctions.getNumberDigitStr($!row.COUNTY_CODE , 3)</aqs:CountyCode> <aqs:SiteNumber>$PRFunctions.getNumberDigitStr($!row.SITE_ID , 4)</aqs:SiteNumber> </aqs:SiteIdentifierDetails>## Call subsequent execution#set( $config = $MapperEngine.createMapperConfiguration() )#set ($tmp = $!config.ContextConfig.put( 'SITE_ID', $!row.SITE_ID ))#set ($tmp = $!config.ContextConfig.put( 'tableName', $tableName ))#set ($tmp = $!config.ContextConfig.put( 'subs', 'PRMonitorDeleteRAMap' ))$MapperEngine.subExecute('MapperServices/PR/PRDBReadConfig.vm', 'MapperServices/PR/PRMonitorDeleteRAMap.vm', $config)</aqs:ActionRawDataDelete>#end

Page 14: Central Data Exchange Environmental Information Exchange Network

Data Transformation

• Flat file to unstructured text through custom Java (EnviroFlash)## Column names for delimited text file$MapperEngine.setMapReaderProperty('COL_NAMES_LIST',['CITY','COUNTY','STATE','UV_INDEX','UV_ALERT'])## Delimiter$MapperEngine.setMapReaderProperty('DELIMITER','\|')## Loop for all records in text file#foreach($row in $MapperEngine.getIterator())#if($templateCallback.isCitySubscribedTo($row.STATE, $row.CITY, $row.COUNTY))## Use values from record as variable#set( $config = $MapperEngine.createMapperConfiguration() )#set ($tmp = $!config.ContextConfig.put( 'CITY', $row.CITY ) )#set ($tmp = $!config.ContextConfig.put( 'COUNTY', $row.COUNTY ) )#set ($tmp = $!config.ContextConfig.put( 'STATE', $row.STATE ) )#set ($tmp = $!config.ContextConfig.put( 'UV_INDEX', $row.UV_INDEX ) )#set ($tmp = $!config.ContextConfig.put( 'UV_ALERT', $row.UV_ALERT ) )#set ($tmp = $!config.ContextConfig.put( 'subscriberURL', $subscriberURL ) )#set ($tmp = $!config.ContextConfig.put( 'environmentName', $environmentName ) )#set ($tmp = $MapperEngine.subExecute('gov/epa/cdx/enviroflash/uv/templates/writeUVMailConfig.vm', 'gov/epa/cdx/enviroflash/uv/templates/writeUVMailMap.vm', $config) )#set ($outMail = $!MapperEngine.getObjectCacheMap().get('OUT_MAIL') )#set ($tmp = $templateCallback.sendEmail($outMail, $row.STATE, $row.CITY, $row.COUNTY, $row.UV_ALERT) )#end#end

Page 15: Central Data Exchange Environmental Information Exchange Network

Data Transformation

• Advantages• Provides an ability to concentrate mapping logic within the configuration file and custom methods.• Provides ability to handle several data source types.• Provides an ability to decouple readers and writers.• Provides streaming capabilities to handle large size files (tested against 680 MB).• Provides an ability to use custom Java methods.• Does not require license fee.• Requires minimum coding. • Superior performance compared to commercial tools (XAware, BEA Liquid Data) - 30 times faster on large data sets.• Uses streaming approach for low memory overhead.

Page 16: Central Data Exchange Environmental Information Exchange Network

BPEL

• BPEL is a standard for orchestrating Web Services.• XML based description of a business process• Contains references to supporting WSDL files• Portable between BPEL engines

• BPEL allows for a formal specification of business processes.• BPEL meshes well with Service Oriented Architectures (SOA).• BPEL provides several useful constructs

• Transaction context management• Synchronous and asynchronous web service invocation and response • Conditional branching• Parallel flow activities• Fault handling and exception invocation

Page 17: Central Data Exchange Environmental Information Exchange Network

BPEL

Page 18: Central Data Exchange Environmental Information Exchange Network

BPEL

• BPEL within CDX• Motivations

• Can it simplify the design of existing dataflows?• Can it reduce the cost of dataflow development?• Can it speed up the process of integrating CDX Web and Node applications?• Can it provide better visibility into existing flows?

• Goals• Identify a target platform.• Demonstrate feasibility of deployment/integration.• Demonstrate ability to reuse existing CDX services.• Determine if BPEL allows for quick development of dataflow components.

Page 19: Central Data Exchange Environmental Information Exchange Network

BPEL

• Prototype specifics• Exposed generic CDX services (Java) as Web Services

• XML validation• Retrieval of transaction/document metadata• Created a CDX Services project to host the web services

• Model existing National Emissions Inventory (NEI) dataflow.• Enhance CDX infrastructure to support use of BPEL orchestration.• Configure a production-like environment to host the services.

• Deploy ActiveBPEL engine (deployed within Tomcat)• Set up persistence of processes (Oracle DMBS)

Page 20: Central Data Exchange Environmental Information Exchange Network

BPEL

Page 21: Central Data Exchange Environmental Information Exchange Network

BPEL

Page 22: Central Data Exchange Environmental Information Exchange Network

BPEL

Page 23: Central Data Exchange Environmental Information Exchange Network

BPEL

Page 24: Central Data Exchange Environmental Information Exchange Network

BPEL

• Findings • BPEL prototype demonstrates feasibility in the EPA environment.• Appears that cost savings could be realized for future flows as the CDX service suite increases, however, it is not yet clear what the savings are.• Learning curve is not insignificant• Tools have not yet reached full maturity.

Page 25: Central Data Exchange Environmental Information Exchange Network

RUI Client

• Guidelines• Provide more features/capabilities than a web application is capable of delivering.• Provide flexible configuration for interaction with multiple Nodes.• Support all existing Exchange Network Web Services and dataflows.• Provide pluggable transformation/visualization for multiple dataflows (Mapper, XML binding).• Use NAAS for authentication/authorization.

Page 26: Central Data Exchange Environmental Information Exchange Network

RUI Client

Page 27: Central Data Exchange Environmental Information Exchange Network

RUI Client

Page 28: Central Data Exchange Environmental Information Exchange Network

RUI Client

Page 29: Central Data Exchange Environmental Information Exchange Network

RUI Client

Page 30: Central Data Exchange Environmental Information Exchange Network

RUI Client

• Current capabilities• Supports submit, download, and transaction history search• Supports configurable data transformation• Supports NAAS authentication/authorization

• Future capabilities• Support query and data visualization• Add ability to sign/encrypt documents (CROMERR)

Page 31: Central Data Exchange Environmental Information Exchange Network

Geographic Data Interaction

• Some dataflows have geographic data (e.g. FRS)• Provide the capability to visualize data• Provide the capability to update the data

• API’s exist for addressing geographic data• Google Maps• ESRI products suite

• CDX approach• Integrate Google Maps API into CDX web applications• Provide end to end solution for querying and updating data