50
A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 28, 2011 1

A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

1

A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments

Dr. Brand NiemannDirector and Senior Data Scientist

Semantic CommunityJuly 28, 2011

Page 2: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

2

Webinar Description• Establishing a foundation for data governance has never been more

critical as federal agencies face more data center consolidation pressures. Many agencies are following the IT trend of breaking their problems into smaller pieces to make a complex problem more solvable. Your agency may be planning to send “some data” and “some applications” to the cloud, but do you have a methodology for optimizing your data once it’s spread across a hybrid environment?

• Join us to learn what you need to do to lay the groundwork for a good data governance program to support your agency’s consolidation goals:– Create views and models of your architecture– Maintain clear definitions of data, involved applications/systems and process

flows– Leverage metadata for data governance processes– And clearly define the integration and interfaces among the various platform

tools and between platform tools with other repositories and vendor tools

Page 3: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

3

Speakers

• Moderator: Michael Smoyer, President, Digital Government Institute– The moderator will introduce speakers, coordinate logistics

and Q&A with the "virtual" attendees.• David Lyle, VP Product Strategy, Office CTO, Informatica

– Co-author of “Lean Integration: An Integration Factory Approach to Business Agility”

• Brand Niemann, Director and Senior Data Scientist, Semantic Community– Author of over 50 Data Science Products in the Cloud for the

US EPA and Data.gov

Page 4: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

4

David Lyle• He co-authored two books… his latest was just last year. The book “Lean Integration:

An Integration Factory Approach to Business Agility”, published by Addison-Wesley. This book shows how “Lean” and “Agile” thinking can be applied to information management projects because they all follow a relatively small number of repeating patterns, and taking an assembly-line approach to dealing with these patterns delivers information to the business far faster, with less risk and cheaper costs than traditional approaches.

• He spoke at DGI’s EA Conference about “the acceleration in volumes of data as well as the acceleration in technological “options” (cloud, appliances, SOA, etc.) makes this problem (we call it the “integration hairball” in the book) even worse.

• With Lean Principles, (focus on the customer, eliminate waste in processes from the customer’s perspective, and use technology to manage this complexity more efficiently), we have a fighting chance, not to make the simple tasks mundane, but to make the seemingly impossible tasks manageable.

• The goal is to create a better IT world where the “customer/citizen” can self-serve themselves (when appropriate), yet give IT the visibility, oversight and governance of what the “customer/citizen” is up to.http://www.linkedin.com/in/davelyle

Page 5: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

5

Brand Niemann• Dr. Brand Niemann is the Director and Senior Data Scientist of the Semantic

Community. He was the former Senior Enterprise Architect and Data Scientist at the U.S. Environmental Protection Agency and co-led the Federal CIO Council’s Semantic Interoperability Community of Practice (SICOP) with Mills Davis from 2003-2008. He is currently authoring a series of Editorials for Federal Computer Week on his work and recently made Spotfire's Twitter list for his cool visualizations on government data to produce more transparent, open and collaborative business analytics applications.– http://semanticommunity.info/A_Gov_2.0_spin_on_archiving_2.0_data– http://spotfireblog.tibco.com/?p=5328

• He is working as a data journalist for AOL Government due to launch July 11th.– http://semanticommunity.info/AOL_Government

• He is also helping organize the 12th SOA for eGov Conference, October 11th.– http://semanticommunity.info/Federal_SOA

Page 6: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

6

Preface

• Thank you for the opportunity to present.• Primer (basic), Methodology (real-world example), and

Cloud (tools I used).• Real-world example: EPA Apps for the Environment

Challenge – good place to start and learn since agency data governance already in place and build on that!

• Some metrics: About 50 Data Products, Over 100 Spotfire Visualizations, Nine Data Stories for Federal Computer Week this year and 15 for AOL Government: Google “AOL Government Brand Niemann” to see the three that have been published since July 13th launch.

Page 7: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

7

Overview• Data Center Consolidation Initiative: Send agency data to Data.gov

and to the Cloud and close data centers.– My solution was and is: Put My EPA Desktop in the Cloud in Support of the

Open Government Directive and a Data.gov/Semantic• Published a Paper April 19, 2010

• Data Governance Program to Support Your Agency’s Consolidation Goals:– My solution was and is:

• Create views and models of your architecture• Maintain clear definitions of data, involved applications/systems and process flows• Leverage metadata for data governance processes• And clearly define the integration and interfaces among the various platform tools

and between platform tools with other repositories and vendor tools

– Using the EPA Apps for the Environment Challenge

Page 8: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

8

EPA Apps for the Environment Challenge

• Applications for the challenge must use EPA data and be accessible via the Web or a mobile device. EPA experts will select a winner and runner up in each of two categories: Best Overall App and Best Student App. In addition, the public will vote for a “People’s Choice” winner. Apps will be judged based on their usefulness, innovation, and ability to address one or more of EPA Administrator Lisa P. Jackson’s seven priorities for EPA’s future. Winners will receive recognition from EPA on the agency’s website and at an event in Washington, DC in the fall, where they can present their apps to senior EPA officials and other interested parties.

Source: http://www.epa.gov/appsfortheenvironment/

Page 9: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

9

EPA Apps for the Environment Challenge

• EPA challenges you to find new ways to combine and deliver environmental data in a new app. In the Apps for the Environment challenge, you have free reign to make an app that uses EPA data, addresses one of Administrator Lisa Jackson’s Seven Priorities, and is useful to communities or individuals. EPA encourages you to use other environmental and health data too. The winners will be honored at a recognition event in Washington, D.C. this fall and the winning apps will be publicized on EPA’s website.

Source: http://www.epa.gov/appsfortheenvironment/

Page 10: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

10

Create views and models of your architecture

http://semanticommunity.info/AOL_Government/EPA_Announces_Apps_for_the_Environment_Challenge#Apps_for_the_Environment

Unstructured to structured information view and modelSupports Sitemap.org and Schema.org Protocol

Page 11: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

11

Maintain clear definitions of data, involved applications/systems and process flows

http://semanticommunity.info/@api/deki/files/13015/=EPAApps.xlsx

Data set inventory and data element dictionaryWork flow for Phases I (Preparation) and II (Applications)

Page 12: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

12

Leverage metadata for data governance processes

http://semanticommunity.info/EPA/EPA_Toxic_Release_Inventory_2009#Record_Layout

The EPA TRI 2009 has 99 data elements defined in a 30 page PDF file that was exposed here with well-definedURLs (Getting to the Five Stars of Linked Open Data)

Page 13: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

13

And clearly define the integration and interfaces among the various platform tools and between platform tools with other repositories and vendor tools

PC Desktop Spotfire

The Data sets and data dictionaries and links to data sources and metadata are integrated here

Page 14: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

14

And clearly define the integration and interfaces among the various platform tools and between platform tools with other repositories and vendor tools

Spotfire Web Player

Phase I identifies Data Quality Issues: The Guam Brownfields site is obviously mis-located (see outlier to extreme right in the Scatter Plot below). It should be a negative Longitude and have a larger value.

Page 15: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

15

And clearly define the integration and interfaces among the various platform tools and between platform tools with other repositories and vendor tools

Socrata at Data.gov

Page 16: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

And clearly define the integration and interfaces among the various platform tools and between platform tools with other repositories and vendor tools

• Smart Mapping: Automatic Creation of Information Models:– Spotfire 3.3 Information Services users can automatically generate 1-to-1 mappings of

the existing tables and columns in their Data Sources. Just generate a Data Source in Spotfire, then right click it and select “Create Default Information Model…” This helps a lot when the work has already been done to nicely model and expose tables for business applications such as Spotfire, so the mapping step is more about transparency than transformation. For example, if you use Spotfire Application Data Services, you do the work in ADS to expose Spotfire-ready tables and columns, so a simple transparent mapping of those elements through Spotfire Information Services can now be accomplished in one click. Note that the automated creation will work through nested levels of data objects in the data source you supply.

– The result is a folder structure that matches the catalogs, schemas etc. that were selected with a column element for each column and an information link for each table containing those column elements. Procedures will get a procedure element and an information link of their own if they return data.

– See next slide.

16

http://semanticommunity.info/@api/deki/files/10975/=Whats_New_in_Spotfire_3.3.pdf

Page 17: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

And clearly define the integration and interfaces among the various platform tools and between platform tools with other repositories and vendor tools

17

Page 18: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

18

And clearly define the integration and interfaces among the various platform tools and between platform tools with other repositories and vendor tools

• Semantic Community Workflow:– Information Architecture of Public Web Pages in

Spreadsheets as Linked Open Data.– Public Reports (Web and PDF) in Wiki as Linked Open

Data.– Desktop and Network Databases in Wiki and

Spreadsheets in Linked Open Data Format.– Spreadsheets in Spotfire as Linked Open Data.– Spreadsheets in Semantic Insights Research Assistant for

Semantic Search, Report Writing, and Ontology Development.

Page 19: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

19

Questions and Answers

• Now and Later:– Brand Niemann– Director and Senior Data Scientist– Semantic Community– http://semanticommunity.info– [email protected]

Page 20: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

20

Supplemental Slides• 7.1 Semantic Technology Training: Building Knowledge-Centric Systems

– KM 2011– SemTech 2011

• 7.2 W3C Government Linked Data Working Group– Clinical Quality Linked Data on Health.data.gov– Build Clinical Quality Linked Data on Health.data.gov in the Cloud– Hospital Compare Downloadable Database Example of "5 Star Government Data“

• 7.3 Library of Congress Project Recollection and Digital Preservation Initiative• 7.4 Elsevier/Tetherless World Health and Life Sciences Hackathon (27-28

June 2011)– Build TWC in the Cloud– Build NCI CLASS in the Cloud– Build the NYC Data Mine Health in the Cloud– Build SciVerse Apps in the Cloud (IN PROCESS)

• 7.5 Be Informed (IN PROCESS)

Page 21: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

21

7.1 Semantic Technology Training: Building Knowledge-Centric Systems

http://semanticommunity.info/FOSE_Institute/Knowledge_Management

Page 22: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

22

7.1 Semantic Technology Training: Building Knowledge-Centric Systems

http://semanticommunity.info/Semantic_Technology_Conferences

Page 23: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

23

7.2 W3C Government Linked Data Working Group

• The mission of the Government Linked Data (GLD) Working Group is to provide standards and other information which help governments around the world publish their data as effective and usable Linked Data using Semantic Web technologies.

• This group will develop standards-track documents and maintain a community website in order to help governments at all levels (from small towns to nations) share their data as high quality ("five-star") linked data.

• The Working Group will construct and maintain an online directory of the government linked data community.

• "Cookbook" Advice Site• The group will produce Best Practices for Publishing Linked Data.• The group will develop Standard Vocabularies.• First Face-to-Face Meeting, June 29-30th, NSF, Arlington, VA.

http://www.w3.org/2011/gld/charter

Page 24: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

24

7.2 Open Public Dataset Catalogs Faceted Browser

http://semanticommunity.info/Data.gov/An_Open_Data_Public_Dataset_Catalogs_Faceted_Browser

Page 25: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

25

7.2 Linked Data Cookbook• Linked Data is an evolving set of techniques for publishing and consuming data on the Web.

Learn how Linked Data can turn the Web into a distributed database and how you can participate. In this session, Bernadette Hyland takes the mystery out of Linked Data by summarizing seven steps to prepare your data sets as Linked Data and announce it so others will use it.– Model without context: There is a Process: Identify, Model, Name, Describe, Convert, Publish, and

Maintain. I Disagree!• Participants will understand the actual steps to produce high quality, useful data sets that can be

modeled, transformed, documented and available on the Linked Data cloud. We'll discuss a recent government agency that did just this in less than 12 weeks. Best practices for data publishing as well as the "social contract" one makes as a publisher will be discussed.– Better to make progress with something rather than do nothing because we cannot be comprehensive

and complete. I Disagree!• Bernadette oversees strategy for Talis‘ North American clients. She brings a strong background in

commercial and government data management strategies, coupled with expertise in leading high-growth software organizations. Prior to joining Talis, Bernadette was CEO of several profitable Internet companies delivering scalable Web-based solutions for the enterprise, including Zepheira LLC and Tucana Technologies Inc., a pioneer in the emerging semantic technology community.

http://semtech2011.semanticweb.com/sessionPop.cfm?confid=62&proposalid=3822

Page 26: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

26

7.2 Linked Data Cookbook• 1. Leverage what exists.

– Obtain data extracts (i.e., databases and/or spreadsheets) or create data in a way that can be replicated.

• 2. Model data without context to allow for reuse and easier merging of data sets.– With LD, application logic does not drive the data schema, concepts, etc.

• 3. Look for real world objects of interest (e.g., people, places, things, locations, etc.) and model them.– Use common sense to decide whether or not to make link. I Disagree!

• 4. Connect data from different sources and authoritative vocabularies (see list of popular vocabularies below).– Put aside immediate needs of any application. I Disagree!– Don’t think about how an application will use your data. I Disagree!

• 5. Write a script or process to convert the data set repeatedly.• 6. Publish to the Web and announce it! (more details shortly).• 7. Maintenance strategy (more details in the social contract at the end).

http://www.slideshare.net/bhylandwood/bernadette-hyland-semtech-2011-west-linked-data-cookbook

Page 27: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

27

7.2 Linked Data Cookbook

• Guidelines for merging:– URIs name the resources we are describing.– Two people using the same URI are describing the same thing.– The same URI in two datasets means the same thing.– Graphs from several different sources can be merged.– Resources with the same URI are considered identical.– No limitations on which graphs can be merged.

• For a government agency ... a data policy is “a must”:– specify data quality and retention, treatment of data thru

secondary sources, restrictions for use, frequency of updates, public participation, and applicability of this data policy. I Agree!

http://www.slideshare.net/bhylandwood/bernadette-hyland-semtech-2011-west-linked-data-cookbook

Page 28: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

28

7.2 Linked Data Cookbook

http://www.slideshare.net/bhylandwood/bernadette-hyland-semtech-2011-west-linked-data-cookbook

Page 29: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

29

7.2 Clinical Quality Linked Data on Health.data.gov

http://www.data.gov/communities/node/81/blogs/4920

See Next Slide

Page 30: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

30

7.2 Clinical Quality Linked Data on Health.data.gov

http://health.data.gov/def/hospital/Hospital

Page 31: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

31

7.2 Clinical Quality Linked Data on Health.data.gov

http://health.data.gov/doc/hospital/393303.csv

Page 32: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

32

7.2 Clinical Quality Linked Data on Health.data.gov

http://www.slideshare.net/george.thomas.name/clinical-quality-linked-data-on-healthdatagov

Page 33: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

33

7.2 Health data innovation 'at a crawl'

• The health care data community should step up its efforts to innovate to help improve the nation’s health outcomes and reduce costs, Health and Human Services Secretary Kathleen Sebelius said at the department’s second Health Data Initiative Forum on June 9.

• “Use tools and use data,” Sebelius said at the forum held at the National Institute of Medicine campus in Bethesda, Md. “Do it more, do it better and do it faster.”

• Sebelius said Americans experience a “triple loss” due to having the highest public health care costs, highest private health care costs, and only mediocre health outcomes.

• The goal of the conference was to present 45 winning health care IT applications developed with HHS’ newly-available data sets within the last several months. HHS CTO Todd Park called the event a “Health Data Palooza” that would showcase innovation in health IT.– PerlDiverInc and Semantic Community were one of the finalists!

http://fcw.com/articles/2011/06/09/nation-needs-more-health-data-innovation-sebelius-says-at-forum.aspx

Page 34: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

PearlDiver Data Engine &

Semantic Community Data Visualization

Benjamin Young Brand NiemannPearlDiver Technologies Inc. Semantic Community

Health Data Initiative Forum Submission

Medicare Zombie Hunter

Page 35: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

35

7.2 Build Clinical Quality Linked Data on Health.data.gov in the Cloud

http://semanticommunity.info/Semantic_Technology_Conferences/Clinical_Quality_Linked_Data_on_Health.data.gov

Page 36: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

36

7.2 Build Clinical Quality Linked Data on Health.data.gov in the Cloud

http://semanticommunity.info/Semantic_Technology_Conferences/Clinical_Quality_Linked_Data_on_Health.data.gov/Hospital_Compare_Downloadable_Database_Metadata

Page 37: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

37

7.2 Build Clinical Quality Linked Data on Health.data.gov in the Cloud

PC Desktop Spotfire

Page 38: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

38

7.2 Build Clinical Quality Linked Data on Health.data.gov in the Cloud

Spotfire Web Player

Page 39: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

39

7.3 Library of Congress Project Recollection and Digital Preservation Initiative

The Libraries of Congress & MIT are developing a Semantic Web Browser (Exhibit and now Exhibit 3) to do essentially what Spotfire already does!

Page 40: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

40

7.3 Library of Congress Project Recollection and Digital Preservation Initiative

PC Desktop Spotfire

Page 41: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

41

7.3 Library of Congress Project Recollection and Digital Preservation Initiative

http://semanticommunity.info/Semantic_Technology_Conferences/Library_of_Congress

Page 43: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

43

7.4 Elsevier/Tetherless World Health and Life Sciences Hackathon (27-28 June 2011)

http://semanticommunity.info/Build_TWC_in_the_Cloud

Page 44: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

44

7.4 NYC Data Web

http://knoodl.com/ui/groups/NYC_Homepage

Page 45: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

45

7.4 NYC Data Web

http://semanticommunity.info/Semantic_Technology_Conferences/NY_Data_Mine/Revelytix

Quote: Ontology architecture is a new aspect of system architecture and development, to our knowledge it has not been employed anywhere else in DOD.

Page 46: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

46

7.4 NYC Data Web

http://semanticommunity.info/Semantic_Technology_Conferences/NY_Data_Mine/Revelytix#Dashboard

Page 47: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

47

7.4 NYC Data Web

PC Desktop Spotfire

Page 48: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

48

7.4 NYC Data Web

PC Desktop Spotfire

Page 49: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

49

7.5 Be Informed• A recent paper describes the formalism and rationale that Be Informed applies

to business process modeling. It explains how and why goal-oriented modeling differs from more conventional business process modeling which is procedural. In the near-term, there is applicability for many government agencies, especially for those exploring semantic approaches.

• For example, Dennis Wisnosky advocates semantic web (RDF & OWL) standards for modeling data integration, and a dialect of BPMN for modeling processes. The metaphor for processes is an electronic circuit specification that uses standard building blocks. "We all know what those primitives mean." Previous, costly attempts at business process modeling were failures in part because there was no standard at the primitive level.

• However, as this paper makes clear, just having unambiguous primitives is only part of what is needed to specify and manage complex and dynamic business processes. Modeling flow in swim lanes is less agile than modeling goals, activities, and pre and post conditions.

Source: Mills Davis, Project10x, July 5, 2011.

Page 50: A Primer for Data Methodology in the Cloud: Making Data Governance Work in Hybrid Environments Dr. Brand Niemann Director and Senior Data Scientist Semantic

50

7.5 Be Informed

Source: Specifying Flexible Business Processes using Pre and Post Conditions, Jeroen van Grondelle and Menno Gulpers, Be Informed BV, Apeldoorn, The Netherlands, 13 pp.

Fig. 1. Summary of the Meta Model for Capturing Business Processes