31
Inverted World: Open Data, Open Government Astana, Kazakhstan Oct. 4-5, 2014 By Eric Kavanagh, CEO The Bloor Group

Inverted World: Open Data, Open Government - Sign In …workspace.unpan.org/sites/Internet/Documents/UNPAN93640.pdf · Inverted World: Open Data, Open Government Astana, Kazakhstan

Embed Size (px)

Citation preview

Inverted World: Open Data, Open Government

Astana, Kazakhstan Oct. 4-5, 2014 By Eric Kavanagh, CEO The Bloor Group

We Live In Interesting Times!

Are we living in a new era of the Child Emperor?

The world of media has gone from a Push to a Pull model.

Craigslist guts revenues for major newspapers by 50%.

With newspapers, what you read was nobody’s business.

When you browse online today, every motion you make can be monitored and tracked!

What happens when practices invert?

Mark Zuckerberg

Why Open Government? Trust!

“If you always tell the truth, you don’t have to remember anything!” Trust breeds trust, while

mistrust breeds mistrust. Trust is the foundation of all

social and civil transactions. Without trust, chaos will ensue

Mark Twain

What Happens When Opacity Rules?

World Bank Data in 2000 stated that US GDP was $10T

In that same year, Russia was listed at $340B!

How could one of two Super Powers be nearly 30x the size of the other, in terms of GDP?

The Black Market!

Russia was still mired in practices & behaviors that stemmed from decades of oppression

Most transactions, small and large, took place under the radar of government officials

Lack of openness on rules and data created widespread mistrust

Lack of trust leads to unrest of various kinds!

Time to Return to First Principles?

Philosopher Lao Tzu cautioned against complicated legal codes 2,500 years ago

“When the laws are complex, the bandits will abound!”

You cannot consciously obey laws you don’t get

Enforcement = Arbitrary

Ancient Wisdom Still Applies Today!

All Roads Lead to Open Government?

Policies must be defined to determine which data sets should be open

Procedures must be designed to enable a smooth flow of data

Practices must be devised that enable all parties to collaborate

This Will Take Time!

Today’s Governments Must Embrace Transparency

But How?

Why Not Open Government?

All data is open to someone! If you don’t open the door, someone else will!

Talk about an inverted world! A US Citizen gets political asylum in Russia? The former Soviet Union?

When you try to control too tightly, you can often lose all control!

How Can Open Government Happen?

An ideal solution exists, and it’s called the Highly Distributed File System

HDFS is the foundation of Hadoop, and it’s an open-source system, aka FREE!

HDFS is infinitely scalable, and is designed to not lose data

This foundation can serve as the ultimate storage area for open data

Hadoop Players: Strenghts & Weaknesses

Purveyors of the pure source code; but navigation is difficult.

$1B invested esp. Intel; proprietary enterprise software.

Purely open-source approach; serious ‘support’ costs.

Focus on traditional data management; proprietary.

Enterprise hardened, tandem with Vertica; proprietary.

Hadoop Is Not a Data Warehouse

Big Data and the Data Reservoir

Just the Data Will Not Be Enough

Data without the context of process and meaning provides no value

Open Government also requires transparency of process: who does what, when, why, and how?

The complete picture must be viewable such that public eyes can help

Collaboration 3.0? Many Hands…

A range of functionality can enable citizens to assist the government

Questionable expenses or processes can be flagged; a critical mass leads to formal review

Registered users who find valuable issues will earn points and credibility in this meritocracy!

Beware the Specter of Data Quality

Data Quality is notoriously bad in organizations and government entities

Problems go far beyond misspelled names and bad addresses

Business logic and rules trapped in legacy apps

Faulty integration points Metadata out of synch

with data values

Key Code Analysis – Invoice data sets extracted with correlation

• CAGE: 984274, DUNS: 973437 – FPDS DUNS and Names extracted & correlated

• 158181 unique DUNS codes – Will be included in normalized composite IT Asset

records – Composite records for lookup added to Hadoop

• By DUNS or Global DUNS: get all related DUNS, CAGE, names

• By CAGE: get all related DUNS, names • By name: get all related DUNS, CAGE, names

Number CAGE Per DUNS Code

0.1

1

10

100

1000

10000

100000

1000000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 23 24 27 35 40 43 44 46 54 71 78 90 119

Number DUNS Codes With X CAGE Codes

One DUNS code has 119 CAGE

0

0.2

0.4

0.6

0.8

1

1.2

1.4

ToWAWF

Mill

ions

CAGE Codes from LookUp File

Found NotFound

0.1

1

10

100

1000

10000

100000

1000000

0 1 2 3 4 5

FPDS Number DUNS with N Global DUNS

0.1

1

10

100

1000

10000

100000

1 3 5 7 9 11 13 15 17 19 21 24 27 35 112

FPDS: Number DUNS with N Names

6849 instances for code = 123456787

0.1

1

10

100

1000

10000

0 50 100 150 200 250

Num

ber

Glo

bal D

UN

S

Number DUNS

FPDS: Number Global DUNS with N DUNS

0.1

1

10

100

1000

0 200 400 600 800 1000 1200 1400 N

umbe

r G

loba

l DU

NS

Number Names

FPDS: Global DUNS with Multiple Names

140827

13302

17363

942

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

DUNS GlobalDUNS

FPDS DUNS Code Matches to WAWF Codes

Found NotFound

DUNS NGlobalDUNS Nnames

123456787 0 6849

136666505 0 112

790238851 0 96

103933453 1 35

103385519 1 33

005149120 1 27

067641597 1 25

005103494 0 24

332619535 0 24

020751082 1 22

054781240 1 22

621599893 1 21

790238638 0 21

834476079 1 21

FPDS DUNS With Most Names 123456787 miscellaneous foreign contractors 123456787 etisalat c/o us consulate general dubai 123456787 boswedden house 123456787 turner engine controls b. v. 123456787 swissport hellas cargo s a 123456787 orbit couriers sa 123456787 goldair aviation handling s.a.

123456787 federal egov iae initiative generic duns

123456787 federal egov iae initiative - generic duns

123456787 miscellaneous foreign contractorsan 123456787 prc-desoto 123456787 inversiones sochagota e.u. 123456787 comcel 123456787 transporte y servicio lucio

123456787 jesse james members only maxi taxi svc

123456787 club naval de oficiales 123456787 inchcape shipping services 123456787 dr. thalia abatzi 123456787 central asia development group 123456787 bennett-fouch and associates 123456787 noor al-sabah company 123456787 ait/arc infrasture solutions 123456787 not available 123456787 77 construction company

136666505 adese genc petrol 136666505 amy lily chung 136666505 anderson erin ruth 136666505 andrew william knef 136666505 anduaga-arias laura 136666505 angelica m. de la cruz 136666505 anthony o'brien, 330531-5100194 136666505 batac belle 136666505 bottesini beth ms. 136666505 bouck shannon 136666505 bunn amy b. 136666505 carlene clark 136666505 cho, boong haeng 136666505 choe, sun young 136666505 christina michajlyszyn 136666505 christopher cannon 136666505 christopher l. booth 136666505 chun, kil mo 136666505 conflict + transition consultancies 136666505 cozzone elaine 136666505 deborah p. carney 136666505 denihan patricia joann

136666505 dong sook mcgeorge, 690525-2716816

136666505 dorene d.lukewalton,pharm d. 136666505 dr. terry a. klein

FPDS Global DUNS with Most Names & DUNS

GlobalDUNS NDUNS Nnames 877936518 12 27299 624770475 212 21866 148095086 80 21754 027079776 2 17128 103933453 86 17075 026157235 4 15694 963737366 106 15200 134303192 19 14481 067641597 108 13998 064680213 102 13809 077652761 93 12914 002204600 15 12570 039860122 44 12382 805258373 130 11995

GlobalDUNS NDUNS Nnames 624770475 212 21866 805258373 130 11995 012003349 128 9748 877987347 127 8253 057272486 124 6935 007250079 123 9076 071767334 123 9474 158140041 117 6671 019710586 116 8163 091441089 116 7813 616924770 116 7217 067641597 108 13998

Prompted Collaboration and New Business Information

Showing these results prompted discussions leading to: – There are generic DUNS heavily used but these

are being removed from use via policy changes – System validation rules are not current with all

policy – Additional “rules” of how to track, audit, align,

merge spread by email • All put back into Data Normalization system and then

into modified Java

New results available over all data sets <1day

Make the Data Useful: Search 3.0!

Traditional Google-based search is very primitive, pays little attention to context or semantics

Most end users have become dulled by it and don’t invest much time

What’s the loneliest place on the Web? Page two of Google results!

The Search Giant Seems Rather Distracted

One Solution? Zakta Research

Zakta provides a comprehensive search platform

Search results begin with a semantic map: Penguins -- Arctic birds, the Linux logo, or Pittsburgh hockey team?

Guided navigation through topics enables solid research Give Users the Tools

To Discover Issues

Beware: Security! Is IT Safe?

Any system can be hacked, whether from inside or outside

Strategies and tactics of hackers change all the time and must be monitored closely

With so much money living in a digital world, security poses grave challenges for us all Mischief Makers Are

Here to Stay!

One Solution? Extrahop

Replicates all network traffic to create a mirror image of application architectures

Provides multifaceted view of information landscape; identifies all packets of data

Could have saved entire Target breach from ever taking place! Security Requires

More Than Tools and People

Truth: A Roadmap to Prosperity

In all of life’s dealings, truth engenders trust

Trust fosters openness Openness leads to

collaboration The end result? Peace

and prosperity, with very little mischief!

“Please, let it have been a goat!”

Faoud Ajami

Your Presenter Today:

Eric Kavanagh, CEO The Bloor Group [email protected] Twitter: @Eric_Kavanagh +512.426.7725 http://www.insideanalysis.com

Take the Inside Track to Insight!