19
e-Delivery Team 1 Too many websites Too little of interest 30.03.2004 e-Delivery Team Alan Mather

Too many websites v2

Embed Size (px)

Citation preview

e-Delivery Team

1

Too many websites

Too little of interest

30.03.2004

e-Delivery Team

Alan Mather

e-Delivery Team

2

Page Count <50 <1000 <2000 <3000 <4000 <5000 <10000 <20000 <100000

Site Count 1891 738 251 120 63 113 4 28 25

%of total 58% 81% 89% 93% 95% 98% 98% 99% 100%

Cumulative

Site Count 1891 2629 2880 3000 3063 3176 3180 3208 3233

Gov.UK - Web Site Page Counts

0

200

400

600

800

1000

1200

1400

1600

1800

2000

<50

<100

0

<200

0

<300

0

<400

0

<500

0

<100

00

<200

00

<100

000

e.g. 0<x<50, 50<x<1000 etc

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

%ag

e o

f to

tal sit

e c

ou

nt

High count with less than 50 pages – many redirects (where domain has changed or is not active, e.g. direct.gov.uk), also sites that use ASP or frames making it impossible for google to spider behind first page

But, notwithstanding inability to spider some sites, it looks clear that the vast bulk of .gov sites have less than 2000 pages

And only a few sites have huge page counts (between 20,000 and 100,000) … including ir.gov.uk, dh, scotland, ons, hmso

Pages per site

Using google to spider to count the pages of all 3233 .gov.uk sites … 90% of sites have less than 2,000 pages

Less than 1% of sites have more than 20,000 pages

e-Delivery Team

3

The Google Data - Raw

The Google data shows: More than 80% of the content (in pages) is found in around 10% of the total count of sites

There are huge numbers of very small sites (per Google), although that may be because

Google is unable to spider or does not cover all sites through the entire hierarchy

Still, errors in Google indexing are likely to be consistent across the entire population of .gov

sites, making the shape of the graph likely ok

Google's site sizes

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

sit

e s

ize

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

e-Delivery Team

4

Counting Servers

Checking on the servers operating

behind the websites in .gov.uk

Over 1,200 running Apache

And more than 1,500 running Microsoft IIS

These figures don’t include servers that

may be configured but not active for, e.g.

resilience. They also don’t include

servers further down the infrastructure

stack, e.g. running content applications

or other code

Naturally, each of these servers is likely

to be accompanied by firewall and

storage configurations

At a conservative cost of £10,000 per

server, the total cost of this infrastructure

alone is over £29,000,000

Apache 1209

Apache 186

Apache/1.3.26 274

Apache/1.3.27 282

Apache/1.3.28 62

Apache/1.3.29 99

Apache/2.0.40 25

Apache/2.0.45 1

Apache/2.0.46 32

other Apache 248

Microsoft-IIS 1547

IIS/4.0 377

IIS/5.0 1103

IIS/6.0 65

other IIS 2

Lotus-Domino Lotus-Domino 109

Netscape-Enterprise Netscape-Enterprise 74

e-Delivery Team

5

Cost of Websites (Benchmarking)

250k 500k 750k 1.0m 1.25m 1.5m 1.75m 2.0m 2.25m 2.5m 3.0m

Not on Record•dti•IR•HMCE•Home Office•DEFRA•ODPM

HMT

OFT

TheRegister.com

DWP The PensionService

JC+

DH

ONS

Worktrain

Business.gov?

JC+ (development)

Worktrain(development) DfES

LargeQuasi-PublicSector (fullyLoaded)

Figures drawn from recent PQ (and, unless stated, include only hosting charges and not development or development support)

DfT

e-Delivery Team

6

Characteristics of .gov.uk sites

More than 3200

sites

More than 2.5

million documents

Huge - up to 100,000

pagesComplex - Nine levels

deep

More than 200

URLs per dept More than 300

authors

Some parts of the

site not linked to others

‘orphan content’

100s of broken

links

Slow - download time

more than one minute

Unreliable - Poor uptime

Inconsistent - five different

look and feels

More than three

navigation designs

e-Delivery Team

7

Looking For The Right Thing?

30/03/2004

Disability Living Allowance 14,700

Child Tax Credit 5,790

Carers Allowance 915

Working Family Tax Credit 546

Attendance Allowance 13,000

Council Tax Benefit 42,000

Housing Benefit 77,800

Statutory Sick Pay 6,200

Self Assessment 14,000

Using Internet search engines in an

effort to find “the right thing” can be

challenging. The search terms at left

were entered, with the results restricted

to the “.gov.uk” domain only

There is a huge amount of duplication

in government online:

Many local authority sites repeat the

description of the rules for claiming

certain benefits, where to claim, what to

claim for and so on … and doubtless,

every year or so, each of these

mentions must be updated with the

correct rules (but what if they’re not?)

Even “self assessment” only has 4,950

mentions on the Inland Revenue’s own

site, but a further 9,000 across the rest

of government

e-Delivery Team

8

And how does .gov look to the consumer?

The variety of sites show little in

the way of consistency

Navigation varies from site to

site, sometimes on the left,

sometimes tabbed, sometimes

graphic, sometimes text

“Search” is called different

things, is often not on the home

page and often returns poor

results – despite research

showing that consumers who

can’t see what they want

instantly will use search

Accessibility is poor with many

sites not attempting to achieve

the lowest hurdles

Even sites owned by the same

parent are confusing, e.g.

pensionservice, pensionguide,

agepositive, over50 …

e-Delivery Team

9

The Missing Data

To complete the picture and allow the proposed plan of action to be fine

tuned, the following data is needed:

Visitor counts (Hitwise may offer an approximation)

Approximate costs to operate (at an infrastructure level including all servers,

network equipment, firewalls, software licences etc) – both price bought at and the

price for continued operations projected forwards (to allow for annual licence

premiums, renewals etc that may be due in the future)

Contractual agreements around exit arrangements, renewal dates etc along with

whether the contract for web hosting is part of a wider technology outsource

agreement (that might, therefore, make it harder to exit)

e-Delivery Team

10

Proposal For What Next

Principles

Government is in the business of helping citizens by making information easy to find. The total number of websites needs to be rationalised dramatically – from over 3,000 to under 600 in the first stage (including Local Authorities).

Government is in the business of presenting information in a way that citizens will understand; it is not in the user interface design business. The range of navigational and interface styles needs to be harmonised to a single core style.

Government has already spent significant sums on its online presence, yet government is not a technology leader. The cost of the programme outlined must be absorbed through saves generated in the first year of the programme, making it self-funding.

Government buys in cycles and these are likely to be maintained. This cycle will allow work to be completed at a constant pace as contracts come to their natural end, thus incurring no exit penalties.

A programme of rationalisation this large will require multiple parallel streams of work – the cost of the overlap reducing the saves inherent slightly but increasing the odds of success through elimination of bottleneck and delay

e-Delivery Team

11

DotP versus Everything Else

Condensing 3,000+ sites to a few hundred is no simple task. It will likely

require a variety of approaches and software solutions to ensure that there

are no bottlenecks.

DotP’s primary characteristics are:

A managed service model (i.e. hardware, software, network included)

A high end content management engine allowing customised workflow, complex

information architectures and large numbers of geographical authors

Highly resilient, scalable and secure infrastructure reducing the risk of failure

A model to allow changes to sites through configuration, not code customisation

A range of features tailored to solve government’s main content problems

Other content engines usually:

Come as a software licence with extensive customisation required

Have a range of features that DotP doesn’t have and that have been developed

over several product cycles, primarily for commercial customers. Some of these

features will be useful for government

Will develop competitively no matter what government does

But they rarely come as managed services, necessitating hosting and

management to be included

e-Delivery Team

12

Setting Up The Programme

Select a core of important websites based on:

Total size (aiming to isolate 50% of the content in government)

Visitor count (capturing a large chunk of the audience, say 50%)

Transaction generation (targeting the bulk of online transactions for both business

and citizen)

Content management status (looking first for unmanaged systems still based on

HTML or those that are not well advanced in terms of a content engine)

Outline the information architecture as it is coupled with the target

architecture for how it should be – taking each site and fitting it into an overall

architecture and design that is consistent across all of them

It is assumed that these sites – ranking as the most popular and largest in

government – will need rearchitecting to make the most of them (including a new

layout, new navigation and so on)

This rework will give a good chance to eliminate duplication and inconsistency, as

well as remove as much as 30-50% of content as redundant (based on experience

with Department of Health).

e-Delivery Team

13

Establishing The Target Platforms

To identify the target platforms, the following is proposed:

A “bake off” competition is kicked off where a variety of content management

vendors are given an environment (with workspace, hardware and network

connectivity).

Each vendor is given the same brief – to take an existing, static website – the

“challenge site” - with a known information architecture and transfer it to a new

target architecture (also provided).

The vendors then set up their systems, using templates and guidelines provided

by government, to deliver the challenge site under strict timescales – including

defining the architecture, implementing the style guidelines, integrating the search

engine and migrating the content

At the end of the competition, a subset of the vendors who have met previously

agreed and published criteria is passed through to the next stage

Commercial agreements are then built – using standard templates – with the

vendors, allowing for volume discounts on licences to be obtained.

Websites in the core population are then allocated across vendors and the

implementation task kicked off. Vendors that perform are given more, vendors

that don’t perform are gradually eliminated and their work shared across other,

more successful vendors

e-Delivery Team

14

Why a Bake Off?

Migrating some 3000 websites is a fearsome task, here is why there should

be more than one solution going:

The problem is not one of only technology – the changes required to government

editorial processes are enormous. The greater the range of experience thrown at

this, the better the result

One single system (or even two or three) would result in bottlenecks that would

delay rationalisation. Having several “similar” but independent systems will

resolve the bottleneck

One large system would be high risk – a single outage could take down

government’s online presence – spreading the systems will, in the end, reduce risk

versus cost.

Competition is healthy – a few players working both together (to complete the

goal) and against each other (to complete the goal first and therefore win

business) will work well

But, we need only a few (5,6,7?) – too many will bring too high an overhead and

risk quality standards

e-Delivery Team

15

Estimating the Costs

The costs of migration will include:

The initial work to identify candidates

The evaluation of target platforms

The setting up of migration environments

The cost of redesign of some sites to make them consistent with the target

standard (e.g. search engine on home page, navigation through tabs, reducing the

depth of the site etc)

The cost of redesigning pages to fit the new system – e.g. where the site uses

custom techniques that are not easily replicable

The actual migration of data from one format to another (there are tools that claim

to do this, with varying success, or manual methods – these too will need to be

assessed)

e-Delivery Team

16

Integrate … Marriott.com

One URL

13 brands

Five major redesigns

2,600 locations

142,000 people

e-Delivery Team

17

Rationalise … IRS.gov

235 sites … to one

47% e-filing

25 million regular users

AOL cache data at peaks

80% of e-filers do it again

Accountants starting to charge $35 for

those who want to do it on paper

e-Delivery Team

18

Unfocused and disorganised

e-Delivery Team

19

Organised and Focused