Decisions, decisions, decisions: standards for evaluating international statistics resources

  • Published on

  • View

  • Download

Embed Size (px)


  • Librarians routinely evaluate new print publications and make purchasing decisions as to

    what best serves their clientele. However, because of ever-evolving software, newproducts, and new delivery methods, librarians are less comfortable evaluating electronic

    statistical publications. Increasing demand for these products and limited resources make

    this evaluation process critical.

    The bulk of this article will focus on resources that are not freely available

    because the choices are more important and often mean weighing one product against

    another. There is so much variation in construction and delivery of international

    statistical resources that articulating a single set of standards by which to judge them

    would be inappropriate. Instead, potential users should apply a variable list of

    standards, some of which apply to all resources and some of which apply to only

    a few. This article describes such a combination of standards, noting resources that

    exemplify them.

    Briefly, the standards groups are: (1) those independent of the resource; (2) thoseInternational Information Update

    Decisions, decisions, decisions: standards for evaluating

    international statistics resources

    Amy West*

    Government Publications Library, 10 Wilson Library, 309-19th Avenue South, University of Minnesota,

    Minneapolis, MN 55455-0414, USA

    Available online 29 December 2003

    1. Introduction

    No institution ever has as much money as it could use for collection development.

    Journal of Government Information 29 (2002) 365370dependent on the resource itself; (3) those dependent on the resource in context of other

    related resources; and (4) those dependent on the resource in context of the customers

    budget. At the end is a select list of resources annotated with their best features and

    biggest shortcomings.

    1352-0237/$ see front matter D 2003 Elsevier Inc. All rights reserved.


    * E-mail address:

  • documentation for the UNSTATS database takes advantage of hyperlinks to link individualdefinitions of indicators with their original source, so that users can view all the indicators

    contributed by a given source or all the sources for a given indicator.

    2.2. Is it compatible with the local computing environment?

    When a resource is delivered via a tangible medium, then it will have to work with the

    local operating system while a security program is running. This isnt too much of a problem

    these days. However, there have been products designed to interface with the hard drive of the

    computer in such a way as to make the hard drive accessible by users. On a librarys

    networked public workstation this would conflict with security standards.

    3. Group 2 standards

    Buyers may choose from so many resources with so many different uses, target audiences,

    and methods of delivery, that it would be pointless not to use standards that are specific to each

    resource. Broadly, thismeans testing them to see if they live up to their advertising. For example,

    if a producer says the benefit of a given product is that said product will provide remote access

    via Internet delivery, then the test of the product should be Will this be really usable by a end

    user using a standard dial-up connection to the Internet? Even if the end-user has a 56kmodem

    and the producer has the biggest, fastest server in the world, a standard connection travels on

    phone lines and phone lines transmit at 28.8bps. In essence, this means one should ask how

    many graphics are used, how large they are, whether there is behind the scenes programming

    that is invoked every time a page is loaded, and how many clicks it takes to get to the statistics.

    UNSTATS is an excellent example of how to provide true remote access. Its interface is

    very simple and has minimal graphics. The interface pages, as delivered to the end-user, are2. Group 1 standards

    2.1. Documentation

    There should be two separate types of documentation for any resource. The first is

    documentation of the interface software, with clear installation instructions (for resources

    delivered via tangible media), a help file and contact information should problems in

    installation or operation occur. Generally, most resources have enough of this type of

    documentation to suffice. The second type of documentation covers the data itself. It should

    indicate where the numbers came from, if they were modified and how so, what calculations

    are then used to generate the statistics, and explanations of all symbols used in tables.

    World Development Indicators (WDI) has pretty good documentation. On the CD there are

    separate files for the index of indicators, acronyms, and abbreviations, a bibliography, their

    groups of economies, the primary data documentation, and their statistical methods. The

    A. West / Journal of Government Information 29 (2002) 365370366straightforward HTML and involve no scripting apart from what may be used to initially

    generate and load the page. As a result, it is very fast.

  • end-user. One effective instance of this is the WISTAT CD-ROM from the United Nations.

    Users can interact with the statistics using Beyond 20/20, but there is a separate directory thatcontains Excel formats of all the tables in the database. Users who already know what they

    need can go straight to the Excel files, save a copy, and head back to their office to work while

    users who do not know what they need can browse with Beyond 20/20.

    Ideally, producers would include character delimited ASCII text file formats in case the

    end-users file is too big for a disk or if the end-user wants to use her file with some software

    other than Excel or because the end-user has another need for a nonproprietary file format.

    WDI and the World Bank Africa Database use the same software and both offer users the

    option of saving as ASCII text, Excel, SAS, and more.

    4. Group 3 standards

    The third group of standards depend on the resource in context: what about it makes it

    worth having: content, querying software and/or data structure?

    This standard can be the hardest to judge. Most resources for international statistics start

    with the same sources, i.e., they start with data gathered by other international inter-

    governmental organizations. In the absence of a summary comparing sources, potential

    buyers have to go to the resources and try compare on a series by series basis. This is

    virtually impossible due to the massive size of most resources, the limited time available to

    buyers and, most importantly, structural differences in databases that mask similarities

    between sources. The UNSTATS and International Financial Statistics (IFS) databases are

    a good example of this.

    When UNSTATS was introduced, its producers highlighted in particular its inclusion of

    IMF data otherwise only available on the IFS CD. For potential buyers of UNSTATS whoResources are also routinely called easy to use. Easy is subjective, but some

    illustrative examples are available. SourceOECD makes good use of web design standards.

    The link to the Statistics section is easily spotted on the home page. When clicked, the end-

    users eye will be drawn to the menu down the left which contains links to broad subjects, e.g.,

    Agriculture. Given that end-users typically think in broad terms, this makes for a good match.

    The end-user thinks, I want stuff on agriculture. Oh, there is agriculture. Then the user

    clicks on Agriculture. SourceOECD also uses graphics to help orient the end-user, such as

    smiley faces to indicate whether the end users institution has access to a given database.

    SourceOECDs implementation of the Beyond 20/20 browser is also well done. Because the

    end-users operational options are always in view in a menu on the left-hand side of the

    screen, it is easy to change variables, time periods, countries, or output options.

    If a resource is supposed to allow users to download or save the information theyve looked

    up, take a look at the file formats users can choose from. Microsoft Excel is the most widely

    used spreadsheet in the world and there should be a format compatible with it available to the

    A. West / Journal of Government Information 29 (2002) 365370 367already bought IFS, it was then important to determine the extent of overlap because IFS is

    more expensive than UNSTATS. If UNSTATS provided enough data from the IMF, then

    buyers might decide to discontinue the IFS subscription.

  • typographical errors, but which inflate the number of rows and thereby the number of seriesGiven all of the above, is a resource worth the cost or not? The answer is, of course, it

    depends. Certainly, any resource that is cheap will get considered and in all honesty will

    probably get judged less stringently simply because the financial stakes arent as high.

    Conversely, any really expensive resource, even it appears to be really, really good, could

    be dismissed out of hand.WDI on CD-ROM is very reasonably priced, works well, has lots of

    content, and is fairly easy to use. WDI Online, to the extent that it performs as well as the free

    Data Query on the World Bank web site, looks to be significantly better. It integrates

    documentation, effectively exploits hyperlinks, and does not overdo the graphics. However,

    compared with the cost of both the network license for the CD and for a similar web delivered

    service such as UNSTATS, the cost is astronomical.

    6. Conclusion

    In an imperfect world where buyers have limited income, they must critically assess any

    resource that provides access to international statistical resources. Some of the standards for

    assessment will be applicable across the board, some will be specific to the resource and some

    will be specific to the financial state of the buyer.

    One test that all the producers of resources discussed above pass with flying colors isand observations.

    On the surface, UNSTATS appears to provide just under 100 series (with the attendant

    larger number of observations). That implies that very little of the IFS database is captured by

    UNSTATS. However, it turns out that the database structure underlying UNSTATS is multi-

    dimensional, not two-dimensional. That means that all of the rows that would belong to, say,

    capital account credit, and which would be counted individually in IFS as described above,

    are collapsed in UNSTATS. In UNSTATS there will be a series, like capital account credit,

    which has multiple dimensions including time and place. Thus, while there is a content

    difference between IFS and UNSTATS, it is not as extreme as it might seem nor is it small

    enough to justify dropping an IFS subscription.

    5. Group 4 standardsThe IFS database is a two-dimensional table in which every row represents a series

    defined as a set of statistics for a given country over a period of years. There is a minimum

    of 30,000 rows in the IFS database. The number of observations would then be 30,000

    times about 50 years plus an unknown number of quarterly and monthly periods, i.e., a

    minimum of 1,500,000 observations. The maximum is harder to calculate. Not every

    country will have a row for every statistic and IFS treats aggregated groups of countries as

    if they were individual nations. Also, there are several odd series names that are probably

    A. West / Journal of Government Information 29 (2002) 365370368responsiveness to customers. They have each taken critical comments constructively and

    moved to address them and it has been appreciated by their users.

  • Resource name Best feature Biggest shortcoming

    Eurostat (web site)

    Unique content Almost none of it is free

    FAOSTAT (web and CD) Lengthy time series,

    unique content

    User doesnt find out web

    downloads arent free until

    after trying to download

    Census Bureau International Database (web and downloadable software)

    Lengthy time series

    of demographic data

    Labeling and descriptions

    on web site confusing

    Foreign Labor Statistics (web) Excellent documentation,

    public data query clearly

    directs user with numbered


    Public Data Query does

    have a download option,

    but it is not explicitly

    described that way and

    users could easily end up

    doing more work than


    International Financial Statistics (CD)

    Unique content, extremely

    timely, lengthy time series,

    lots of series, low


    Interface initially confusing

    to users

    LABORSTA (web) Free, lengthy time series,

    includes worker injury and

    strike statistics

    Interface uses frames which

    dont meet accessibility


    SourceOECD (web) Provides trade by commodity

    by country by year; Beyond

    20/20 implementation is


    Too many graphics, too

    long to load each page, too

    many clicks to get to data,

    down too often

    Table of ResourcesA.West






  • UNSTATS (web) Fast, tells user coverage for

    series as a whole and for

    each country in each series

    Putting a link to the

    Advanced Data Selection

    on every screen falsely

    implies a context-sensitive

    function; user will not

    expect to have to start over

    from scratch

    UN Demographic Yearbook Historical Supplement (CD) 50-year time series in many

    formats, including raw data

    and sample SPSS data


    Overly complex frames

    interface that squeezes

    target information into a

    very small frame

    UNESCO Statistics (web site)

    Freely available, stable, easy

    to use, clear directions for


    Limited statistics as

    compared with other

    sources that draw on

    UNESCO data

    WISTAT (CD) Unique content thats hard to

    come by

    Beyond 20/20 software

    can be difficult to use on a

    public workstation that has

    other titles also using

    Beyond 20/20

    World Bank Africa Database (CD) Unique content thats hard to

    come by, uses the same

    software as World

    Development Indicators

    Not as much documentation

    as on World Development


    World Development Indicators (CD)

    40 years of a huge number

    of series drawn from many

    different sources

    Software is a little clunky,

    initial results display is








    Decisions, decisions, decisions: standards for evaluating international statistics resources1. Introduction2. Group 1 standards2.1. Documentation2.2. Is it compatible with the local computing environment?

    3. Group 2 standards4. Group 3 standards5. Group 4 standards6. Conclusion


View more >