23
ONLINE FILE W3.1 DATABASES In this file, we discuss databases in detail, with examples as application cases. Specifically, we discuss the following: • The nature and sources of data • Data collection • Data problems • Data quality (DQ) • Data integration • The Web/Internet and commercial database systems • Database management systems (DBMS) in decision support systems (DSS) and business intelligence (BI) • Database organization and structures We pepper our discussion with Technology Insights that describe the nature of databases and their issues and with Application Cases that describe their effective use, especially in decision support. 3-1 M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 1

ONLINE FILE W3 - Savvas

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ONLINE FILE W3 - Savvas

ONLINE FILE W3.1DATABASES

In this file, we discuss databases in detail, with examples as application cases. Specifically,we discuss the following:

• The nature and sources of data• Data collection• Data problems• Data quality (DQ)• Data integration• The Web/Internet and commercial database systems• Database management systems (DBMS) in decision support systems (DSS) and

business intelligence (BI)• Database organization and structures

We pepper our discussion with Technology Insights that describe the nature ofdatabases and their issues and with Application Cases that describe their effective use,especially in decision support.

3-1

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 1

Page 2: ONLINE FILE W3 - Savvas

3-2 Part II • Computerized Decision Support

ONLINE FILE W3.1.1The Nature and Sources of Data

To understand a situation, a decision maker needs data, information, and knowledge. These must be inte-grated and organized in a manner that makes them useful. Then the decision maker must be able to applyanalysis tools (e.g., online analytical processing [OLAP], data mining) so that the data, information, and knowl-edge can be utilized to full benefit. These analysis tools, though utilized in DSS, in the technical press fallunder the general heading of business intelligence (BI) and business analytics (BA). New tools allow decisionmakers and analysts to readily identify relationships among data items that enable understanding and providea competitive advantage. For example, a customer relationship management (CRM) system allows managersto better understand their customers. The managers can then determine a likely candidate for a particularproduct or service at a specific price. Marketing efforts are improved, and sales are maximized. All enterpriseinformation systems (EIS; e.g., CRM, executive information systems, content management systems, revenuemanagement systems, enterprise resource planning [ERP]/enterprise resource management [ERM] systems,supply-chain management [SCM] systems, knowledge management systems [KMS]) use DBMS, data ware-houses, OLAP, and data mining as their data analysis and modeling foundation. These BI/BA (and Webintelligence/Web analytic) tools enable the modern enterprise to compete successfully. In the right hands,these tools provide great decision makers with great capabilities.

Activities at the U.S. Department of Homeland Security (DHS) illustrate what can go wrong in the extremewhen you do not gather and integrate data to track the activities of individuals and organizations that affect yourorganization (e.g., customers, potential customers, the competition). The critical issue for the DHS is to gatherand analyze data from disparate sources. These data must be integrated in a data warehouse and analyzedautomatically via data mining tools or by analysts, using OLAP tools. Of course, abuses can occur in the processof collecting and utilizing such a massive amount of data (see Technology Insights W3.1.1).

TECHNOLOGY INSIGHTS W3.1.1 Homeland Security Privacy and Cost Concerns

The U.S. government plans to apply analytic technologies on a global scale in the war on terrorism,but will they prove an effective weapon? In the first year and a half after September 11, 2001, super-market chains, home improvement stores, and others voluntarily handed over massive amounts ofcustomer records to federal law enforcement agencies, almost always in violation of their statedprivacy policies. Many others responded to court orders for information, as required by law. Thegovernment has a right to gather corporate data under legislation passed after September 11, 2001.

The FBI now mines enormous amounts of data, looking for activity that could indicate aterrorist plot or crime. Law-enforcement agencies expect to find results in transaction data. U.S. busi-nesses are stuck in the middle. Some have to create special systems to generate the data requiredby law-enforcement agencies. An average-size company will spend an average of $5 million for asystem. However, not complying can cost more. Western Union was fined $8 million in December2002 for not complying properly.

Privacy issues abound. Because the government is acquiring personal data to detect sus-picious patterns of activity, there is the prospect of abuse and illegal use of the data. There maybe significant privacy costs involved. There are major problems with violating people’s freedomsand rights. There is a need for an oversight organization to “watch the watchers.” The DHS mustnot mindlessly acquire data. It should acquire only pertinent data and information that can bemined to identify patterns that potentially could lead to stopping terrorist activities.

Sources: Adapted from J. Foley, “Data Debate,” InformationWeek, No. 940, May 19, 2003, pp. 22–24;S. Grimes, “Look Before You Leap,” Intelligent Enterprise, Vol. 6, No. 10, June 2003; and B. Worthen, “Whatto Do When Uncle Sam Wants Your Data,” CIO, April 15, 2003, pp. 56–66.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 2

Page 3: ONLINE FILE W3 - Savvas

Chapter 3 • Decision Support Systems Concepts, Methodologies, and Technologies: An Overview 3-3

APPLICATION CASE W3.1.1Database Tools Open up New Revenue Opportunities for Experian Automotive

Experian Automotive has developed new businessopportunities from data tools that manage, extract,and integrate. Experian has developed a systemwith a huge database (the world’s 10th largest) totrack automobile sales data. The acquired data areexternal and come from public records of automobilesales. Experian draws on these data to provide theownership history of any vehicle bought or sold in

the United States for an inexpensive fee per query viathe Web. There is a massive market for this service,especially from car dealerships. Experian also focuseson automobile parts companies to identify recalls andconsider how to target automobile parts sales.

Sources: Adapted from P. Fox, “Extracting Dollars from Data,”Computerworld, Vol. 36, No. 16, April 15, 2002, p. 42; andexperian.com.

The impact of tracking data and then exploiting it for competitive advantage can be enormous. Entireindustries—such as travel, banking, and all successful e-commerce ventures—rely totally on their data andinformation content to flourish. Experian Automotive has developed a business opportunity from moderndatabase, extraction, and integration tools (see Application Case W3.1.1).

Songini (2002) provided an excellent description of databases, data, information, metadata, OLAP,repository, and data mining. Major database vendors include IBM, Oracle, Informix, Microsoft, and Sybase.Database vendors are reviewed on a regular basis by the trade press.

All DSS use data, information, and/or knowledge. These three terms are sometimes used interchange-ably and may have several definitions. A common way of looking at them is as follows:

• Data. Data are items about things, events, activities, and transactions that are recorded, classified,and stored but are not organized to convey any specific meaning. Data items can be numeric, alphanu-meric, figures, sounds, or images.

• Information. Information is data that have been organized in a manner that gives them meaningfor the recipient. Information confirms something the recipient knows or may have surprise value byrevealing something not known. A management support system (MSS) application processes dataitems so that the results are meaningful for an intended action or decision.

• Knowledge. Knowledge consists of data items and/or information organized and processed to con-vey understanding, experience, accumulated learning, and expertise that is applicable to a currentproblem or activity. Knowledge can be the application of data and information in making a decision.

MSS data can include documents, pictures, maps, sound, video, and animation. These data can bestored and organized in different ways before and after use. They also include concepts, thoughts, andopinions. Data can be raw or summarized. Many MSS applications use summary or extracted data of threeprimary types: internal, external, and personal.

Internal Data

Internal data are stored in one or more places. These data are about people, products, services, and processes.For example, data about employees and their pay are usually stored in a corporate database. Data aboutequipment and machinery may be stored in a maintenance department database. Sales data may be stored inseveral places: aggregate sales data in the corporate database and details in each region’s database. A DSScan use raw data as well as processed data (e.g., reports, summaries). Internal data are available via anorganization’s intranet or other internal network.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 3

Page 4: ONLINE FILE W3 - Savvas

3-4 Part II • Computerized Decision Support

External Data

There are many sources of external data. They range from commercial databases to data collected by sen-sors and satellites. Data are available on CDs and DVDs, on the Internet, as films and photographs, and asmusic or voices. Government reports and files are a major source of external data, most of which are avail-able on the Web today (e.g., see ftc.gov, the U.S. Federal Trade Commission homepage). External data mayalso be available by using geographical information systems (GIS), from federal census bureaus and otherdemographic sources that gather data either directly from customers or from data suppliers. Chambers ofcommerce, local banks, research institutions, and the like flood the environment with data and information,resulting in information overload for MSS users. Data can come from around the globe. Most external dataare irrelevant to a specific MSS. Yet many external data must be monitored and captured to ensure that im-portant items are not overlooked. Using intelligent scanning and interpretation agents may alleviate thisproblem.

Personal Data and Knowledge

MSS users and other corporate employees have expertise and knowledge that can be stored for future use.These include subjective estimates of sales, opinions about what competitors are likely to do, and interpre-tations of news articles. What people really know and what methodologies to capture, manage, and distrib-ute that knowledge are the subject of knowledge management.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 4

Page 5: ONLINE FILE W3 - Savvas

Chapter 3 • Decision Support Systems Concepts, Methodologies, and Technologies: An Overview 3-5

ONLINE FILE W3.1.2Data Collection, Problems, and Quality

The need to extract data from many internal and external sources complicates the task of MSS building.Sometimes it is necessary to collect raw data in the field. In other cases, it is necessary to elicit data frompeople or to find it on the Internet. Regardless of how they are collected, data must be validated andfiltered. A classic expression that sums up the situation is “garbage in, garbage out” (GIGO). Therefore,data quality (DQ) is an extremely important issue.

Methods for Collecting Raw Data

Raw data can be collected manually or by instruments and sensors. Examples of data collection methodsare time studies, surveys (using questionnaires), observations (e.g., using video cameras, radio frequencyidentification [RFID], or other real-time electronic scanners and devices), and solicitation of informationfrom experts (e.g., using interviews). In addition, sensors and scanners are increasingly being used indata acquisition. Probably the most reliable method of data collection is from point-of-purchaseinventory control. When you buy something, the register records sales information with your personalinformation collected from your credit card. This has enabled Wal-Mart, Sears, and other retailers tobuild complete, massive (petabyte-sized) data warehouses in which they collect and store BI data abouttheir customers. This information is then used to identify customer buying patterns to manage local storeinventory and identify new merchandising opportunities. It also helps a retail organization manage itssuppliers.

Ewalt (2003) described how PDAs are used to collect and utilize data in the field. Logistics compa-nies have been using PDAs for some time. Menlo Worldwide Forwarding, a global freight company,recently equipped more than 800 drivers with PDAs. Radio links are used to dispatch drivers to pick uppackages. The driver scans a bar-code label on the package into the PDA, which then beams trackingdata back to the home office.

The need for reliable, accurate data for any MSS is universally accepted. However, in real life, devel-opers and users face ill-structured problems in “noisy” and difficult environments. A wide variety of hard-ware and software is available for data storage, communication, and presentation, but comparatively littleeffort has gone into developing methods for MSS data capture in less tractable decision environments.Inadequate methods for dealing with these problems may limit the effectiveness of even sophisticated tech-nologies in MSS development and use. Some methods involve physically capturing data via bar codes or byRFID technology; see Chapter 14 for more details. An RFID electronic tag sends an identification signal withsome data (several kilobytes, when these devices were new) directly to a nearby receiver. A packing crate,or even an individual consumer product, can readily be identified via RFID. In the early 2000s, manufactur-ers, airlines, and retailers were experimenting with utilizing RFID devices for security, speeding up process-ing in receiving, and customer checkout. Wal-Mart announced in June 2003 that by January 2005 its 100 keysuppliers must use RFID to track pallets of goods through its supply chain. Success seems to have beenachieved by January 2005. See Application Case W3.1.2 for details. Swatch incorporates the RFID into selectwatch models so that ski lift passes at ski resorts can be automatically encoded into it. The resort can read-ily identify the types of slopes a person likes to ski and shares the information with its other properties.Furthermore, these chips have been inserted subcutaneously in pets and people in danger of becoming lostor kidnapped.

Even biometric (scanning) devices are used to collect real-world data. Biometric systems detect variousphysical and behavioral features of individuals and assess them to authenticate the identities of visitors andimmigrants entering the United States. Databases and data mining methods are also used. Some $400 millionwas spent on biometrics for U.S. border control in 2003 (see Verton, 2003).

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 5

Page 6: ONLINE FILE W3 - Savvas

3-6 Part II • Computerized Decision Support

APPLICATION CASE W3.1.2RFID Tags Help Automate Data Collection and Use

In June 2003, Wal-Mart announced that by 2005 its 100key suppliers must use RFID to track pallets of goodsthrough its supply chain. Wal-Mart considers thismuch more than a company-specific effort and urgedall retailers and suppliers to embrace RFID and relatedstandards. Wal-Mart’s initiative resulted in deployingabout 1 billion RFID tags to track and identify itemsin the individual crates and pallets. Wal-Mart firstconcentrated on using the technology to improveinventory management in its supply chain. Wal-Mart’sdecision to deploy the technology has helped legit-imize it and push it into the mainstream. The Wal-Martdeadline definitely sped adoption by the industry.

The RFID unit price needed to be 5 cents (UnitedStates) or less for the Wal-Mart initiative to be cost-effective. In mid-2003, the RFID tags cost between30 and 50 cents. Based on a cost of 5 cents per tag, theoutlay for the tags alone would total $50 million. In2003, the readers sold for $1,000 or more. According toa January 2005 Incucomm, Inc., report, the Wal-MartRFID initiative has been a resounding success. Vendorshave offered minimal resistance, their costs have beensubstantially lower than expected, and accuracy ofshipped goods is extremely high.

Wal-Mart is not the only retailer moving towardRFID. Marks & Spencer plc, one of Britain’s largestretailers, utilizes RFID technology in its food supply-chain operations. Each of 3.5 million plastic traysused to ship products has an RFID tag on it. Procter &Gamble experimented with RFID for more than6 months in 2003, running tests with several retailers.

In 2003, Delta Airlines started tests of usingRFID to identify baggage while bags were loaded andunloaded on airport tarmacs. Delta loads data intothe tags as the bar code is printed. Testing is criticalbecause of potential interference from other airportwireless systems. Delta expected to see a higher levelof accuracy than from the existing bar-code system,which enabled Delta to deliver 99 percent of the

100 million or so bags it handles each year. But it stillcosts Delta a small fortune to find missing bags.

RFID tags were utilized to track the movement ofpharmaceuticals through Europe’s gray (i.e., semilegal)markets. At the time, medicines were generally muchless expensive in Southern Europe than in NorthernEurope, so unscrupulous wholesalers traveled to theSouth to buy them for resale in the North. RFID tagswere installed inside the labels. When a vendor repre-sentative visited the dishonest wholesalers, he was ableto identify the source of the stock when he got within3 meters of the containers. All contracts with thesewholesalers were immediately cancelled.

Other possible uses of RFID include embeddingRFID tags in badges so that doors will automaticallyunlock for an authorized person and provide access tomovies and other events (through a watch-embeddedor card-embedded RFID tag). They could also beembedded in automobiles for automatic toll charges(as in London), used in automobiles to store an entiremaintenance and repair record (as is currently done forindustrial forklifts), or even under the skin for identifi-cation (by ATMs, museums, transit systems, admissionto any facility, or law-enforcement officials). Some petowners have had RFID tags surgically embeddedunder their pet’s skin for identification in case the petis lost or stolen. Eventually, consumer product pack-ages and suitcases may be manufactured to containRFID tags. For example, when you walk out of a store,readers may detect what you have selected, and youraccount will automatically be charged for what youhave, through an RFID tag either under your skin or ina credit card.Sources: Adapted from R. Brewin, “Delta to Test RFID Tags onLuggage,” Computerworld, Vol. 37, No. 25, June 23, 2003, p. 7;C. Murphy and M. Hayes, “Tag Line,” InformationWeek, No. 944,June 16, 2003, pp. 18–20; J. Vijayan and R. Brewin, “Wal-MartBacks RFID Technology,” Computerworld, Vol. 37, No. 24,June 16, 2003, pp. 1, 14; and Incucomm, Inc., Wal-Mart’s RFIDDeployment: How Is It Going? January 2005, incucomm.com.

Data Problems

All computer-based systems depend on data. The quality and integrity of the data are critical if the MSS isto avoid the GIGO syndrome. MSS depend on data because compiled data that make up information andknowledge are at the heart of any decision-making system.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 6

Page 7: ONLINE FILE W3 - Savvas

Chapter 3 • Decision Support Systems Concepts, Methodologies, and Technologies: An Overview 3-7

The major DSS data problems are summarized in Table W3.1.1, along with some possible solutions.Data must be available to the system, or the system must include a data-acquisition subsystem. Data issuesshould be considered in the planning stage of system development. If too many problems are anticipated,the costs of solving them can be estimated. If they are excessive, the MSS project should not be undertakenor should be put on hold until costs and problems decrease.

Data Quality

Data quality (DQ) is an extremely important issue because quality determines the usefulness of data aswell as the quality of the decisions based on them. Data in organizational databases are frequently found tobe inaccurate, incomplete, or ambiguous. The economic and social damage from poor-quality data costs bil-lions of dollars.

The Data Warehousing Institute (TDWI; tdwi.org) estimated in 2001 that poor-quality customer datacost U.S. businesses $611 billion per year in postage, printing, and the staff overhead to deal with the massof erroneous communications and marketing. This cost is increasing dramatically each year becausecustomer data degenerate at the rate of 2 percent per month because of customer deaths, divorces, mar-riages, and moves. Also, data entry errors, missing values, and integrity errors tend to show up only whenintegrating and aggregating data across the enterprise (see Erickson, 2003). Frighteningly, the real cost ofpoor-quality data is much higher. Organizations can frustrate and alienate loyal customers by incorrectlyaddressing letters or failing to recognize them when they call or visit a store or Web site. When a companyloses its loyal customers, it loses its base of sales and referrals, as well as future revenue potential. Sometypical costs are related to rework, lost customers, late reporting, wrong decisions, wasted project activities,slow response to new needs (i.e., missed opportunities), and delays in implementing large projects thatdepend on existing databases (see Olson, 2003a, 2003b).

DQ is one topic that people know is important but tend to neglect. DQ often generates little enthusi-asm and is typically viewed as a maintenance function. Firms have clearly been willing to accept poor DQ.Companies can even survive and flourish with poor DQ. It is not considered a life-and-death issue, butsometimes it can be. Data inaccuracies can be extremely costly (see Olson, 2003a, 2003b). Even so, mostfirms manage DQ in a casual manner (see Eckerson, 2002). According to Hatcher (2003), DQ is a major

TABLE W3.1.1 Data Problems

Problem Typical Cause Possible Solution

Data are not correct.

Data were generated carelessly. Raw data were entered inaccurately. Data were tampered with.

Develop a systematic way to enter data. Automate data entry. Introduce quality controls on data generation. Establish appropriate security programs.

Data are not timely.

The method for generating data is not rapid enough to meet the need for data.

Modify the system for generating data. Use the Web to get fresh data.

Data are not measured or indexed properly.

Raw data are gathered inconsistently with the purposes of the analysis. Complex models are used.

Develop a system for rescaling or recombining improperly indexed data. Use a data warehouse. Use appropriate search engines. Develop simpler or more highly aggregated models.

Needed data simply do not exist.

No one ever stored data that are now needed. Required data never existed.

Predict what data may be needed in the future. Use a data warehouse. Generate new data or estimate them.

Source: Based on S. L. Alter, Decision Support Systems: Current Practices and Continuing Challenges. Addison-Wesley, Reading, MA,1980, p. 130.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 7

Page 8: ONLINE FILE W3 - Savvas

3-8 Part II • Computerized Decision Support

problem in data warehouse development and BI/BA utilization. DQ can delay the implementation ofa warehouse or a data mart six months or more. Inaccurate data stored in a data warehouse and thenreported to someone can instantly kill a user’s trust in the new system.

DQ was often overlooked in the early days of data warehousing. Data warehouse practitionersnow need to revisit many of the original decisions about DQ in order to keep pace with the demandsof enterprise decision making (see Canter, 2002).

DQ is important, especially for CRM, ERP, and other EIS. The problem is that data warehousing,e-business, and CRM projects often expose poor-quality data because they require companies to extract andintegrate data from multiple operational systems that are often peppered with errors, missing values, andintegrity problems. These problems generally do not show up until someone tries to summarize or aggre-gate the data.

Improved DQ is the result of a business improvement process designed to identify and eliminate theroot causes of bad data. Data warehouse applications require data cleansing every time the warehouse ispopulated or updated. To improve DQ and maintain accuracy require an active DQ assurance program. Wedescribe a DQ action plan, from a management perspective, and a model for improving DQ that provides aframework in Technology Insights W3.1.2. A specific major benefit of improving DQ occurred for one or-ganization after integrating the information systems of two businesses that merged after an acquisition.Instead of a 3-year effort, it was completed in 1 year. Another example involves getting a CRM system com-pleted and serving the sales and marketing organizations in 1 year instead of working on it for 3 years andthen canceling it (see Olson, 2003a, 2003b).

We describe some best practices for DQ in Technology Insights W3.1.3. Practitioners have identifiedthese as being important in order for an organization to maintain a high level of DQ and integrity.

DQ issues, methods, and solutions are discussed in great detail in the technical press and in whitepapers of various research and consulting firms, as well as in the textbook.

TECHNOLOGY INSIGHTS W3.1.2 A Data Quality Action Plan

A DQ action plan is a recommended framework for guiding DQ improvement. It involves thesesteps:

1. Determine the critical business functions to be considered.2. Identify criteria for selecting critical data elements.3. Designate the critical data elements.4. Identify known DQ concerns for the critical data elements and their causes.5. Determine the quality standards to be applied to each critical data element.6. Design a measurement method for each standard.7. Identify and implement quick-hit DQ improvement initiatives.8. Implement measurement methods to obtain a DQ baseline.9. Assess measurements as well as DQ concerns and their causes.

10. Plan and implement additional improvement initiatives.11. Continue to measure quality levels and tune initiatives.12. Expand the process to include additional data elements.

Sources: Adapted from D. Berg and C. Heagele, “Improving Data Quality: A Management Perspective andModel,” in R. Barquin and H. Edelstein (eds.), Planning and Designing the Data Warehouse. Upper SaddleRiver, NJ: Prentice Hall, 1997; and “Data Quality: Mission Critical,” Canadian Institute for Health InformationDirections, Vol. 10, No. 3, secure.cihi.ca/cihiweb/dispPage.jsp?cw_page=news_dir_v10n3_quality_e(accessed July 2009).

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 8

Page 9: ONLINE FILE W3 - Savvas

Chapter 3 • Decision Support Systems Concepts, Methodologies, and Technologies: An Overview 3-9

Data Integrity

One of the major issues of DQ is data integrity. Older filing systems may lack integrity. That is, a changemade in the file in one place may not be made in the file in another place or department. This results inconflicting data. DQ-specific issues and measures depend on the application of the data. This is an espe-cially important issue in collaborative computing environments, such as the one provided by LotusNotes/Domino and Groove. In the area of data warehouses, for example, Gray and Watson (1998) distin-guished the following five issues:

• Uniformity. During data capture, uniformity checks ensure that the data are within specified limits.• Version. Version checks are performed when the data are transformed through the use of metadata

to ensure that the format of the original data has not been changed.• Completeness check. A completeness check ensures that the summaries are correct and that all

values needed to create the summary are included.• Conformity check. A conformity check makes sure that the summarized data are in the ballpark.

That is, during data analysis and reporting, correlations are run between the value reported andprevious values for the same number. Sudden changes can indicate a basic change in the business,analysis errors, or bad data.

• Genealogy check or drill-down. A genealogy check or drill-down is a trace back to the datasource through its various transformations.

Data Access and Integration

A decision maker typically needs access to multiple sources of data that must be integrated (see TheNational Strategy for Homeland Security, whitehouse.gov/homeland/book/index.html; and“PetroVantage Launches Commercial Software,” National Petroleum News, January 2002). Before datawarehouses, data marts, and BI software, providing access to data sources was a major, laboriousprocess. Even with modern Web-based data management tools, recognizing what data to access and pro-viding it to decision makers are nontrivial tasks that require database specialists. As data warehousesgrow in size, the issues of integrating data exasperate. This is especially important for the DHS (see The

TECHNOLOGY INSIGHTS W3.1.3 Best Practices for Data Quality

Here are some best practices for ensuring DQ in practice:

• Data scrubbing is not enough. Be aware that data-cleansing software handles only afew issues: inaccurate numbers, misspellings, and incomplete fields. Comprehensive DQprograms approach data standardization in order to maintain information integrity.

• Start at the top. Be aware of DQ issues and how they affect the organization. Topmanagers must buy into any repair effort because resources will be needed to addresslong-standing issues.

• Know your data. Understand what data you have and what they are used for. Determinethe appropriate level of precision necessary for each data item.

• Make it a continuous process. Develop a culture of DQ. Institutionalize a methodologyand best practices for entering and checking information.

• Measure results. Regularly audit the results to ensure that standards are beingenforced and to estimate impacts on the bottom line.

Sources: Adapted from B. Stackpole, “Dirty Data Is the Dirty Little Secret That Can Jeopardize Your CRMEffort,” CIO, February 15, 2001, pp. 101–114; and J. Moad, “Mopping Up Dirty Data,” Baseline, No. 25,December 1, 2003, pp. 74–75.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 9

Page 10: ONLINE FILE W3 - Savvas

3-10 Part II • Computerized Decision Support

National Strategy for Homeland Security, whitehouse.gov/homeland/book/index.html). SeeApplication Case W3.1.3 for how the DHS is working on a massive enterprise data and applicationintegration project.

APPLICATION CASE W3.1.3Homeland Security Data Integration

Steve Cooper, special assistant to the president andCIO of the DHS, is responsible for determiningwhich existing applications and types of data canhelp the organization meet its goal of migrating thedata into a secure, usable, state-of-the-art frameworkand integrating the disparate networks and datastandards of 22 federal agencies, with 170,000employees, that merged to form the DHS. This taskwas to have been completed by mid-2005. The realproblem is that federal agencies have historicallyoperated autonomously, and their IT systems werenot designed to interoperate with one another.Essentially, the DHS needs to link silos of datatogether. By mid-2005, the data integration prob-lems plagued the Department of Homeland Securityin this endeavor.

The DHS has one of the most complex infor-mation-gathering and data-migration projects underway in the federal government. The challenge ofmoving data from legacy systems within or acrossagencies is something all departments must address.Complicating the issue is the plethora of rapidlyaging applications and databases throughout gov-ernment. Data integration improvement is underway at the federal, local, and state levels. The gov-ernment is using tools from the corporate world.

Major problems have occurred because eachagency has its own set of business rules that dictatehow data are described, collected, and accessed.Some of the data are unstructured and not locatedin relational databases, and they cannot be easilymanipulated and analyzed. Commercial applicationswill definitely be used in this major integration.Probably the bulk of the effort will be accomplishedwith data warehouse and data mart technologies.Informatica, among other software vendors, hasdeveloped data integration solutions that enableorganizations to combine disparate systems to makeinformation more widely accessible throughout anorganization. Such software may be ideal for such alarge-scale project.

The idea is to decide on and create an enter-prise architecture for federal and state agenciesinvolved in homeland security. The architecture willhelp determine the success of homeland defense.The first step in migrating data is to identify all theapplications and data in use. After identifyingapplications and databases, the next step is todetermine which to use and which to discard. Whenan organization knows which data and applications itwants to keep, the difficult process of moving thedata starts. First, it is necessary to identify and buildon a common thread in the data. Another majorchallenge in the data-migration arena is security,especially when dealing with data and applicationsthat are decades old.

The DHS will definitely have an information-analysis and infrastructure-protection component.This may be the single most difficult challenge forthe DHS. Not only will the DHS have to makesense of a huge mountain of intelligence gatheredfrom disparate sources, but it will have to get thatinformation to the people who can most effectivelyact on it, many of whom are outside the federalgovernment.

Even the central government recognizes thatdata deficiencies may plague the DHS. Movinginformation to where it is needed, and doing sowhen it is needed, is critical and exceedingly diffi-cult. Some 650,000 state and local law-enforcementofficials “operate in a virtual intelligence vacuum,without access to terrorist watch lists providedby the State Department to immigration and con-sular officials,” according to the October 2002Hart–Rudman report America—Still Unprepared,Still in Danger, sponsored by the Council onForeign Relations. The task force cited the lack ofintelligence sharing as a critical problem deservingimmediate attention. “When it comes to combatingterrorism, the police officers on the beat are effec-tively operating deaf, dumb and blind,” the reportconcluded.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 10

Page 11: ONLINE FILE W3 - Savvas

Chapter 3 • Decision Support Systems Concepts, Methodologies, and Technologies: An Overview 3-11

The Defense Advanced Research ProjectsAgency (DARPA) spent $240 million on combinedprojects on total information awareness to developways of treating worldwide, distributed legacy data-bases as if they were a single, centralized database.

Sources: Adapted from E. Chabrow, “One Nation, Under I.T.”InformationWeek, No. 914, November 11, 2002, pp. 47–50;T. Datz, “Integrating America,” CIO, December 2002, pp. 44–51;J. Foley, “Data Debate,” InformationWeek, No. 940, May 19,2003, pp. 22–24; A.R. Nazarov, “Informatica Seeks Partners to

Gain Traction in Fed Market,” CRN, June 9, 2003, p. 39;P. Thibodeau, “DHS Sets Timeline for IT Integration,”Computerworld, Vol. 37, No. 24, June 16, 2003, p. 7;K. M. Peters, “5 Homeland Security Hurdles,” GovernmentExecutive, Vol. 35, No. 2, February 2003, pp. 18–21; A. Rogers,“Data Sharing Key to Homeland Security Efforts,” CRN, No. 1019, November 4, 2002, pp. 39–40; K. D. Schwartz,“The Data Migration Challenge,” Government Executive, Vol. 34, No. 16, December 2002, pp. 70–72; and G. Hartand W. B. Rudman, America—Still Unprepared, Still in Danger, Council on Foreign Relations Press, October 2002,cfr.org/publication.html?id=5099 (accessed July 2009).

The needs of business analytics continue to evolve. In addition to historical, cleansed, consolidated,and point-in-time data, business users increasingly demand access to real-time, unstructured, and/or remotedata. In addition, everything has to be integrated with the contents of existing data warehouses (see Devlin,2003). Moreover, access via PDAs and through speech recognition and synthesis is becoming morecommonplace, further complicating integration issues (see Edwards, 2003).

Enterprise data resources can take many different forms: relational database (RDB) tables, XMLdocuments, electronic data interchange (EDI) messages, COBOL records, and so on. Independent soft-ware vendor (ISV) applications, such as ERP, CRM software, and in-house-developed software, definetheir own input and output schemas. Often, different schemas hold similar information, structured dif-ferently. The information model is central in that it represents a neutral semantic view of the enterprise.See Fox (2003) for details. Technology Insights W3.1.4 describes the process of extraction, transforma-tion, and load (ETL), which is the basis for all data-integration efforts. ETL is discussed in greater detailin the textbook.

Many integration projects involve enterprise-wide systems. In Technology Insights W3.1.5, we pro-vide a checklist of what works and what does not work when attempting such a project.

TECHNOLOGY INSIGHTS W3.1.4 What Is ETL?

Extraction, transformation, and load (ETL) programs periodically extract data from source systems,transform them into a common format, and then load them into the target data store, typically a datawarehouse or data mart. ETL tools also typically transport data between sources and targets, docu-ment how data elements change as they move between source and target (e.g., metadata), exchangemetadata with other applications as needed, and administer all runtime processes and operations(e.g., scheduling, error management, audit logs, statistics). ETL is extremely important for dataintegration and data warehousing.

Sources: Adapted from W. Erickson, “The Evolution of ETL,” in What Works: Best Practices in BusinessIntelligence and Data Warehousing, Vol. 15. The Data Warehousing Institute, Chatsworth, CA, June 2003;and M. L. Songini, “ETL,” Computerworld, Vol. 38, No. 5, February 2, 2004, p. 23.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 11

Page 12: ONLINE FILE W3 - Savvas

3-12 Part II • Computerized Decision Support

TECHNOLOGY INSIGHTS W3.1.5 What to Do and What Not to DoWhen Implementing an Enterprise-Wide Integration Project

What to Do

1. Think globally and act locally. Plan enterprise-wide; implement incrementally.2. Define integration framework components.3. Focus on business-driven goals with high-cost and low-technical complexity.4. Treat the enterprise system as your strategic application.5. Pursue reusable, template-based approaches to development.6. Use prototyping as the project estimate generator.7. Think of integration at different levels of abstraction.8. Expect to build application logic into the enterprise infrastructure.9. Assign project responsibility at the highest corporate level and negotiate, negotiate,

negotiate.10. Plan for message logging and warehousing to track audit and recovery.

What Not to Do

1. Critique business strategy through the enterprise architecture. Instead, evaluate the impactof the business strategy on IT.

2. Purchase more than you need for a given phase.3. Substitute an enterprise application architecture for a data warehouse.4. Force use of near-real-time message-based integration unless it is absolutely mandatory.5. Assume that existing process models will suffice for process integration; they are not

the same.6. Plan to change your business processes as part of the enterprise application implementation.7. Assume that all relevant knowledge resides within the project team.8. Be driven by centralizing any enterprise-level business objects as part of the enterprise

application implementation.9. Be intrusive into the existing applications.

10. Use ad hoc process and message modeling techniques.

Sources: Adapted from V. Orovic, “To Do & Not to Do,” EAI Journal, June 2003, pp. 37–43; Enterprise IntegrationPatterns, enterpriseintegrationpatterns.com (accessed March 2006); and G. Hohpe and B. Woolf, EnterpriseIntegration Patterns. Reading, MA: Addison Wesley Professional, 2004.

Properly integrating data from various databases and other disparate sources is difficult. But whennot done properly, it can lead to disaster in enterprise-wide systems such as CRM, ERP, and supply-chainprojects. See Technology Insights W3.1.6 for issues related to data cleansing as a part of data integration.

The following authors discuss data integration issues, models, methods, and solutions: Balen (2000),Calvanese et al. (2001), Devlin (2003), Erickson (2003), Fox (2003), Holland (2000), McCright (2001),Meehan (2002), Nash (2002), Orovic (2003), Pelletier et al. (2003), Vaughan (2002), and Whiting (2003).

Data Integration via XML

XML is quickly becoming the standard language for database integration and data transfer (Balen, 2000).By 2004, some 40 percent of all e-commerce transactions occurred over XML servers. This was up from 16percent in 2002 (see Savage, 2001). XML may revolutionize electronic data exchange by becomingthe universal data translator (Savage, 2001). Systems developers must be extremely careful because XMLcannot overcome poor business logic. If the business processes are bad, no data integration method willimprove them.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 12

Page 13: ONLINE FILE W3 - Savvas

Chapter 3 • Decision Support Systems Concepts, Methodologies, and Technologies: An Overview 3-13

Even though using XML is an excellent way to exchange data among applications and organizations,a critical issue is whether it can function well as a native database format in practice. XML is a mismatchwith relational databases: It works, but it is hard to maintain. There are difficulties in performance, specifi-cally in searching large databases. XML uses a lot of space. Even so, there are native XML database engines.See DeJesus (2000) for more information.

Data Integration Software

Developers of document and data capture and management software are increasingly using XML to trans-port data from sources to destinations. For example, Captiva Software Corp., RTSe USA, Inc., Kofax ImageProducts, Inc., and Tower Software all use XML to move and upload documents to the Web, intranets, andwireless applications. RosettaNet XML solutions create standard business-to-business (B2B) protocols thatincrease supply-chain efficiency. BizTalk Server 2000 uses XML to help companies manage their data, con-duct data exchanges with e-commerce partners more easily, and lower costs (see Savage, 2001). The ADT(formerly InfoPump) data-transformation tools from Computer Associates track changes in data and appli-cations. The software lets companies extract and transform data from up to 30 sources, including RDB,mainframe IMS and VSAM files, and applications, and load them into a database or data warehouse.Vaughan (2002) provided a list of software tools that use XML to extract and transform data.

TECHNOLOGY INSIGHTS W3.1.6 Enterprise Data House Cleaning

Every organization has redundant data, wrong data, missing data, and miscoded data, probablyburied in systems that do not communicate much. This is the attic problem familiar to most home-owners: Throw in enough boxes of seasonal clothes, holiday trim, family-history documents, andother important items, and soon the mess is too big to manage. It happens at companies, too.Multiple operating units, manufacturing plants, and other facilities may all run different vendors’applications for sales, human resources, and other tasks. The mix of disparate data makes for a pileof unsorted and unreconciled information. Integration becomes a major effort.

Cleaning House

Before any data can be cleansed, your IT department must create a plan for finding and collectingall the data and then decide how to manage them. Practitioners offer this advice:

• Decide what types of information must be captured. Set up a small data-mapping com-mittee to do this.

• Find mapping software that can harvest data from many sources, including legacy applica-tions, PC files, HTML documents, unstructured sources, and enterprise systems. Severalvendors have developed such software.

• Start with a high-payoff project. The first integration project should be in a business unit thatgenerates high revenue. This helps obtain upper-management buy-in.

• Create and institutionalize a process for mapping, cleansing, and collating data.Companies must continually capture information from disparate sources.

Sources: Adapted from K. S. Nash, “Merging Data Silos,” Computerworld, Vol. 36, No. 16, April 25, 2002,pp. 30–32; and M. Ferguson, “Let the Data Flow,” Intelligent Enterprise, Vol. 9, No. 3, March 1, 2006,pp. 20–25.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 13

Page 14: ONLINE FILE W3 - Savvas

3-14 Part II • Computerized Decision Support

ONLINE FILE 3.1.3The Web/Internet and Commercial Database Services

External data pour into organizations from many sources. Some of the data come on a regular basis frombusiness partners through collaboration (e.g., collaborative SCM). The Internet is a major source of data, aswe describe below:

• The Web/Internet. Many thousands of databases all over the world are accessible through theWeb/Internet. A decision maker can use the Internet to access the homepages of vendors, clients, andcompetitors; view and download information; and conduct research. The Internet is the major suppli-er of external data for many decision situations.

• Commercial databanks. An online commercial database service sells access to specialized data-bases. Such a service can add external data to an MSS in a timely manner and at a reasonable cost.For example, GIS data must be accurate; regular updates are available. Several thousand servicesare currently available, many of which are accessible via the Internet. Table W3.1.2 lists severalrepresentative services.

The collection of data from multiple external sources can be complicated. Products from leading com-panies (e.g., Oracle, IBM, Sybase) can transfer information from external sources and put it where it isneeded, when it is needed, in a usable form. Because most sources of external data are on the Web, itmakes sense to use intelligent agents to collect and possibly interpret external data.

The Web and Corporate Databases and Systems

Developments in document management systems (DMS) and content management systems (CMS)enable employees and customers to use Web browsers to access vital information. Critical issues havebecome more critical in Web-based systems. It is important to maintain accurate, up-to-date versions ofdocuments, data, and other content; otherwise, the value of the information will diminish. Real-timecomputing, especially as it relates to DMS and CMS, has become a reality. Managers expect their DMS andCMS to produce up-to-the-minute, accurate documents and information about the status of the organizationas it relates to their work. This real-time access to data introduces new complications in the design anddevelopment of data warehouses and the tools that access them.

A number of other Web developments are occurring. For example, MicroStrategy (microstrategy.com) utilizes a Web-like interface, integrates data from many sources, and provides capabilities to report,analyze, and monitor. Group support systems are typically deployed via Web browsers and servers(e.g., Lotus Notes/Domino, Groove, Cisco/WebEx). Finally, DBMS provide data directly in a format that aWeb browser can display, with delivery through the Internet or an intranet.

The “big three” vendors of relational DBMS—Oracle, Microsoft, and IBM—all have core databaseproducts to accommodate a world of client/server architecture and Internet/intranet applications that incor-porate nontraditional, or rich, multimedia data types. So do other firms in this area. Oracle’s Developer 2000is able to generate graphical client/server applications in PL/SQL code, Structured Query Language (SQL),COBOL, C++, and HTML. Other tools provide Web browser capabilities, multimedia authoring and contentscripting, object class libraries, and OLAP routines. Microsoft’s .NET Framework supports Web-based BI.

Among the suppliers of Web site and database integration are Spider Technology, Ltd. (spidertechnology.co.uk), NetObjects, Inc. (netobjects.com), and Oracle (oracle.com). These vendors link Web technology todatabase sources and to legacy database systems.

The use of the Web has had a far-reaching impact on collaborative computing in the form of group-ware, EIS, KMS, document management systems, and the whole area of interface design, including otherEIS: business activity monitoring (BAM), BPM, product life-cycle management (PLM), ERP/ERM, CRM, KMS,and SCM.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 14

Page 15: ONLINE FILE W3 - Savvas

Chapter 3 • Decision Support Systems Concepts, Methodologies, and Technologies: An Overview 3-15

TABLE W3.1.2 Examples of Commercial Database (Data Bank) Services

Service Description

Free Lunch (economy.com/freelunch) A source of free economic data offered by Moody’s Inc. Also offers additional data at a premium.

Compustat (compustat.com) Provides financial statistics about tens of thousands of corporations.

Dow Jones Information Service (dowjones.com) Provides statistical databanks on the stock market and other financial markets and activities. Also provides in-depth financial statistics on all corporations listed on the New York and American stock exchanges, plus thousands of other selected companies. Its Dow Jones News/Retrieval System provides bibliographic databanks on business, financial, and general news from the Wall Street Journal, Barron’s, and the Dow Jones News Service.

Lockheed Information Systems (dialog.com) Is the largest bibliographic distributor. Its DIALOG system offers extracts and summaries of hundreds of different databanks in agriculture, business, economics, education, energy, engineering, environment, foundations, general news publications, government, international business, patents, pharmaceuticals, science, and social sciences. It relies on many economic research firms, trade associations, and government agencies for data.

LexisNexis, a division of Reed Elsevier Inc. (lexis.com) This databank service offers two major bibliographic databanks. Lexis provides legal research information and legal articles. Nexis provides a full-text (not abstract) bibliographic database of hundreds of newspapers, magazines, newsletters, news services, government documents, and so on. It includes full text and abstracts from the New York Times and the complete 29-volume Encyclopedia Britannica. Also provided are the Advertising & Marketing Intelligence (AMI) databank and the National Automated Accounting Research System.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 15

Page 16: ONLINE FILE W3 - Savvas

APPLICATION CASE W3.1.4Aviall Lands $3 Billion Deal

How important are effective data management andretrieval? Aviall Services, Inc., attributes a $3 billionspare parts distribution contract that it won to its ITinfrastructure. The 10-year contract requires thecompany to distribute spare parts for Rolls-Royceaircraft engines. Aviall cited the ability to offertechnology-driven services, such as sales fore-casting, down to the line-item level as one of thereasons it was successful. It recently linked infor-mation from its ERP, SCM, CRM, and e-business

applications to provide access to its marine and avi-ation parts inventory and distribution, at a cost ofsome $30 to $40 million. The system is expected topay for itself by cutting costs associated with “lost”inventory. Timely access to information is provingto be a competitive resource that results in a bigpayoff.

Sources: Adapted from M. L. Songini, “Distributor: New AppsHelped Seal $3B Deal,” Computerworld, Vol. 36, No. 3, January14, 2002, p. 16; and Aviall Services, Inc., aviall.com.

3-16 Part II • Computerized Decision Support

ONLINE FILE W3.1.4Database Management Systems in DSS/BI

The complexity of most corporate databases and large-scale independent MSS databases sometimes makesstandard computer operating systems inadequate for an effective and efficient interface between the userand the database. A DBMS supplements standard operating systems by allowing for greater integration ofdata, complex file structure, quick retrieval and changes, and better data security. Specifically, a databasemanagement system (DBMS) is a software program for adding information to a database and updating,deleting, manipulating, storing, and retrieving information. A DBMS combined with a modeling language isa typical system-development pair used in constructing DSS and other MSS. DBMS are designed to handlelarge amounts of information. Often, data from the database are extracted and put in a statistical, mathemat-ical, or financial model for further manipulation or analysis. Large, complex DSS often do this.

The major role of DBMS is to manage (i.e., create, delete, change, and display) data. DBMS enableusers to query data as well as to generate reports. Effective database management and retrieval can leadto immense benefits for organizations, as is evident in the situation of Aviall Services, Inc., described inApplication Case W3.1.4.

Unfortunately, there is some confusion about the appropriate role of DBMS and spreadsheets. This isbecause many DBMS offer capabilities similar to those available in an integrated spreadsheet such as Excel,which enables the DBMS user to perform DSS spreadsheet work with a DBMS. Similarly, many spreadsheetprograms offer a rudimentary set of DBMS capabilities. Although such a combination can be valuable insome cases, it may result in lengthy processing of information and inferior results. The add-in packages arenot robust enough and are often very cumbersome. Finally, a computer’s available memory may limit thesize and speed of the user’s spreadsheet. For some applications, DBMS work with several databases and dealwith many more data than a spreadsheet can.

For DSS applications, it is often necessary to work with both data and models. Therefore, it is tempt-ing to use only one integrated tool, such as Excel. However, interfaces between DBMS and spreadsheets arefairly simple, facilitating the exchange of data between more powerful independent programs. Web-basedmodeling and database tools are designed to seamlessly interact.

Small to medium DSS can be built with either enhanced DBMS or integrated spreadsheets.Alternatively, they can be built with a DBMS program and a spreadsheet program. A third approach to DSSdevelopment is to use a fully integrated DSS generator (i.e., development package).

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 16

Page 17: ONLINE FILE W3 - Savvas

Chapter 3 • Decision Support Systems Concepts, Methodologies, and Technologies: An Overview 3-17

ONLINE FILE W3.1.5Database Organization and Structures

The relationships between the many individual records stored in a database can be expressed by severallogical structures (see Hoffer et al., 2005; Kroenke, 2005; Mannino, 2001; Post, 2002; and Riccardi, 2003).DBMS are designed to use these structures to perform their functions. The three conventional structures arerelational, hierarchical, and network.

Relational Databases

The relational form of DSS database organization, described as tabular or flat, allows the user to think interms of two-dimensional tables, which is the way many people see data reports. Relational DBMS allowmultiple access queries. Thus, a data file consists of a number of columns proceeding down a page. Eachcolumn is considered a separate field. The rows on a page represent individual records made up of severalfields—the same design used for spreadsheets. Several files can be related by means of a common data fieldfound in two (or more) data files. The names of the common fields must be spelled exactly alike, and thefields must be the same size (i.e., the same number of bytes) and type (e.g., alphanumeric, dollar). For ex-ample, in Figure W3.1.1 the data field Customer Name is found in both the customer and the usage files,and thus the two fields are related. The data field Product Number is found in the product file and the usagefile. It is through these common linkages that all three files are related and in combination form arelational database.

Database Structures

The advantage of this type of database is that it is simple for the user to learn, can easily be expanded oraltered, and can be accessed in a number of formats not anticipated at the time of the initial design anddevelopment of the database. It can support large amounts of data and efficient access. Many data ware-houses are organized this way.

HIERARCHICAL DATABASES A hierarchical model orders data items in a top-down fashion, creating logicallinks between related data items. It looks like a tree or an organization chart. It is used mainly in transactionprocessing, where processing efficiency is a critical element.

NETWORK DATABASES The network database structure permits more complex links, including lateralconnections between related items. This structure is also called the CODASYL model. It can save stor-age space through the sharing of some items. For example, in Figure W3.1.1 Green and Brown share S.1and T.1.

OBJECT-ORIENTED DATABASES Comprehensive MSS applications, such as those involving computer-integrated manufacturing, require accessibility to complex data, which may include pictures and elaboraterelationships. Such situations cannot be handled efficiently by hierarchical, network, or even relational data-base architectures, which mainly use an alphanumeric approach. Even the use of SQL to create and accessrelational databases may not be effective. For such applications, a graphical representation, such as the oneused in objected-oriented systems, may be useful.

Object-oriented data management is based on the principle of object-oriented programming (seeSatzinger and Orvik, 2001;Weisfeld, 2004; and Yourdon, 1994). Object-oriented database systems combinethe characteristics of an object-oriented programming language, such as Veritos or UML, with a mechanism

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 17

Page 18: ONLINE FILE W3 - Savvas

3-18 Part II • Computerized Decision Support

for data storage and access. The object-oriented tools focus directly on the databases. An object-orienteddatabase management system (OODBMS) allows us to analyze data at a conceptual level that empha-sizes the natural relationships between objects. Abstraction is used to establish inheritance hierarchies, andobject encapsulation allows the database designer to store both conventional data and procedural codewithin the same objects.

An OODBMS defines data as objects and encapsulates data along with their relevant structure and behav-ior. The system uses a hierarchy of classes and subclasses of objects. Structure (in terms of relationships) andbehavior (in terms of methods and procedures) are contained within an object.

The worldwide relational and object-relational DBMS software market is expected to grow toalmost $20 billion by 2006, according to IDC (The Day Group, 2002). Object-oriented database man-agers are especially useful in distributed DSS for very complex applications. OODBMS have the powerto handle the complex data used in MSS applications. For a descriptive example, see Application CaseW3.1.5. Trident Systems Group, Inc. (Fairfax, VA), has developed a large-scale OODBMS for the U.S.Navy (see Sgarioto, 1999).

FIGURE W3.1.1 Different Database Structures

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 18

Page 19: ONLINE FILE W3 - Savvas

Chapter 3 • Decision Support Systems Concepts, Methodologies, and Technologies: An Overview 3-19

Glenn Palmier, data processing manager for G. PierceWood Memorial Hospital (GPW), was not happythat the vendor of his DBMS, InterSystems Corp.,was upgrading to an object-oriented architecturein its core product, Caché. At the time, GPW had45 different systems developed over 15 years atthe state mental health facility in Arcadia, Florida.Smooth operations and fast data access were criticalto GPW.

The vendor moved quickly, reducing a 5-yearconversion plan to 8 months. By then, GPW hadconverted all its systems to be object-oriented andWeb-based. GPW focused on data usability in theconversion process. Databases were updated to

work better in the new object-oriented environment.After reengineering the databases and upgrading,the new systems ran faster than ever before. Forexample, whereas the old system had requiredalmost 2 hours to perform a certain query, the newsystem takes less than a minute. Personnel havebeen easily and quickly trained in the new system,and the use of Web browsers to access data fitsperfectly into the state’s Internet strategy.

Sources: Adapted from J. T. Toigo, “Objects Are Good for YourMental Health,” Enterprise Systems, June 2001, pp. 34–35; andInterSystems, Success with Caché: G. Pierce Wood MemorialHospital Builds Applications Quickly with Caché Server Pages,intersystems.com (accessed July 2009).

APPLICATION CASE W3.1.5G. Pierce Wood Memorial Hospital Objects

Multimedia-Based Databases

Multimedia database management systems (MMDBMS) manage data in a variety of formats, in addi-tion to the standard text or numeric fields. These formats include images, such as digitized photo-graphs, forms of bitmapped graphics (e.g., maps, .PIC files), hypertext images, video clips, sound,and virtual reality (i.e., multidimensional images). Cataloging such data is tricky. Accurate and knownkeywords must be used. It is critical to develop effective ways to manage such data for GIS and formany other Web applications. Managing multimedia data continues to become more important for BI(see D’Agostino, 2003).

Most corporate information resides outside the computer, in documents, maps, photos, images, andvideotapes. For companies to build applications that take advantage of such rich data types, a specialDBMS with the ability to manage and manipulate multiple data types must be used. Such systems storerich multimedia data types as binary large objects (BLOB). DBMS are evolving to provide this capability(Hoffer et al., 2005). It is critical to design the management capability up front, with scalability in mind. Foran example of a situation that was not developed as such, but luckily worked, see Hurwicz (2002), whichdescribes NASA’s experience when it endeavored to download and catalog images from space for educa-tional purposes, as envisioned by astronaut Sally Ride. Fortunately, there was time and volunteer effortenough to redesign the cataloging mechanism on the Web-based, multimedia database system. SeeHurwicz (2002) for details about the development issues and EarthKAM (earthkam.ucsd.edu) for directaccess to the online, running database system. Note that similar problems can occur in data warehousedesign and development.

For Web-related applications of multimedia databases, see Maybury (1997) and multimedia demon-strations on the Web, including those offered by Adobe (adobe.com) and Visual Intelligence Corporation(affinity.com). Also see Application Case W3.1.6. In Application Case W3.1.7, we describe how ananimated film production company used several multimedia databases to develop the Jimmy Neutron: BoyGenius film. The databases and managerial techniques have since led to lower overall production costs forthe animated television series.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 19

Page 20: ONLINE FILE W3 - Savvas

3-20 Part II • Computerized Decision Support

APPLICATION CASE W3.1.6Multimedia Database Management Systems: A Sampler

IBM developed its DB2 Digital Library multimediaserver architecture for storing, managing, and retriev-ing text, video, and digitized images over networks.Digital Library consists of several existing IBM softwareand hardware products combined with consulting andcustom development (see ibm.com). Digital Librarycompetes head-to-head with multimedia storage andretrieval packages from other leading vendors.

MediaWay, Inc. (mediaway.com), claims thatits multimedia DBMS can store, index, and retrievemultimedia data (e.g., sound, video, graphics) aseasily as relational databases handle tabular data.The DBMS is aimed at companies that want to buildwhat MediaWay calls multimedia cataloging appli-cations that manage images, sound, and video

across multiple back-end platforms. An advertisingagency, for example, might want to use the productto build an application that accesses images of lastyear’s advertisements, stored on several servers. It isa client/server implementation. MediaWay is not theonly vendor to target this niche, however. Relationaldatabase vendors, such as Oracle Corporation andSybase, have incorporated multimedia data featuresin their database servers. In addition, several desk-top software companies promote client databasesfor storing scanned images. Among the industriesthat use this technology are health care, real estate,retailing, and insurance.

Source: Condensed and adapted from the Web sites and publiclyadvertised information of various vendors.

Some computer hardware (including the communication system with the database) may not becapable of playback in real time. A delay with some buffering might be necessary; to see an example ofthis, try any audio or video player in Windows. Intel Corporation’s Pentium processor chips incorporatemultimedia extension (MMX) technology for processing multimedia data for real-time graphics display.Since then, this and other similar technologies have been embedded in many CPU and auxiliary proces-sor chips.

APPLICATION CASE W3.1.7Jimmy Neutron: The “I Can Fix That” Database

Producers and animators working on the filmJimmy Neutron: Boy Genius tracked thousandsof frames on four massive databases. DNA Produc-tions (dnahelix.com), the animation servicescompany that worked with Nickelodeon andscreenwriter and director Steve Oedekerk to pro-duce the film, addressed the problem of assem-bling the 1,800 shots that comprise the 82-minutefilm by logging and tracking them in four FileMakerPro databases. One tracked initial storyboards,another tracked the shots assigned to individualartists, the third tracked the progress of each framethroughout the production process, and the fourth

tracked retakes (i.e., changes to completed shots).At the film’s completion, there were 20,000 entries.Each record tracked information about each shot,dating back to the beginning of the project. Thedatabases enabled the film to be completed in amere 18 months. The best part is that everyonehad access to the shots instantly instead of havingto track down an individual or walk over to a 4- by 8-foot board and look for it. The JimmyNeutron TV series continues to utilize this databasetechnology.

Sources: Adapted from S. Overby, “Animation Animation,” CIO,May 15, 2002, pp. 22–24; and public domain sources.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 20

Page 21: ONLINE FILE W3 - Savvas

Chapter 3 • Decision Support Systems Concepts, Methodologies, and Technologies: An Overview 3-21

Document-Based Databases

Document-based databases, also known as electronic document management (EDM) systems (Swift,2001), were developed to alleviate paper storage and shuffling. They are used for information dis-semination, form storage and management, shipment tracking, expert license processing, and workflowautomation. Many CMS are based on EDM. In practice, most are implemented in Web-based systems.Because EDM uses both object-oriented and multimedia databases, document-based databases wereincluded in the preceding two sections. What is unique to EDM are the implementation and theapplications.

Boeing distributes aircraft service bulletins to its customers around the world through theInternet. The company used to distribute a staggering volume of bulletins to more than 200 airlines,using over 4 million pages of documentation every year. Now it is all on the Web, via a DMS, savingmoney and time for both the company and its customers. Motorola uses DMS not only for documentstorage and retrieval but also for small-group collaboration and company-wide knowledge sharing.It has developed virtual communities where people can discuss and publish information, all with theWeb-enabled DMS.

Web-enabled DMS have become an efficient and cost-effective delivery system. American Expressnow offers its customers the option of receiving monthly billing statements online, including the ability todownload statement details, retrieve past statements, and view activity that has been posted but not yetbilled. As this option grows in popularity, it will reduce production and mailing costs. Xerox Corporationdeveloped its first KMS on its EDM platform.

Intelligent Databases

Artificial intelligence (AI) technologies, especially Web-based intelligent agents and artificial neural net-works (ANN), simplify access to and manipulation of complex databases. Among other things, they canenhance a DBMS by providing it with an inference capability, resulting in an intelligent database.

Difficulties in integrating ES into large databases have been a major problem, even for majorcorporations. Several vendors, recognizing the importance of integration, have developed softwareproducts to support it. An example of such a product is the Oracle relational DBMS, which incorpo-rates some ES functionality in the form of a query optimizer that selects the most efficient path for data-base queries to travel. In a distributed database, for example, a query optimizer recognizes that itis more efficient to transfer two records to a machine that holds 10,000 records than vice versa. (Theoptimization is important to users because with such a capability they need to know only a fewrules and commands to use the database.) Another product is the INGRES 2006 Intelligent Database(ingres.com).

Intelligent agents can enhance database searches, especially in large data warehouses. They can alsomaintain user preferences (e.g., amazon.com) and enhance search capability by anticipating user needs.These are important concepts that ultimately lead to ubiquitous computing. See Technology Insights W3.1.7for details of recent developments in intelligent agents.

One of IBM’s main initiatives in commercial AI provides a knowledge-processing subsystem thatworks with a database, enabling users to extract information from the database and pass it to an ES’s knowl-edge base in several different knowledge representation structures. Databases now store photographs,sophisticated graphics, audio, and other media. As a result, access to and management of databases arebecoming more difficult, and so are the accessibility and retrieval of information. The use of intelligentsystems in database access is also reflected in the use of natural language interfaces that can help nonpro-grammers retrieve and analyze data.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 21

Page 22: ONLINE FILE W3 - Savvas

3-22 Part II • Computerized Decision Support

Key Terms

content manage-ment system (CMS) 3-14

data 3-3data integrity 3-9

data quality (DQ) 3-7database management

system (DBMS) 3-16document management

system (DMS) 3-14

information 3-3intelligent

database 3-21knowledge 3-3

object-oriented databasemanagement system(OODBMS) 3-18

relational database 3-17

Glossary

content management system (CMS) An electronicdocument management system that produces dynamicversions of documents and automatically maintainsthe current set for use at the enterprise level.

data Raw facts that are meaningless by themselves(e.g., names, numbers).

data integrity The accuracy and accessibility ofdata. Data integrity is a part of data quality.

data quality (DQ) The quality of data, includingtheir accuracy, precision, completeness, and relevance.

database management systems (DBMS) Softwarefor establishing, updating, and querying (e.g., manag-ing) a database.

document management systems (DMS)Information systems (e.g., hardware, software) thatallow the flow, storage, retrieval, and use of digitizeddocuments.

information Data that are organized in a meaning-ful way.

intelligent database A database management sys-tem exhibiting artificial intelligence features that assistthe user or designer; often includes ES and intelligentagents.

knowledge Understanding, awareness, or familiarityacquired through education or experience. Knowledgeis anything that has been learned, perceived, discov-ered, inferred, or understood, and it is the ability to

TECHNOLOGY INSIGHTS W3.1.7 Bots

Many software agents are in use today. They are found in help systems, search engines, andcomparison-shopping tools. During the next few years, as technologies mature and agents radi-cally increase their value by communicating with one another, they will significantly affect anorganization’s business processes. Training, decision support, and knowledge sharing will beaffected, but experts see procurement as the killer application of B2B agents. Intelligent soft-ware agents, called bots, feature triggers that allow them to execute without human intervention.Most agents also feature adaptive learning of users’ tendencies and preferences and offerpersonalization based on what they learn about users.

One goal of software agent developers is to develop machines that perform tasks thatpeople do not want to do. Another is to delegate to machines tasks at which they are vastlysuperior to humans, such as comparing the price, quality, availability, and shipping cost ofitems.

BotKnowledge produces agents that can automatically perform intelligent searches, answerquestions, tell you when an event occurs, individualize news delivery, tutor, and comparison shop.

Agents migrate from system to system, communicating and negotiating with each other.They are evolving from facilitators into decision makers.

Sources: Adapted from S. Ulfelder, “Undercover Agents,” Computerworld, Vol. 34, No. 23, June 5, 2000, p. 85;and BotKnowledge, botknowledge.com (accessed July 2009).

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 22

Page 23: ONLINE FILE W3 - Savvas

Chapter 3 • Decision Support Systems Concepts, Methodologies, and Technologies: An Overview 3-23

use information. In a knowledge management system,knowledge is information in action.

object-oriented database management system(OODBMS) A database that is designed and manipu-lated using the object-oriented approach.

relational database A database whose records areorganized into tables that can be processed by eitherrelational algebra or relational calculus.

References

Balen, H. (2000, December). “Deconstructing Babel: XMLand Application Integration.” Application DevelopmentTrends.

Calvanese, D., G. de Giacomo, M. Lenzerini, D. Nardi, andR. Rosati. (2001, September). “Data Integration in DataWarehousing.” International Journal of CooperativeInformation Systems, Vol. 10, No. 3, pp. 237.

Canter, J. (2002, Spring). “Today’s Intelligent Data WarehouseDemands Quality Data.” Journal of Data Warehousing,Vol. 7, No. 2.

D’Agostino, D. (2003, May). “Water from Stone.” CIO Insight,No. 26, p. 53.

Day Group. (2002, July). “Newsletter.” CIO AnalystsOutlook.

DeJesus, E. X. (2000, October 30). “XML Enters the DBMSArena.” Computerworld, Vol. 34, No. 44, p. 80.

Devlin, B. (2003, Quarter 2). “Solving the Data WarehousePuzzle.” DB2 Magazine.

Eckerson, W. (2002, May). “Data Quality and the BottomLine.” Application Development Trends.

Edwards, J. (2003, February 15).“Tag, You’re It.” CIO.Erickson, W. (2003). Data Quality and the Bottom Line.

Seattle, WA: The Data Warehousing Institute.Ewalt, D. M. (2003, June 30). “PDAs Make Inroads into

Businesses.” InformationWeek, No. 946, p. 27.Fox, J. (2003, May). “Active Information Models for Data

Transformation.” EAI Journal.Gray, P., and H. J. Watson. (1998). Decision Support in the

Data Warehouse. Upper Saddle River, NJ: Prentice Hall.Hatcher, D. (2003, June 30). “Sharing the Info Wealth.”

Computerworld, Vol. 37, No. 26, p. 30.Hoffer, J. A., M. B. Prescott, and F. R. McFadden. (2005).

Modern Database Management, 7th ed. Upper SaddleRiver, NJ: Prentice Hall.

Holland, R. (2000, October 16).“XML Standards Are Gaining aFoothold.” eWeek, Vol. 17, No. 42, p. 18.

Hurwicz, M. (2002, August). “Attack of the Space Data:Down-to-Earth Data Management at ISS EarthKAM.” NewArchitect Magazine.

Kroenke, D. M. (2005). Database Concepts, 2nd ed., UpperSaddle River, NJ: Prentice Hall.

Mannino, M. V. (2001). Database Application Development &Design. New York: McGraw-Hill.

Maybury, M. T. (1997). Intelligent Multimedia InformationRetrieval. Boston: MIT Press.

McCright, J. S. (2001, May 7). “XML Eases Data Transport.”eWeek, Vol. 18, No. 18, p. 40.

Meehan, M. (2002, April 15). “Data’s Tower of Babel.”Computerworld, Vol. 36, No. 16, p. 40.

Nash, K. S. (2002, July). “Chemical Reaction.” Baseline, No. 8.Olson, J. E. (2003a). Data Quality: The Accurate Dimension.

San Francisco: Morgan Kaufman.Olson, J. E. (2003b, June). “The Business Case for Accurate

Data.” Application Development Trends.Orovic, V. (2003, June). “To Do & Not to Do.” EAI Journal,

pp. 37–43.Pelletier, S.-J., S. Pierre, and H. H. Hoang. (2003, March).

“Modeling a Multi-Agent System for Retrieving Informationfrom Distributed Sources.” Journal of Computing andInformation Technology, Vol. 11, No. 1.

Post, G. V. (2002). Database Management Systems, 2nd ed.New York: McGraw-Hill.

Riccardi, G. (2003). Database Management. Boston: PearsonEducation.

Satzinger, J. W., and T. U. Orvik. (2001). The Object OrientedApproach: Concepts, Modeling and System Development.Danvers, MA: Boyd & Fraser.

Savage, H. (2001, Winter). “Democratizing Data Exchange.”edirections.

Sgarioto, M. S. (1999, November 29). “Object Databases Moveto the Middle.” InformationWeek, No. 763, p. 115.

Songini, M. (2002, April 15). “Collections of Data.” Computer-world, Vol. 36, No. 16, p. 46.

Swift, R. S. (2001). Accelerating Customer Relationships:Using CRM and Relationship Technologies. Upper SaddleRiver, NJ: Prentice Hall.

Vaughan, J. (2002, December). “Technologies to Watch.”Application Development Trends.

Verton, D. (2003, May 26). “Feds Plan Biometrics for BorderControl.” Computerworld, Vol. 37, No. 21, p. 12.

Weisfeld, M. (2004). The Object-Oriented Thought Process,2nd ed. Indianapolis: Sams.

Whiting, R. (2003, January 13). “Look Within.” InformationWeek,No. 922, pp. 32–47.

Yourdon, E. (1994). Object-Oriented System Design: AnIntegrated Approach. Upper Saddle River, NJ: Prentice Hall.

M03_TURB7293_09_SE_WC03.1.QXD 12/22/09 12:55 PM Page 23