13
Technology in Society 22 (2000) 97–109 www.elsevier.com/locate/techsoc Retail-data quality: evidence, causes, costs, and fixes Ananth Raman * Harvard Business School, Morgan T-11, Boston, MA 02163, USA 1. Introduction US retail sales, much of which goes through point-of-sale (POS) scanners, amount to over $2 trillion dollars a year. We argue, using evidence from multiple retailers, that the quality of POS data is often poor and cannot be taken for granted even at well-run retailers. Primarily, this paper seeks to identify and highlight a problem so that managers and academics can start working and solving it. The paper also offers a taxonomy of retail-data quality, quantifies these costs to the extent possible, high- lights the impact of data quality on Internet retailing, and offers guidelines to man- agers for improving quality. The evidence in the paper, while of obvious significance to retailers, has substan- tial value to manufacturers and distributors as well for two reasons. First, while the evidence for this paper is drawn from retailing, the data-quality problem itself is probably not confined to retailing. Hence, the paper seeks to inform manufacturers and retailers of an operational problem that is likely to be substantial in their organi- zations. Second, for manufacturers interested in coordinating their supply chains, poor data can make supply-chain coordination substantially more difficult. To coordi- nate supply chains effectively, manufacturers should have visibility to retail sales and inventory; inaccurate POS data make it difficult for manufacturers to have this visibility. Moreover, due to the bullwhip effect [1], distortions in retail-demand sig- nals due to inaccurate POS data are likely to be amplified upstream in the supply chain. Most people, including supply-chain researchers, inadvertently assume that POS data are accurate, and widely used in merchandising decisions like replenishment and assortment. However, evidence from multiple researchers directly contradicts * Fax: + 1-617-496–5265. E-mail address: [email protected] (A. Raman). 0160-791X/00/$ - see front matter 2000 Published by Elsevier Science Ltd. All rights reserved. PII:S0160-791X(99)00037-8

Retail-data quality: evidence, causes, costs, and fixes

Embed Size (px)

Citation preview

Page 1: Retail-data quality: evidence, causes, costs, and fixes

Technology in Society 22 (2000) 97–109www.elsevier.com/locate/techsoc

Retail-data quality: evidence, causes, costs, andfixes

Ananth Raman*

Harvard Business School, Morgan T-11, Boston, MA 02163, USA

1. Introduction

US retail sales, much of which goes through point-of-sale (POS) scanners, amountto over $2 trillion dollars a year. We argue, using evidence from multiple retailers,that the quality of POS data is often poor and cannot be taken for granted even atwell-run retailers. Primarily, this paper seeks to identify and highlight a problem sothat managers and academics can start working and solving it. The paper also offersa taxonomy of retail-data quality, quantifies these costs to the extent possible, high-lights the impact of data quality on Internet retailing, and offers guidelines to man-agers for improving quality.

The evidence in the paper, while of obvious significance to retailers, has substan-tial value to manufacturers and distributors as well for two reasons. First, while theevidence for this paper is drawn from retailing, the data-quality problem itself isprobably not confined to retailing. Hence, the paper seeks to inform manufacturersand retailers of an operational problem that is likely to be substantial in their organi-zations. Second, for manufacturers interested in coordinating their supply chains,poor data can make supply-chain coordination substantially more difficult. To coordi-nate supply chains effectively, manufacturers should have visibility to retail salesand inventory; inaccurate POS data make it difficult for manufacturers to have thisvisibility. Moreover, due to the bullwhip effect [1], distortions in retail-demand sig-nals due to inaccurate POS data are likely to be amplified upstream in the supplychain.

Most people, including supply-chain researchers, inadvertently assume that POSdata are accurate, and widely used in merchandising decisions like replenishmentand assortment. However, evidence from multiple researchers directly contradicts

* Fax: +1-617-496–5265.E-mail address:[email protected] (A. Raman).

0160-791X/00/$ - see front matter 2000 Published by Elsevier Science Ltd. All rights reserved.PII: S0160 -791X(99)00037-8

Page 2: Retail-data quality: evidence, causes, costs, and fixes

98 A. Raman / Technology in Society 22 (2000) 97–109

this assumption on multiple counts. This paper shows that when POS data are access-ible, they tend to be highly inaccurate. This paper introduces a taxonomy for retail-data quality, and provides some field evidence for each type of data inaccuracy. Thepaper also explains, to the extent possible, the origins and the cost associated withthe various types. Moreover, it highlights why even Internet retailers should payattention to data quality. The paper concludes with some ideas on how retail-dataquality management can be improved. Our central message is that retail-data qualityneeds to be actively managed and cannot be taken for granted.

2. Classifying retail-data quality

We classify retail-data quality into three categories, each of which is describedbelow:

1. “Price-Scan Accuracy”. On occasion, the scanned price for an item is not equalto the “posted price” or the “advertised price”. Very often in a retail store whenthe checkout clerk scans the item, the price that shows up on the scanner is notequal to the price posted in the retail store or the one advertised in newspapersor store flyers. We shall refer to this type of data inaccuracy as “Price-Scan Inac-curacy”.

2. “Inventory-Data Inaccuracy”. Often, “physical inventory” (i.e., what is availablein inventory for an SKU at a store) is not equal to “book inventory” (i.e., whatthe computer or “book” shows as available). At many stores, as we point out laterwith field evidence, “book inventory” can differ substantially from “physicalinventory”.

3. “Phantom Stockouts”. This refers to situations when the item in question is avail-able at the store but cannot be found by the consumer. Phantom stockouts are not“stockouts” (as used in inventory management), since the item is available at thestore, yet they, like stockouts, can lead to lost sales.

2.1. Price-scan accuracy

The Federal Trade Commission (FTC) has been studying price-scan inaccuracyin a number of different industries. Based on a sample size of over 17,000 scans inmany different sectors of retailing (see Table 1), the FTC found that 2.2% of retailscans were overcharged (i.e., the scanned price was higher than the posted or adver-tised price), while for 2.6% of the scans the price was undercharged. Thus, the grossrate of error (overcharges plus undercharges) was 4.8%.

In certain sectors of retailing, the price-scan inaccuracy could be substantiallyhigher. At department stores, for example, the total percentage of errors is around9% (5.9% consist of undercharges, while 3.2% consist of overcharges).

Why do price-scan inaccuracies occur? There are many reasons. One explanationis that retailers deliberately fail to update their computers to reflect price reductions

Page 3: Retail-data quality: evidence, causes, costs, and fixes

99A. Raman / Technology in Society 22 (2000) 97–109

Tab

le1

Ret

ail

pric

ing

accu

racya

To

tal

Cla

ssifi

catio

n

Aut

oD

epar

tmen

tDis

coun

tD

rug

Foo

dH

ome

Toy

Mis

cella

neou

s

Num

ber

ofst

ores

29

44

3080

3911

317

92

Per

cent

age

ofov

erch

arge

s(%

)2

.24

2.02

3.25

1.87

3.56

1.92

2.52

0.20

0.00

Tot

alam

ount

ofov

erch

arge

s($

)1

17

2.0

6.90

457.

4124

9.64

80.6

960

.47

313.

494.

020.

00P

erce

ntag

eof

unde

rcha

rges

(%)

2.5

80.

675.

902.

682.

751.

552.

841.

601.

01T

otal

amou

ntof

unde

rcha

rges

($)

13

19

.72.

0957

6.92

298.

6459

.12

70.3

026

2.98

39.6

210

.00

Tot

alnu

mbe

rof

item

sch

ecke

d1

7,2

98

297

1846

5071

2218

5999

1269

499

99T

ota

l%

of

err

ors

4.8

22

.69

9.1

54

.56

6.3

13

.47

5.3

61

.80

1.0

1

aS

ourc

e:P

rice

Ch

eck

:a

rep

ort

on

the

acc

ura

cyo

fch

eck

ou

tsc

an

ne

rs,

are

port

byth

est

aff

ofth

eF

eder

alT

rade

Com

mis

sion

,T

echn

olog

yS

ervi

ces

ofth

eN

atio

nalI

nstit

ute

ofS

tand

ards

and

Tec

hnol

ogy,

the

stat

esof

Flo

rida,

Mic

higa

n,T

enne

ssee

,V

erm

ont

and

Wis

cons

inan

dth

eC

omm

onw

ealth

ofM

assa

chu-

setts

,O

ctob

er22

,19

96.

Page 4: Retail-data quality: evidence, causes, costs, and fixes

100 A. Raman / Technology in Society 22 (2000) 97–109

during sales and promotions, thus resulting in retailers charging higher prices. How-ever, this is an unlikely explanation; if the errors were largely deliberate, we wouldexpect retailers to overcharge more often than they undercharged. Since the reverseis true in Table 1 (i.e., undercharges exceed overcharges), price-scan inaccuraciesare probably more often a consequence of inadequate or improper process control,rather than of retailers bilking consumers.

Examples of inadequate process control leading to price-scan inaccuracy have beencited in prior FTC reports. Examples include the computer system, or alternativelythe displayed price at the store, not being updated after a price change. At times,the same item can have multiple SKU numbers, causing confusion in the store. Forexample, manufacturers often assign a new SKU number when an item is on pro-motion. Wrong price scans can result if the retailer advertises the promotion, but theconsumer picks up some units that were received prior to the promotion period (dueto which the scanned item would have a different SKU and a different price). Some-times when an item is on a “buy one, get one free” promotion, the cashier is expectedto manually enter the price of the second item. Untrained or improperly trainedcashiers often fail to do so, causing price-scan inaccuracy.

2.2. Inventory-data accuracy

This section is based on an audit of six stores at a company called “The GammaCorporation” (name has been disguised). The Gamma Corporation is a very success-ful retailer with over $1 billion in annual sales, and a few hundred stores. The GammaCorporation is considered, within industry, to be a leader in information systems,and is, of the nearly 40 retailers that I have studied as part of the Harvard–WhartonMerchandising Effectiveness Project, one of the most careful with maintaining itsdata integrity.

Based on an audit of six stores at the Gamma Corporation, we found that inventoryrecords were wrong for over 70% of the SKUs in the store. The audit consisted ofphysically counting the amount of inventory for each item and comparing the “physi-cal inventory” with the “book inventory”. The average store in the Gamma Corpor-ation had 9000 SKUs. For 2600 of the 9000 SKUs (i.e., for 29% of the SKUs), the“book inventory” exceeded “physical inventory”. For another 3800 items (i.e., 42%of the SKUs), “book inventory” was less than “physical inventory”. In total, 6400of the 9000 SKUs (i.e., 71% of the SKUs) had either positive or negative variation.

The total unit deviation per store was 61,000 units and the average store had totalinventory of roughly 150,000 units. Consequently, the mean absolute deviation was6.8 units per SKU per store or over 40% of the inventory level for the SKU atthe store.

Why are retail inventory data inaccurate? Since inventory is the difference betweenshipments into a store and sales out of the store, any error in tracking inventory datawould have to stem either from an error in tracking shipments into the store or intracking sales out of the store. Consequently, we classify the sources of inventorydata inaccuracy into “sales-related sources” and “supply-related sources”.

The sales-related sources are easier to understand. Problems with data quality stem

Page 5: Retail-data quality: evidence, causes, costs, and fixes

101A. Raman / Technology in Society 22 (2000) 97–109

partly from improper scanning at the checkout counter. Most of us have been tosupermarkets where we bought, for example, a plain yogurt and a vanilla yogurt atthe same time, and the checkout counter clerk scanned one flavor twice, instead ofeach flavor once. Often, checkout clerks do not care to scan each yogurt individuallyif the two yogurts have the same price. This simple, and what initially appears tobe harmless process, has substantial consequences for the accuracy of inventory data.In scanning one flavor of yogurt twice, the checkout counter person has introducedtwo errors into the database — “book inventory” will not match “physical inventory”for either product after this seemingly innocuous error. Another source of error isthat returns are not processed properly. Most stores expect store personnel to processreturned merchandise through the POS scanner even if the original item that waspurchased and the item it was exchanged for have identical prices. However, storepeople usually make the switch without scanning the transaction in the POS scannerif they are too busy. The third and very obvious sales-related source of inventory-data inaccuracy is shrink, which averaged 1.72% of total retail sales in 1998 [2].Shrink, though large relative to profits, represents only a small percentage of datainaccuracy at the store.

An anecdote is helpful to illustrate inaccuracies in tracking retail sales. At onesupermarket chain, sales of “medium tomatoes” typically exceed shipments into thestore by 25% in any period because more expensive tomatoes (e.g., organic tomatoes,vine-ripe tomatoes) were often sold as “medium tomatoes”, resulting in substantialrevenue losses to the retailer. Even though this example does not include scannedproduct (since tomatoes are not scanned at this supermarket), it illustrates the magni-tude of checkout inaccuracy in retailing.

Supply-related problems could be substantial as well. At the Gamma Corporation,an interesting experiment was tried to isolate and estimate supply-related problems.The Gamma Corporation announced that a certain store was going to open on Febru-ary 1, but, in reality, the store opened roughly two weeks later. The distributionpeople at Gamma Corporation did not know that the store was not going to openon February 1 and stocked the stores with the amount of inventory they were sup-posed to stock the store with. Between February 1 and February 15, independentauditors verified the level of inventory at the store, and found that 29% of the SKUshad erroneous inventory, the average error being 3.07 units per SKU. Bear in mindthat this wasbeforea single customer had walked the store and before a single salehad occurred. Thus, there are substantial supply-related problems that contribute tosources of data inaccuracy as well.

Inaccuracy in inventory data tends to be high because errors in tracking sales outof, and shipments in to, a store tend to cumulate. Consider the following simpleexample, where a retailer starts with a certain amount of inventoryQ1 for a singleitem at the beginning of day 1. The retailer follows an inventory policy forn dayssuch that, at the end of each day, the retailer replenishes the store with the numberof units that the POS system indicates was sold that day. Demand on dayi is denotedby Xi. Due to errors in scanning, the retailer’s POS system shows thatXi+Di unitswere sold; thus,Di is the error induced by the scanning process on dayi. The retailerattempts to replenishXi+Di units to the store; the actual quantity shipped does not

Page 6: Retail-data quality: evidence, causes, costs, and fixes

102 A. Raman / Technology in Society 22 (2000) 97–109

match the target replenishment quantity precisely. Let us denote the actual replenish-ment quantity on dayi by Xi+Di+ei, whereei represents the error induced by theshipment process on dayi. For example, on day 1, the retailer sellsX1 units, thePOS system shows thatX1+D1 units were sold, and the retailer shipsX1+D1+e1 unitsto the store at the end of day 1. Consequently, the inventory at the beginning of day2, Q2, is Q1+D1+e1. Using similar logic, the inventory level at the beginning of the(n+1)th day will be

Qn+15Q11Oni51

(Di1ei).

If

Var(Di1ei)5s2,

then assuming independence between the errors on different days, and assuming thatthe stocking quantity is substantially higher than the errors in monitoring sales outof, or supply into, a store,

VarOni51

(Di1ei)5s2n.

Thus, inventory-data inaccuracy (measured by the standard deviation ofQn+1)increases with√n. Note thatn is the number of days since the day the inventorywas known to be accurate (e.g., inventory might be known accurately immediatelyafter a physical audit of the store).

2.3. Phantom stockouts

As explained earlier, “phantom stockouts” refers to the situation where a consumercannot find a particular item that is available at the store. We explain the contextand causes of phantom stockouts, and quantify them at Beta Corporation (“Beta”),a chain of over 200 bookstores (company’s name has been disguised). A typicalBeta Bookstore has 130,000 book titles, 50,000 music titles, and between 10,000and 20,000 video titles. Many years ago, Beta chose to offer higher service (i.e.,higher availability) to the consumer while incurring slightly higher costs than com-petitors. The company invested a substantial amount of money in its internallydeveloped expert system for stocking merchandise, regarded among the finest in theindustry by many observers. Like the Gamma Corporation, Beta too is a sophisticateduser of information technology in operations and merchandising.

Anecdotally, Beta had discovered that certain book titles could not be found bythe consumer or a sales associate even though the Title-Look-Up (“TLU”; i.e., thecomputer system that lists the book titles available at a store) would indicate thatthe book was available. A typical store comment about the TLU was: “The TLUshows one is available but that’s probably not true”. Similar anecdotal evidence canbe found from other retail chains as well.

Page 7: Retail-data quality: evidence, causes, costs, and fixes

103A. Raman / Technology in Society 22 (2000) 97–109

With help from Beta management we estimated phantom stockouts at the companyin two ways,1 both of which showed that for those cases where customers soughthelp from a sales associate in finding a book, roughly one in six books available atthe store could not be found. At Beta, unlike many other retail stores, the TLU isvery accurate. This conclusion is derived from physical audit reports at the end ofthe year; because of confidentiality reasons, we are unable to reveal actual shrinkand data-accuracy numbers for Beta. Both shrinkage and “inventory-data inaccuracy”are low for books at Beta. Thus, in our analysis we assume that a book is availableif (and only if) the TLU indicates its availability.

1. We tracked actual customer requests at a single store and directly estimated thefraction of times a book (that was available at the store) could not be found. Thisapproach revealed that 81% of the books that were available at a store could notbe found at the store. In fact, only 73% of the books available at the store couldbe found at the right location, another 8% of the time the book was found butnot at the right location. Thus, according to this procedure, sales associates couldnot find approximately 19% of the books that were requested by customers andwere available at the store.

2. We interviewed 17 store employees (managers and associates) and asked each ofthem to estimate (based on their recollections) the frequency of phantom stockoutsat the stores. On average, store employees estimated that they could find a bookonly 83% of the time when the Title-Look-Up said the book was available at thestore. In other words, on 17% of the occasions, customers were unable to find abook that was available at the store. Hence, the two approaches provided consist-ent estimates.

Why do phantom stockouts occur? Part of the problem stems from the fact that themagnitude or impact of phantom stockouts is not widely understood at most retailers.Senior Beta managers at corporate headquarters were surprised to learn of the fre-quency of phantom stockouts. They knew that phantom stockouts occurred, but wereunaware of the magnitude. In contrast, store employees, although aware of the prob-lem, do not have appropriate benchmarks for what is acceptable. One store manager,upon seeing the data we collected, said, “…for the inquiries that we find in TLU,we are finding the vast majority of books where we expect to”. In the highly competi-tive book-retailing environment (especially with the entry of Internet retailers likeamazon.com), the “vast majority” is simply not good enough, and is the wrongbenchmark for the store.

Other causes for phantom stockouts include customers restocking books and oftenplacing them in the wrong location at the store. Unlike a library, most bookstoresdon’t have carts for customers to place books. Moreover, there are multiple areas

1 Data for phantom stockouts at Beta Corporation are drawn from a field study report conducted underthe author’s supervision by Tamar Bruckel, Mandee Heller, Tamara King, and Lilian Kuo (all MBA’99students at the Harvard Business School).

Page 8: Retail-data quality: evidence, causes, costs, and fixes

104 A. Raman / Technology in Society 22 (2000) 97–109

in a Beta store, in addition to the shelves, where a book can be placed including the“backroom area” (where books received from the distribution center are stored priorto being shelved), “display area” (where books are placed to be visible to consumerseasily), and “overstock areas” (where slow-moving items are stored). The movementfrom one area to another is not tracked by the information system, and the companydepends on employee knowledge instead to find the book. This approach does notwork well, especially during high-traffic periods when employees are too busy tospend the time to find a specific book that a customer is looking for.

3. Cost of poor data quality

We argue in this section that the cost of poor data quality in retailing is substantial.We do not arrive at a precise computation of these costs, leaving that exercise forfuture research. We are working with multiple retailers, including the Gamma Cor-poration and Beta Corporation, to quantify the costs associated with data inaccuracy.At this stage, we explain the costs associated with each type of retail-data quality, andpoint out that each of them is substantial in the different retail contexts considered.

3.1. Cost of price-scan inaccuracy

As noted earlier, expected dollar undercharge exceeds expected dollar overcharge.Hence, retailers are currently losing sales revenue due to “price-scan inaccuracy”.For department stores, for example, Table 1 shows (based on 1846 scans) that under-charges exceeded overcharges by $119.51 (=$576.922$457.41). Assuming that eachsales scan was for $25 (a reasonable estimate for the average price of an item at adepartment store obtained from data at one chain), the resulting revenue loss wouldbe 0.26% of sales (or roughly 7% of operating profit2). The revenue-loss problemis compounded by the fact that consumers are more likely to complain about over-charges rather than about undercharges.

In addition to losing revenue, retailers are also likely to lose some goodwill withcustomers due to inaccurate price scanning. Clearly, consumer goodwill would belost when customers find out that they’ve been overcharged. Goodwill is also lostwhen price-scan inaccuracies are reported in the media. Various TV programs andmagazines such asDateline, Primetime Live, andMoneyhave been conducting “stingoperations” at various retailers. In addition to the goodwill lost, there are fines andlegal costs associated with price-scan inaccuracy, particularly when some kind ofclass action lawsuit is filed against the retailer.

3.2. Costs associated with inventory-data inaccuracy

Inventory-data accuracy requires retailers to stock additional safety stock to ensurea certain level of customer service. For example, a retailer that replenishes its inven-

2 Operating profit was 3.85% of sales for US department stores in 1995 [3].

Page 9: Retail-data quality: evidence, causes, costs, and fixes

105A. Raman / Technology in Society 22 (2000) 97–109

tory every day and targets a 95% type-II service level (i.e., to satisfy all demand on95% of the days) will have to carry safety stock equal to 1.63√s2

d+s2e, wheresd is

the standard deviation of demand andse is the error in measuring the inventory levelat the store-SKU level. Currently, we lack the data to quantify precisely the additionalsafety stock that has to be stored to buffer against the impact of inventory-datainaccuracy. Furthermore, the cost of inventory-data accuracy is very high because,compounded by the “bullwhip” effect, any distortion of demand at the retail locationis likely to be compounded upstream in the supply chain.

We can estimate a portion of the costs due to inventory-data inaccuracy at a super-market by calculating the additional labor cost incurred due to poor retail inventorydata. A typical supermarket store with annual sales of roughly $10 million has twoto three full-time employees walking the aisles in a store and ordering items forreplenishment. Given that these two to three employees cost the retailer roughly$100,000 per year, this cost works out to 1% of sales and could be avoided had databeen more accurate and had replenishment been managed through the computer data.

3.3. Cost of phantom stockouts

To estimate the revenues lost from these sales requires assumptions on the fractionof customers that walk into a Beta store looking for a specific title, and the fractionof these titles that is available at the store (i.e., target fill rate for a title). In addition,we also need to assume the fraction of customers who approach a sales associatefor help in finding a book. Industry sources estimate that 55% of customers at abookstore have a specific title in mind [4]. Assume that the store achieves its targetfill rate of 95%. Finally, let us assume a certain fraction,p, of customers seek helpfrom a sales associate; let us conservatively assumep is 10%. We know from resultspresented earlier that 18% of these customers fail to find the book they seek at thestore. Using these assumptions, we can see that roughly 0.94% of the customers failto find the book they seek.

In customer survey results conducted at Beta, we found that 72% of those whoexperienced phantom stockouts would go to another (i.e., non-Beta) store, 14%would not buy the book, and only 14% would place a special order at Beta. So 86%of the time that a phantom stockout occurred, Beta would lose the sale. Hence, theresulting lost sales due to technical stockouts alone would amount to 0.81% of sales(0.86×0.94%). Using annual sales of $2 billion, Beta stands to lose $16.2 million inannual sales, gross margins on which would be over $4 million.

4. The impact of, and implications for, the Internet

Internet retailing will not resolve the data-accuracy problem. In fact, in manyways, Internet retailing will substantially complicate the problem for two reasons.First, with Internet retailing, customers place orders for items electronically, andretailers have to accept these orders based on availability. Clearly, accurate inventorydata are essential to execute these transactions smoothly. Second, the costs associated

Page 10: Retail-data quality: evidence, causes, costs, and fixes

106 A. Raman / Technology in Society 22 (2000) 97–109

with shipping the wrong item to an Internet consumer are often substantially higherthan the costs associated with shipping erroneously to a store. Not surprisingly, manyanalysts [5] have identified cost-effective fulfillment as a crucial competitive weaponfor Internet retailers.

Customers will expect retailers to confirm availability and, hence, promise deliverywhile “accepting” orders on the Internet. Many Internet retailers have had difficultyin achieving this objective. For example, a consumer who ordered a gift for a friendfrom a leading Internet book retailer, received e-mail roughly a week later informingher that the book could not be shipped since the item was out-of-stock. Similarly,at least one Internet food retailer does not guarantee that an item ordered by a con-sumer will be delivered to the customer. From a consumer’s perspective, the impli-cation of having wrong data is more substantial at an Internet retailer than at a “brick-and-mortar” store. At a “brick-and-mortar” store, consumers can recognize stockoutsinstantaneously, and can, if needed, substitute the stocked-out item for another itemor take part, or all, of their shopping basket to another retail store. At an Internetretailer, consumers depend on the retailer having accurate inventory data, and usingthe data to accept or decline their orders instantaneously. Failing to do so wouldinvolve substantial costs in informing the customer of the slip-up and the associatedgoodwill costs.

Shipping mistakes contribute substantially to inventory-data inaccuracy at mostretailers. Recall that a substantial portion of the Gamma Corporation’s problem wasobserved at the store prior to any consumer walking into a store. In fact, 29% ofthe SKUs had erroneous inventory prior to even a single customer entering the store.The cost of shipping the wrong item to a customer on the Internet can be veryexpensive. The Gamma Corporation, for example, is more concerned about dataquality for the “.com” portion of the business than for its “brick-and-mortar” stores.Internet customers who receive the wrong item are usually upset, and often demanda refund or replacement. Moreover, they often justifiably refuse to pay for shippingthe wrong item back to the retailer, causing the retailer’s transportation costs torise substantially.

5. Improving retail-data quality

Retail-data quality can be improved; companies in financial services, and overnightcarriers like Federal Express, have had excellent data quality for years. For example,the credit card industry is able to track each customer’s expenses down to the individ-ual transaction. Errors in tracking transactions appear to be rare; most of us wouldbe alarmed if even a small portion of our credit card bill was wrong each month.Stock markets also have done a remarkable job of maintaining accurate data ontransactions. Finally, companies like Federal Express are able to do a remarkablejob of tracking each customer’s package from pick-up to delivery. The success thesecompanies have had with managing data quality shows that retailing can achieve asimilar transition.

We see an interesting parallel between retail-data inaccuracy and process re-work

Page 11: Retail-data quality: evidence, causes, costs, and fixes

107A. Raman / Technology in Society 22 (2000) 97–109

in manufacturing. In the 1970s and early 1980s, many manufacturers in the UnitedStates and other Western countries found that over 10% of production in variousstages of the manufacturing process required some kind of re-work. For a compli-cated assembly like a car, it was not unusual to find a substantially larger portionof re-work. Quite like inaccuracies in POS data, the frequency of these defects wasnot tracked and senior management did not know the frequency of, or the cost asso-ciated with, these defects. Today, most manufacturers have made substantial progresswith product quality, and many of those that did not went out of business. In theautomobile industry, companies like Ford have transformed themselves from beinglaggards to leaders in product quality. Other companies such as Motorola now meas-ure defects in “parts per million”, substantially lower than the “percentages” theyhad been happy with not so long ago. A similar transition is needed with retail-data quality, and retailers would do well to examine the quality revolution in USmanufacturing for guidance and inspiration. Central to achieving the transformationin manufacturing was better measurement of quality problems (e.g., statistical pro-cess control charts [6]), and systematic programs to improve process quality (e.g., the“seven-step process” [7]). Moreover, manufacturers realized that improving qualityrequired them to treat factory workers as problem-solvers, and many manufacturersrealized that quality could be used as a competitive weapon [8].

To managers interested in improving data quality, we offer two steps that, in ouropinion, should be takenimmediately. First, companies should make greater use ofthe data that they have stored. This might appear counter-intuitive since it could bedangerous to use bad data for decision-making. However, the recommendation makessense because we often find companies in a vicious circle nowadays. Since data areoften not used systematically to guide planning and replenishment decisions [9],many companies are not aware of, and have not taken steps to address, data quality.To break the vicious circle, companies should start using the data in decision-making.

Second, we recommend that companies start measuring data quality to the extentpossible. One relatively easy step for most retailers is to use their periodic (typicallyannual) physical audit to measure inventory-data accuracy. Most retailers audit theirstores periodically to ensure accuracy of their inventory records. Many companiesthat we have worked with have not looked carefully at these audit reports to identifythe extent of, or the patterns in, inventory-data inaccuracy. Some other retailers usethe audit to track “financial inventory”; for example, at an apparel store, they mightverify the dollar value of inventory at a store, but not verify the quantity availablein each style, color, and size. Thus, the audit will fail to report an error if a storehad an extra unit of a blue dress, and was a unit short of a green dress in a particularstyle as long as the two items had identical prices. Retailers should use the audit totrack the inventory of each SKU; all errors should be tracked and analyzed. Similarly,retailers should also attempt to measure price-scan inaccuracy, and phantom stock-outs; the methods adopted by the FTC for the former, and by the Harvard BusinessSchool team at Beta Corporation for the latter, are reasonable starting points. BetaCorporation, for example, has made remarkable progress in this direction during thelast few months, having set up a task force with representatives from various depart-

Page 12: Retail-data quality: evidence, causes, costs, and fixes

108 A. Raman / Technology in Society 22 (2000) 97–109

ments to address the problem. At the Gamma Corporation, inventory-data accuracyhas the attention of senior management, including the CEO.

Some observers, even while agreeing that manufacturing’s quality revolution canbe an inspiration for retail-data quality, point to some differences in the two industriesthat could make it harder to adapt manufacturing quality-revolution principles toretailing. First, in many manufacturing sectors (e.g., automotive assembly), wageson the shop floor tend to be higher than on the retail sales floor, and, possibly as aconsequence, education levels tend to be higher and turnover tends to be lower.Consequently, it might be harder to implement continuous-improvement programsin retailing than in manufacturing. Second, retailers might find it difficult to controltheir data-quality problems completely because part of their problem is caused byconsumers’ actions, and attempts to restrict these actions could reduce customer ser-vice. For example, the phantom stockout problem is caused in part by consumerswho pick up a book at one location and restock it elsewhere in the store. Retailerswould find it hard to restrict a customer from moving a book from one section toanother. However, manufacturing remains an inspiration in that the magnitude of thequality problem was not realized for many years, and once it was realized, amazingstrides were made.

6. Conclusion

This paper illustrates the existence of a substantial problem in retailing that hasnot been addressed by academics. Practitioners often do not understand the signifi-cance of the problem either. A senior manager at the Gamma Corporation estimatedthe accuracy of inventory data at his company to be “99%” just a week before thefirst audit results on data quality were available. Thus, above all, the paper makesa contribution by demonstrating the existence of a problem that has so far notbeen acknowledged.

References

[1] Lee H, Padmanabhan P, Whang S. The bullwhip effect in supply chains. Sloan ManagementReview 1997;38.

[2] 1998 national retail security survey. University of Florida, 1998.[3] Financial and operating results for department and specialty stores. National Retail Federation, 1996.[4] Rangan Kash. Xerox: book-in-time. In: Harvard Business School Case N9-699-155. Boston (MA):

Harvard Business School, 1999.[5] Glass JS, Hone DF, Ma V. Bricks and mortar.com: retailing meets the internet. In: Deutsche Bank

Alex Brown Equity Research Report, August, 1999.[6] McClain JO, Thomas LJ, Mazzola JB. Operations management: production of goods and services.

3rd ed. NJ: Prentice Hall, 1992.[7] Shoji S, Graham A, Walden D. A new American TQM: four practical revolutions in management.

Cambridge (MA): Productivity Press, 1993.[8] Garvin DA. Managing quality: the strategic and competitive edge. The Free Press, 1988.[9] Fisher ML, Raman A, McClelland AS. Rocket-science retailing. Boston (MA): Harvard, 1999.

Page 13: Retail-data quality: evidence, causes, costs, and fixes

109A. Raman / Technology in Society 22 (2000) 97–109

Ananth Raman, Associate Professor, has been on the Harvard Business School faculty since 1993 and special-izes in supply-chain management. He teaches an MBA elective course, Coordinating and Managing SupplyChains, and in multiple executive education courses. His research focuses on supply-chain management forshort-lifecycle products with unpredictable demand, and emphasizes production and inventory planning, andthe role of incentives. He is co-director of a Sloan-Foundation-funded Harvard–Wharton research project tostudy retail operations and merchandising practices. Over 30 retailers from the United States, Japan, and Europeare currently participating in this study.