[IEEE 2008 16th International Conference on Advanced Computing and Communications (ADCOM) - Chennai, India (2008.12.14-2008.12.17)] 2008 16th International Conference on Advanced Computing

A Decision Support Model for Filtering RFID Read Data

Yu-Ju Tu #1, Selwyn Piramuthu #2 #1 Information Systems, University of Illinois at Urbana-Champaign, USA

[email protected] #2 Information Systems and Operations Management, University of Florida, USA

[email protected]

Abstract— As RFID tags gain widespread acceptance in a wide

variety of domains, there is a need to improve their read rate accuracy. Although this is an important issue that needs to be addressed, there is relatively few published work in this area. We consider a model for filtering data that is already being gathered in RFID systems, and utilize it to improve read rate accuracy. We implement the proposed model and illustrate its performance using an example.

I. INTRODUCTION Radio frequency identification (RFID) technology has gained significant attention spurred by its adoption by influential enterprises including Wal-Mart, Metro, and Tesco. As of March 2006, Wal-Mart reported establishing RFID program with more than three hundred major suppliers [1]. Unlike barcode technology, RFID can be used to store detailed information about the tagged object and can be executed without direct line of sight. Therefore, RFID technology is very suitable for use in dynamically identifying, locating, tracking, and monitoring merchandise. Ultimately, a goal is to reduce inventory and improve product sales. As a technology, RFID is not new, and can be traced back to World War II, when it was large, heavy, and relatively more power-consuming. With technological improvement, RFID devices have become smaller and cheaper. This is especially true for RFID tags - a tiny device containing an antenna, an optional power source, and a chip that stores a unique identifier. Generally speaking, there are three different types of RFID tag - active tag, semi-passive tag, and passive tag. The active tag has its own power source, semi-passive tag’s own power source is limited, and passive tag gets its power externally from the reader through inductive coupling. Although functionally the active tag is better, in practice the most commonly used tag is the passive tag primarily because of its relatively low production cost that is approaching commercially viable levels. This is also one of the crucial reasons for why RFID is increasingly being adopted by enterprises and enjoying popularity across various application domains [1], [2], [8].

From a business operations perspective, RFID technology can be beneficially utilized to improve operational efficiency in domains such as manual inventory taking, warehouse picking, and verifying order numbering. From the perspective of information technology, RFID systems enable enterprises

to seamlessly capture source data that can be further processed and used for making decisions. Despite its beneficial aspects, challenges revolving around RFID data are anticipated. For example, a recently presented RFID data life cycle matrix in the context of supply chain highlights some of the more common RFID data related issues. Specifically, from an applications logic perspective, RFID data read related issues are situated in the top of the list as important topics in the matrix. Indeed, incorrect RFID data generated by the reader is a critical concern for enterprises. This has been cited as a major hindrance to more widespread adoption of RFID technology [1], [4], [5]. This is due to the fact that this issue is directly related to the most essential question of whether the enterprise can rely on data generated by reading RFID tags to identify the presence of the object (i.e., tag) of interest. In reality, over 30% of tags are read incorrectly, reducing the usefulness of RFID data in higher-level business applications [3]. When RFID tags are read, we identify a RFID read signifying its presence as true reading, and an RFID reading signifying its absence as false reading. These can be further classified into true positive reading and true negative reading, false negative reading and false positive reading respectively. Essentially, true positive reading refers to the case where the tag is identified to be present by the reader while it is in the field of the reader. True false reading refers to the case where the tag is read as being absent by the reader because it is truly not in the field of the reader. False negative reading refers to the case where the tag is read as being absent while it actually is present in the field of the reader. For example, a metal obstacle between tag and reader could prevent the reader from accurately receiving signal, communication between reader and tag could also be interfered by other irrelevant signal sources, or signals get cancelled when multiple tags are simultaneously read by a reader. False positive reading refers to the case where the tag is read as present by the reader when it is really not in the field of the reader. Compared to a false negative reading, false positive reading is rarely a problem although it does occur. For example, at a receiving center the reader can read a pallet but not a case contained in the pallet, while both pallet and case are read at the shipping center before being sent to the receiving center. This could be due to the reader incorrectly reading the case at the shipping center rather than missing the case at the receiving center when the reader is malfunctioning or capturing the tag that is present

978-1-4244-2963-9/08/$25.00 © 2008 IEEE ADCOM 2008221

outside the normal field of the reader. Other than the examples mentioned above, RFID read data would also be influenced by such factors as signal angle and signal reception, signal deflection, and refraction from materials, among others [2], [3], [6].

Generally speaking, while low power-consumption and low manufacturing cost are two key factors that have prompted enterprises to adopt RFID tag technology. These are also two of the major reasons for incorrect RFID reads, slowing large scale deployment of RFID technology. Therefore, the need to effectively filter RFID read data are definite and pressing. In this paper, we attempt to address this issue by proposing a convenient decision support model to help filter RFID read data. The rest of the paper is organized as follows. We first survey some related methods that have been proposed in the literature to address the incorrect RFID read problem (Section II), and then propose our model (Section III). Following this, we present a simulation scenario to test our proposed model and then evaluate its performance (Sections IV and V). We conclude the paper with a brief discussion and propose further extensions to this study (Section VI).

II. LITERATURE REVIEW The task of filtering data read from RFID tags is not trivial simply because RFID tag reads can be either positive or negative and these in turn are interwoven with both true and false readings, deeply worsening its complexity. So far, most related methods in the published literature employ a sliding-window approach to determine whether an RFID read is true of false. Basically, the idea behind this approach is to enable the RFID data reads as tags enter and exit a pre-defined time window. Moreover, if the count of the reads for a specific tag is less than a pre-determined threshold in this time window, the reads for this tag are considered to be false. We now consider two typical sliding-window based methods. By controlling for two parameters, the size of window and the threshold for determining a reading to be false, Bai et al. [6] presented an approach to clean false reads while preserving the order in which the reads took place. However, in their method, the solution to filter out the false positive reading (i.e., noise) seems to apply to only the situation when the other irrelevant tag is read. For example, if the tag of interest has identifier 1 (i.e., tag 1) and the observed RFID reads are 1, 1, and 9, based on their method it is not difficult to identify that the third reading is a false positive reading. However, what if the observed RFID reads are 1, 1, and 0, assuming 0 refers to the situation when the reader doesn’t read tag 1? In this case, the third reading could be a true negative reading, or false negative reading, and their method might not work well under such circumstances. Contrary to the method that uses deterministic parameters, Jeffery et al. [3] purport to determine a “right” window size automatically and continually modify it based on observed readings over the life time of the system. They claim that the main challenge for the sliding window based method is to distinguish between periods of dropped reads and periods when a tag has moved away. Further, they also claim that a

small window size tends to cause false negative reads, while a large window size tends to cause false positive reads. In order to address this problem, they attempt to view RFID reads as a random sample of tags in the field of the reader, so that they can develop a method that is grounded in statistical sampling theories to adaptively derive the size of the sliding window. However, even if the window size is adaptively set, other concerns still remain. For example, what if a metal shielding is present between tag and reader while a tag is being read by a reader? In such a case, there is barely any signal for the reader to detect the tag. Therefore, even with a large window size, the performance of this method is suspect. Consequently, the generation of a series of false negative readings is inevitable, although the tag is truly present in the field of the reader.

In a manner different from the sliding window approach, Tu and Piramuthu [8] develop methods to reduce false reads based on the belief that a certain extent of the false read problems could be alleviated when communication between tag and reader is achieved somehow regardless of the presence of signal-blocking entities such as a metal shielding.. For example, a tag might be “visible” to a reader at one orientation but might be “invisible” to a reader in another orientation because the obstacle (e.g., a metal shielding) affects communication between tag and reader in one orientation but not in the other. Hence, their method primarily relies on simultaneously employing multiple readers or tags in order to take advantage of varied signal orientations, although the idea of using several readers itself is not novel [7]. The essence of their method can be stated as follows: the tagged object can be identified to be confirmed as present or absent if consistent reads are generated by both readers; otherwise, the tagged object is identified using a pre-determined probability P. Undoubtedly, their method introduces a promising way to address false read related problems. Specifically, the situation where both readers’ readings differ need to be addressed, and they propose a set of heuristics. In addition to the case where there is no blockage of communication between tag and reader, their approach addresses the scenario where a tagged object is visible to one reader but invisible to the other reader. Although a fixed P works reasonably well, there are reasons to believe that this can be improved even further. Moreover, their discussion on how a suitable P is determined is also limited.

III. PROPOSED MODELS After carefully considering published literature in the area

of cleaning false RFID reads, we believe that majority voting and multiple orientations are two dominant features that have been used to improve performance in these systems. Therefore, by incorporating these features, we propose an integrated decision support model to help enterprises determine whether the objects (or, tags) are truly present. Specifically, our proposed model is also capable of effectively dealing with the situation where both readers disagree with each other in identifying a tag. In order to fully utilize the advantages of

222

triangulation and data inference, we utilize data reads from three tags and two readers. We also assume that all three tags are embedded in our object of interest. Moreover, in addition to our proposed model (Model M1), we present another model (Model M2) for comparison purposes.

A. Model M1 Model M1, our proposed model, operates based on the

following three rules.

Rule 1: Using majority voting to determine whether the tagged object is present or not. For example, if both readers’ readings are Reader 1 (tag1: present, tag2: present, tag3: present), and Reader 2 (tag1: present, tag2: present, tag3: present), our model will decide that the tagged object is present because a majority of the reads indicate that it is present. If Rule 1 fails, Rule 2 takes over the control. Rule 2: This works by choosing a relatively reliable reader’s identification to rely on. Here, the idea is to see if a reader can generate more consistent readings on the same tagged object based on reads from the ten most recent rounds. For example, if in the past ten rounds a reader’s readings on three tags are either Reader 1 (tag1: present, tag2: present, tag3: present) or Reader 1 (tag1: absent, tag2: absent, tag3: absent), this reader is definitely our preferred reader and our model will rely on this reader, unless the other reader can do the same thing. Nevertheless, if Rule 2 too fails, Rule 3 will make the call. Rule 3: This works by following the previous model’s decision on the same tagged object a read cycle ago. For example, if readings from both readers are Reader 1 (tag1: present, tag2: present, tag3: present) and Reader 2 (tag1: absent, tag2: absent, tag3: absent), Rule 1 would fail. If both readers’ readings are similar in the immediate previous ten rounds, although such situations may not occur often, Rule 2 fails as well. Therefore, our model will determine that the tagged object is present if in the immediate past read cycle, the model determined that the object is present.

B. Model M2 Model M2, our base model, operates based on the

following rules. If both readers agree on tag1’s presence, the model will determine that the tagged object is present; if not, the model will check to see whether an agreement exists on tag2s presence; if not, the model will try to see if an agreement exists on tag3s presence; otherwise, the model will finally make the call randomly. In other words, there is a 50% chance that the tagged object is identified as being present.

IV. MODEL SIMULATION We simulate both of our models 10 times with 1000

readings per run with the following assumptions: Three tags (tag 1, tag 2, and tag 3) are always present (or, absent) together when read by readers (Reader 1 and Reader 2). If our object of interest (Target 1) is present near the reader, all three of its embedded tags will be present in the field of reader. The

probability that the object is present is set as 0.5, signifying that the random transitions of target are also considered. Moreover, the read accuracy rates for the readers (Reader 1 and Reader 2) are set at 40 % (low), 70 % (fair), and 90% (good), i.e., when a tag is truly present in the field of a reader, the reader can successfully identify its presence with probability 0.4, 0.7, or 0.9 respectively. Consequently, there are nine probability scenarios that need to be tested: P1 (Reader 1= 0.4, Reader 2 =0.4), P2 (Reader 1= 0.4, Reader 2 =0.7), P3 (Reader 1= 0.7, Reader 2 =0.4), P4 (Reader 1= 0.4, Reader 2 =0.9), P5 (Reader 1= 0.9, Reader 2 =0.4), P6 (Reader 1= 0.7, Reader 2 =0.7), P7 (Reader 1= 0.7, Reader 2 =0.9), P8 (Reader 1= 0.9, Reader 2 =0.7), and P9 (Reader 1= 0.9, Reader 2 =0.9).

V. RESULTS Based on the scenario settings above, the results in the form of false read rates are derived and presented in Figure 1. From Figure 1, we can see the false read rates are clearly correlated with the accuracy rates of readers, and generally speaking our proposed model M1 has superior performance compared to the base model M2. Moreover, there are a few other interesting observations that we discuss below. Firstly, M1 is clearly dominant as it outperforms M2 for almost all considered probability scenarios except P1. Besides, P1 is a peculiar case where both M1 and M2 have rather high false read rates (around 70%). In other words, both models are far worse than a model that could just flip a coin to determine whether the objects (or, tags) are truly present. Since under certain situations M2 decides randomly (i.e., probability = 0.5), it makes sense that M2 results in a slightly lower false reading rate at P1. In practice, nevertheless, situations where both readers remain at low accuracy rate would occur infrequently. For example, both readers happen to be out of order during the same time period and therefore continue to generate incorrect RFID read data. Secondly, we observe that false read rates under M1 and M2 are very similar at P1, P2, and P9. This is primarily because in these scenarios both readers are at the same accuracy rate levels. Since the main strength of M1 is to choose a “better” reader to follow, it might slightly outperform M2 when both readers are at the same accuracy rates. Nevertheless, M1 performs better at P2 and P9 but worse at P1. The reason for this is that while M1 takes majority voting to make its decision, M2 does not. Accordingly, since at P2 and P9 both readers are at high accuracy rates, the majority of reads from both readers would also be more correct, resulting in lower false read rate of M1 at P2 and P9. Conversely, since at P1 both readers have quiet low accuracy rates, the majority of reads from both the readers would also be more incorrect, partially contributing to the lower false read rate of M2 at P1. Thirdly, the gap between M1 and M2 are consistent (or, flat) for P2 – P3, as well as for P4- P5 and for P7 – P8. This could be explained by the fact that in such scenarios the differences in accuracy rates of both readers are also

223

consistent. For example, at P2 Reader 1 is at 40% accuracy rate and Reader 2 is at 70% accuracy rate, while at P3 Reader 2 is at 40% accuracy rate and Reader 1 is at 70% accuracy rate. Therefore, to M1 and M2, the average accuracy rates of both readers are the same (55%). Besides, since the “better” reader for M2 will be the reader at the same accuracy rate (70%), the overall difference in false read rates between M1 and M2 has to be very similar for P2 - P3. Additionally, the same explanation applies to P4 - P5 as well as P7 - P8.

Finally, it is not surprising that at P4 and P5 M1 greatly outperforms M2. This is primarily because the difference in accuracy rates between the two readers is highest at P4 and P5 compared to that in the other cases. At P4 and P5 one reader is at 40% accuracy rate while the other is at 90% accuracy rate, such that M2 is able to fully use its edge on selecting a “better” reader to follow and thus generate a lower false read rate than M1.

VI. CONCLUSION It is clear that RFID technology continues to be of interest to enterprises, and better, powerful, and low-cost RFID devices are being introduced continually as they are being utilized in applications involving various domains. Nonetheless, data unreliability induced by incorrect RFID reads remains a major concern. Although RFID data read occurs at the lowest level in a decision-making scenario, incorrect reads can translate to significant losses for the corresponding enterprises because of bullwhip effect. i.e., impaired lower-level outputs indirectly or directly permeate and affect every layer of higher-level decision making quality, causing a cascade of lower quality decisions that could culminate in a disaster for the enterprise. Indeed, since error-free reads exist almost only in laboratories where researchers’ focus usually are generally different from that of the enterprises, it is clearly imperative to develop an approachable and workable solution for these enterprises to filter RFID read data. We presented a convenient decision support model to help enterprises filter out incorrect RFID data reads when an RFID

tag is present (or, absent) in the field of a reader. Inheriting merits from multiple reader based method, our model not only has such advantages as the ability to utilize the inference of collected read data to minimize additional burden on the system and the ability to identify tags that move rapidly in and out of the field of the reader, among others, but also it would not be plagued by inconsistent tag identifications from multiple readers. For example, one reader indicates a tag is present while the other reader doesn’t. Results from our preliminary evaluations appear promising and show evidence that our model is a step in the direction to effectively attain the goal of maintaining relatively low false RFID read rates. There still remain some issues that warrant further consideration. For example, our model performs poorly when multiple readers simultaneously result in low read rate accuracy, even though such a situation is rare but not impossible to encounter in any realistic setting. Additionally, we are also in the process of exploring whether a descriptive model is more competitive than a prescriptive model to help enterprises address this false RFID read problem. This is because from the perspective of a tagged object of interest, what really matters varies across application domains and enterprises. For instance, it is likely that some firms would never worry about false positive RFID reads since in their working environment it is extremely rare to encounter such a problem. Therefore, perhaps a customized RFID data read filtering model is more beneficial to these firms. In sum, further improvements and innovations on effectively collecting, refining, and appropriately utilizing RFID data are clearly necessary. This is primarily due to the fact that enterprises expect RFID technology to provide them with more reliable source data to make decisions, and we hope that our contribution in this paper will have a positive impact on this prospect.

REFERENCES [1] Y. F. Niederman, R. Mathieu, R. Morley, and I. K. Won “Examining

RFID Applications in Supply Chain Management”, Communication of ACM, vol. 50, pp. 93-101, Jul. 2007.

[2] S. S. Chawathe, V. Krishnamurthy, S. Ramachandran, and S. Sarma. “Managing RFID Data,” in Proc. of the International Conference on Very Large Data Bases (VLDB). Toronto, Canada. Aug. 2004.

[3] S. R. Jeffery, M. N. Garofalakis, and M. J. Franklin, “Adaptive Cleaning for RFID Data Streams,” in Proc. of the 32nd VLDB Conference, pp. 163-174, 2006.

[4] H. Vogt. “Efficient Object Identification With Passive RFID Tags,” in Proc. International Conference on Pervasive Computing (Pervasive2002) pp. 98-113, April 2002

[5] J. Brusey, C. Floerkemeier, M. Harrison, and M. Fletcher, “Reasoning about Uncertainty in Location Identification with RFID,” in Proc. of the IJCAI-03 Workshop on Reasoning with Uncertainty in Robotics, 2003.

[6] Y. Bai, F. Wang, and P. Liu. “Efficiently Filtering RFID Data Streams,” in Clean DB Workshop, pp.50-57, 2006.

[7] B. Carbunar, A. Grama, J. Vitek, and O. Carbunar. “Redundancy and Coverage Detection in Sensor Networks,” ACM Transactions on Sensor Networks, 2(1), pp. 94-128, Feb. 2006.

[8] Y. Tu and S. Piramuthu. “Identifying RFID-Embedded Objects in Pervasive Healthcare Applications,” Decision Support Systems, 2008.

0%

10%

20%

30%

40%

50%

60%

70%

80%

P1 P2 P3 P4 P5 P6 P7 P8 P9

Probability Scenarios

Fals

e R

eadi

ng R

ate

M2 M1

Fig. 1. Average false read rates for Reader 1 and Reader 2

224

Documents

[IEEE 2008 16th International Conference on Advanced Computing and Communications (ADCOM) - Chennai, India (2008.12.14-2008.12.17)] 2008 16th International Conference on Advanced Computing