41
This article was downloaded by: [University of Saskatchewan Library] On: 05 May 2013, At: 02:29 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Mathematical Thinking and Learning Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/hmtl20 The Longitudinal Development of Understanding of Average Jane M. Watson & Jonathan B. Moritz Published online: 18 Nov 2009. To cite this article: Jane M. Watson & Jonathan B. Moritz (2000): The Longitudinal Development of Understanding of Average, Mathematical Thinking and Learning, 2:1-2, 11-50 To link to this article: http://dx.doi.org/10.1207/S15327833MTL0202_2 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms- and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

The Longitudinal Development of Understanding of Average

Embed Size (px)

Citation preview

Page 1: The Longitudinal Development of Understanding of Average

This article was downloaded by: [University of Saskatchewan Library]On: 05 May 2013, At: 02:29Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

Mathematical Thinking andLearningPublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/hmtl20

The Longitudinal Developmentof Understanding of AverageJane M. Watson & Jonathan B. MoritzPublished online: 18 Nov 2009.

To cite this article: Jane M. Watson & Jonathan B. Moritz (2000): The LongitudinalDevelopment of Understanding of Average, Mathematical Thinking and Learning,2:1-2, 11-50

To link to this article: http://dx.doi.org/10.1207/S15327833MTL0202_2

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden.

The publisher does not give any warranty express or implied or make anyrepresentation that the contents will be complete or accurate or up todate. The accuracy of any instructions, formulae, and drug doses should beindependently verified with primary sources. The publisher shall not be liablefor any loss, actions, claims, proceedings, demand, or costs or damageswhatsoever or howsoever caused arising directly or indirectly in connectionwith or arising out of the use of this material.

Page 2: The Longitudinal Development of Understanding of Average

The Longitudinal Developmentof Understanding of Average

Jane M. Watson and Jonathan B. MoritzFaculty of Education

University of TasmaniaHobart, Australia

The development of the understanding of average was explored through interviewswith 94 students from Grades 3 to 9, follow-up interviews with 22 of these students af-ter 3 years, and follow-up interviews with 21 others after 4 years. Six levels of re-sponse were observed based on a hierarchical model of cognitive functioning. Thefirst four levels described the development of the concept of average from colloquialideas into procedural or conceptual descriptions to derive a central measure of a dataset. The highest two levels represented transferring this understanding to one or moreapplications in problem-solving tasks to reverse the averaging process and to evaluatea weighted mean. Usage of ideas associated with the three standard measures of cen-tral tendency and with representation are documented, as are strategies for problemsolving. Implications for mathematics educators are discussed.

36. What is the average daily attendance at a school, if there are present on Monday,Tuesday, etc., the following numbers of children, 43, 39, 37, 51, and 50?37. Give yourself the average of 44, and other days’ attendance except Tuesdays [sic],and find it.40. I buy three parcels of eggs, whose numbers are as 5, 7, 9 and their prices per dozenas 8, 6, 4. How must I sell them so as neither to win or lose? (Capel, 1885, p. 205)

Students have been resolving problems involving the arithmetic mean for over 100years. The examples just given show that working the algorithm in a forward direc-tion as well as working the algorithm backward and determining a weighted meanhave been among the problem-solving tasks for students.

MATHEMATICAL THINKING AND LEARNING, 2(1&2), 11–50Copyright © 2000, Lawrence Erlbaum Associates, Inc.

Requests for reprints should be sent to Jane M. Watson, Faculty of Education, University of Tasma-nia, GPO Box 252–66, Hobart, Tasmania 7001, Australia. E-mail: [email protected]

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 3: The Longitudinal Development of Understanding of Average

For a long time,averagein the school curriculum was synonymous with thearithmetic mean (e.g., Pendlebury & Robinson, 1928), and by the middle of the20th century, the mean was still the measure of central tendency discussed inmathematics books (e.g., Denbow & Goedicke, 1959; Hart, 1953). Throughout thecentury, books covering elementary statistics for users (e.g., Holman, 1938) notedthe mean, median, and mode, but these three measures only reached the schoolcurriculum in a definitive way with the formal inclusion of probability and statis-tics, or chance and data, in the mathematics curriculum in the past decade. Theconcept of average, with all three standard measures, has now been included incurricula as part of data handling in meaningful contexts, reflecting the require-ment of an expanded curriculum to find representative values for data sets in vary-ing contexts, not just to illustrate an arithmetic procedure.

The importance of the three standard measures of average has been acknowl-edged in curriculum documents of many countries, including Australia (AustralianEducation Council [AEC], 1991, 1994), England and Wales (Department for Edu-cation [DFE], 1995), and the United States (National Council of Teachers of Math-ematics, 1989). These documents have emphasized understanding and usage ofmeasures of average in appropriate contexts. For example, at Key Stage 2 of DataHandling in England and Wales, “collecting, representing and interpreting data”includes “understand and use measures of average, leading toward the mode, themedian and the mean in relevant contexts” (DFE, 1995, p. 10). In New Zealand,the curriculum also stresses the importance of “statistical investigations within arange of meaningful contexts” in relation to study of the mean, median, and mode(Ministry of Education, 1992, p. 186).

The importance of context in understanding and applying average has also beenrecognized in research that documents students’ abilities in this area. Gal,Rothschild, and Wagner (1990) observed that students rarely use the mean sponta-neously in the context of comparing two data sets. Mokros and Russell (1995)asked students about average in the contexts of pocket money and of potato chipsprices. Watson and Moritz (in press) asked students what it meant “if someone saidyou were average” and what was meant by the termaveragein a context wherehouse prices were being reported in a newspaper article. Although the proceduresstudents used to find a measure of average were of interest in these studies, thecontexts in which average was applied provided much of the information on stu-dent understanding.

Interaction between research and practice has been prominent in this area ofmathematics education. From the beginning of the Used Numbers project (e.g.,Friel, Mokros, & Russell, 1992), the links between research (Mokros & Russell,1995) and advocacy of practice (Friel, 1998; Russell & Mokros, 1996) have beenclearly seen. Similarly, the work of Gal (1995) and colleagues has led to specificsuggestions for teachers in relation to average. Curriculum documents, both offi-cial and commercial, and articles for teachers have stressed the importance of pro-

12 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 4: The Longitudinal Development of Understanding of Average

cedure and context for understanding. Gal, for example, advocated that “toevaluate students’ understanding of averages, teachers should use tasks presentinga genuine need tousean average” (p. 99). Similar admonitions have been made toteachers in Australia (Watson, 1996, 1998).

This investigation continues research on the concept of average by consideringstudents’ understanding of the term in everyday usage and in the contexts of num-ber of hours watching television and number of children in a family. Two morecomplex problem-solving tasks include postulating a distribution from a given av-erage and determining a weighted mean.

PREVIOUS RESEARCH

Previous research is summarized in the next two sections. The first reviews studiesof understanding of average by researchers over the last two decades to provide ahistorical background for the research presented here. The second section reviewstwo studies arising from a research project of which this investigation is a part.

Studies by Other Researchers

Research on students’ understanding of average began at the university level withthe complex concept of weighted mean. Pollatsek, Lima, and Well (1981) reportedon tertiary students’ inability to handle problems associated with weighted means.These difficulties were confirmed by Mevarech (1983), who considered weightedmean errors from the point of view of the mathematical axioms involved, and byReed (1984), who considered them for algebra word problems in the contexts of av-erage speed, work time, and mixtures. Hardiman, Well, and Pollatsek (1984) at-tempted to alleviate the problem with specific instruction on the balancing proper-ties of the mean, whereas Mevarech used a feedback–corrective procedure duringinstruction. Both reported some success in achieving improved performances.

Subsequent research investigated properties related to the arithmetic mean.Goodchild (1988) found that high school students’ understanding of average gen-erally was not sophisticated in terms of representativeness, location, and expecta-tion. Strauss and Bichler (1988), working with 8- to 14-year-old students, usedtasks involving situations of equal sharing without mention of the termaverage.Seven properties of average they investigated included mathematical properties(e.g., the sum of deviations from the mean is zero), statistically abstract properties(e.g., the average can be a number with no counterpart in physical reality), and rep-resentativeness. Different properties emerged in the responses of students at dif-ferent ages, although no specific developmental structure was posited for theacquisition of understanding of the properties. This work was partially replicated

LONGITUDINAL DEVELOPMENT 13

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 5: The Longitudinal Development of Understanding of Average

by Leon and Zawojewski (1991), with results confirming those of the original re-searchers.

More recently, Mokros and Russell (1995) used a range of problems to investi-gate the different approaches of students to the average concept and the relationbetween average and a data set. Their in-depth interviews, with 21 children inGrades 4, 6, and 8, detailed five approaches to solving a series of concretedata-based problems. Two approaches, considering average as mode or as algo-rithm for the mean, were termednonrepresentativein the sense that they did notimply a concept of average as a representative value. Three approaches, construingaverage in terms of reasonable value, midpoint, or point of balance, were consid-ered to reflect some notion of average as arepresentativevalue for the data set un-der consideration.

Cai (1995, 1998) considered the relation between knowing the procedure tocalculate the arithmetic mean and understanding the concept of the mean wellenough to work backward to provide missing data values in a context. Cai(1995) found that 88% of 250 Grade 6 students could identify correctly the algo-rithm for the mean in a multiple-choice question, but only half could apply theprocedure to determine a missing data value to yield a certain average. Many ofthose who achieved success used a strategy of reversing the averaging algo-rithm, including undoing division by multiplying, whereas others appeared re-stricted to guess-and-check methods while always working the algorithmforward (Cai, 1998). Despite the fact that the data were presented in apictograph, very few students used the leveling or balancing approaches advo-cated by some authors (Friel, 1998; Friel et al., 1992; Meyer, Browning, &Channell, 1995).

Studies by the Current Researchers

Watson and Moritz (in press) considered the development of concepts of averagefor students in Grades 3 to 11 using four survey items administered to 2,250 stu-dents. Two items were designed to access everyday meanings of average in authen-tic contexts, including the use of the more specialized termmedian.The other twoitems were multiple choice, including a request for explanation, with one asking forthe median value of a small data set. These two items were designed to assess appli-cation and calculation in straightforward settings. Using a neo-Piagetian model ofcognitive development (Biggs & Collis, 1982, 1991), student responses to eachitem were classified in a hierarchical fashion that reflected the structure of the ob-served learning outcomes within the target domain of the task set. For each item, thetarget domain was associated with the learning that takes place during the years ofschooling. Within the expectations set by the tasks, the following levels of perfor-

14 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 6: The Longitudinal Development of Understanding of Average

mance were observed related to the construction of at least one of the three basicmeasures of average:

• Prestructural responses did not include any clear concept of average. For ex-ample, when asked, “If someone said you were average, what would it mean?” aGrade 5 student replied, “You are not the best friend for me.”

• Unistructural responses consisted of a single relevant aspect from the domainof the task set. Another Grade 5 student in answering the aforementioned questionsaid, “That you were okay.”

• Multistructural responses included two or more aspects of the domain of thetask, usually presented in sequence. For example, a Grade 5 student responded tothe question with “It is if you have some numbers, you add them together and thendivide by how many there are,” reflecting the algorithm for the arithmetic mean. AGrade 9 student responded, “Not really good and not really bad; in between,” re-flecting the idea inherent in the median.

• Relational responses exhibited an integrated understanding of the relations in-volved in the task set. Students responding at this level, for example, knew that themedian was the middle value of a data set and could find it after ordering the valuesgiven. When asked why the median might be used rather than the mean to reporthouse prices, a Grade 9 student appreciated the merits of each for representation inthe context: “Because it shows a fair representation of the prices. If the average wasused, a particularly cheap or expensive house would muck up the fair representa-tion.”

Over Grades 3 to 11, response levels improved with grade: Seventy percent ofGrade 3 students responded at the prestructural level, and none gave relational re-sponses; at Grade 11, only one student response was prestructural, and 54% wererelational.

Watson and Moritz (in press) also reported percentages of open-ended re-sponses associated with the three ideas of mean, median, and mode. For the itemabout the student “being average,” of those responses that could be related to oneof the three terms, 1% were related to the mean, 86% to the median, and 18% to themode. (The percentages add up to more than 100 due to some multiple responses.)When asked whataveragemeant in the phrase “the average wage earner finallycan afford to buy the average home,” 9% of responses were related to the mean,60% to the median, and 36% to the mode (again, some multiple replies). These re-sults indicate that, in these contexts, a majority of students conceptualized averagein terms of “middle,” with the idea of “the most frequent” used by fewer studentsand the algorithmic idea of “mean” seldom employed. This investigation looksmore deeply into students’ understanding of average, particularly in contexts re-quiring the arithmetic mean.

LONGITUDINAL DEVELOPMENT 15

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 7: The Longitudinal Development of Understanding of Average

Students’ application of average in a problem-solving setting was reported byWatson and Moritz (1999) based on interviews with 88 students in Grades 3 to 9, asubset of the students discussed in this investigation. Assigned tasks of comparingtwo data sets presented in graphical form (cf. Gal, Rothschild, & Wagner, 1989),students often compared the data sets by totaling or by making visual comparisons,rather than by employing the arithmetic mean or any other measure of average.When comparing graphs of two data sets of different size, only 2 of 37 mid-dle-school students (Grades 5 to 7) and 8 of 28 Grade 9 students used the mean tocompare the sets. In this more complex context of unequal-sized sets, responsesthat addressed the unequal size appropriately were categorized into three levels:Unistructuralresponses used a single visual comparison to justify a decision aboutwhich group “had done better”;multistructuralresponses used multiple-step vi-sual comparisons or numerical calculations involving the arithmetic mean to com-pare groups; andrelational responses integrated all available information, bothfrom visual comparison and calculation of means, to support a decision on whichgroup “had done better.” These three response levels paralleled those of the earlierstudy of the understanding of average (Watson & Moritz, in press). However, theoverall cognitive functioning in terms of comparing two data sets represented ahigher cycle of functioning related to application of the average concept in the con-text of the task set.

It is important to note that the task based on comparing data sets analyzed byWatson and Moritz (1999) did not set out to study the students’ use of the arithme-tic mean. In fact, the statistic was not mentioned at any time by the interviewerwhile that task was completed. The observations about the use of the mean werethus based on spontaneous usage of the statistic in the given context. Some of thequestions included in this study, although reflecting the level of complexity of thetask comparing two unequal-sized data sets, address issues more specifically re-lated to average and the contexts in which the concept is expected to be used.

RESEARCH QUESTIONS

This investigation aimed to extend the previous research on students’ approaches todealing with the concept of average in several ways. We did not find any previousstudies that conducted longitudinal interviews to explore how student understand-ing changed over a number of years, which was a feature of this investigation. Inparticular, this investigation sought to do the following:

1. Confirm, in an in-depth interview setting, the survey findings of Watson andMoritz (in press) concerning (a) the structure associated with building an initialconcept of average; and (b) the use of ideas associated with mean, median, andmode in various contexts.

16 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 8: The Longitudinal Development of Understanding of Average

2. Document developmental change in approaches to the concept of averageand associated usage of ideas related to mean, median, and mode that occurs forstudents over a 3-year or 4-year period.

3. Document evidence of ideas associated with representativeness in relationto development of the concept of average (Mokros & Russell, 1995).

4. Consider the development of problem-solving skills in relation to applyingthe arithmetic mean in two complex tasks.

METHOD

Participants

The entire investigation was based on 137 student interviews, involving an initialsample and two longitudinal subsamples. The initial sample consisted of 94 stu-dents from two Australian states, South Australia and Tasmania, in Grades 3 to 9.Tasmanian students included nearly equal numbers of boys and girls, drawn fromgovernment schools: 21 Grade 3 students (8–9 years old), 23 Grade 6 students(11–12 years old), and 20 Grade 9 students (14–15 years old). South Australian stu-dents were selected from a private girls’ school, including 6 Grade 3 students (8–9years old), and 8 students each from Grades 5 (10–11 years old), 7 (12–13 yearsold), and 9 (14–15 years old). Elementary school is completed at Grade 6 in Tasma-nia and Grade 7 in South Australia. All students had taken part in a large-scale writ-ten survey of concepts in chance and data (Watson, 1994) and were selected for in-terview to cover a range of apparent abilities, including students who gaveinteresting or unusual responses to the survey. Teachers confirmed that the studentswere able and willing to be interviewed, and students were told they could stop theinterview at any time. None did.

For Longitudinal Study 1, which occurred 3 years after the first data collection,22 of the South Australian students (5 from Grade 3, 6 from Grade 5, 6 from Grade7, and 5 from Grade 9) were again interviewed using the same protocol. Similarlyfor Longitudinal Study 2, which occurred 4 years after the first data collection, 21of the Tasmanian students (8 from Grade 3, 9 from Grade 6, and 4 from Grade 9)were interviewed again. Most attrition was due to change in schools or residencyof some students, although ability-related attrition occurred for Tasmanian Grade9 students who had left school 4 years later; only those studying at the Universityof Tasmania were located. All students located agreed to participate.

The specific learning experiences of the students in this investigation with re-spect to the three standard measures of average are unknown. The most influentialmodel for classroom practice at the time of the initial interviews wasA NationalStatement on Mathematics for Australian Schools(AEC, 1991). The document out-linedthemathematicscurriculuminfourbands(AtoD) forGrades1to12.Measures

LONGITUDINAL DEVELOPMENT 17

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 9: The Longitudinal Development of Understanding of Average

ofcentral tendencywere firstmentioned inBandC, forGrades7to10,with the termsmean, median, mode, andaverageintroduced with expectations of students beingable to choose appropriate measures and interpret them. Between the time of the ini-tial interviewsandthe longitudinal interviews,anoutcomes-baseddocument,Math-ematics—ACurriculumProfile forAustralianSchools(AEC,1994),waspublished.This document was again available to classroom teachers in both states and de-scribed “theprogressionof learning typicallyachievedduring thecompulsoryyearsofschooling (Years1–10)…divided intostrands…and intoeight levelsofachieve-ment” (p. 1). The termmeanwas mentioned at Levels 4, 5, and 7 of the profile;mid-dle scoresat Level 4;medianat Levels 5 and 7; andmodeat Level 5, although theobservationofamost frequentdatagroupwasnotedatLevel2.Thus, itwouldbeex-pected that the students in this study would have been exposed to the three standardmeasures of average by Grade 7.

Protocol

The interview protocol is shown in Figure 1. Parts 1 and 2 are similar to items of Galet al. (1990). Part 3 is original and reflects the information given in an automobileadvertisement shown in Australia at the time the interviews were conducted. Part 4is adapted from Pollatsek et al. (1981). Each successive part involves additionalstructural complexity to assess the level of student understanding (cf. Collis &Romberg, 1992).

Another interview protocol based on comparing two data sets presentedgraphically (Watson & Moritz, 1999), devised from protocols of Gal et al.(1989, 1990), was administered immediately before the interview protocol deal-ing with average. The final part of this protocol asked students to comparegroups of unequal sizes, and those responses that included mention of averagewere included as part of this investigation to further illuminate responses to theprotocol for average. Some students’ responses to this task influenced what wasasked about average. For example, if a complete understanding of the mean hadalready been demonstrated, this was taken as evidence for Parts 1 and 2 of theprotocol for average (see Figure 1).

Procedure

The protocol for average was one of up to nine protocols used with students within a45-min videotaped individual interview session. Other topics included pictographconstruction; bar graph interpretation; unfair dice; probability of dice outcomes;comparing data sets presented in graphs; and sample, random, and conditionalprobability (Watson, 1994). The protocol was occasionally cut short because the

18 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 10: The Longitudinal Development of Understanding of Average

interviewer felt that the student was becoming uncomfortably confused or frus-trated with the difficult latter parts. For the initial interviews, the Tasmanian inter-views were conducted by one of the two authors, and the South Australian inter-views were conducted by the second author. The second author conducted alllongitudinal interviews. Interviews were conducted at the students’ schools duringclass time. For Longitudinal Study 2, 4 students who were by then attending theuniversity were interviewed on one of the University of Tasmania campuses andwere given a small amount of money to compensate them for their time.

LONGITUDINAL DEVELOPMENT 19

FIGURE 1 Interview protocol for average. (Parts 1 and 2 are based on Gal, Rothschild, &Wagner, 1990; Part 4 is based on Pollatsek, Lima, & Well, 1981.)

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 11: The Longitudinal Development of Understanding of Average

Analysis

The primary data were the responses to the average protocol, although individualresponses to four survey items related to average (Watson & Moritz, in press) andthe protocol of comparing data sets (Watson & Moritz, 1999) were used to gain amore complete picture of each student’s conception of average in different con-texts. These data were available for all 94 students in the initial sample. For Longi-tudinal Study 1, no subsequent survey data were available. For Longitudinal Study2, students in Grades 7 and 10 had completed two subsequent longitudinal sur-veys—one of them in the year in which they were interviewed again—whereasGrade 13 (university) students had completed one subsequent survey 2 years priorto the second interview.

Categorization of response levels was based on the structure of observed learn-ing outcomes, as described earlier and in previous studies (Watson & Moritz,1999, in press). A clustering technique similar to that suggested by Miles andHuberman (1994) was used to determine overall performance, based on the vari-ous responses available, due to the lack of total consistency of responses acrossparts of the protocol and other items. Six levels were distinguished based on char-acteristics evident in the three sources of responses. We independently read tran-scripts of student interviews, viewed selected videotapes, and categorized levels ofresponses, with 80% agreement (110 of 137 interviews). Anomalies were resolvedby discussion of characteristics of each response.

Categorization of the usage of mean, median, and mode was based on variousexpressions that are related to the three terms as determined by Watson and Moritz(in press). For example, a response that average means “the same as most others”was associated with the mode, and “not good, not bad, but in between” was associ-ated with the median. Although Mokros and Russell (1995) chose to classify their21 students by the predominant measure of average used over the entire interview,the students in this study often expressed multiple constructs. Thus, it was decidedto analyze responses by taking into account multiple ideas. Responses to Parts 1and 2 were analyzed separately from responses to Parts 3 and 4 to explore differ-ences in responses between the contexts of the tasks. We independently catego-rized usage of ideas of average, agreeing on 95% of classifications (752 of 789cases, as 11 students were not asked Parts 3 and 4); anomalies were resolved bydiscussion.

Results are presented according to the four research questions. For the firstquestion, descriptions of the developmental levels of understanding in relation toaverage are presented in conjunction with specific examples of responses that il-lustrate levels of performance. Responses for all 137 students interviewed are thenclassified by grade and developmental level, providing information on the differ-ences in performance among grades. Summaries of the usage of the ideas associ-ated with mean, median, and mode (and more than one of them) are presented for

20 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 12: The Longitudinal Development of Understanding of Average

grade levels and response levels for each of the settings—general introductory(Parts 1 and 2) and complex contexts (Parts 3 and 4). For the second research ques-tion, responses offered in Longitudinal Studies 1 and 2 are compared with re-sponses of the same students from the initial sample. For the last two researchquestions, discussions of students’ acknowledgment of the average as representa-tive of a data set and of their problem-solving strategies for Parts 3 and 4 are pre-sented with examples drawn from all 137 student interviews.

RESULTS

Research Question 1: Level of Response and Use of IdeasRelated to Average

Characteristics of six levels of understanding of average are described in Table 1.At the Preaverage (P) level, students respond to the context but offer no clear idea ofaverage. The next three levels show increasingly complex structure (cf. Watson &Moritz, in press). The building of the concept of average involves Single ColloquialUsage (U); the construction of Multiple Structures (M) associated with mean, me-dian, or mode; and the recognition of the need for the average to be a Representation(R) of the data set. Applications of Average in Complex Tasks involve workingbackward from a decimal mean value and finding a weighted mean. The ability ofsome students to produce correct solutions in only one of these tasks (A1), whereasothers can use the mean in both tasks (A2), results in the distinction between the lasttwo levels. Examples follow to clarify the types of responses associated with eachlevel of observed performance.

In presenting excerpts from transcripts, information in square brackets, such as“[ Part 1a]” or “[ Is that what you’d expect?],” refers to questions in Figure 1 orfrom the interviewer. The symbol “[…]” represents deleted phrases, and “ … ”represents a pause by the student. Parentheses are used to indicate nonverbal ac-tions of the student, such as writing on paper. Each extended excerpt is annotatedwith the grade of the student who gave the response and with a serial reference, S1to S24, such as “[S1, Grade 5].”

Preaverage (P). Some students’ initial appreciation for average appearedto derive from out-of-school experiences, perhaps from hearing others use theword in conversation. Preaverage responses illustrate student understanding thathas not come to terms with data outside the student’s personal experiences assomething that can be described and summarized. The first response shows a stu-dent with no experience in the domain where an average would be an appropriatemeasure. When asked Part 3, the student reflected out-of-school experiences ofseeing an automobile advertisement, based on the claim that “the average young

LONGITUDINAL DEVELOPMENT 21

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 13: The Longitudinal Development of Understanding of Average

TABLE 1Characteristics of Six Levels of Students’ Understanding for the Concept of Average

Preaverage (P)Use no term for average, even in a colloquial senseTell imaginative stories about the contextOften are not asked complex questions by interviewer

Single Colloquial Usage for Average (U)Often use colloquial terms for average, such asnormalor okayOften use imaginative ideas related to the context to support responseSometimes refer to “add up” colloquially but not in a calculation senseMake no progress on complex questions

Multiple Structures for Average (M)Use at least one, often two or three, ideas—including most, middle, and the add-and-divide algorithm for the

mean—to describe average in straightforward situationsRarely use more than one of these ideas in complex questions and make little progress toward solutionsSometimes acknowledge conflict between incorrect calculations of mean and idea of mode

Representation With Average (R)Refer to add-and-divide algorithm for the mean to describe average in straightforward situations, often also

with ideas of most or middleOften realize association of decimal form with algorithm for meanOften express some idea related to the representative nature of average (e.g., prediction, estimation, or

representing whole data set)Often refer to most to describe data distributions compatible with mean or offer mode as alternative average

conceptKnow the mean but do not successfully apply it in complex contexts; make partial progress on more complex

contexts with prompting, for example, have sense of weighted mean not being exactly in the middle, butlack precision

Often use visual features in preference to the mean to compare data sets presented in graphs, although mayuse mean when prompted

Application of Average in One Complex Task (A1)Refer to add-and-divide algorithm for the mean to describe average in straightforward situations, often also

with ideas of most or middleOften realize association of decimal form with algorithm for meanOften express some idea related to the representative nature of averageApply understanding of the mean to determine total (Part 3), or apply weighted mean algorithm directly (Part

4), but not bothRarely refer to most to describe data distributions compatible with mean

Application of Average in Two Complex Tasks (A2)Refer to add-and-divide algorithm for the mean to describe average in straightforward situations, often also

with ideas of most or middleOften realize association of decimal form with algorithm for meanOften express some idea related to the representative nature of averageApply understanding of the mean to determine total (Part 3)Solve weighted mean problem by calculation (Part 4); when inappropriate calculations of weighted mean

yield an unusual result, revert to proportional reasoning to solve problemOften refer to most to describe data distributions compatible with mean or offer mode as alternative average

conceptOften use the mean to compare data sets presented in graphs

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 14: The Longitudinal Development of Understanding of Average

family has 2.3 children,” where a small boy appears with a “.3” printed on hisshirt.

P:[Part 1a] It means the same. [Part 1b] Around 3 hours. [Part 2] By watchingsomeone in their daily life. [Do you think they would have watched one per-son or lots of people?] Yes, one person. […] [Part 3] Well, that they have gottwo full grown kids, and one’s not full grown yet. [S1, Grade 5]

Storytelling was characteristic of many responses at this level, based on the stu-dents’ real or imagined experiences. The following comment by S3 about the aver-age of what people look like indicated that, although she considered average asdescriptive across a number of people, she did not consider numerical measure-ment of data to be a part of her concept of average.

P:[Part 2] Probably somehow got a little powerful little camera and disguised iton one student and then another student to see how long they have beenwatching the TV instead of probably getting on with their homework afterschool. [S2, Grade 3]

P:[Part 1a] People, you can have an average of people. […] Well, like it couldbe a certain number of people and what the average of what they look like.[…] [ Part 2] Well, maybe they might have something to eat and get changedwhen they come home, and in the middle of that they watch TV and then havetea and might watch a bit more and then probably go to bed. […] [Part 3a]Well, it might be because they might just want that many children or some-thing like that. [S3, Grade 5]

Single colloquial usage for average (U). Responses at the U level consis-tently involved colloquial language to describeaverageas normal or included men-tion of adding in a casual sense not specifically related to an algorithm. Responsesat this level were unistructural in nature (Watson & Moritz, in press), based on a sin-gle simple idea. The responses cited next were given by a Grade 3 student whowrote “normal” in response to the survey item, “If someone said you were average,what would it mean?” The student acknowledged use of cricket averages from ex-perience but had only a simple sense of what the term means in this context.

U:[Part 1a: Where have you heard “average”?] My stepbrother plays cricket,and he usually gets an average amount of runs, he’s always saying to my

LONGITUDINAL DEVELOPMENT 23

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 15: The Longitudinal Development of Understanding of Average

stepmum. [What does it mean?] It’s normal. [Part 1b] Usually. […] [Part 3a]That they have from two to three children. [S4, Grade 3]

The Grade 6 student cited next responded to the survey item with, “That I wasOK,” but could not respond with a meaning in the interview initially. For Part 3,the student resorted to out-of-domain reasoning to try to resolve the difficulty.

U:[Part 1b] About, sort of around, 3 hours, near. [Part 2] … They probably addup all the children’s hours of TV they watched, and they sort of comparedthem, and they like, all of them were around, sort of near 3 hours. […] [Part3a] They have two children and one little one, little kid that’s sort of, doesn’treally get to do everything with them or something. [How can the average be2.3 and not a counting number?] Well, someone might have two children, amum might have two children or something, and she might be pregnant. [S5,Grade 6]

The idea of .3 of a child before birth may make more sense in a part–whole decimalsense than the small child with “.3” printed on his shirt in the television advertise-ment. Cognitive conflict for this student was acknowledged at the end when she ap-peared puzzled and then exclaimed, “But on the ad they didn’t have it that way!”

Multiple structures for average (M). Responses at the M level consisted ofmore than a single colloquial idea and described the average in terms of “most,”“middle/in between,” or the add-and-divide algorithm. Many responses includedseveral ideas to handle the situation. Students giving these responses did not go on,however, to apply these ideas in the more complex tasks of Parts 3 and 4. The fol-lowing excerpt illustrates a Grade 5 student who had ideas of middle, most, and to-taling, but struggled with the question of children per family in Part 3.

M:[Part 1a] It would mean that you weren’t really bad at spelling, but you weren’treally good at it either. You were half and half. […] [Part 2: What might someof the children say?] Maybe that they didn’t watch any or that they watched alot more than 3 hours, and they might have all added up that the average was 3hours. […What would they do then?] They would probably add them all upand see which is more likely than the others. [Part 3a] Most Australian fami-lies have two children, but there are quite a few as well that have three. [S6,Grade 5]

The followingGrade7studenthadagraspofall three ideasandappreciatedacol-loquial and a technical meaning but again could not cope in more complex tasks:

24 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 16: The Longitudinal Development of Understanding of Average

M:[Part 1a] OK, doesn’t it mean in between, like the average, in between theheights, something like that. … But I know it in a maths term or something. Iknow how to work out an average. […] [How would you work it out?] Youcould just get four numbers or five numbers, six numbers whatever you want.They’re all different numbers, or they could be the same. You add them allup, so just say there are four numbers, you add them all up and divide by 4.Then you get your average. [Part 2] Well, probably the most would be three,so they have got three in there. [S7, Grade 6]

Another Grade 7 student had similar ideas of middle and the mean algorithm, thelatter being simply applied to the problem in Part 4 without taking account of theweighting.

M:[Part 1a] Average is like … the middle standard. It’s not very, very good butit’s not very, very bad. It’s like the middle. So, if you said the television pro-gram was average, it means that it wasn’t very good and it wasn’t very bad. […][Part 2] By working out the average like. Oh, I’ve forgotten how to do it, add itup and divide by the number of whatever. [Part 4] It’s 6. [Why do you say 6?]Because if you add 8 and 4, and then divide by 2, it’s 6. [S8, Grade 7]

Representation with average (R). Students whose responses were at thislevel mentioned a relation between the data set and the average measure that wasnot the simple definition itself. This relation may have included mentioning how adecimal number can be the result of division in the mean algorithm, mentioninghow the average represents the data for use in prediction, or referring to where mostof the distribution is likely to be without being limited to a modal definition. Forcomplex tasks, often a result was assessed for reasonableness. For the example inPart 3, however, the algorithm could not be reversed. For Part 4, students appearedto understand the idea of the arithmetic mean as presented for the two groups indi-vidually but could not work the algorithm backward to find the total number ofhours watched for each group. Some students discussed the possibility of workingthe problem in the forward direction if it were possible to know the exact number ofhours watched by each child. The following Grade 9 student’s responses illustratesome of these features.

R:[Part 1a] You’d use them, like, to average out the score of a class. You’d,like, add them up, you know, to find like the middle of the scores, like howmany people got it right, and most people got it right. [You do that by addingup the scores?] And then dividing it. [Part 2] […] They sort of looked at it all

LONGITUDINAL DEVELOPMENT 25

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 17: The Longitudinal Development of Understanding of Average

and probably added up all the hours and then divided by how many kids orsomething. [Part 3a: How can the average be 2.3 and not a counting num-ber?] Because they added all the kids up and got a bit left over. [Part 4] Addthe amount of kids together and you get 100 and divide it by the 12 hours ofTV. […] [ So you would expect it to be around 8 hours 20 minutes?] No.[What might you expect it to be around?] I would expect it to be less than 8hours because there’s more city children watching 4 hours than there is coun-try children watching 8 hours. [S9, Grade 9]

Another student defined average as middle for Parts 1 and 2, but for Part 3 withthe decimal 2.3, he referred to mean concepts and almost achieved success by ap-proximating the result of working the mean algorithm forward.

R:[Part 1a] It means the middle number between two or more numbers. [Part1b] Well, average means, say, there’s 100,000 kids in primary school andsomething like, about 40% watch 4 hours and the other 60% watch … another40% watch 2 hours and some watch 3 and a half and they just found out themiddle number between all of the people. [Part 3] With the average, it wasn’tan even average. So, they had to go down to points. [Part 3b] They mighthave two to three because if you add it all up. Say another four families havetwo, and other three families have, oh … another four families have three.Four times 3 is 12, and 2 times 4 is 8. Eight plus 12 plus 5 should be 25. Tengoes into 25, 2.3 or something like that. [S10, Grade 6]

Application of average in one complex task (A1). Some students ap-plied their understanding of average appropriately to one of the problem-solvingtasks in Parts 3 and 4. For Part 3, students either used a systematic trial-and-errorstrategy or reversed the averaging procedure to realize that there must have been 23children for the 10 families, and thus there were 18 left to distribute among 8 fami-lies. Some students stopped here because they felt they did not have enough infor-mation to go further, and the interviewer did not insist they proceed to suggest a dis-tribution. Other students proceeded to find the average for the 18 children dividedamong the 8 families, giving a result of 2.25 children per family. In the followingexcerpt, a Grade 6 student used three ideas of average, although not distinguishingthem clearly, and then struggled to determine the total needed for an average of 2.3.

A1:[Part 1a] Sometimes in maths we have to work out the average of the scorethat we got. The amount that we got most. So, like if you got a low score and ahigh score, the average score was in the middle. [Part 1b] It means most of

26 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 18: The Longitudinal Development of Understanding of Average

the children watched TV for 3 hours a day. […] First they asked lots of peoplewhat they watched and wrote them all down and then add them together anddivide them by the number that you had, the people that you had counted […][Part 3b] If the other eight families had had, two 8s are 16, plus 4 is 20, plus 1is 21, then divide by 10 … would be 2.1, so 2.3, so some of them had to havethree to make it higher … two would have to have three, and six of themwould have to have two … I figured out that 2.1, you’d need .2 more, so justsay two of the families had to have three. [Part 4] Add them all together, allthe hours and then divide it by 100 this time because there’s a lot of them.[Could you use the information provided to find the average overall?] I don’tknow. [S11, Grade 6]

The student cited next offered the mean algorithm as the procedure to calculatethe average and distinguished this from a modal idea that was used to describe howan average represents a distribution of scores. The student achieved the result forPart 3 simply but for Part 4 was confused about how to handle the weighting of thetwo averages.

A1:[Part 1a] Find out if you added all the people’s scores together what was thescore that was. … That’s not necessarily the most common, is it? … Thescore that is, like, don’t know how to say it. […] You add the number of peo-ple up and you divide it by the score that they got, and that gives you the aver-age score. […] [Part 3b] (Writes 2, 4, 1, 2, 3, 3, 2, 1.) [How did you decide onwhat numbers? You have been changing them?] Well, you would have tohave 23 children to get the average of 2.3 for 10 families, and I took … there’s5 here, so I had to get them to add up to 18, so I just wrote down the numbers.[Part 4] (Writes down “4 ÷ 3 = 1.3then I added 1.3 to the 8 = 9.3.”) I dividedthat (4 hours) by 3 because that’s 3 times that and added it on to the 8 hours. Idon’t know why but … [S12, Grade 9]

In the following responses, a Grade 8 student struggled for Parts 3 and 4,achieving success on Part 4, only to reject the notion. This student provided a pointof discussion between the authors. For Part 3, she gave responses similar to thosein the R category involving totals to yield an average of approximately 2.3. ForPart 4, she initially applied the given averages to find the total and the correctweighted average, justifying classification in the A1 category. When the inter-viewer probed for reasons, it became clear that the student, like a number of othersin the investigation, interpreted the average of 8 hr as if it were 8 hr for each stu-dent, just as she had reasoned about Part 3, using two children for each family. Shethus knew to reverse the algorithm by multiplication but did not appreciate vari-ability in the distribution of data values. This illustrates how students giving A1 re-

LONGITUDINAL DEVELOPMENT 27

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 19: The Longitudinal Development of Understanding of Average

sponses did not have the flexibility in usage of the concept to respond to both Parts3 and 4.

A1:[Part 3b] About five each, no about two, yeah two each for the other families. [Iheard you doing lots of calculations. You had 16 in mind for a minute, whywere you thinking that?] Eight times 2 for these families, 8 times 2 is 16, and 5more for the other families is 20, divided by 10 families is about 2.3, is 2. […][Part 4] A hundred students, if it’s 8 hours, then 8 hours times 25 is 200, andcity students, 75 times 4 is 300, that’s the hours each have, that’s 500 hours, and500 divided by 100 is about 5 hours of TV per week. [Is that what you wouldexpect by looking at the problem?] I expect about 6 because that’s between 8and 4. […] [Why did you decide to do 25 times 8 and 75 times 4?] You had towork how much hours altogether for, oh that’s the average! … Eight plus 4 is12, actually I did it wrong before because I thought that 8 hours each, but that’sthe average, yes, so 8 plus 4 is 12 divided by 2 is 6. [S13, Grade 8]

Application of average in two complex tasks (A2). Students who re-sponded at the A2 level applied their understanding to both Parts 3 and 4. Theydemonstrated a more flexible usage of their understanding of average than those inthe A1 category. This difference was often evidenced in Part 3: Students who gaveA1 responses stuck closely to the definition and did not venture descriptions,whereas those who gave A2 responses added that a mean of 2.3 was likely to indi-cate that most families have two children or commented that there were many pos-sible solutions to distributing the children among the families. The following re-sponse from a 1st-year university student involved a strong sense of representationas well as fluency to calculate and reason in complex contexts.

A2:[Part 1a] It means it’s sort of the central figure. It’s not the same as the me-dian, which is the middle point of the data. … It means it’s sort … of propor-tionately in the middle. [How would you do it?] You sum all the scores and di-vide by the number of the scores. It’s a measure of central tendency … it’swhat you’d expect … if you took all the scores. It’s … the bit you’d expectmost, even though it may not be an actual score. […] [Part 3] Well, therewould be, the average for the 10 families would be, that would 2.3 times 10which would be 23, and take away the Grants of 4 and take away the Coopersof 1, gives you 18, and then you divide by the remaining number of familieswhich is 8, which equals two and a quarter children. So you would expectthere would be between two and three children for each of the other families,that there would be more with two than three. [Part 4] Twenty-five times 8 …

28 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 20: The Longitudinal Development of Understanding of Average

75 times 4 … plus … divided by the number of students. Therefore, the aver-age is 5 hours. [Is this what you would expect?] Yes, because you would ex-pect the 4 hours to have a much greater, 3 times proportional impact on thepopulation, because there are 3 times the number of students in the city thancountry. [S14, Grade 13]

Not all students who gave A2 responses were so fluent; however, all solved Parts 3and 4, and most also exhibited features of representation. In the following excerpt, astudent used reasoning about proportions to solve the weighted mean problem.

A2:[Part 1a] Add them up and divide it by the number of times it would happen.[…] Means about middle. [Part 1b] It means that most of the children watcharound about 3 hours per day. [Part 3a] It means that the average family hasbetween two and three, usually two, more than three sometimes. […] (Writesdown 2, 3, 4, 1, 3, 2, 1, 1 [note arithmetic error, numbers total 17, not 18].)[How did you decide on those numbers?] I worked it out that they had to addup to 23 and I took away 5 and […] [Part 4] […] It would be below a bit (re-ferring to 6, the midpoint of 4 and 8), it would go down to about 5 … becausethat’s three quarters of the whole survey, the difference. [S15, Grade 9]

Summary. The definition of the first four levels in Table 1, based on the clus-tering of prominent characteristics described there, adds support to the develop-mental levels suggested by Watson and Moritz (in press). The greater breadth anddepth of responses possible in the interview setting allowed for a richer descriptionof these levels. Notation was made throughout the description of the levels in Table1 of the usage of ideas associated with mean, median, and mode, as discussed in thenext section. Table 2 shows the distribution of response level across grades. Al-

LONGITUDINAL DEVELOPMENT 29

TABLE 2Percentage in Response Levels Across Gradesa

Response Level

Grade

3b 5c 6d 7e 8f 9d 10g 12h 13i

P 30 38 0 0 0 0 0 0 0U 41 13 7 0 0 0 0 0 0M 26 50 43 50 33 21 20 20 0R 4 0 14 44 33 25 27 0 0A1 0 0 21 6 17 25 27 0 0A2 0 0 14 0 17 29 27 80 100

aN = 137.bn = 27.cn = 8. dn = 28.en = 16.fn = 6. gn = 15.hn = 5.in = 4.

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 21: The Longitudinal Development of Understanding of Average

though some grades have small student numbers, the trend for improvement in cat-egory with grade is clearly evident.

Frequency of use of ideas associated with average. Table 3 shows theprevalence of the three standard ideas for average across the grades as well as thepresence of multiple ideas displayed across questions asked in the interview. Itshould be noted that 11 students were not asked Parts 3 and 4 (7 Grade 3 students, 3Grade 7 students, and 1 Grade 10 student). No students in Grades 3 or 5 were ac-quainted with the mean, whereas almost all students in Grade 6 or higher were fa-miliar with it. This is not to say, however, that students from the higher grades ap-plied it correctly in Parts 3 and 4, as shown by their response levels (Table 2). ForParts 1 and 2, which asked about average generally, ideas associated with medianand mode were frequently mentioned at most grade levels, and many students fromGrade 5 upward offered two or more ideas for average. For Parts 3 and 4, where ap-plying the mean in context was the mathematical expectation, the median idea re-ceived less attention. The termmostwas often used by students in response to Parts3 and 4 (see Table 3) to describe the transition from a summary value to a distribu-tion, that is, to indicate the distribution of values given the average of 2.3 or to indi-cate that most students were in the city group.Mostwas rarely used, however, to de-scribe the usual transition from the distribution of data to a summary value, that is,to indicate that the mode was the procedure to determine the average value giventhe original data distribution. Multiple ideas occurred for Parts 3 and 4 but less fre-quently than for Parts 1 and 2.

30 WATSON AND MORITZ

TABLE 3Percentage Use of Ideas Associated With Mean, Median, and Mode Across Gradesa

Idea of Average

Grade

3b 5c 6d 7e 8f 9d 10g 12h 13i

Response to Parts 1 and 2Mean 0 0 86 100 100 86 100 100 100Median 22 63 43 69 50 43 53 60 50Mode 26 38 39 50 17 43 53 40 50Twoj 7 38 36 56 33 36 53 20 50Threej 0 0 18 31 17 18 27 40 25

Response to Parts 3 and 4Mean 0 0 61 69 100 93 80 80 100Median 15 25 11 31 17 21 27 0 0Mode 19 38 25 31 33 39 60 40 75Twoj 4 13 14 25 50 32 47 40 75Threej 0 0 4 19 0 14 13 0 0

aN = 137.bn = 27.cn = 8.dn = 28.en = 16.fn = 6.gn = 15.hn = 5.in = 4. jDenotes usage of expressionsassociated with two or three of the mean, median, and mode.

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 22: The Longitudinal Development of Understanding of Average

Table 4 shows the percentage of student responses at each response level mak-ing reference to mean, median, and mode ideas in Parts 1 and 2 and in Parts 3 and 4.It should be noted that 11 students were not asked Parts 3 and 4 (5 at the P level, 2at the U level, and 4 at the M level). Six responses in the P and U levels casually in-cluded ideas associated with middle or most, but other aspects of these responsesresulted in classification at these levels. In general discussion (Parts 1 and 2), themean was mentioned in 65% of responses at the M level and in over 90% of re-sponses at the higher levels. In Parts 3 and 4, where the mean was the appropriatemeasure for solving the problems, M- and R-level responses were unsuccessful;however, it is interesting to note that M-level students less frequently discussed themean in the last two parts (from 65% for Parts 1 and 2 to 35% for Parts 3 and 4),whereas students responding at the R level continued to use it in Parts 3 and 4. TheA1- and A2-level responses had to use the mean for success to be allocated to theselevels; hence, the rate rises to 100%. Multiple usage of the three concepts was gen-erally more common in Parts 1 and 2 than in Parts 3 and 4, which would be ex-pected given the contexts of the questions.

Research Question 2: Longitudinal Change

The question of longitudinal change in conceptual understanding and in applicationof the measures of average is considered by comparing responses in the initial inter-views with responses in longitudinal interviews. Longitudinal Study 1 involved 22

LONGITUDINAL DEVELOPMENT 31

TABLE 4Percentage Use of Ideas Associated With Mean, Median, and Mode by Response Levela

Idea of Average

Response Level

Pb Uc Md Re A1f A2e

Response to Parts 1 and 2Mean 0 7 65 92 95 96Median 0 14 65 36 63 44Mode 9 14 44 56 32 48Twog 0 0 49 44 37 32Threeg 0 0 14 20 26 28

Response to Parts 3 and 4Mean 0 0 35 84 100 100Median 0 7 19 20 26 24Mode 0 14 33 56 16 56Twog 0 0 13 44 32 48Threeg 0 0 5 12 5 16

aN= 137.bn= 11.cn= 14.dn= 43.en= 25.fn= 19.gDenotes usage of expressions associated with twoor three of the mean, median, and mode.

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 23: The Longitudinal Development of Understanding of Average

students interviewed again after 3 years (Year 1 to Year 4), and Longitudinal Study2 involved 21 students interviewed again after 4 years (Year 1 to Year 5).

Longitudinal Study 1. Table 5 shows the changes in levels of performancefor the 22 students in Longitudinal Study 1 over the 3 years between interviews. Nostudents exhibited a drop in performance, 6 students responded at the same level,and 16 students responded at a higher level. We also found that students fromhigher grades responded at higher levels: Four of 5 Grade 6 students responded atthe M level, whereas 4 of 5 Grade 12 students responded at the A2 level. Examplesfollow to illustrate the types of change observed after the 3-year interval.

Four students changed from P to M over the period (see Table 5). One of thesestudents, from Grade 3, twice referred to a possibility and its negation in the initialinterview. This may have been a result of hearing others refer to average as “notgood, not bad, but in between” (M category description); however, the student didnot incorporate any coherent sense for average.

P:[Part 1a] Well, I don’t know what it means but I would probably take a guessthat it means the truth or no truth. But average I could say two things, that av-erage might say we might go or we might not. […] [Part 2] Well, I am notsure about that one, but if they got 3 hours a day of watching television theywould be pretty lucky. […] [S16, Grade 3]

Three years later, the same student had multiple ideas of average, including themean, and a primitive median central idea using the “not … not … ” descriptionagain. These ideas were only quoted in the complex contexts of Parts 3 and 4 ratherthan being applied to achieve reasonable results.

M:[Part 1a] Well, there’s two ways of saying it. Average as in, what’s the aver-

32 WATSON AND MORITZ

TABLE 5Frequency of Response Levels of Students Interviewed Longitudinally in Different Yearsa

Response Level (Year 4)

Response Level (Year 1)

P U M R A1 A2

M 4 1 3 0 0 0R 1 0 1 2 0 0A1 0 2 1 0 0 0A2 0 0 2 3 1 1

aLongitudinal Study 1,n = 22.

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 24: The Longitudinal Development of Understanding of Average

age of 21 and 15 and you have to find out. And average as in, you’re all rightat that and you would get a bit better but you’re okay at it so you’re not bad,and you’re not good. […] [Part 2] Well … if they had 3, 25, and 27, whateverthe numbers are, they would have three and they would add them all up andthen they would divide three by the answer of all the addition and then theydivide it and get their answer. […] [Part 3a: How does the average work outto be 2.3 children?] Well, they would get all the children and count them up… blah blah blah and they got 10, then they divide 10 into the number of fam-ilies—8 and 4 and 1, whatever—and then they would get an answer to get 2.3.[Part 4] (Student writes 25 + 75 = 100, and 100 ÷ 2 = 50.) [S16, Grade 6]

Three students’ responses remained at the M level after 3 years (see Table 5).The students were initially in Grades 3, 7, and 9. The Grade 9 student had ideas as-sociated with middle and most; the idea of using a graph to assist finding the modemay have been suggested by the previous protocol in the interview concerningcomparing data sets presented in graphs. The definition ofmostwas identifiedwith the 2.3 average, apparently without any thought of how the decimal repre-sented the situation.

M:[Part 1a] It means that someone is in the middle of everything. Like, averageheight, average in maths, they are caught between being either smart or notthat good, or short and tall. [Part 2] Maybe they asked each student from theprimary school how much TV they watched and then made a graph, and theyfound that 3 was more popular. Like, 3 hours was maybe what the most stu-dents had. [Part 3a] That out of a certain amount of Australian families, themost common amount of children is 2.3. [S17, Grade 9]

In Grade 12, the student had retained the idea of “in the middle of good and bad” andhad acquired a rough idea of mean as half of the total. This definition used in Part 2may have caused difficulty in the context of explaining the average of 2.3 childrenin Part 3. No comment was made on a number to use in division.

M:[Part 1a] I’ve heard it in maths obviously, and you hear it at school. Like, areyou an average student, like, are you a middle of the range, like, you’re not re-ally smart and you’re not that bad at schoolwork. You’re just pretty averageand pretty normal, like, going along at a steady rate. […] [Part 2] Well, I’dadd all the numbers that were given and then divide it by 2 to find out the aver-age score. [Part 3a] When looking at Australian families they usually havebetween two and three children, because you can’t get .3 of a person, so theywould have to have between two and three children. [How can the average be

LONGITUDINAL DEVELOPMENT 33

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 25: The Longitudinal Development of Understanding of Average

2.3 and not a counting number?] I guess by working out the average they’dfind the total, by working out how many people were surveyed, it wouldn’twork out exactly, so it’s not necessarily going to be a whole number. It couldbe like 2.1, so it’d be closer to two children. [S17, Grade 12]

Two students made a striking improvement over the 3 years from the U level tothe A1 level. One, a Grade 5 student, initially used a weak idea of middle on sev-eral occasions, but when asked what one might speak about the “middle of,” shesaid, “schoolbooks.” The other, a Grade 3 student, initially used the colloquial ideaof “roundabout” in describing average and “it might mean you are right and mightmean you are wrong as well, it just depends.” She appeared to construct her re-sponse for studying the average television watching time using the grammar of thequestion: “Probably a number of people because, like, if they said average of pri-mary school students, like, they didn’t say student, they said students, and that ismore than one.” Three years later, both of the students continued to have difficultywith the average family size of 2.3, but they used the weighted mean algorithm forPart 4. The Grade 3 student, now in Grade 6, had a fairly confident grasp of themean. She tried to apply it in both complex contexts, handling Part 4 with apparentease but struggling in Part 3 in reasoning about average as “evened out” and thuseach family having 2.3 children.

A1:[Part 1a] You would use it, like, when you were working the score out wherethey are most likely to get. So, if someone got 97.6% and someone got 98.3%and 100%, their average would be adding them all up and divide by 3. […][Part 3a] That people, evened out, have 2.3 children. You can’t have .3 of achild, but that is just how it worked out. I know because my aunty has fourchildren, and I’m an only child. That’s five for two families. That’s two and ahalf people. […] [Part 3b] […] You’ve got eight families there and you’ve al-ready got five children, and so the average is 2.3 children. So, you could aver-age it 2.3, 2.3, and 2.3, or times it, and then add on 5 and sort of then get—be-cause it’s the average and you don’t know how many people—that’s howthey average it out. [Part 4] You could get 8 and times by 25 and 4 by 75 andyou add the answer to that and that together and divide by 100 or you couldjust take the 2 zeros off. You probably wouldn’t get an exact number but thenyou could probably work it out by doing that. […] The answer is 5. [S18,Grade 6]

Three students progressed from the R level to the A2 level (see Table 5). TheGrade 7 student cited next initially used the mean as representative in the context ofcomparing two data sets but did not solve the more complex problems even thoughsheusedthemiddleasareasonablenesscriterionforhersimplemeancalculation.

34 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 26: The Longitudinal Development of Understanding of Average

R:[Part 3a] Most families would have around 2.3 children, and it’s not likethere’s a child who’s so small that it’s a .3, but it’s all divided … like, you addup the number of children and then you divide it by the number of familiesand it came out at 2.3 [Part 4] You add up the 8 and the 4, which is 12, andthen you divide it by 2, so the average is 6 hours. … Or else you could just saywhat’s in the middle and that’s a 6 anyway. So, the average would be 6 hoursper student. [S19, Grade 7]

By Grade 10, this student displayed understanding in both of the more complexcontexts, using the mean for calculation and the middle for careful descriptions ofhow the average may represent various data set distributions. While solving Parts 3and 4 quickly and efficiently, this student conveyed an understanding of averagethat included balancing involved in calculation of the mean as well as modal ideasas would be observed in graphical representations.

A2:[Part 1a] It’s meant to get somewhere in the middle. But then it also keeps inaccount that lots of people may have got a higher mark compared to a lowermark. So, it doesn’t just get in the middle of just those marks that are in thevicinity, it also takes into account how many people got each mark. […] Itdoesn’t really find the middle, but it finds the most probable situation in acount of where the graph skews and everything. [Part 3a] […] Upon addingup all the Australian families and how many children they have, and dividingit by their number, the most common number, well not number but the mid-dle, but keeping in account that some families have more of one score than theother, that 2.3 is the outcome. [Part 3b] Altogether, because to get 2.3 just go-ing backwards you would have had to divide by 10, the number 23 to get 2.3,because that’s what you do to get the average. So, 5 off the 23, so 23 take 5equals 18. [Part 4] OK, so 25 students, right, so you say 25 by 8, plus 75 by 4,over 100 […] 5 hours per student is the average. [S19, Grade 10]

One Grade 9 student initially illustrated structurally complex out-of-domainreasoning alongside R-level responses. The student used the mean to compare datasets in the previous protocol on comparing two data sets. For Part 3, she developedan elaborate imaginative account of the average of 2.3 children.

R:[Part 3a] It says that most Australian families have two older children andsay one infant or child under the age of 5 or whatever. […] [How can the aver-age be 2.3?] Well, because the average is of, like, the older children, whichthey could say is fully grown or my age or whatever. And the .3 is a child that

LONGITUDINAL DEVELOPMENT 35

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 27: The Longitudinal Development of Understanding of Average

is growing up to be an older child. So that, like, say the kid is 3 now, once itturns to be 10, it will get to be 1, so they will have three children sort of thing.[S20, Grade 9]

Three years later, this student still retained the account from her out-of-school tele-vision experience when asked the same question but also employed appropriatereasoning and solved both of the more complex problems after recognizing a falsestart in the second.

A2:[Part 3a] That most Australian families have two, um, like, um, grandchil-dren and maybe some other toddlers under 1, so the 2 would represent twochildren that are over 1 year old and the .3 would be children that are just be-low 1. [How can the average be 2.3?] When they would have got the totalnumber of children in the families, the number of people they interviewedmight not have divided equally into the total number of children they had, sothat’s how you would get a point. [Part 3b] OK, so 23 children, so there is 18amongst the other eight families, so the other eight families would have 2.25children each. [Part 4] You would just divide 100 by 12, and then you wouldget the total average (8.33). […] There is less country than city students. […]If you do 8 times 25, so that’s 200 hours altogether, and then 4 times is 300, sothat is 500 divided by 100, so that is 5 hours, on average, for the lot of them,and that works out to be more likely because there are more city students …[S20, Grade 12]

Of the 22 students in Longitudinal Study 1, 12 students did not refer to the meanin Year 1. By Year 4, all but 1 student made reference to the mean, and 16 of the 22made reference to it in attempting the more complex questions. In Year 4, more re-sponses to Parts 1 and 2 referred to ideas related to median (9 in Year 1, 14 in Year4) and mode (6 in Year 1, 10 in Year 4). Responses to Parts 3 and 4, however,showed a focusing on the appropriate mean concept, with decreased reference tothe median idea (8 in Year 1, 4 in Year 4) and no change in frequency of modal ref-erences (7 responses in each year). These changes would appear to reflect studentlearning associated with either the in-school curriculum or out-or-school experi-ences: The mean was acquired; the average was more flexible and could be associ-ated with other concepts; and, in applied contexts, the appropriate idea (in this casethe mean) was more likely to be selected.

Longitudinal Study 2. Table 6 indicates the changes in levels of responsefor the 21 students in Longitudinal Study 2. No students performed at a lower level4 years later. Three students performed at the highest level on both occasions, and

36 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 28: The Longitudinal Development of Understanding of Average

another 3 students stayed on the same level. The other 15 students improved theirobserved level of performance over the 4 years. We also found that students fromhigher grades responded at higher levels: Seven of 8 Grade 7 students responded atthe M or R levels, whereas all 4 Grade 13 students responded at the A2 level.

Three students changed from the U to the M level after 4 years (see Table 6).One Grade 3 student initially used colloquial terms such as “just about” and “usu-ally” for Part 1. The student also referred to adding up the hours for Part 2, butwithout division, although the interviewer specifically asked if any other processwas required after adding. In Part 3, the subsequent response concerning the deci-mal 2.3 used intuitive ideas: “Maybe because they have got another one on theway, or something, like having another baby.” Four years later, the same studenthad acquired ideas associated with median and mean but still did not apply them inthe complex problem setting of Parts 3 and 4.

M:[Part 1b] It is the average, like, the middle amount between how many stu-dents. Like, some might have had five and others might have only had one,evened it to a middle number. [Part 2] Add up all the numbers that the stu-dents got and divide it by how many students they surveyed. [S21, Grade 7]

Four students changed from the M level to the R level (see Table 6). The Grade3 student cited next initially displayed an understanding of average based on mid-dle, with a hint of balancing with the comment “more likely to have two though.”

M:[Part 1a] Means sort of in between high and low. [When would you use theword?] When you’re … well, there’s one person here who is no good at hand-writing, one here who’s excellent, one here who’s okay. […] [Part 3a] Well,some have two children and others have three. More likely to have twothough. [Part 4] Just make it like a plus sum. […] It has got something to do

LONGITUDINAL DEVELOPMENT 37

TABLE 6Frequency of Response Levels of Students Interviewed Longitudinally in Different Yearsa

Response Level (Year 5)

Response Level (Year 1)

P U M R A1 A2

M 1 3 2 0 0 0R 0 1 4 0 0 0A1 0 0 2 1 1 0A2 0 0 1 0 2 3

aLongitudinal Study 2,n = 21.

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 29: The Longitudinal Development of Understanding of Average

with 100 … yeah … that plus that is 100, so there is 100 students, so that is 1hour each. [S22, Grade 3]

Four years later, the idea of leveling had developed strongly in a visual sense. How-ever, the student could not tie the algorithm to the concept in order to apply it to cal-culate a value for the weighted mean in Part 4.

R:[Part 1a] Well, average would be sort of about the middle of the scale, the onewith most things on it. [Part 2] Well, they would take it somewhere and takeout the calculator and work it all out. [So, what do they do with all the num-bers, people who watched different hours?] Well, they would have put it on agraph. […] They would have leveled it off, so take the high ones and make itfairly even. (Draws bar graph with values 6, 3, 0.) If you have got three peopleand Person 1 watches 6 hours, Person 2 watches 3 hours, and Person 3watches none. They would have a table looking something like—so, theywould take the top 3 off there and put them down there, so there would be 3 upthere and 3 down there. (Redraws bar graph with values 3, 3, 3.) [Part 4] Iwould expect the average would be somewhere around 4.3 or something […]because there’s a lot more people with 4 rather than 8. […] Because there isnot many of a lot, not much of a lot, rather than a lot of not much. It would beprobably closer to a lot of not much which would be about 4 point something.[…] [ Is there a way that that could be calculated from the information wehave got?] Probably use a computer or something. [Can a computer do it withthe numbers or would it be easier to do it with a graph like you have gotthere?] I expect it would be easier for a computer to do it because they thinkfaster, more accurately. [S22, Grade 7]

Two students’ responses improved from the A1 to the A2 level (see Table 6).The context for thinking of average, the game of cricket, did not change for the fol-lowing Grade 6 student over a 4-year period. In the first interview, only the meanwas used.

A1:[Part 1a] Say there’s a cricket team, and a person gets a certain amount ofscore each game. And then to find the roundabout score, you add them all upand divide them by the amount of games they’ve had. You’ll get the amountthat’s roundabout what they’d get each game if they were put into the samesorts of score. [Part 3b] […] If there were 10 families and 2.3 that would be23 children altogether, and so if there’s 4, and 1 would be 5, take 5 it would be18. So, each of them could have any amount of children, as long as they alto-

38 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 30: The Longitudinal Development of Understanding of Average

gether added up to 18. [Part 4] You could add up the 4 and the 8 and then di-vide it by 2 to find the average, which would be 6. [S23, Grade 6]

Four years later, however, a modal idea supplemented that of the mean, a commoncharacteristic difference between the A1 and A2 categories (see Table 1). In addi-tion, the student was now acquainted with a way of dealing with weighted means.

A2:[Part 1a] You use averaging in a cricket match if, how much average runsthey need each over to win. … It means how much total, add up all the differ-ent numbers and divide it by how much there is, will be. [Any other ways weuse the word “average”?] Some people say, like, “I’m pretty average atthat,” which would sort of mean they are the same as most people, how theyare good at it. [Part 3b] Well, if there’s 10 families and the average is 2.3,there is 23 children for 10 families, so if you took 5 off that would be 18 forthe remaining 8 families. So, you would say that most people would have 2most likely, and some would have probably … and 2 would have 3. Onemight have 1 and another one might have 4 or more than that. [Part 4] Thereare 25 country students, all up would have 200 hours. And the other studentswould have 300 hours, so 100 students all up would have watched 500, whichwould be 5 hours average. [S23, Grade 10]

The three students whose responses were correct in two contexts (A2) in bothYear 1 and Year 5 were all in Grade 9 in Year 1. The student cited next dis-played all three ideas of average in Grade 9 and applied the mean appropriatelyin both complex contexts after using appropriate estimation and correcting strat-egies in Part 4.

A2:[Part 1a] Like, the one that is most popular, or the one that’s kind ofin-between. [Part 3b] […] There will be 23 all up because there will be 23 di-vided by 10 to get the average. So that’s 2.3. Minus 5, so that’s 18. So, I reckonthe rest of it would be fairly evenly divided. [Part 4] You would add them sothat’s 100 and the hours would be 12, so you divided 100 by 12. So, the averagewould be … according to this it would be 8.33, but that can’t be right. [Whycan’t that be right?] Because it would have to be somewhere between 4 and 5. Ireckon it would be more towards 4 because more of them are doing it, more ofthem are watching. […] It would be about 5. […] Twenty-five times 8 is 200,so that’s their total viewing time. Seventy-five times 4 is 300, so that’s their to-tal viewing time. So, the total viewing time is 500, divided by 5 is 100, or 500divided by 100 is 5. So, the average time would be 5. [S24, Grade 9]

LONGITUDINAL DEVELOPMENT 39

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 31: The Longitudinal Development of Understanding of Average

Four years later, this student again displayed modal and mean concepts with someloss of recall of terminology. The student commented on the many possibilities forPart 3. The use of her particular ratio method for weighting the hours in Part 4 wasonly observed on this one occasion in the entire investigation.

A2:[Part 1a] The average can be the most amount of something. Like, if there is agroup, the average person is male, because there are more of them, or like, ifyou take the total of something and divide it by how many scores you have forthat, it would be the average then. There’s mode, median, and I am sure thereis another one—I haven’t done it for a year. [Part 3b] There would be 23 kidsall up, minus the 5 and that’s 18 for the other 8, and I suppose 8 of them have 3kids, oh, no […] I suppose there are all different ways. [Part 4] If you dividethese by 25, so that’s 1 and that’s 3, so that’s 8 plus 4 plus 4 plus 4 equals 20,and divide that by the 4 because there’s four samples there, so it is 5 hours av-erage. [S24, Grade 13]

In Year 1, 8 students made no reference to the mean, whereas by Year 5, all 21students referred to it. Of those 8, only 4 referred to the mean in Parts 3 and 4 inYear 5, compared to 12 of the 13 who had referred to the mean in Year 1. Changesin usage of median and modal concepts were similar to those observed in Longitu-dinal Study 1, namely increasing numbers of responses to Parts 1 and 2 referring toideas related to median (9 in Year 1, 12 in Year 5) and mode (6 in Year 1, 10 inYear 5), and focusing more on the mean and less on the median in Parts 3 and 4. Incontrast to Longitudinal Study 1, however, for the 13 students who had used themean in Year 1, there was increasing reference to the mode in Parts 3 and 4 (4 of 13students in Year 1, 9 of 13 in Year 5). This was partly due to students who changedto the A2 level and, as noted earlier, more frequently referred to the modal conceptto describe the situation intuitively without confusing this with the mean concept.

Research Question 3: Average as Representation

As noted in Table 1, the idea of using an average somehow to represent a data setwas a major determinant in assigning responses to the R level for those who couldnot solve the complex applied problems necessary to be allocated to levels A1 andA2. The types of response acknowledging something representative about themean varied greatly, and the examples given here reflect the variation of meaningas well as sophistication. As seen in Table 2, only one Grade 3 response was classi-fied as R. This student had a strong idea of middle combined with most. Whenasked how the average cricket score worked, he replied, “Well, the score they usu-ally get closest to most of the time.” On Part 4 he said, “Well, you would get all the

40 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 32: The Longitudinal Development of Understanding of Average

people and you would halven it … and put that as the average.” The response re-flected an appreciation of the data set and the average telling something about it.

Other responses, all from students in Grade 6 or higher, illustrated other repre-sentative aspects of the average concept. Seven students chose an average to repre-sent and compare two data sets in the previous interview protocol. Expectationwas noted on some occasions, for example, with students responding to “Whatdoes average mean?” with “the usual expected mark or score,” or with “that givesyou what you would expect one person of the whole lot to be doing.” This idea wasalso sometimes expressed in terms of likelihood: “ … you work out which hobbyhas the most number, and [you] would be more likely to meet the person that likesthat sort of thing” or “what’s most likely to become … be.”

Most students at the A1 and A2 levels also demonstrated a notion of representa-tion somewhere during the interview. Only two students explicitly used the termrepresent:“It … gives a fairly accurate representation of, say, what that class of aschool got for a test” (categorized A1) and “With a group of scores to give a singlescore that represents how well people have done, or how high or low the scores arein general” (categorized A2). The latter student displayed a quite sophisticated un-derstanding of the nature of the arithmetic mean, for example, for Part 3, “the aver-age isn’t always a possible number … so the average is not going to be the mostcommon number that people have […] sort of what people generally have, it’s justa number to work out so you can compare.” Most responses at the A1 and A2 lev-els also tended to use the idea of most in a colloquial repetitive sense in answers toPart 3.

Three responses at the M level involved representative ideas, but overall inthe clustering process, the levels of response were not judged to be higher thanthe M level. One Grade 6 student said of average, “Use it sort of if you’ve gotlots of scores or something and you want to find the average, see what you usu-ally get.” Overall, however, the response was weak, for example, suggesting forPart 4, “You just add them up, the averages … 12,” with no appreciation for thenonrepresentative nature of the answer. A Grade 7 student suggested an averagecould be used “When you give a test and you’re working out, if some people areaway, what mark to give them” using the add-and-divide algorithm. Similarly, aGrade 8 student mentioned using it for mathematics tests when “He says you av-erage it out … it means sort of, like, if you’re averaging something out …you’re making it equal so it’s the same on both sides, sort of.” These two re-sponses appear to be based on ideas of representation acquired from a particularusage by classroom teachers, and they were not typical of the overall perfor-mances of these students.

Overall, students displayed greater appreciation of the average as a representa-tive measure with increasing grade and with increasing developmental level. As isseen in Tables 5 and 6, after 3 or 4 years nearly all students’ responses were at lev-els reflecting an appreciation of the representative nature of an average.

LONGITUDINAL DEVELOPMENT 41

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 33: The Longitudinal Development of Understanding of Average

Research Question 4: Problem-Solving Strategies

Although some examples of responses to the two problems in Parts 3 and 4 of theprotocol were given earlier, the purpose of this section is to describe developingstrategies for Parts 3 and 4 as observed over the 137 interviews. Some of the lowerlevel responses would appear to reflect a drop in performance from Parts 1 and 2,when students encountered a problem much too difficult for their current cognitiverepertoire.

Consider first the problem in Part 3 about 10 families with an average of 2.3children. Interpreting the decimal mean posed a difficulty for some students.Those who could not comprehend the mathematics sometimes referred to theirpersonal experiences of families. For example, when asked how many childrenthe other eight families would have, one Grade 6 student replied, “I would saytwo children for each of them … there might be a few ones, but I reckon most ofthem would have two because you see most people do have two these days.”Others picked reasonable numbers such as two, three, or four for the number ofchildren in families with no justification. A Grade 5 student quoted earliershowed some ability to manipulate the information in suggesting “two each or afew more … because 2.5 would mean half, so that means two and a half of thefamilies … ” [S6]. The child was starting to grapple with the ideas involved butcould not put it all together.

Determining the total by reversing the mean algorithm and multiplying was acentral idea for success on Part 3. The Grade 6 student [S10] described earlier atthe R level illustrated well the struggle some students had when they understoodthe application of the mean algorithm in a forward direction but could not reverseit. Occasionally, a student applied the forward strategy successfully with compen-sation that used the total, although not quite signaling the key idea of reversing thatmade it viable. The Grade 6 student [S11] described at the A1 level was an excel-lent example of the transition that, when complete, produced a much simpler struc-ture in the response. The typical response showing this use of the algorithm inreverse is seen in several responses at the A1 and A2 levels (e.g., S12, S14, S15,S19, and S20).

The distribution of data values for Part 3 also involved different strategies,beginning with the simple acknowledgment that there must be 23 children in the10 families. Several students stopped here and worked out the 10 families, in-cluding the Grants and Coopers, to sum to 23. Most responses, however, sub-tracted 5 and discussed 8 families with a total of 18 children. Some stoppedhere, saying they could not determine the number of children per family, someproduced an example (e.g., S12 and S15), and some indicated there were “count-less” solutions (e.g., S23 and S24). The open-ended form of the question al-lowed students to discuss what could be determined with certainty based on the2.3 average and what was not certain but might be possibly or reasonably sup-

42 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 34: The Longitudinal Development of Understanding of Average

posed. Hence, the variety of responses—being satisfied with a total of 18, an an-swer of 2.25, or a distribution—was to be expected.

In response to the problem in Part 4, some students again resorted toout-of-schoolexperiences.OneGrade6studentcommented, “If youareona farmorsomething,youprobablywouldn’t bewatching [TV], youwouldbeout,going look-ingaftersheeporcattle,orsomething like that,youwouldn’tbereallywatchingTV”[S5].Otherstudentsperformedasinglecalculationwith thenumbersprovided,suchas “4 + 8 = 12, so 12hours,” “100 ÷ 12 = 8.33,” or “12 ÷ 100 = 0.12,” often not recog-nizing theunreasonablenatureof theanswer.Somewhohaddifficultymadeanotherattempt, whereas others gave up. The typical response for those newly introduced tothemeanwastoworkout theaverageas“(4+8)/2=6” (e.g.,S8,S19,andS23).Somestudents used the middle idea to justify this answer as well. A few (e.g., S13) strug-gled with the dilemma of more students watching 4 hr of TV but eventually decidedon 6 as a response. This wavering is not unusual as understanding is developing.

For students who were familiar with applying the algorithm for the mean in aforward direction only and realized that 4 and 8 were average values, Part 4 createdconsiderable conflict. After reading the question, a Grade 7 student asked, “Don’tyou need everybody’s time?” (cf. S11). It would appear that these students had notbeen exposed to, or had not consolidated, the procedure for working the algorithmbackward.

Some students had an intuitive feel for the information in the problem, espe-cially the different sized groups, and suggested that the answer should be nearer to4 than to 8. Without reversing the algorithm or using ratio, however, these studentsdid not settle on a specific value and were often uncertain of their reasoning. Forexample, a Grade 7 student [S22] suggested a mean of around 4.3 because morepeople were watching 4 hr than 8 hr and used an intuitive notion of ratio in com-ments like “not much of a lot, rather than a lot of not much,” indicating an impres-sive start on the problem. Most responses that were correct (e.g., 26 of 31)employed the algorithm for the weighted mean as part of the solution (e.g., S13,S14, S18, S19, S20, S23, and S24). A few students explicitly used ratio. The Grade9 student [S15] noted earlier found 5 as three quarters of the way from 8 to 4 be-cause three quarters of the children were in the city.

The most sophisticated solutions to Part 4 were those that used both the algo-rithm and a weighting argument based on looking at the sample sizes to confirmthe answer of 5. These students usually echoed the response of the Grade 12 stu-dent [S20] in Longitudinal Study 1. Overall, 12 (of 31) correct responses to Part 4combined the two approaches.

The increasing facility over time of individuals with problem-solving strategiesin the two complex contexts was observed (see Tables 5 and 6), in which 9 and 6students, respectively, improved to the A1 or A2 levels over 3 of 4 years. Exam-ples of the longitudinal change in strategies are seen for 4 students in LongitudinalStudies 1 and 2 (S18, S19, S20, and S23).

LONGITUDINAL DEVELOPMENT 43

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 35: The Longitudinal Development of Understanding of Average

DISCUSSION

Summary of Longitudinal Analysis

The six levels of response described in Table 1 were distinct, and each was based onmore than 10 examples provided from a relatively large initial sample of 94 stu-dents as well as longitudinal samples totaling 43 students. Our initial interrater reli-ability of 80% is not unusual for the type of data used to cover a range of questions(e.g., Mokros & Russell, 1995). The clustering and redefining procedure led to con-siderable confidence in the final allocation of responses to levels.

The levels provide a developmental sequence from which it would be hypothe-sized that students should not regress but generally improve over time. The ten-dency for students to perform at higher levels in subsequent interviews reflectedthe same trends across grades at the time of the initial data collection. These find-ings support the view that changes over grades do indeed represent common devel-opment over time for students. Progression related to ideas of representation andproblem-solving strategies were noted as part of the developmental sequence inTable 1. This progress was related both to response level and to grade, as there wasa positive association between them. The results of this study reinforce the out-comes observed from the survey data reported by Watson and Moritz (in press).The increasingly complex structure of responses at the first four levels was evidentin in-depth interviews as well as in the survey questions.

This investigation also assessed understanding of average in higher order, prob-lem-solving settings. For the more complex questions addressed in Parts 3 and 4 ofthe protocol, of those students who could initially do one or the other or both, allwho were interviewed again could repeat the performance, perhaps using a slightlydifferent strategy; that is, no students at the A1 level in either Longitudinal Study 1or 2 switched from one problem to the other. This should be encouraging to thoseconcerned about student difficulties in these complex contexts. When students dofinally appreciate how to use the algorithm for the mean in a more difficult setting,they appear not to forget it. This, of course, does not solve the problem of how toestablish initial student understanding.

The factors that produced the positive changes in outcome cannot be confi-dently identified with any individual source. As noted by Biggs and Collis (1991),they are likely to be a result of readiness for change in relation to performance atthe previous level, development of working memory, social support from teachersand fellow students, and confrontation with cognitive conflict in problem situa-tions. Based on the curriculum documents in Australia at the time, it can be as-sumed that students in Grades 7 and higher during the 3 or 4 years of the studieswere exposed to some classroom work on the three standard measures of average.In fact, responses from Grade 6 students (e.g., S10, S11, S16, S18, and S23) wouldappear to reflect the students having been taught the algorithm for the mean in that

44 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 36: The Longitudinal Development of Understanding of Average

grade or earlier. There is certainly some evidence from interviews of students re-membering being taught ideas as well as retrieving out-of-school social experi-ences. The results of this investigation likely reflect student understanding fromvarious sources that had been internalized and adapted according to the capacitiesof the students’ maturation levels.

It could be argued that average is the most well known concept in the statisticspart of the curriculum and that educators might expect the best long-term learningoutcomes for this concept. Although the data are consistent with a developmentalprogression of outcomes, some might still lament the lack of progress of some stu-dents to higher response levels. Perhaps a more coherent ordering of the three stan-dard measures of central tendency in the curriculum (Mokros & Russell, 1995) andmore open-ended experiences (Gal, 1995) might lead to better performance over-all in later years, but this conjecture awaits further research in classroom teachingsettings.

Representation

The idea of an average representing a data set is important both in terms of generalusage as well as in being able to apply the arithmetic mean in more complex con-texts. Mokros and Russell (1995) stated this relationship of an average to a data setin the following fashion:

Until a data set can be thought of as a unit, not simply as a series of values, it cannot bedescribed and summarized as something that is more than the sum of its parts. An av-erage is a measure of the center of the data, a value that represents aspects of the dataset as a whole. An average makes no sense until data sets make sense as real entities.(p. 35)

The responses of some of the younger students in this study (e.g., S1) illustrate thatthis understanding must be built over the years of schooling—it is not present for allat Grades 3 and 5.

In their study, Mokros and Russell (1995) suggested a trend for older students toview data-based problems in such a way that average is seen as a summary statisticthat represents an entire data set. A similar trend was found in this study. Althoughthe ability to apply the arithmetic mean increased with grade, the discussion of aver-age often continued to involve colloquial language rather than terms such asrepre-sentsto describe an average. It might be suggested that, as students acquire thestatement of the algorithm for the mean, they expect this to be the desired answerwhen someone (interviewer or teacher) asks them what an average is. Some go on toaddadescriptivephrase,oftenrelated tomostormiddle.Thosewhodonotmayhaveone of two unexpressed thoughts: (a) They may not know more than the definition;

LONGITUDINAL DEVELOPMENT 45

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 37: The Longitudinal Development of Understanding of Average

or (b) they may see the definition as, by its very nature, representing all of the valuesin the data set and see no need to state the representative nature explicitly. Educatorsmay hope for the latter but reinforcement with discussion in the classroom would bea tactic to ensure that this representative understanding develops.

Whereas Mokros and Russell (1995) classified students who view average asmiddle as having a representative view of average, the description in their Table 1is much richer than just the idea of middle. It is assumed that responses that onlymentioned average as middle (e.g., S6, S7, and S8 in this investigation) would nothave been so classified by Mokros and Russell. In this investigation, another as-pect or amplification was required before a response was classified at the R level.

Usage of Mean, Median, and Mode

One goal of this study was to follow the usage of the ideas associated with the threestandard measures of central tendency for students of different grades over time.The rich elaborations in many responses illustrated the students’ attempts to grap-ple with their intuitions about average and reinforced our belief that interviews areimportant to extend and validate survey work. Comments associated withmostwere classified as related to the mode concept, and those referring tomiddle,to themedian concept. Although comments associated with balancing a data set would berelated to the mean, these virtually never occurred, and instead, mention of theadd-and-divide algorithm was designated as the mean. Because of its introductionin the school curriculum at about Grade 6, it was not expected that the mean notionwould be observed below this level, and it was not. Although the algorithm was fa-miliar to almost all students from Grade 6 and higher, many were not able to apply itin more complex problem-solving tasks.

Whereas Mokros and Russell (1995) categorized students by their dominantuse of measure of central tendency, this investigation instead found it useful todocument all usage. This was particularly important in the general context of Parts1 and 2 of the protocol, in which no measure would be expected to be more usefulin discussing the meaning of average or the 3 hr per day average televi-sion-watching habits of students. This approach produced responses showing thatideas associated with middle and most were each present for about one fourth ofGrade 3 students and about one half of other grades. Furthermore, for students re-sponding at the M level or higher, between one third and one half expressed two ofthe three ideas, and around 20% expressed all three. It would appear that many stu-dents held eclectic ideas associated with average, even when they had not beentaught them formally. This is a good sign for teachers because it means there is afoundation on which to discuss and build classroom experiences. When the me-dian and mode are formally introduced, links should be made with the students’prior understanding. That these prior notions are so common lends support to therecommendations of Mokros and Russell that the mean be introduced in the curric-

46 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 38: The Longitudinal Development of Understanding of Average

ulum last, after intuitive ideas related to mode and median have been consolidatedwith the idea of a value representing a data set. Eventually these ideas all need to bedeveloped as expressions of mode, median, and mean, both on conceptual and pro-cedural levels, with students able to discern which measure is most appropriate forgiven situations (Friel, 1998).

For the specific contexts of Parts 3 and 4 of the protocol, where it would be ex-pected that the mean would be the measure of preference, it was interesting to notethe continued mention ofmiddleandmost.With about 20% of students at the Mlevel and higher mentioning the middle, for those at the M, R, and A1 levels, it maybe that the idea was mistakenly used in the weighted mean problem. This couldnot, however, have been the case in the A2 category, and in fact, students respond-ing in the context of 2.3 children often mentioned that .3 was not the middle of 2and 3, or for the weighted data set, that 6 was not the appropriate middle.Mostwasmore commonly referred to thanmiddleat three of the four levels (M to A2; seeTable 4), and this reflected attempts at explaining the distribution informally aswell as formally with the mean (e.g., S7). In some cases, it appears that studentswere seeing data sets represented in graphical form and seeing a peak (most ormode) as the place where the distribution would also balance. Although this wasnot always the case, students may have been familiar with examples such as thenormal curve where the highest point is the mean. This usage of most did not ap-pear to interfere in any way with suggested solutions to Parts 3 and 4, whether cor-rect or incorrect.

As the more difficult questions in this study required an understanding of the al-gorithm for the mean to obtain a total, either of the number of children in 10 fami-lies in Part 3 or of the hours of television viewing by country and city students inPart 4, it is relevant to look at methods suggested for teaching the mean. Friel(1998) suggested two models for teaching understanding of the arithmetic mean: abalance beam model and an “evening-out” model. Although the balance model nodoubt promotes the idea of average as representative of the data set (Mokros &Russell, 1995), a particular strength of the evening-out model lies in its close rela-tion to the algorithm. The need to find the total before an even distribution is madereinforces the importance of the total as part of the representation of the data set.The suggestion of Russell and Mokros (1996) that some students “lose” the origi-nal data in this redistribution led them to suggest another model, the “unpacking”model. This model involves the reconstruction of a data set with a given meanwhere initially all of the data values were equal to the mean, probably showngraphically. As data elements are moved from the mean value to create a realisticdata set with variation, the total must remain constant, as must the balance point.By working problems in reverse, this model anticipates problems in Parts 3 and 4of the protocol and thus could be a useful learning experience. This model mayalso help to clarify thinking of the average from an evening-out perspective by ac-knowledging that the even share is only equivalent in relation to the total, not in re-

LONGITUDINAL DEVELOPMENT 47

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 39: The Longitudinal Development of Understanding of Average

lation to the way the data are distributed. The type of numerical values used in thisinvestigation may assist in creating the cognitive conflict that leads to a resolutionof this problem. Although it is theoretically possible to assign each country studenta value of 8 hr of television watching, it is not possible realistically to assign avalue of 2.3 to each family; thus, it is the total in each case that holds the relevantinformation.

CONCLUSIONS

The message of this investigation for teachers and curriculum planners reinforcesthat of others (cf. Friel, 1998; Gal, 1995; Mokros & Russell, 1995; Russell &Mokros, 1996). Students must be presented with learning experiences that provideopportunities (a) to see the need for representing a data set with a particular value,be it the mean, median, or mode; (b) to learn algorithms to produce representativevalues, particularly the mean and median; (c) to explore the relation between a spe-cific measure of central tendency and possible data sets that could have created it;(d) to consider contexts in which data sets are compared and combined using themean as the representative measure; and (e) to explore open-ended problems re-quiring the mean algorithm to be reversed and the total seen as an important link inthe process. Complementing these experiences with consideration of variation inthe data sets being represented with an average is also advocated following recentresearch by Shaughnessy and colleagues (Shaughnessy, 1997; Shaughnessy, Wat-son, Moritz, & Reading, 1999).

Advocating these five recommendations is nothing new, but they must be reit-erated, particularly for classroom teachers. The folklore of teaching says that text-book problems will influence what is learned. Obviously, on their own they havenot been enough, judging from the examples of Capel (1885) quoted at the begin-ning of this article. More input to the profession from educator–researchers such asGal (1995), Friel (1998), and Russell and Mokros (1996) should assist if their mes-sages are heeded.

ACKNOWLEDGMENTS

This research was funded by the Australian Research Council with Large GrantsA79231392 and A79800950 and with a Small Grant at the University of Tasmaniain 1997.

REFERENCES

Australian Education Council. (1991).A national statement on mathematics for Australian schools.Carlton, Australia: Author.

48 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 40: The Longitudinal Development of Understanding of Average

Australian Education Council. (1994).Mathematics—A curriculum profile for Australian schools.Carlton, Australia: Curriculum Corporation.

Biggs, J. B., & Collis, K. F. (1982).Evaluating the quality of learning: The SOLO taxonomy.New York:Academic.

Biggs, J. B., & Collis, K. F. (1991). Multimodal learning and the quality of intelligent behavior. In H. A.H. Rowe (Ed.),Intelligence: Reconceptualization and measurement(pp. 57–76). Hillsdale, NJ:Lawrence Erlbaum Associates, Inc.

Cai, J. (1995). Beyond the computational algorithm: Students’ understanding of the arithmetic averageconcept. In L. Meira & D. Carraher (Eds.),Proceedings of the 19th Psychology of Mathematics Edu-cation Conference(Vol. 3, pp. 144–151). São Paulo, Brazil: PME Program Committee.

Cai, J. (1998). Exploring students’ conceptual understanding of the averaging algorithm.School Sci-ence and Mathematics, 98,93–98.

Capel, A. D. (1885).Catch questions in arithmetic & mensuration and how to solve them.London: Jo-seph Hughes.

Collis, K. F., & Romberg, T. A. (1992).Collis–Romberg mathematical problem-solving profiles.Haw-thorn: Australian Council for Educational Research.

Denbow, C. H., & Goedicke, V. (1959).Foundations of mathematics.New York: Harper & Row.Department for Education (England and Wales). (1995).Mathematics in the national curriculum.Lon-

don: Author.Friel, S. N. (1998). Teaching statistics: What’s average? In L. J. Morrow (Ed.),The teaching and learn-

ing of algorithms in school mathematics(pp. 208–217). Reston, VA: National Council of Teachersof Mathematics.

Friel, S. N., Mokros, J. R., & Russell, S. J. (1992).Statistics: Middles, means, and in-betweens.PaloAlto, CA: Dale Seymour.

Gal, I. (1995). Statistical tools and statistical literacy: The case of the average.Teaching Statistics, 17,97–99.

Gal I., Rothschild, K., & Wagner, D. A. (1989, April).Which group is better?: The development of sta-tistical reasoning in elementary school children.Paper presented at the meeting of the Society forResearch in Child Development, Kansas City, MO.

Gal, I., Rothschild, K., & Wagner, D. A. (1990, April).Statistical concepts and statistical reasoning inschool children: Convergence or divergence?Paper presented at the meeting of the American Edu-cational Research Association, Boston.

Goodchild, S. (1988). School pupils’ understanding of average.Teaching Statistics, 10,77–81.Hardiman, P. T., Well, A. D., & Pollatsek, A. (1984). Usefulness of a balance model in understanding

the mean.Journal of Educational Psychology, 76,792–801.Hart, W. L. (1953).College algebra(4th ed.). Boston: Heath.Holman, L. J. (1938).Simplified statistics.London: Pitman.Leon, M. R., & Zawojewski, J. S. (1991). Use of the arithmetic mean: An investigation of four proper-

ties, issues and preliminary results. In D. Vere-Jones (Ed.),Proceedings of the Third InternationalConference on Teaching Statistics: Vol. 1. School and general issues(pp. 302–306). Voorburg, TheNetherlands: International Statistical Institute.

Mevarech, Z. (1983). A deep structure model of students’ statistical misconceptions.EducationalStudies in Mathematics, 14,415–429.

Meyer, R. A., Browning, C., & Channell, D. (1995). Expanding students’ conceptions of the arithmeticmean.School Science and Mathematics, 95,114–117.

Miles, M. B., & Huberman, A. M. (1994).Qualitative data analysis: An expanded sourcebook(2nd ed.).Thousand Oaks, CA: Sage.

Ministry of Education. (1992).Mathematics in the New Zealand curriculum.Wellington, New Zealand:Author.

LONGITUDINAL DEVELOPMENT 49

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3

Page 41: The Longitudinal Development of Understanding of Average

Mokros, J., & Russell, S. J. (1995). Children’s concepts of average and representativeness.Journal forResearch in Mathematics Education, 26,20–39.

National Council of Teachers of Mathematics. (1989).Curriculum and evaluation standards for schoolmathematics.Reston, VA: Author.

Pendlebury, C., & Robinson, F. E. (1928).New school arithmetic.London: G. Bell.Pollatsek, A., Lima, S., & Well, A. D. (1981). Concept or computation: Students’ understanding of the

mean.Educational Studies in Mathematics, 12,191–204.Reed, S. K. (1984). Estimating answers to algebra word problems.Journal of Experimental Psychology:

Learning, Memory, and Cognition, 10,778–790.Russell, S. J., & Mokros, J. (1996). What do children understand about average?Teaching Children

Mathematics, 2,360–364.Shaughnessy, J. M. (1997). Missed opportunities in research on the teaching and learning of data and

chance. In F. Biddulph & K. Carr (Eds.),People in mathematics education(Vol. 1, pp. 6–22).Waikato, New Zealand: Mathematics Education Research Group of Australasia.

Shaughnessy, J. M., Watson, J., Moritz, J., & Reading, C. (1999, April). School mathematics students’acknowledgment of statistical variation. In C. Maher (Chair),There’s more to life than centers.Symposium conducted at the Research Presession of the 77th Annual National Council of Teachersof Mathematics Conference, San Francisco.

Strauss, S., & Bichler, E. (1988). The development of children’s concept of the arithmetic average.Jour-nal for Research in Mathematics Education, 19,64–80.

Watson, J. M. (1994). Instruments to assess statistical concepts in the school curriculum. In National Or-ganizing Committee (Ed.),Proceedings of the Fourth International Conference on Teaching Statis-tics (Vol. 1, pp. 73–80). Rabat, Morocco: National Institute of Statistics and Applied Economics.

Watson, J. M. (1996). What’s the point?Australian Mathematics Teacher, 52,40–43.Watson, J. M. (1998). Statistical literacy: What’s the chance?Reflections, 23,6–14.Watson, J. M., & Moritz, J. B. (1999). The beginning of statistical inference: Comparing two data sets.

Educational Studies in Mathematics, 37,145–168.Watson, J. M., & Moritz, J. B. (in press). The development of concepts of average.Focus on Learning

Problems in Mathematics.

50 WATSON AND MORITZ

Dow

nloa

ded

by [

Uni

vers

ity o

f Sa

skat

chew

an L

ibra

ry]

at 0

2:29

05

May

201

3