A Statistics Field Trip?

  • Published on
    15-Jul-2016

  • View
    214

  • Download
    0

Embed Size (px)

Transcript

  • ^ INTRODUCTION ^

    D URING a recent holiday on the small islandof Tiree in the Hebrides, our party, whichconsisted of two families each of three people,three males and three females, stayed in the AlanStevenson Centre at Hynish. This centre, which isowned and run by the Hebridean Trust, occupiesthe former storehouse used during the constructionof Skerryvore Lighthouse (built between 1838 and1843 by Alan Stevenson, the engineer uncle ofRobert Louis Stevenson). The centre caters forsmall school parties at other times of the year andI started thinking about possible statistics projectswhich could be used to justify a `statistics eld trip'to such an idyllic spot.

    Tiree, gaelic name Tir-Iodh, the land of corn, isone of many islands o the west coast of Scotland.Information about Tiree can be found on theWeb site http://www.scotland-info.com/tiree.htm.The west-coast islands are in three groups, thesouthernmost being the islands of the Clyde,

    including Bute; further north lie the Inner Hebridesincluding Tiree, Skye, Mull, Coll and Lismore,and further north still the Western Isles (formerlyknown as the Outer Hebrides), including Barraand Lewis. An excellent map showing the positionof all the islands can be found on the CaledonianMacBrayne Web site http://www.calmac.co.uk/.Tiree is the outermost of the Inner Hebrides; it is asmall island of about 30 square miles and with apopulation of between 800 and 900. Figure 1gives a map of Tiree showing the location ofHynish (1).

    ^ TRAVEL TO TIREE: ^CORRELATION AND REGRESSION

    Tiree is reached either by air from Glasgow or onthe Caledonian MacBrayne (Calmac) ferry `Lordof the Isles' which sails to Scarinish (2) on theisland from Oban on the mainland. At the time ofour visit, there were complaints in the local paper,An Tirisdeach, that the costs of the journey to

    A Statistics Field Trip?

    KEYWORDS:Teaching;Correlation;Ranks;Hypothesis tests;Poisson.

    Susan MeacockUniversity of Southampton, England.e-mail: sem1@soton.ac.uk

    SummaryIdeas for statistical investigations can be found inall sorts of places, even on holiday.

    Fig 1. Map of Tiree.

    Teaching Statistics. Volume 22, Number 1, Spring 2000 . 7

  • Tiree were high compared to the costs of journeysto other islands o the west coast of Scotland. TheCalmac timetable (Caledonian MacBrayne 1996)gives travel costs and journey times to all 23islands which it serves. From this, journey timesand costs (for a single passenger ticket) for the 16islands accessible directly from the mainland wereobtained. (Current information is available on theCalmac Web site.) Journey times varied from 30minutes for Bute and Skye to 5 hours for Barraand costs from 1.95 for Lismore to 15.95 forBarra. At that time Tiree had a journey time of3 hours 55 minutes (via Mull and Coll) at a cost of9.85. Using a computer statistical package, thecorrelation between cost and time was found tobe 0.949, a highly signicant correlation for asample size of 16. The regression line estimatingcost (in pence) in terms of time (in minutes) wasgiven by:

    cost 68:0 4:89 time error

    This gives, for a journey time of 3 hours 55minutes (235 minutes), an expected cost of 12.17.In these terms the Tireasdachs appear to have littleto complain about! The plot in gure 2 shows that6 of the islands are paying less than expected with10 paying more. Lewis in the Western Isles appearsto be suering most. Of course, it could well beargued that the ferry to Tiree does not take theshortest route and that a fairer measure would beto use sea distances rather than journey times. Toexpand this project, other costs (passenger returns,vehicles) could be used or a comparison made withair fares where these are available. Anotherexample of the use of correlation and regressioncould include examining the relationship betweenpopulation and area on these west-coast islands.

    ^ `SCULPTURE IN WATER': ^RANKING METHODS

    During our holiday we paid a visit to an exhibitionentitled `Sculpture in Water' at Milton (3), in which10 international artists had created sculptures in aseries of watery areas (known as lochans), mainlyusing material found locally. We saw 11 of the 13sculptures (in a typical Hebridean downpour) andthe 6 members of the party ranked them in orderof their personal appeal. Spearman's rankcorrelation coecients were then obtained for eachpair of rankings. Table 1 gives the values (correctto 3 decimal places) of Spearman's coecient foreach of the pairs. The gures in bold type aresignicantly greater than zero at the 1% level;those in italics are signicant at the 5% but not atthe 1% level. (For 11 objects the 5% critical valueis 0.536 and the 1% critical value is 0.709.)

    Figure 3 illustrates all the correlations signicantat the 5% and 1% levels. Four of the participantsclearly agreed fairly well in their orderings, whileone, JB, had negative coecients with three of theother participants.

    For each group of three people the averagecorrelation can be obtained. These were 0.75 and0.16 for the two families, 0.48 for the women and0.32 for the men.

    It can be shown (Kendall 1943) that for groups ofm individuals ranking n objects, the averagecorrelation lies between 1=m 1 and 1. Themaximum value of 1 will always be obtainedwhen each of the m individuals award the samerankings. The minimum value of 1=m 1 isobtained when the sum of the rankings given bythe m rankers is the same for all of the n objectsbeing ranked and is equal to m times the averagerank, n 1=2. Hence if n is even and m odd andassuming no tied ranks, the minimum cannot beachieved. It is an interesting exercise to nd sets ofm rankings with minimum average correlation forFig 2. Regression of cost against time.

    TB JB SB SM EM GM

    TB 1.000 0.191 0.373 0.546 0.136 0.427JB 1.000 0.082 0.118 0.046 0.473SB 1.000 0.718 0.655 0.564SM 1.000 0.764 0.709EM 1.000 0.782GM 1.000

    Table 1. Spearman's rank correlation coecients

    8 . Teaching Statistics. Volume 22, Number 1, Spring 2000

  • m even and any value of n, which is easy, and form odd and n odd, which is more challenging.

    Clearly such ranking experiments could easilybe adapted to other situations and to other groups.Such an experiment allows consideration of rankingmethods and levels of signicance. Considerationof how to obtain an overall ranking of the objectscan also be made.

    ^ COWRIE SHELLS: DATA ^ANALYSIS AND HYPOTHESIS TESTING

    Tiree is famous for, amongst other things, its abun-dant wildlife and this could provide a fruitfulsource of data. The sandy beaches contain manyshells, in particular those of the two species ofcowrie found in the British Isles, the EuropeanCowrie (Trivia monacha) and the Spotted Cowrie(Trivia arctica) (gure 4). The European Cowrie isunspotted but the Spotted Cowrie has three spots

    which are purplish-brown in colour. The EuropeanCowrie is generally found to be smaller in size thanits spotted relative (McMillan 1977).

    Over three days I collected and measured thelengths of 338 cowries (270 unspotted and 68spotted) on 3 beaches, Hynish (1) and Balephuil(4) on the south-west of the island and Vaul (5) onthe north-east. Table 2 summarizes the resultsobtained.

    The data can be used for exploratory dataanalysis (stem-and-leaf, box-and-whisker), histo-grams, Normal plots and chi-squared goodness-of-t tests. (The Normal t was accepted for boththe spotted and unspotted data.) One-tailedhypothesis tests of the equality of the means of thetwo species were rejected at the 0.1% level orsmaller for the overall data and for each individualbeach. Two-tailed hypothesis tests of the equalityof size of specimens on the dierent beaches wererejected at the 5% level or smaller except for

    Fig 3. Chart to show signicant Spearman correlation coecients.

    Fig 4. Spotted Cowrie Trivia arctica from the top and from the bottom.

    Teaching Statistics. Volume 22, Number 1, Spring 2000 . 9

  • Hynish/Balephuil for both spotted and unspottedspecimens.

    Independent two-sample tests were used through-out on a statistics package which also tested forequality of variances. Clearly analysis of variancewould have been a preferable method with ad-vanced students.

    Statistical tests of the dierences between theproportions of spotted/unspotted at the threelocations were made. The chi-squared test on thetable of numbers of the two species on the threebeaches gave a highly signicant value of 26.56with 2 degrees of freedom, thus the hypothesis ofequality of proportions can be rejected. Condenceintervals for the proportions on the three beachescould be obtained.

    Tests on other types of shells could have been usedinstead. Possible experiments could include similartests on the relative sizes of the two species oftortoiseshell limpet, Acmaea, or on the proportionsof the dierent colours of the at periwinkle,Littorina littoralis, or on the height/diameter ratioof the common limpet, Patella vulgata, whichvaries with habitat (Yonge 1966).

    Tiree is also famous for its ower-lled machair(low-lying sandy beach or boggy links aordingsome pasturage) and similar analyses on plantheights in varying locations would have been analternative use of hypothesis testing.

    ^ MEDICAL EMERGENCIES: ^THE POISSON DISTRIBUTION

    Medical emergencies on Tiree are dealt with bysending an air ambulance from Glasgow to theairport (6). An Tirisdeach reported that in the twoyears prior to our visit there were 33 ambulancecalls each year. In the week before our visit therewere 2 call-outs. Assuming that call-outs are

    random events with a mean rate of 33/52 perweek, what is the probability distribution, px, ofthe number of call-outs per week and in particularthe probability of two or more? Assuming aPoisson distribution with m 33=52 0:6346, wehave

    px emmx=x! x 0; 1; 2; . . .

    or more conveniently

    p0 em and px fm=xgpx 1

    for x 1; 2; 3; . . . :

    This gives table 3, in which the values of px aregiven correct to 3 decimal places.

    Thus in only 13.4% of weeks will there be 2 ormore call-outs and in over half the weeks there willbe no call-outs. Data on the actual numbers ofcall-outs over the two years could be obtained andused to estimate the randomness, or otherwise, ofthe distribution. Possible sources of nonrandom-ness would include seasonal eects due to touristsin the summer and bad weather in the winter. Forrandom data the gaps between call-outs wouldfollow an exponential distribution and this couldalso be tested.

    Tiree has its own weather station and is acoastal report station which features on the UKshipping forecast. It is also known as the`sunshine island' because of its good sunshinerecords, principally in May and June. Weatherrecords, which can be obtained from the Website http://www.wunderground.com, are thus an-other possible area of investigation which couldgenerate some interesting hypotheses.

    Unspotted Spotted

    Day Beach Sample size Mean SD Sample size Mean SD

    1 Hynish 168 8.90 1.19 42 10.55 1.312 Balephuil 14 9.50 0.94 16 11.25 1.483 Vaul 88 7.99 1.01 10 9.60 1.35

    Total 270 8.64 1.22 68 10.57 1.43

    Table 2. Summary of data for cowries

    x 0 1 2 3 4 5

    px 0.530 0.336 0.107 0.023 0.004 0.000Table 3.

    10 . Teaching Statistics. Volume 22, Number 1, Spring 2000

  • ^ INTRODUCTION ^

    T HIS article is the second of a pair containingexamples where the graphics calculator can

    provide students with particularly valuable insightsinto some of the big ideas in statistics. The rstarticle, entitled `Statistical Nuggets with a GraphicsCalculator', appeared in the previous issue of Teach-ingStatistics (volume 21, number 3, pages 70^3).

    ^ CONCLUSION ^

    Biology and geography eld trips are a commonfeature of school and college life these days. I havetried to show that a statistics eld trip could alsobe a possibility and could be used to illustratemany of the main features of statistics syllabuses.

    Realistically, I fear that I have probably not suc-ceeded in justifying such an expedition. However, Ihope that some of the ideas could be modied foruse in a `statistics day' somewhat nearer home,with pupils of all secondary ages.

    The travel example could easily be adapted to uselocally available bus, coach or train times andfares. I have carried out a similar experiment usingcoach journey times, costs and mileages to 20 UKcities from Birmingham, which worked well.

    The ranking experiment could be carried out usingpictures, photographs, music, sweets or any otheritems of interest. Using penny sweets is a cheapoption which has worked nicely with students inthe past.

    Shells could be replaced by other creatures foundlocally such as land snails or owers or grasses in

    dierent habitats. Finally, data from any of theemergency services could form the basis for thePoisson analysis. For those near the coasts in theUK, life-boat stations usually give records of theirlaunchings and further inland data from re orambulance stations could be used.

    Acknowledgements

    The author would like to thank the editor and thereferee for many helpful comments. She would alsolike to thank Malcomn Marshall and Israel Vieirafor their help with the illustrations. Finally sheacknowledges the patience of her extended familyin allowing her to collect data while on holiday.

    ReferencesCaledonian MacBrayne (1996). Timetables andFares.

    Kendall, M.G. (1943). The Advanced Theoryof Statistics Vol I. London: CharlesGrin.

    McMillan, N.F. (1977). The Observer's Bookof Seashells of the British Isles. London:Frederick Warne.

    Yonge, C.M. (1966). The Sea Shore. London:Collins.

    "CC" " " " " " " " " " " " " " " " " "

    OMPUTING CORNEROMPUTING CORNER

    More Statistical Nuggets with a Graphics Calculator

    KEYWORDS:Teaching;Boxplot;Histogram;Graphs.

    Alan GrahamThe Open University, Milton Keynes, England.e-mail: a.t.graham@open.ac.uk

    SummaryThis article follows on from the one in theprevious issue concerning exploration of thegraphics calculator.

    Teaching Statistics. Volume 22, Number 1, Spring 2000 . 11