Transcript
Page 1: Estimation and effects of tag-misread rates in capture-recapture studies

Estimation and effects of tag-misread rates incapture–recapture studies

Carl James Schwarz and Wayne T. Stobo

Abstract: Virtually all capture–recapture studies assume that tags are read properly when an animal is captured.However, the animals in resighting studies are often not handled, tags are read at a distance, and misreads can occur.In this paper, we develop a simple model to account for misreads, assess the impact of misreads upon estimates ofsurvival and catchability, and analyze a set of data from resightings of previously branded seals. In general, unlesssurvival, capture, and misread rates are relatively high, most studies have poor power to detect misreads and estimatesof survival are not severely biased. Estimates of capture rates will be biased downwards and will tend to estimate theproduct of the original capture rate and the complement of the misread rate.

Résumé: Dans presque toutes les études de marquage–recapture, on assume que les marques sont lues correctementquand un animal est recapturé. Cependant, dans les études de marquage–observation, les animaux marqués ne sontgénéralement pas manipulés, les marques sont lues à une certaine distance et les lectures peuvent être erronées. Danscet article, nous présentons un modèle simple permettant de tenir compte des lectures erronées, évaluons l’impact deslectures erronées sur les estimations de la survie et de la capturabilité et analysons un ensemble de données sur desobservations de phoques marqués visuellement. En général, à moins que les taux de survie, de capture et de lectureserronées soient relativement élevés, on réussit mal à détecter les lectures erronées dans la plupart des études, et lesestimations de la survie ne sont pas gravement biaisées. Il y aura sous-estimation des taux de capture et on tendra àestimer le produit du taux de capture original par la probabilité que les marques soient lues correctement.

[Traduit par la Rédaction] Schwarz and Stobo 559

Introduction

One of the assumptions made for capture–recapture stud-ies is that tag numbers are read without error whenever ananimal is captured. Surprisingly, the effect of tag misreadsand how to estimate the misread rate have not been previ-ously considered.

Errors in reading tag numbers is a particular problem in“resighting studies” where the animal is not physically han-dled. For example, the context of this paper is a mark–resight study of seals branded as pups and resighted as adultsseveral years later where the subsequent “recaptures” aresightings taken from a distance. The brand may be partiallycovered with sand, the animals may be moving when sighted,the weather conditions may impede a clear reading of thebrand number, or the brand was poorly applied resulting inportions of the brand being indistinct. Other contexts in whichmisreads may be a problem are where tag numbers deterio-rate over time (e.g., ring colors fading over time or ringscracking and pieces breaking off), where untrained person-nel (e.g., volunteers in a bird survey) are used to spot and re-

cord tag numbers, or where animals are “self-marking” (e.g.,whales identified through scars and other markings).

In this paper, we develop a very simple model for misreaderrors, investigate the general effect of misreads upon the es-timates when the usual models ignoring the problem of mis-reads are applied, and then apply our methodology to aresighting study of previously branded seal pups.

Methods

Statistical notationAssume that there areK sample times and, for simplicity, that

there are neither losses on capture nor injections at any sampletime. We consider a single cohort released at time 1 of a knownsize, in which each animal is tagged with a unique tag number.This is the usual setup for a Cormack–Jolly–Seber (CJS) study(Cormack 1964; Jolly 1965; Seber 1965).

A total of R1 = N1 animals are originally captured, tagged, andreleased with individual tag numbers 1,...,N1. In actual practice,tag numbers may be codes of letters, numbers, symbols, colors,etc., but for this paper, we will assume that they are sequentiallynumbered. When misreads are possible, there is no longer a one-to-one correspondence between observing a particular tag numberand observing the particular animal originally tagged with that tagnumber. We will use the terms “observing a tag number” and “cap-turing an animal”, respectively, to distinguish between the twoevents.

Parametersφi probability that an animal alive at timei will be alive and in

the population at timei + 1; permanent emigration is indistin-guishable from and confounded with death

pi probability that an animal alive at timei will be captured; thisdiffers from the usual definition in capture–recapture studies

Can. J. Fish. Aquat. Sci.56: 551–559 (1999) © 1999 NRC Canada

551

Received June 9, 1998. Accepted November 13, 1998.J14645

C.J. Schwarz.1 Department of Statistics and Mathematics,Simon Fraser University, Burnaby, BC V5A 1S6, Canada.W.T. Stobo. Department of Fisheries and Oceans, BedfordInstitute of Oceanography, P.O. Box 1006, Dartmouth,NS B2Y 4A2, Canada.

1Author to whom all correspondence should be addressed.e-mail: [email protected]

I:\cjfas\cjfas56\CJFAS-04\F98-196.vpMonday, May 17, 1999 1:56:48 PM

Color profile: DisabledComposite Default screen

Page 2: Estimation and effects of tag-misread rates in capture-recapture studies

in that we do not assume that the tag number will be readcorrectly

θi probability that an animal captured at timei will have its tagcorrectly read

ν i probability that if a tag is incorrectly read at timei, then it isobserved as a valid tag number

pimisread probability that the tag number for a particular animal

will be observed at timei through a misread of some othercaptured animal

pialive probability that the tag number for a particular animal will be

observed if the animal is alive at timei; this could occurthrough an animal being captured and its tag being read prop-erly or from a misread from another animal captured

pideadprobability that the tag number for a particular animal will be

observed if the animal is dead at timei; this can only occurthrough a misread of a tag number for another animal cap-tured at the same time

λialive probability that a tag number corresponding to a particular

animal alive at timei will be observed after timeiλi

dead probability that a tag number corresponding to a particularanimal dead at timei will be observed after timei

Ni number of animals (out of those released) that are alive attime i; N1 is known; E[Ni] = N1φ1 φ2 , ,K φi −1

Statisticsh = (h1, h2,..., hK) generic encounter history vector for a partic-

ular tag number wherehi = 1 if the tag number is observed attime i andhi = 0 if the tag number is not observed at timei;because all animals were released alive at time 1,h1 = 1 forall animals

nh number of tag numbers with capture history h; this is distin-guished fromni (below) by the context in which it is used

Ri = ni = mi number of tag numbers observed at timei; in theCJS model, all recaptures are of previous marked animals,and so,mi = ni; if there are no losses on capture, the numberof tag numbers released is equal to the number of tag num-bers observed (in our example, recaptures are resightings andso there are no losses)

ri number of tag numbers recovered (in the sense that the tagnumber is again observed) out ofRi after time i

zi number of tag numbers observed before timei, not observedat time i, and observed after timei

Matrices

fi 2 × 2 matrix with elementsφ φi i

0

1

1

pi 2 × 1 matrix with elementsp

p

i

i

alive

dead

D(·) operator that makes a matrix with zeroes on the off-diagonalsand its vector argument along the main diagonal

Statistical model and its assumptionsWe will assume that a standard CJS study (Cormack 1964; Jolly

1965; Seber 1965) is being conducted on a single cohort of ani-mals and will make the usual assumptions.

At the end of the study, each tag number will have a capture his-tory h = (h1, h2,..., hK). For example, ifK = 4, the history vectorh = (1, 0, 1, 1) corresponds to this tag number being observed attime 1 (its release) and times 3 and 4. The actual capture history ofthe corresponding animal may be different because of misreads.For example, the eventh3 = 1 can occur either because the corre-sponding animal was alive, captured at timei, and its tag read

properly or some other animal was captured, its tag misread, andthe misread generated this tag number.

Note that in some cases, misreads will generate tag numbers thatare not possible, e.g., a number never applied. If these are re-corded, this information will give a lower bound on the tag mis-read rate.

The major difficulty in computing the probability of a particularhistory vector (which is needed in order to construct the likelihoodfunction) is the modelling of the misread process.

We will assume that if a misread occurs, then any one of theN1 = R1 initial tag numbers is equally likely to be generated. How-ever, the chance that a tag number is misread and the value of thesubsequent misread in real life is not equal for all tags and dependsupon many factors such as the way the tag is applied to animals,the conditions under which it is read, the particular set of tag num-bers used, etc. For example, in our seal study, the brand numbersare observed at a distance of 2–4 m and individual digits may bepartially obscured by dirt or shade or be indistinct due to poorbrand application. Under these circumstances, the digit “7” is morelikely to be misread as a “1” rather than an “8”. Nevertheless, oursimple model appears to work well, as illustrated in our examplelater in the paper.

Under this assumption, we can derive a simple expression forthe probability that a tag number will be observed through a mis-read. At sample timei, there areNi animals alive. Each of thesehas a probability ofpi of being captured, a probability of (1 –θi)that a misread will occur, a probability ofν i that the misread willgenerate a legitimate tag number, and a probability of 1/N1 that aparticular tag number will be generated through the misread. Con-

sequently, there is a probability of1 11

1

− −pN

i i i( )θ ν that a partic-

ular tag number willnot be generated by a misread of anotheranimal’s tag. However, this nonevent must occur for every one ofthe Ni animals alive, and so the probability that a particular tagnumber will be observed at timei because of a misread is

p pN

i i i i

Ni

misread = − − −

1 1 1

1

1

( )θ ν

≈ −pNN

i i ii( )11

θ ν

≈ −−φ φ φ1 2 1 1, ..., ( )i i i ip θ νwhich does not depend upon the number of animals originallytagged or upon the number of animals currently alive at timei (ex-cept indirectly through the survival rates).

The pimisread can be used to derive the conditional probabilities

of observing an animal’s tag number at timei if it is dead,pidead, or

if it is alive, pialive. A dead animal’s tag number can only be ob-

served through a misread, hence,p pi idead misread= . On the other

hand, an alive animal’s tag number can be observed either becauseit is captured and the tag is read properly or because a misread ofanother animal occurs. Because these two possibilities occur inde-pendently:

p p p p pi i i i i i ialive misread misread= + −θ θ

= − − −1 1 1[ ][ ] .p pi i iθ misread

Finally, the probability of the entire history vector is found byexamining all the possible (unknown) survival and capture pro-cesses that could lead to the observed capture history vector. Forexample, the history vectorh = (1, 0, 1, 1) could have occurred bythe animal surviving until time 4 with the observedhi dependingonly upon thepi

alive, or the animal may have died at sample time 3,and the final observation is a misread from another animal, etc. Inthis case, there are four possible survival and capture processes

© 1999 NRC Canada

552 Can. J. Fish. Aquat. Sci. Vol. 56, 1999

I:\cjfas\cjfas56\CJFAS-04\F98-196.vpMonday, May 17, 1999 1:56:52 PM

Color profile: DisabledComposite Default screen

Page 3: Estimation and effects of tag-misread rates in capture-recapture studies

that could lead to the observed history, and the overall probabilityof the history is found as

P p p p( ( , , , )) ( )h = = −1 0 1 1 11 2 2 3 3 4φ φ φalive alive alive

+ − −φ φ φ1 2 2 3 3 41 1( ) ( )p p palive alive dead

+ − −φ φ1 2 2 3 41 1( )( )p p palive dead dead

+ − −( )( ) .1 11 2 3 4φ p p pdead dead dead

It is the potential for misreads to generate observations of a par-ticular tag number when the animal has died that makes the con-struction of the likelihood for a particular capture history vectorextremely complex. However, it can be greatly simplified by re-casting the problem as a multistratum framework, as suggested byLebreton et al. (1998). In this problem, there are two strata (aliveor dead) in which an animal can be present at timei with a transi-tion matrixfi for the probability of changing strata between timeiand timei + 1. Using a similar argument as used in Brownie et al.(1993) and Schwarz et al. (1993), the probability of historyh for aparticular tag number can be found as

Ph

h

t

( )( )

( )h

D p

I D p=

× ×

−==

1

0

1

01

2

2

2

2

fif

if

× ×−

==

× ⋅ ⋅ ⋅f23

3

3

3

1

0

D p

I D p

( )

( )

if

if

h

h

× ×−

==

×

−fK

h

h1

1

0

1

1

D p

I D p

( )

( ).K

K

K

K

if

if

For example:

Pt

( ( , , , ))h = =

×

1 0 1 1

1

0 0

1

1

1 1φ φ

× −−

1

0

0

12

2

p

p

alive

alive

×−

×

φ φ2 2 3

30

1

1 0

0p

p

alive

alive

×−

×

×

φ φ3 3 4

40

1

1 0

0 1

1

p

p

alive

alive

which gives the result earlier. The final matrix1

1

is needed be-

cause the two strata cannot be directly observed and an observed

tag number could have arisen from either stratum. The initial1

0

vector is needed because, presumably, all animals were originallyreleased in the alive state. However, the initial matrix could bemodified if there were a known acute mortality due to marking ef-fects, or it could be made into a parameter to estimate this acute ef-fect.

The R1 animals released can be partitioned into mutually exclu-sive classes based upon their subsequent capture histories. Conse-quently, assuming that animals are independent of each other, thecounts of the number of animals in each capture history class {nh}can be modelled as a multinomial distribution with the probabilityof observing the historyh as given above. The likelihood functionis then the multinomial probability function

L n pR

nPi i i i

n({ }| { }, { }, { }, { }){ }

( )hh h

h hφ θ ν =

∏1 .

An additional component could be appended to the likelihood ifrecords are kept of tag numbers reported that are obviously notpossible. However, this information is rarely available and this ad-ditional component will be ignored.

Because of the complex form of theP(h) terms, there does notappear to be any factorization of the likelihood possible, and it ap-pears that the minimal set of sufficient statistics are the observedcapture histories and their corresponding counts.

Inspection of thepialive and pi

dead shows that not all parametersare identifiable. In particular, only the productspiθi (the probabil-ity that an animal is captured and its tag read properly) andpi(1 –θi)ν i (the probability that an animal is captured, its tag misread,and a valid tag number generated) can be estimated. There is alsothe usual confounding at the end of the study betweenφK −1 andpK.

Standard numerical methods can be used to maximize the likeli-hood function. For example, we used the conjugate gradient opti-mization method provided in PROC IML of SAS (SAS Institute,Inc. 1995). As in the case with most multistrata models, the likeli-hood is very flat, and care must be taken not to select a local maxi-mum. As well, in many cases, the misread rate is likely small, anda fully specified model with separate misread rates for each sampletime may be too general to be useful. A simpler model would bemore useful as a starting point and assumes thatθi andν i are con-stant over time, i.e.,θi = θ andν i = ν for all i.

The usual CJS model with no misreads corresponds to the modelwith θi = 1 for all i (with ν i unspecified). A likelihood ratio testcould be constructed to test if the misread rate is 0, but care mustbe taken in this nonstandard situation because the null hypothesisis on the boundary of the parameter space and is not two-dimensional(Davies 1977; Self and Liang 1987). In this situation, the easiestway to find the significance level of the likelihood ratio test isbased upon a parametric bootstrap sample using the estimates froma fit to the standard CJS model (Buckland and Garthwaite 1991;Manly 1997).

Standard errors for the parameter estimates from the misreadmodel can be found using the usual inverse of the information ma-trix returned by the numerical optimization process. For example,we used a central finite difference method to estimate the informa-tion matrix after convergence was obtained. As well, standard er-rors and confidence intervals for all parameters, but particularlythose whose estimates fall on the boundary of the parameter space(e.g., estimates of survival equal to 1), can be found using boot-strap methods on the observed capture histories, as outlined inBuckland and Garthwaite (1991).

Because the likelihood cannot be factored into a smaller set ofsufficient statistics, there are no simple goodness-of-fit statisticsthat “fall out of the likelihood”. However, one possible goodness-of-fit test would be to compare the observed and expected countsof tag numbers classified by the number of times the tag number isobserved.

Results

Effect of misreads

Case whenνi = 0If misreads never generate other valid tag numbers, the

likelihood simplifies to that of the standard CJS model ex-cept that only the productspiθi are estimable. It is fairly sim-ple to show that in large sample,E p pi i i i[ $ | ]CJSν θ= =0 andthat E i i i[ $ | ]φ φCJSν = =0 , i.e., only estimates of the recapturerates are biased downwards. This is not surprising: misreads

© 1999 NRC Canada

Schwarz and Stobo 553

I:\cjfas\cjfas56\CJFAS-04\F98-196.vpMonday, May 17, 1999 1:56:55 PM

Color profile: DisabledComposite Default screen

Page 4: Estimation and effects of tag-misread rates in capture-recapture studies

that never lead to valid tag numbers are indistinguishablefrom failing to capture the animal.

Case whenνi > 0We investigated the likely effects of misreads in these sit-

uations by computing the expected CJS summary statisticsand then treating the expected values as “data” in the usualestimation formula, as outlined in Lebreton et al. (1992,p. 83).

The probability of observing a tag number correspondingto live and dead animals was derived earlier. Consequently,assuming no losses on capture and no injections at any sam-ple time, the expected number of animals captured and re-leased at timei is found as

n R N1 1 1= =

n m R R Ri i i i i= = = +alive dead

= + − =N p N N p i Ki i i ialive dead for( ) , ..., .1 2

Analogous to similar quantities defined in Lebreton et al.(1992), we compute

λKalive = 0

λ λi i i i i i ip palive alive alive alive= + − + −+ + +φ φ φ1 1 11 1( ) ( )pi+1dead

+ − − = −+ +( )( ) , ...,1 1 1 11 1φ i i ip i Kdead dead forλ

λKdead = 0

λ λi i ip i Kdead dead dead for= − − − = −+ +1 1 1 1 11 1( )( ) , ..., .

Then

r R Ri i i i i= +alive alive dead deadλ λ .

Finally, using the relationships

z1 0=

z z r m i Ki i i i+ += + − = −1 1 1 1for , ...,

allows us to compute the expected values for all sufficientstatistics for the CJS estimates.

There does not appear to be any simple closed-form ex-pressions for the expected values of the ordinary CJS esti-mators as a function ofθ or ν. A simple program to performthe above computations is available from the first author.The results are asymptotic, but simulations using finite sam-ples confirmed the results except for the usual small samplebiases introduced into the estimates.

We investigated all possible combinations forK = 10 ofφ = {0.3, 0.6, 0.9},p = {0.1, 0.3, 0.5},θ = {0.80, 0.90, 0.95,0.99}, andν = 0.5 where all parameters are held constantover time. These represent a range of scenarios likely en-countered in practice.

The results for the estimates of survival are shown inFig. 1. The estimates of survival can be severely positivelybiased early on in the study, particularly if the misread rateis high. This is caused by the fact that a misread anytimelater in the study that generates a tag number correspondingto an already dead animal “implies” that the animal was still

alive early in the study. By midstudy, the bias dies out be-cause by then, estimates of survival are based upon animals“known to be alive” from this point onwards, and there arefewer chances that the tag misreads will select animals fromthis subset. Bias decreases in the early part of the study asthe survival rates increase because longer lived animals havea greater chance of legitimately being observed. But this hasa price as the bias extends further into the study because theanimals are captured more often and each recapture maylead to a misread. The bias appears to be relatively insensi-tive to values ofp because the range of expected values(shown by the small vertical lines) is very small over thevalues ofp examined.

The effects of misreads on the estimates of the capturerate are summarized in Fig. 2. Bias becomes more pro-nounced as capture rates increase: presumably, more cap-tures lead to greater chances of misreads generating other“valid” tag numbers. The mean of the CJS estimate is closeto piθi, the net probability of seeing an animal and properlyreading its tag number, but is usually below this quantity be-cause the misreads generate other valid tag numbers that in-flate theapparent population at risk of being captured. Becauseestimates of abundance are, for the most part, based upon es-timates of recapture, this indicates that abundance estimatorsare positively biased throughout a study. Pradel et al. (1997)noticed that transients tend to cause the Robson (1969)goodness-of-fit test to reject the CJS model. A similar effectis expected to occur in the presence of misreads. In theRobson (1969) goodness-of-fit test, a series of contingencytables for each sample timei (i = 2,...,K – 1) are constructedof the form:

Tag numberssubsequently

observed after timei

Tag numbers neverobserved again

after time i

Tag numbers firstobserved at timei

Tag numbers firstobserved before timei

The row corresponding to tag numbers first observed attime i will tend to include misreads that have inadvertentlygenerated a valid tag number from a dead animal, while therow corresponding to tag numbers first observed before timei will tend to include mostly live animals. The probability ofsubsequent observations of each subset will be differentweighted averages ofλ i

dead andλ ialive, and theχ 2 test for ho-

mogeneity of proportions will reject the hypothesis of equalsubsequent observation rates. Early in the study, the differ-ence in the subsequent observations will be small becausemany of first-observed animals will, in fact, be valid reads,but the sample sizes will tend to also be large. Later in thestudy, the first-observed group will be predominately mis-reads, the difference in the subsequent observation rates willbe larger, but sample sizes will be smaller because many an-imals will not have survived.

The power to detect tag misreads was investigated viacomputer simulation by generating the expected statistics forthe 2 × 2contingency tables above used for goodness-of-fitto the CJS model (Lebreton et al. 1992) and treating these asdata to obtain the noncentrality parameter as was done in

© 1999 NRC Canada

554 Can. J. Fish. Aquat. Sci. Vol. 56, 1999

I:\cjfas\cjfas56\CJFAS-04\F98-196.vpMonday, May 17, 1999 1:56:58 PM

Color profile: DisabledComposite Default screen

Page 5: Estimation and effects of tag-misread rates in capture-recapture studies

Burnham et al. (1987, pp. 214–217). We investigated allcombinations ofK = {10, 15}, N1 = {250, 500, 1000},φ ={0.3, 0.6, 0.9, 0.95},p = {0.1, 0.3, 0.5}, θ = {0.80, 0.90,0.95, 0.99, 1.0}, andν = {0.5, 0.75, 1.0} when all parame-ters were held constant over time. Forα = 0.10, the powernever exceeded 25% except for combinations whereφ ≥0.90,p ≥ 0.5,N1 ≥ 500, andθ ≤ 0.90; even then, the maximalpower was only 75% whenK = 10, but increased substan-tially and was never below 80% for this subset whenK wasincreased to 15. Power can actually decline if the survivalrate is very high; in these cases, most animals survive andmisreads only generate tag numbers of animals still alive.

Example applied to sealsSchwarz and Stobo (1997) described a capture–recapture

study where cohorts of seal pups on Sable Island were sexedand branded with unique brand numbers. The females returnto breed starting at about age 5. During breeding season,multiple surveys of the island are conducted and, at dis-tances of 2–4 m, observers read and record brand numbersof seals spotted. On surveys conducted after inclement weather,the fur of some animals is partially covered with sand orsnow; some animals have indistinct brands due to poor applica-

tion technique, and they may be moving. Multiple observationsof the same brand number within a single breeding season arecollapsed to a single occurrence of the brand number for thatyear.

We will demonstrate our methodology using the 1973 fe-male cohort. One additional year of sighting data is nowavailable, and so the summary statistics in this paper differslightly from those presented in Schwarz and Stobo (1997).

A total of 285 female pups were branded when 1–2 monthsold with unique brand numbers of the form Bnnn wherennnwould represent up to three digits. The values ofnnn wereassigned to pups sequentially with both genders intermin-gled. Resighting trips took place in 1981, when individualswere 8 years old, and in following years. Some informationis available from one earlier year (1978), but the surveywork in that year was not systematic. Consequently, thatyear’s information was not used. Also, female seals usuallystart to recruit to breeding status at age 4, but may not beenfully recruited until age 8.

At the end of the field season, the logbooks were returnedto the laboratory and transcribed. Cases where obvious ille-gal brand numbers were reported (e.g., Bxxx where no sealswere branded withxxx) were excluded. Unfortunately, the

© 1999 NRC Canada

Schwarz and Stobo 555

Fig. 1. Approximate expectation of$φ in the presence of misreads. Each set of four lines represents (from top to bottom in each set)θ = 0.80, 0.90, 0.95, and 0.99 forφ = 0.9, 0.6, and 0.3, respectively. The vertical bars represent the range of expected values over allcombinations ofp = {0.1, 0.3, 0.5},ν = 0.5, andK = 10 with all parameters held constant over all occasions. Note that the verticalbars are extremely small in this graph, indicating that the mean values of$φ are very similar over the ranges ofp investigated in thesimulation.

I:\cjfas\cjfas56\CJFAS-04\F98-196.vpMonday, May 17, 1999 1:57:05 PM

Color profile: DisabledComposite Default screen

Page 6: Estimation and effects of tag-misread rates in capture-recapture studies

number of times this occurred is not easily determined; other-wise, this could have provided additional information aboutthe joint probability of making a misread and generating aninvalid brand number.

A total of 134 different capture histories were observed.Each capture history consists of a vector of length 15 whereyear1 = 1973 (the year of release), year2 = 1981, year3 =1982,..., year15 = 1994.

A standard CJS model was fit that corresponds to a modelwith θ = 1 andν = 0 (Table 1). A comparison of the observedand expected number of brands by the number of times seenin total during the study (Table 2) showed an obvious lack offit, particularly with the number of brands reported onlyonce after release. As in our earlier paper (Schwarz andStobo 1997), the Robson (1969) goodness-of-fit tests stronglyrejected this model because there was a large difference be-tween the subsequent sighting probabilities of brands ob-served for the first time after branding and those previouslyobserved after branding, with the latter much more likely tobe resighted in the future. Again, as noted in Schwarz andStobo (1997), we suspect that the cause is brand misreads.

A model with constantθ and ν over time was fit to thedata (Table 1). The likelihood ratio test statistic comparingthe standard CJS model (Model 1) with that withθ andν constant over time (Model 2) is approximately 290 =–2(–1617.7 – (–1472.9)). The usual large sampleχ2 approx-

imation to the distribution of the test statistic is not applica-ble in this case because the null hypothesis has the parameterson the boundary of the parameter space and is not two-dimensional (i.e.,ν disappears in Model 1). As noted earlier,the significance level was computed from the distribution ofthe likelihood ratio test statistic formed by a parametricbootstrap using the CJS estimates in Table 1 and is well be-low 0.001. Table 2 also shows that Model 2 had a dramaticimprovement in fit.

The estimated misread rate is approximately 0.10 (SE =0.015). The probability that a misread generates a validbrand number is estimated to be 1.00 (SE = 0.020 as deter-mined from a bootstrap procedure). This is largely a conse-quence of the fact that the observers were familiar with thegeneral range of valid brand numbers and of the data clean-ing that has taken place in the laboratory, which implies thatmisreads that generate invalid brand numbers are removedfrom the data. The bootstrap SE for the estimates in Table 2were very close to the SE as computed from the inverse ofthe information matrix (except for those estimates on thebounds of the parameter space). The revised estimates ofpiare much larger than the corresponding estimates under theCJS model, even after multiplying by the estimate ofθ (Ta-ble 1). The estimates ofpi under the CJS model are severelybiased downward, which will severely inflate any estimateof abundance based upon them. Fortunately, Schwarz and

© 1999 NRC Canada

556 Can. J. Fish. Aquat. Sci. Vol. 56, 1999

Fig. 2. Approximate expectation of$p in the presence of misreads. Each set of four lines represents (from top to bottom in each set)θ = 0.99, 0.95, 0.90, and 0.80 forp = 0.5, 0.3, and 0.1, respectively. The vertical bars represent the range of expected values over allcombinations ofφ = {0.3, 0.6, 0.9},ν = 0.5, andK = 10.

I:\cjfas\cjfas56\CJFAS-04\F98-196.vpMonday, May 17, 1999 1:57:11 PM

Color profile: DisabledComposite Default screen

Page 7: Estimation and effects of tag-misread rates in capture-recapture studies

Stobo (1997) showed that a similar inflation also affects theestimated returning population size, and so the effect of mis-reads cancels when estimating the return rate.

The estimates of survival are less affected, but show thatthe CJS estimates are likely positively biased at the begin-ning of the sampling chain and negatively biased from the

middle of the study onwards. As noted earlier, this is notsurprising. A brand misread late in the study that generates abrand number for a dead animal that died early in the studyleads the CJS method to assume that the animal has beenalive from the beginning of the study, which tends to inflateearlier estimates. By midstudy, survival estimates are based

© 1999 NRC Canada

Schwarz and Stobo 557

Model 1: θ = 1,ν = 0 (a standard CJS model)(log-likelihood –1617.7)

Model 2: θ andν constantover time (log-likelihood–1472.9)

Model 3: θ = 1, ν = 0 (astandard CJS model appliedto tags seen at least twice)a

Estimate SE Estimate SE Estimate SE

$p81 0.049 0.016 0.061 0.023 0.064 0.021

$p82 0.187 0.030 0.264 0.041 0.250 0.037

$p83 0.374 0.039 0.526 0.050 0.472 0.042

$p84 0.440 0.043 0.643 0.052 0.580 0.042

$p85 0.467 0.043 0.667 0.051 0.573 0.044

$p86 0.569 0.050 0.797 0.048 0.680 0.043

$p87 0.594 0.053 0.782 0.048 0.680 0.044

$p88 0.591 0.054 0.786 0.052 0.669 0.045

$p89 0.677 0.058 0.881 0.048 0.740 0.043

$p90 0.686 0.059 0.866 0.049 0.733 0.044

$p91 0.838 0.065 1.000 0.020d 0.869 0.038

$p92 0.779 0.067 0.920 0.048 0.810 0.042

$p93 0.835 0.042 0.938 0.047 0.835 0.041$P b94 1.000 — 1.000 — 1.000 —

$θ 1.000 — 0.899 0.014 1.000 —

$ν 0.000 — 1.000 0.020d 0.000 —

$φ73c 0.785 0.115 0.637 0.149 0.490 0.030

$φ81 0.837 0.120 0.744 0.181 1.000 —$φ82 1.000 0.015d 1.000 0.015d 1.000 —$φ83 0.995 0.025 0.993 0.024 1.000 —$φ84 1.000 0.017d 0.980 0.032 0.987 0.028$φ85 0.907 0.033 0.928 0.039 0.960 0.026$φ86 0.947 0.034 0.991 0.026 0.977 0.020$φ87 0.921 0.034 0.961 0.034 0.981 0.021$φ88 0.954 0.029 0.948 0.029 0.958 0.024$φ89 0.946 0.029 0.988 0.020 0.987 0.017$φ90 0.953 0.028 0.974 0.022 0.977 0.020$φ91 0.893 0.034 0.954 0.031 0.926 0.029$φ92 0.954 0.033 0.965 0.032 0.948 0.032$φ93 0.789 0.076 0.930 0.051 0.826 0.044

aThe log-likelihood was not given for this model because it uses a different set of data (i.e., it discards histories with onlyone observation after branding).

bOnly the productφ93 94p can be estimated;$p94 was constrained to the value 1 and hence has no SE reported.cThe parameterφ73 estimates the survival rate from the time of branding in 1973 to the time of the survey performed in

1981.dThe SE’s for these estimates, which are on the boundary of the parameter space, were determined from bootstrap

simulations.

Table 1. Estimates from three different models applied to the seal data.

I:\cjfas\cjfas56\CJFAS-04\F98-196.vpMonday, May 17, 1999 1:57:15 PM

Color profile: DisabledComposite Default screen

Page 8: Estimation and effects of tag-misread rates in capture-recapture studies

more upon subsequent recoveries of animals known to bealive from that point onwards, and the “nonappearance” of abrand misread looks like a “death”, depressing the estimates.However, the bias is not severe because of the long-lived na-ture of the seal, which allows it to be properly sighted in afuture year as well.

In this study, the seals have a high initial mortality afterbranding, but subsequent survival is high. The surviving ani-mals tend to be seen many times and are usually faithful totheir breeding grounds. Yet, we noticed that some brandnumbers were observed only once after release. Indeed thegoodness-of-fit statistics in Table 2 show a large excess inthe number of animals seen twice (at initial branding andonly once afterwards). A similar behavior was noted by Pra-del et al. (1997) when transients were present, i.e., therecould be a subpopulation of seals that only visits Sable Is-land irregularly. Pradel et al. (1997) suggested that an ad hocmethod be used with transients where all brands resightedonly once after release are removed prior to an analysis us-ing a standard CJS model. This will completely remove allbrands observed only once after branding (correspondingpresumably to misreads), but will also remove legitimatehistories corresponding to animals suffering an early death.The results of fitting the standard CJS to the modified cap-ture histories are shown in Tables 1 and 2. The estimates ofthe capture rates seem to have improved early in the study,but show little improvement later in the study. Removal ofspurious brand sightings of dead animals late in the studyimplies that the estimated number of animals alive at the be-ginning of the study is reduced. This improves estimates ofpi, but later in the study, removal of misreads implies thatestimates ofpi tend to estimate the productpiθ. Estimates ofsurvival are severely affected early in the study because re-

moval of brands observed only once after branding tends toremove histories corresponding to legitimate sightings ofanimals captured only once before their death, particularlygiven the low sighting rates and comparably higher mortal-ity rates early in the study. Estimates later in the study areless affected because they are based upon animals typicallyhaving a large number of recaptures given the high annualsurvival and recapture rates late in the study.

A power analysis using the estimates shown in Table 1with sample sizes similar to those in the study shows thatthe power to detect misreads in this study is very high, wellover 95%. This study was successful in detecting possiblemisreads because of the relatively lower survival rate afterbranding until recruitment, which generates a pool of deadanimals that misreads can select, and the high subsequentsurvival and recapture rates, which give a good chance formisreads to occur.

Discussion

This very simple model of tag misreads worked surpris-ingly well in practice. We suspect that even though misreadsdo not generate valid numbers from all possible tag num-bers, they occur so infrequently that it is not likely that thesame misread will take place very often. If the misread rateis higher, this model may not work well, e.g., a tag number354 may be misread consistently as 854, but a tag number711 could be misread as 111, 777, or any other combinationof 1’s and 7’s.

One possible alternative explanation for these observeddata is the presence of a pool of nonbreeders (transients)who visit Sable Island at extremely irregular intervals. Thiscannot be ruled out given the observed data, but the current

© 1999 NRC Canada

558 Can. J. Fish. Aquat. Sci. Vol. 56, 1999

Model 1: θ = 1,ν = 0 (a standard CJSmodel)

Model 2: θ andνconstant over time

Model 3: θ = 1,ν = 0 (applied to tagsseen at least twice)b

Timesobserveda

Observed no.of animals

Predicted no.of animals χ2

Predicted no.of animals χ2

Predicted no.of animals χ2

1 100 100.0 0.0 98.5 0.0 146.0 0.02 46 13.3 80.8 46.1 0.0 2.9 2.93 14 15.6 0.2 14.4 0.0 4.8 17.74 6 14.6 5.1 7.2 0.2 5.1 0.25 9 13.3 1.4 5.7 1.9 5.3 2.76 6 15.1 5.5 5.6 0.0 6.8 0.17 8 20.3 7.5 7.4 0.1 10.8 0.78 12 26.1 7.7 12.5 0.0 17.5 1.89 18 27.6 3.3 20.1 0.2 24.5 1.7

10 17 21.7 1.0 25.5 2.8 26.5 3.411 20 12.0 5.4 23.0 0.4 20.5 0.012 20 4.3 56.7 13.6 3.0 10.6 8.313+c 9 1.0 64.0 5.4 2.4 3.8 7.1

Total 285 284.9 238.4 285.0 11.1 285.0 46.6aIncludes initial release, i.e., a brand observed once would correspond to a brand number released in 1973, but

never again subsequently resighted.bHistory vectors were modified so that brand numbers observed only twice were reclassified into the observed only

once class.cThe last few counts were pooled; otherwise, the small value of the predicted number of animals would inflate the

cell’s χ2 contribution.

Table 2. Goodness-of-fit test for the three models fit to the seal data.

I:\cjfas\cjfas56\CJFAS-04\F98-196.vpMonday, May 17, 1999 1:57:19 PM

Color profile: DisabledComposite Default screen

Page 9: Estimation and effects of tag-misread rates in capture-recapture studies

model still gives useful estimates of survival of the breedingpopulation and estimates of capture rates that can be usedwith other cohorts branded in other years to investigate re-cruitment to breeding status. This model also differs fromthe transient model of Pradel et al. (1997) in that not only domisreads generate histories that look similar to those fromtransient animals, but the capture histories of nontransientsare also affected.

Tag misreads can also be viewed as a source of heteroge-neity in the capture probabilities. Live animals and dead ani-mals are two groups in the populations with different captureprobabilities at each sample time. Previous investigations ofthe effects of heterogeneity in capture probabilities have shownthat the estimates of survival are relatively unaffected bymodest heterogeneity, while estimates of the capture ratesare more sensitive (Pollock et al. 1990). Our results confirmthese findings.

Based upon our simulation results and power analyses,misreads could have serious implications in capture–recapturestudies where tags are applied to young animals with a highinitial mortality until adult age followed by a long period ofhigh survival. In these types of studies, it is often the casethat the animal is only physically handled when young, andsubsequent recaptures are resightings done at a distance. Un-der these conditions, estimates of abundance could be se-verely biased by misreads. When planning studies wherefuture observations have misreads, some care in the actualtag numbers used could reduce the incidence of misreads.For example, the research may avoid using both the digits“1” and “7” in the tag numbers so that they are not con-fused, or may use tag numbers with a built-in error correct-ing code. If animals are physically handled, degraded tagsshould be replaced.

The current models have been fit to data using a general-purpose optimization routine commonly available in manystatistical packages or in libraries in other computer lan-guages. The major difficulty in fitting these models using thecommon mark–recapture software packages such as SURGE,MARK, POPAN, and SURVIV is the complex form of theprobability function for each history. A symbolic algebrasystem could be used to generate the expression that couldsubsequently be used in SURVIV, but this is not easily doneby the casual user. The models could be fit by a general-purose multistrata analysis program (e.g., MS-SURVIV), butthese programs assume that the stratum is known when thetag is observed, i.e., the analyst would need to know if theanimal is alive or dead when the tag number is observed,and so these programs would also have to be modified.

Acknowledgements

This research was supported by the Natural Sciences andEngineering Research Council of Canada and by the Bed-

ford Institute of Oceanography. The complete set of data andthe programs used to analyze the data are available from thefirst author and can be downloaded from http://www.math.sfu.ca/~cschwarz/papers.

References

Brownie, C., Hines, J.E., Nichols, J.D., Pollock, K.H., and Hest-beck, J.B. 1993. Capture–recapture studies for multiple strata in-cluding non-Markovian transition probabilities. Biometrics,49:1173–1187.

Buckland, S.T., and Garthwaite, P.H. 1991. Quantifying precisionof mark–recapture estimates using bootstrap and related meth-ods. Biometrics,47: 255–268.

Burnham, K.P., Anderson, D.R., White, G.C., Brownie, C., andPollock, K.H. 1987. Design and analysis methods for fish sur-vival experiments based on release–recapture. Am. Fish. Soc.Monogr. No. 5.

Cormack, R.M. 1964. Estimates of survival from the sighting ofmarked animals. Biometrics,51: 429–438.

Davies, R.B. 1977. Hypothesis testing when a nuisance parameteris present only under the alternative. Biometrika,64: 247–254.

Jolly, G.M. 1965. Explicit estimates from capture–recapture datawith both death and immigration — stochastic models. Bio-metrika,52: 225–247.

Lebreton, J.-D., Burnham, K.P., Clobert, J., and Anderson, D.R.1992. Modelling survival and testing biological hypotheses us-ing marked animals. A unified approach with case studies. Ecol.Monogr. 62: 67–118.

Lebreton, J.-D., Almeras, T., and Pradel, R. 1998. Competing events,mixture of information, and multi-strata recapture models. BirdStud. In press.

Manly, B.F.J. 1997. Randomization, bootstrap, and Monte Carlomethods in biology. 2nd ed. Chapman and Hall, London, U.K.

Pollock, K.H., Nichols, J.D., Brownie, C., and Hines, J.E. 1990.Statistical inference for capture–recapture experiments. Wildl.Monogr. No. 107.

Pradel, R., Hines, J.E., Lebreton, J.-D., and Nichols, J.D. 1997.Capture–recapture survival models taking account of transients.Biometrics,53: 60–72.

Robson, D.S. 1969. Mark–recapture methods of population estima-tion. In New developments in survey sampling.Edited byN.L.Johnson and H. Smith, Jr. Wiley-Interscience, New York.pp. 120–140.

SAS Institute, Inc. 1995. SAS/IML software: changes and enhance-ments through release 6.11. SAS Institute, Inc., Cary, N.C.

Schwarz, C.J., and Stobo, W.T. 1997. Estimating temporary migra-tion using the robust design. Biometrics,53: 178–194.

Schwarz, C.J., Schweigert, J.M., and Arnason, A.N. 1993. Esti-mating migration rates using tag–recovery data. Biometrics,49:177–193.

Seber, G.A.F. 1965. A note on the multiple recapture census. Bio-metrika,52: 249–259.

Self, S.G., and Liang, K.-Y. 1987. Asymptotic properties of maxi-mum likelihood estimators and likelihood ratio tests under non-standard conditions. J. Am. Stat. Assoc.82: 605–610.

© 1999 NRC Canada

Schwarz and Stobo 559

I:\cjfas\cjfas56\CJFAS-04\F98-196.vpMonday, May 17, 1999 1:57:21 PM

Color profile: DisabledComposite Default screen


Recommended