9
This article was downloaded by: [Virginia Commonwealth University Libraries] On: 25 September 2013, At: 17:20 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK CHANCE Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/ucha20 Statistics and Slobodan: Using Data Analysis and Statistics in the War Crimes Trial of Former President Milosevic Patrick Ball & Jana Asher Published online: 20 Sep 2012. To cite this article: Patrick Ball & Jana Asher (2002) Statistics and Slobodan: Using Data Analysis and Statistics in the War Crimes Trial of Former President Milosevic, CHANCE, 15:4, 17-24 To link to this article: http://dx.doi.org/10.1080/09332480.2002.10554820 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Statistics and Slobodan: Using Data Analysis and Statistics in the War Crimes Trial of Former President Milosevic

  • Upload
    jana

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Statistics and Slobodan: Using Data Analysis and Statistics in the War Crimes Trial of Former President Milosevic

This article was downloaded by: [Virginia Commonwealth University Libraries]On: 25 September 2013, At: 17:20Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK

CHANCEPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/ucha20

Statistics and Slobodan: Using Data Analysisand Statistics in the War Crimes Trial of FormerPresident MilosevicPatrick Ball & Jana AsherPublished online: 20 Sep 2012.

To cite this article: Patrick Ball & Jana Asher (2002) Statistics and Slobodan: Using Data Analysis and Statistics in theWar Crimes Trial of Former President Milosevic, CHANCE, 15:4, 17-24

To link to this article: http://dx.doi.org/10.1080/09332480.2002.10554820

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”)contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensorsmake no representations or warranties whatsoever as to the accuracy, completeness, or suitabilityfor any purpose of the Content. Any opinions and views expressed in this publication are the opinionsand views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy ofthe Content should not be relied upon and should be independently verified with primary sources ofinformation. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands,costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly orindirectly in connection with, in relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial orsystematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution inany form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Statistics and Slobodan: Using Data Analysis and Statistics in the War Crimes Trial of Former President Milosevic

Statisticians recount experiences and share results frominvestigating the alleged war crimes of SiobodanMilosevic

Significant positive evidence for thefirst two questions would support a posi­tion (favorable to the defense) that thegovernment of Yugoslavia was not

Wanted poster for Yugoslav President SlobodanMilosevic and Bosnian Serb leaders RadovanKaradzic and Ratko Mladic released by theStateDepartment Thursday, March 2, 2000. The StateDepartment announced a reward of upto $5 mil­lion for information leading to thearrest and con­viction of the men indicted by an internationaltribunal.

patterns of airstrikes carried outby the North Atlantic TreatyOrganization (NATO) and theactivities of the Kosovo Libera­tion Army (KLA). The report isnow known as Exhibit 67 in theProsecutor v. Slobodan Milose­vic. The authors of the study(Patrick Ball, Wendy Betts, FritzScheuren, Jana Dudukovic, andJana Asher) combined their spe­cialties in data management, sta­tistics, data matching andinternational relations in order toanswer the following questions:

1. Did action by the KosovoLiberation Army (KLA) motivateKosovarAlbanians to leave theirhomes and result (directlyor indi­rectly) in Albanians' deaths?

2. Did air attacks by the NorthAtlantic Treaty Organization(NATO) create local conditionsthat led to Kosovar Albaniansbeing killed and leaving theirhomes?

3. Did a systematic campaign byYugoslav forces lead to Kosovar Alban­ian deaths and expel KosovarAlbaniansfrom their homes?

Patrick Ball and lana Asher

Today in The Hague, history is beingmade: for the first time in over fiftyyears,a state leader ison trial for crimes againsthumanity. Former Yugoslav PresidentSlobodan Milosevic has been indictedfor orchestrating atrocities in Croatia,Bosnia, and Kosovo between 1991 and1999. The current trial is for war crimesallegedlycommitted in Kosovo; it willbefollowedby trials of allegations of crimesin Croatia and Bosnia. On the Kosovoportion of the trial alone, the judges'decision will be based on the testimonyof more than one hundred eyewitnesses,diplomats, military experts, and scien­tists. The International Tribunal for theFormer Yugoslavia (ICTY) is trying thecase and will make a decision thatincludes a finding of fact on dozens ofquestions. As statisticians, the questionwe addressed in the trial is this: were theforced migration and deaths of ethnicAlbanians in Kosovo between Marchand June of 1999 the result of a sys­tematic Yugoslav government campaign?

The first author of this article gavetestimony over two days, March 13-14,2002. The testimony was based on adetailed statistical study of the patternof deaths and migration in relation to the

Statistics and Siobodan:Using Data Analysis and Statisticsin the War Crimes Trial of FormerPresident Milosevic

CHANCE 17

Dow

nloa

ded

by [

Vir

gini

a C

omm

onw

ealth

Uni

vers

ity L

ibra

ries

] at

17:

20 2

5 Se

ptem

ber

2013

Page 3: Statistics and Slobodan: Using Data Analysis and Statistics in the War Crimes Trial of Former President Milosevic

responsible for the forced migration anddeaths of ethnic Albanians. Significantnegative evidence for the first two ques­tions, however, coupled with positiveevidence for the last question, couldundermine Mr. Milosevics defense.

In order to address these questions,data had to be gathered describing boththe migration patterns and the patternsof deaths of ethnic Albanians in Kosovoduring the period in question. Addi­tionally, if available, data on NATObombing, KLA activity, and Yugoslavarmy activity was required. Analyzingthese data together was the goal of thestudy that Patrickpresented to the court.This article describes the steps we andour coauthors took to piece together thepuzzle of what happened in Kosovo.

The Forced Migration ofEthnic Albanians

Figure 1. Records maintained by the Albanian border guardsat Morina, March-June1999.

The first step taken by our scientificteam was to establish patterns of forcedmigration of ethnic Albanians; in otherwords, how many people had left eachvillage in Kosovoover each time point?Ifwe could describe the statistical pat­terns of migration over time and space,the patterns might lead us to favor onehypothesis relative to the others. Togainsome understanding of the refugee flowinto Albania during the crisis in Kosovo,and to identify a source of data, FritzScheuren and Patrick Ball visited theAlbania-Kosovo border in early May1999.

During this visit, Fritz noticed thatthe Albanian guards were recording dataabout the parties crossing into Albania.Through their translator, Fritz andPatrick discovered that the guards wereregistering evel)' refugee they could indetailed border records. The guardsreported a few caveats: sometimes therewas shooting or shelling on the Kosovoside of the border, and at other times,the number of people crossing the bor­der became overwhelming. Duringthose moments of chaos, the borderguards suspended the registrationprocess and just waved people throughas quickly as they could. During a latervisit, our Albanian partners convincedthe Albanian government to grant uspermission to copy the border records.

18 VOL. IS. NO.4, 2002

Determining a way to make thosecopies, however, proved more difficult.The most plausible method for obtain­ing these data was to scan the images atthe border. There were 690 pages ofrecords, all of which were scanned.

The data was of high quality butincomplete. Several days seemed to bemissingchunks of records, and two daysin mid-May were entirely missing. Wewere concerned about the missinginfor­mation, including the missing days andthe moments when the guards sus­pended their registration. Fortunately,the UN High Commission for Refugees(UNHCR) had conducted an indepen­dent count of people on the road fromthe border to the town of Kokes, some30km southwest of the border. UNHCRpublished this count in dailypress brief­ings. By comparing the UNHCR dailycounts to the counts in the borderrecords, we found that the two countsdiffered most often and most signifi­cantly on daysofvel)'high refugee move­ment. Furthermore, when the countsdiffered, the UNHCR count was alwayshigher. These observations are consis­tent with the guards' explanation thatwhen security conditions were threat­ening, they suspended registration.

We built a dataset of migrations bycombining the data from the borderrecords with the UNHCR counts. The

people counted by UNHCR but not reg­istered in the border data were allocatedto villagescarefully, by using the distri­bution of origin places found in surveysconducted among Kosovar Albanianrefugees in camps. This method avoidedthe assumption that the people withunknown origin places have the samedistribution as the people with knownoriginplaces. The adjusted dataset mod­eled the total number of people cross­ingthe border bythe date ofcrossingandthe place of origin. Approximately404,000 people were in this dataset.

Tostudy the potential reasons for themigrations,however, we needed to knowthe times people left their homes, notthe times they crossed the border. Thedifference between people crossing theborder and people leaving home is thetime it took people to travel from theirhomes to the border - the transit time.From surveys conducted in refugeecamps, we were able to determine dis­tributions of transit times. The distrib­utions were disaggregated by places oforigin and border-crossing dates. Wefound that transit distributions did notval)' much by origin locations, but thedistributions did change over time. Weused time-specific transit time distrib­utions to shift backward the border­crossing data, transforming it intohome-leavingestimates. That is,approx-

Dow

nloa

ded

by [

Vir

gini

a C

omm

onw

ealth

Uni

vers

ity L

ibra

ries

] at

17:

20 2

5 Se

ptem

ber

2013

Page 4: Statistics and Slobodan: Using Data Analysis and Statistics in the War Crimes Trial of Former President Milosevic

Figure 2. Close up of one border record.

percent in Poland). The refugees

were interviewed in refugee camps

and private homes. All interviews

were conducted using a standard­

ized questionnaire that allowed for

a narrative description of events.

The information on the question­

naire was then keyed into a data­

base. 5,089 records were created,

• Exhumations by international teams

on behalf of the International

Criminal Tribunal for the Former

Yugoslavia (EXH): ],767 exhuma­

tions were conducted in locations

thought to contain graves of

Kosovars killed during the months

leading up to the Yugoslav with­

drawal. Although exhumations were

not evenly spread across Kosovo,

exhumations were conducted in 24

of Kosovos 29 municipalities.

~ 1.4£0 (j2r.-o 1~

--4'3 -

USTA E REGJISTRlMtT TE REFlIGJATEVE' TE KOSOWSDAT,.;':

4-. f:!t

.2 O.S I 6

Emri Alesla Mblemri AdIese Gilnl. Mosha Uojll mjelitNr. --F_

.L. ,f. L.o- r: L,iiC1 1-." ~,~

~ "1"( A"32 r.t....- ~L' -11-

(, - ',c ~

3 IJ.._H 1"- Y::i / ......

4 ~J... ,...f

""4' I. I-"" 'of5 us: ..-/6"

~er4.- . '1"'A ,7. ~8 ~~ '7f ~

7 A_ - U'. .u.~ .1. .J(~

'...... 78 ~8 -I, /.,.i.;' 'A'L9 ...I...:e -I (

Z.."~-~10 ,_VGt:. . ./. . -1/ .._-~-- r- .,...,.....,.1--;;..-

11 ~'L'0.>,- . -t,

1/",3:' JI12 ,,-,....d. t ,Il","", _'1-

13 1- r.,,,.y14 "15

111

1718 ~-"19 .,..p I'~

12t: ~~ o». J. 0'" "f " f't'

.........* I./......;)t.:,. ·.L -I... ro. - ..21 .. v

122,.

23

24

25'2ll

%712120 ~

30

REGJlSTRUESJ aJV I

The Data on AlbanianDeaths

Estimating the total number of deaths

in Kosovo between March and June of

1999, as well as the pattern over time

and space of those deaths, turned out to

be very complex. Data on killings were

available from four sources:

• The American Bar Association

Central and East European Law

Initiative (ABAICEELI)and itspart­

ners: This non-profit organization

collected data via ],674 interviews

with ethnic Albanian refugees in

five different countries (35 percent

in Albania, ]6 percent in

Macedonia, ]0 percent in the

United States, 38 percent in

Kosovo, Yugoslavia, and less than ]

imately 60 percent of people crossing

the border each day had left home that

same day, approximately ]5 percent had

left home the previous day, 8 percent

two days earlier, and so forth. Thus on

dayy, 60 percent of the people crossing

from village Xwere allocated to leaving

home also on day Y, ]5 percent were

allocated to leaving home on day Y-], 8

percent on day Y-2, etc. The resulting

distribution estimates the pattern of

people leaving home over time and

space.Plotting the pattern over time, we

noted that during the early phase of the

conflict, most of the people leaving

Kosovo originatedfrom municipalities in

the south and west of Kosovo, DUring

the middle phase, a much greater pro­

portion of people leaving Kosovo origi­

nated from the northern regions. Then

in the third phase, most of the people

leavingKosovo came from the southern

region. Each phase was characterized

byhigh volume of refugee flowfollowed

by a much lower level of refugee flow.

The wave-likepattern suggested a cause

that stopped and started.

The originalanalysisof migrationpat­

terns was published in a report authored

by Patrick Ballin March 2000. Patterns

of NATO airstrikes were compared to

the migration patterns by plotting the

dates on which airstrikes occurred in a

horizontal bar on top of the time series

plots of the quantity of people leaving

home foreach municipality.Even in this

exploratory analysis it was clear that

most of the airstrikes occurred after the

major flowof refugees in the first phase.

Only occasionallydid airstrikes precede

a local peak in refugee flow. More often

we found that there were no airstrikes

in a given municipality until after most

of the refugee flow had occurred. The

possibility that NATO airstrikes had

caused migration was not supported by

these findings.

We were pleased with the migra­

tion data available for our study. The

fact that data were available from two

independent sources - the border

records and the UNHCR reports _

helped to support the statistics we

developed on migration flow. The

next step, determining patterns of

killings during the same time period,

required completely different data

and methods.

CHANCE 19

Dow

nloa

ded

by [

Vir

gini

a C

omm

onw

ealth

Uni

vers

ity L

ibra

ries

] at

17:

20 2

5 Se

ptem

ber

2013

Page 5: Statistics and Slobodan: Using Data Analysis and Statistics in the War Crimes Trial of Former President Milosevic

• Human Rights Watch (HRW):Human Rights Watch is the largestnon-governmental organizationdevoted to human rights in theUnited States. From March to June1999, HRW interviewed refugeesas they left Kosovo. Of all the inter­viewees who gave statements toHRW, 25 percent were interviewedas they crossed the border intoAlbania or when they had settled inrefugee camps or private homes; 11percent were interviewed inMacedonia, and 3 percent inMontenegro. From June throughDecember 1999, HRW conductedthe remaining 60 percent of inter­views in Kosovo. The geographicregions within Kosovo were select­ed based on refugee reports of masshuman rights violations and onreports of mass violations fromsources other than refugees. Allinterviews were conducted to elicitopen narratives of what the inter­viewee had seen. Standardizedquestionnaires were not used.There were 337 interviews and1,717 incidents reported in total.

• The Oy;ganization for Security andCooperation in Europe (OSCE):The OSCE Kosovo VerificationMission (OSCE-KVM) collected1,837 interviews that mention oneor more killings. The statementswere taken from March throughJune 1999. The interviews wereconducted in more than 90 distinctlocations in Albania (37 percent ofthe interviews) and at least six loca­tions in Macedonia (61 percent ofthe interviews). There was a smallnumber of interviews (22) forwhich the place of interview wasnot noted. No information wasgathered in Kosovo itself. Most ofthe interviews (over 80 percent)were conducted in refugee camps;the remainder of the interviewswere collected in public gatheringspaces or private homes. OCSE­KVM used standardized interviewforms similar to those used byABNCEELI. The information wasthen entered into a database, alsosimilar to that used by the ABA.

Information common to the threelists created from interview dataincludes place and time ofdeath, as well

20 VOL. IS. NO.4. 2002

as the name of the deceased, ifavailable.The exhumation data did not containplace or time of death.

These four sources of data, or lists,represent the highest quality informa­tion availableabout killings ofAlbaniansin Kosovo in 1999,but substantial effortsto standardize and unduplicate eachindividual dataset, remove anonymousrecords from each dataset, and thencombine the fourdatasets, were requiredby our staff. It was essential for us toaccurately match across the four lists inorder todetermine their overlap.In otherwords,we invested most ofour data pro­cessing effort to identify which killingsappeared in more than one of thedatasets. This process would allowus toboth determine the number of killingslisted uniquely in one or more of the fourdatasets and also analyze the patterns ofoverlap between the datasets. From thepatterns ofoverlap,estimates of the totalnumber of killings not captured by anyof the datasets could be created.

The process by which two largedatasets containing records about thesame people are merged to create onerecord per person is called record link­age. The Kosovo killings data repre­sented a record linkage challenge fortwo reasons. First, it is not possible toautomate linking four datasets jointlygiven current record linkage theory andtechnology. As a result, the six possiblepairs of lists (e.g., HRW and OSeE,HRW and ABA)were separately linked.Second, pre-existing standardizationsoftware could not easilybe utilizeddueto substantial differences betweenAlbanian and United States namingconventions. Such softwarewillprocess

each record of a dataset and substitutestandardized names for nicknames andalternate spellings; e.g., "William" for"Bill" and "Billy." When two lists arestandardized, computer matching isfacilitated.

Because of the name standardizationissues, and due to the small size of thelists, computer matching viaa pre-exist­ing software package was not viable forthis study.Instead, nativeAlbaniansandAlbanian name indices were consultedto determine different spellings of andnicknames for specificAlbanian names,and human coders were employed to dothe matching across lists. For each pairof lists, the smaller list was designatedas the source and the larger as the tar­get. For each source record, possiblematches in the target set were deter­mined via software created by Patrickand his American Association for theAdvancement of Science (AAAS) col­league Matt Zimmerman. Humancoders then used the match-facilitationsoftware to process the source datasetrecord by record, determining the bestmatch (ifany) in the targetdataset. Eachindividual source record in the sixpairsof lists was matched 2~ times by dif­ferent coders to measure the inter-raterreliabilityof the matching process. Sev­eral subsequent rounds of additionalmatching (using different blockingvari­ables) were applied in order to identifymatches that had been missed in earlierrounds.

The overall results of the matchingare listed in Table 1. The total numberof individual deaths (killings where thevictim can be named) within these datais 4,400. Note that the total number of

Dow

nloa

ded

by [

Vir

gini

a C

omm

onw

ealth

Uni

vers

ity L

ibra

ries

] at

17:

20 2

5 Se

ptem

ber

2013

Page 6: Statistics and Slobodan: Using Data Analysis and Statistics in the War Crimes Trial of Former President Milosevic

Figure 3. Estimated total refugee migration and killings overtime, in Kosovo.

Allowing for explanatory variablesthat included all possible list interactionterms, there were over 100 potentialmodels we could use to estimate the totalnumber of Albanian killings in Kosovobetween March and June of 1999. Weselected one of these models by balanc­ing the parsimony (simplicity) of themodel with the overall fit of the model.This represents a common issue inmodel selection; the more complicatedthe model, the better the model "fits"thedata (inother words, the more closelythefitted values produced bythe model rep­resent the data). Both overfitting andunderfitting a model is considered to bea problem, as iscreating toocomplicateda model. We chose the model for whichthe X2 statistic divided by the degrees offreedom was minimized. The end resultwas an overallestimate of 10,356 Koso­var Albanian deaths.

To determine the number of deathsfor individual two-day time periods, weagain used multiple systems estimation,but this time we used the overlap ofthree lists instead of four.This was sim­ply to allow the most accurate estima­tion possible. We were concerned thatthe cell size for the cross-classificationtable forall four lists fora two-dayperiodwas very small, which may have causedhighly variable estimation. Again, weobtained many potential models foreach two-day period, and used the X2

statistic divided by the degrees of free­dom to determine the model of choice.

11 318

Peoplekilled

,

I22 June

'-- Lo

-0

- 52043

I Peopleleavingtheir homes

The technical details of this processof model selection and estimate cre­ation, as well as a history of the devel­opment of multiple systems estimationtechniques, are outlined thoroughly inAppendix 2 of our study.The importantpoint is that we were able to create esti­mates of the number of ethnic Alban­ian deaths in Kosovo in particulartwo-day periods, and for particularregions of the country. We were thenable to compare these estimates to boththe migration data and to the data weacquired about NATO bombing andKLA activity. The results of this com­parison can now be discussed.

Putting it All TogetherThe first clues toward answering thequestion of what happened in Kosovocan be found by simply comparing thepattern of deaths and migrations inKosovo over time. Ascan be seen in Fig­ure 3, counts of deaths and migrationsfollow the same trend, with three dis­tinct phases. Each phase is character­ized by high points separated byrelatively much lower points. This sug­gests that migrations and deaths areassociated, either because one causedthe other, or more likely, because bothwere caused by the same social force.The next logical step was to examinehow patterns of NATO, KLA, andYugoslav army activityrelated to the pat-

I11May

I24 April

I

8Ap"124 Man:h

Estimating the Patternsof Deaths

deaths for each list is smaller than theoriginal number of incidents recordedwhen the datasets were created. This isdue both to the removal of duplicaterecords within each individual list, andalso the removal of anonymous/groupkilling reports for all lists. The missingcell in Table I represents those deathswhich were not identified in any of thefour lists; it is to the estimation of thisnumber that we tum next.

The judges for Milosevic's trial are notparticularly interested in the number ofdeaths we were able to record, becausethis number by itself does not addressthe cause of those deaths. Of more inter­est is an estimate of both the true num­ber of deaths and the true pattern ofdeaths over time. In order to make esti­mates of the true number of deaths over­all and at particular points in time, weused a well-established statistical tech­nique called multiplesystems estimation.

Multiple systems estimation, as weimplemented it, requires the following:

I. The dataset ispartitioned into eachpossiblepattern of exclusionsand inclu­sions in the four lists.Asan example, inorder to estimate the total number ofdeaths in Kosovo, four lists are used.There are fifteen possible patterns ofinclusion and exclusion for a death inthis database (e.g., 1111 representsinclusions in everylist; 10I0 representsinclusion in two lists only, 000 I repre­sents inclusion in one list only).

2. The number of deaths for eachpattern of inclusion and exclusion istreated as a response variable, and theparticular pattern of inclusions andexclusions is used to form explanatoryvariables. The independent variablescan include interaction terms.

3. A log-linear model is estimatedusing these response and explanatoryvariables.

4. The intercept term for the modelrepresents the lognumber of deaths notcaptured by any of the lists.

5.The total number ofdeaths is thenthe number of deaths in the dataset(4,400) plus the estimate of the num­ber of deaths not captured by any of thefour lists.

CHANCE 21

Dow

nloa

ded

by [

Vir

gini

a C

omm

onw

ealth

Uni

vers

ity L

ibra

ries

] at

17:

20 2

5 Se

ptem

ber

2013

Page 7: Statistics and Slobodan: Using Data Analysis and Statistics in the War Crimes Trial of Former President Milosevic

The results of this analysis are pre­sented in Table 2. The stars next to thecoefficients indicate their significancelevels: one star indicates p < 0.05; twostars indicate p < 0.01; and three starsindicate p < 0.001. Reviewing first thefirst two columns, the only significantvariables were the regional dummies inthe second column. Asa result, we wereunable to find an association betweenNATO activity and/or KLA activity anddeaths. We therefore reject the hypothe­ses that NATO or KLAactivitywere plau­siblecauses of KosovarAlbanians'deaths.

The second two columns examinethe relationships of the explanatoryvari-

terns of killingand migration, To do this,we needed more data.

NATO and KLA data of reasonablyhigh quality were available to us.Yugoslav press and government sourcespublished information on NATOattacks contemporaneously with theairstrikes, documenting when andwhere the attacks occurred. We countedthe reports, tabulating them by munic­ipality and date. Information on KLAactivity was obtained from interviewaccounts and a variety of non-govern­mental reports summarized and pro­vided to us by the ICTY. Using thatinformation, we counted the number ofreported battles between the KLA andYugoslav forces occurring in eachmunicipality over time. Isolated KLAattacks that resulted in the injury, dis­appearance, or deaths of ethnic Serbswere also tabulated by the number ofreported casualties. We were unable,however, to obtain data on Yugoslav armyactivity independent of interactions withthe KLA.

To test the hypotheses that KLA orNATO activity (or both) caused migra­tions and killingsof ethnicAlbanians, wecreated several regression models inwhich the number of killings or migra­tions served as a response variable, andexplanatory variables were formed fromthe KLA and NATO data. The unit ofanalysis was different time points, or theregion of Kosovo (N, S, E, W) by timepoints. The explanatory variables usedwere as follows:

• Dummies indicating the region (forregion by time);

• The number of reported Serb casu­alties caused by conflict with KLAforces (KLA-kill);

• The number of reported conflictsbetween Serb and KLA forces(KLA-battle);

• The number of reported NATOairstrikes (NATO);

• The number of reported Serb casu­alties caused by conflict with KLAforces in the previous period (lag­KLA-kill);

• The number of reported conflictsbetween Serb and KLA forces inthe previous period (lag-KLA-bat­tie); and

• The number of reported NATOairstrikes in the previous period(lag-NATO).

22 VOL. IS, NO.4, 2002

ExplanatoryVariables

Region 2

Region 3

Region 4

l<lA. (~iJl)

'KLA

(battle)

'1.ag-KLA(kill)

, Lag-klA(battle)

NATO

lag-NATO

Constant

Table 2 - Regressioll <:oefficientsKillings Killings Refugee Refu~f1ow

Oller overtime fbNover overtimetime and region time and region

• -52.3 -721.4(20.4) (665.3)

*-57.0 ··3,017.8(21.3) (1,048.9)

-34.4 -193.9(25.2) (86~.5)

-1.1 1.6 ·-634.9 -184.2(5.2) (4.1) (3Ut9) (130.8)

34.7 13.3 2,728.6 ··1,879.3(32.1) (12.2) (1,030.5) (583.2)

0.2 3.3 491.9 277.9(4.4) (3.4) (384.4) (167.1)

21.2 11.6 ···2,794.3 ·"2,138.4(17.7) (11.9) (827.7) (633.7)

10.9 11.4 327.6 565.8(11.1) , (6.7) (390.3) (379.7)

-4.8 -2.5 -28.0 29.2(6.9) (4.1) (325.5) (234.0)

83.9 ··62.7 122.3 -484.8(51.1) (19.9) (3,933.2) (608.8)

0.3 0.1 0.7 0.5

abies with refugee flow. Within theregression models, NATO activity hasno association with refugee movement.There are, however, significant rela­tionships between KLA killings (mar­ginally) and battles (more strongly) andrefugee flow. Separate analyses of therelationship between refugee flow andthe explanatory variables, however, sug­gested that the significance of this rela­tionship isspecific to the North and Eastregions of Kosovo only. We thereforemade a qualified finding that there issome evidence of an associationbetween KLAactivityand migration pat­terns in the northern and eastern

Dow

nloa

ded

by [

Vir

gini

a C

omm

onw

ealth

Uni

vers

ity L

ibra

ries

] at

17:

20 2

5 Se

ptem

ber

2013

Page 8: Statistics and Slobodan: Using Data Analysis and Statistics in the War Crimes Trial of Former President Milosevic

A Short Introduction to Multiple Systems Estimation

there are additional lists available for the population, how­ever, problems such as list dependency can be addressedvia modeling. Using the population from the simple exam­ple above, we add a third list in the followingVenn diagram:

•.=u+U,'" +u." +u." +u"" +u,••,+u~.,

If we believe there are no list dependencies, we use justthe first four terms on the right hand side of this equation.Ifwe believe there is only dependency between the first andthird list, we add that interaction term. In this way, a reducedmodel can be created (one that contains less parameters).Fitting the data from this simple example to the model witha list 1 & 3 interaction term yields an estimated populationsize of 12.

Other violations of assumptions can also be addressedvia modeling or, in certain situations, can be or are (cau­tiously) ignored. Further details appear in Appendix 2 of the2001 Ball et. al. report listed in the Additional Reading sec-tion of this article. .

In this case, we can model for dependencies betweenthe three lists via log-linear models (with constraints) of thefollowing form (where the subscript i (or j or k) for a countis 0 if it does not include people from list i (or j or k), and1 if it does):

The basic assumption used to estimate the populationsize is that the ratio of the number of people captured inboth list 1 and list 2 to the number of people captured inlist 1 is proportional to the ratio of the number of peoplecaptured in list 2 to the number of people in the entire pop­ulation. In this example, this means:

2 4'6::N'

where N, the total number of people, is unknown. The esti­mated value of N is then taken as the closest integer to thesolution for N in this equality, or

tV =4*6/2 =12.

There are many assumptions embedded in this simpleratio solution. Severalof these are quite logical (i.e., the indi­viduals on a list have been randomly sampled from the pop­ulation, individuals aren't moving in or out of the populationbetween the creation of the lists, a list never has the sameindividual listed twice, and the matching across lists is accu­rate). Another assumption is that there is no dependencybetween the lists; in other words that the probability thatan individual is captured in list 2 is not dependent onwhether or not that individual is captured in list 1. A finalassumption is homogeneity: that each individual in the pop­ulation has an equal probability of capture in a given list.

If any of these assumptions are violated, capture-recap­ture may not accurately estimate the population size. If

The origin of multiple systems estimation, capture-recap­ture, goes back at least to the late nineteenth century as atechnique for counting fish populations, and was laterextended to multiple captures for other wildlife and humanpopulations. The capture-recapture estimation techniqueuses two separately collected but incomplete lists of a pop­ulation to estimate the total population size. Asimple exam­ple appears in the Venn diagram below.

regions. However, this association doesnot fullyexplain the pattern of migration,especially in the western and southernregions. With this qualification, we con­cluded that the overall effect of KLA

activity and NATO airstrikes does notmuch change the killingand refugee flowpatterns.

But what about the third hypothesis,that the migrations and deaths were due

to a systematic governmental campaign?As we explained before, there were nodata available to us that detailedYugoslavgovernment force deployments or move­ments during the period in question. We

CHANCB 23

Dow

nloa

ded

by [

Vir

gini

a C

omm

onw

ealth

Uni

vers

ity L

ibra

ries

] at

17:

20 2

5 Se

ptem

ber

2013

Page 9: Statistics and Slobodan: Using Data Analysis and Statistics in the War Crimes Trial of Former President Milosevic

A Practical Joke

Obtaining the data on migrations of eth­nic Albanians out of Kosovowas more ofan adventure than either Fritz or Patrickfirst assumed it would he.When Patrickreturned to the Kosovo-Aibaniaborder inJune of 1999, accompanied by translatorI1ir Gocaj, he discovered that the Alban­ian border guards had withdrawn about500 meters from their original post.There had been shooting just a few daysearlier, and a Chilean journalist had beenseriously wounded by the gunfire. P-c1trickasked about the border records and wastold that they had been abandoned at theold post.

The Albanian border guards made aPoint of stating that the walk between thenew and old border posts would exposePatrick and lIir to potential sniper fire.Using his honed logic skills, Patrickquickly concluded that if they would bein danger en route to the old border post,then the data itself might be in danger aswell! Patrick and IIir decided to risk thewalk to the old post, and a bordcr guardaccompanied them.

The old post was in shambles due toa Serbian attack that occurred after theAlbanian border guards had withdrawn.Some sort of small explosive had dam­aged the windows, and the precious bor­der records were scattered on the ground,mixed with bits of broken glass and splin­tered wood. In spite of their ominous sur­roundings, Patrick. I1ir and the borderguard carefully gathered together thesheets of records that they could find andsearched for stray fragments of paper.They then bravely carried the borderrecords back to the safety of the newAlbanian border post.

Upon their safe return. Patrick and llirfound the Albanian guards laughing. Itturns out that they had been the butt ofa practical joke. The Serbs on the Kosovoside of the border had been cut off forseveral weeks from their cigarette sup­pliers by the NATO bombing, and theAlbanian guards had been filling the gapfor them. You can understand, the guardstold Patrick and I1ir, that the Serbs mightshoot journalists (they nodded) but cer­tainly they would not shoot us, their onlysource of nicotine. Patrick and I1ir nod­ded again, dazed at having been so eas­ily and thoroughly hoaxed.

24 VOl I ~ NO d ,nil'

could not, therefore. directly test thehypothesis that Yugoslav forces wereresponsible forethnic Albanians' deathsand migration. However.on the eveningof Tuesday, April 6, official YURoslavsources announced that Yugoslav forceswould observe a unilateral cease-firebeginning at midnight, April 7. inrespect of Orthodox Easter. The holidayitself fell on Sunday, April II, but thecease-fire was to begin earlier. At 3 AMon April 7, the Kosovo-Albania borderclosed, and there was no refugee move­ment over the border until Saturday,April 10 (people were leaving theirhomes during this period, but at a verylowrate). Similarly, killingsfell from veryhigh daily rates in early April nearly tozero during the period April 7-10. Thesequence we observe is thus that the.announced unilateral YURoslav ceasefireis followed within hours by the nearlycomplete halt to killings and migration.During the same four-day period April7-10, however. KLAand NATO activ­ity increased dramatically relative to theprevious four day period. The increasehighlights the unilateral nature of theceaseflre - it was not observed byNATO or by the KLA.So while NATOand KLAactivitywas increasing, killingsand migration were at their lowestpoints since the conflict's beginning.

The coincidence of the OrthodoxEaster ceasefire and the drastic reduc­tion in killings and migration does notprove that YURoslav forces were respon­sible for the violations. However, thefindings of the analysis we performedarc consistent with this hypothesis.Other evidence presented by the pros­ecution provides additional support forthe claim that Yugoslav forces con­ducted a systematic campaign of ethniccleansing against Albanians in someparts of Kosovo.

What Did theStatisticians Prove?

In the end. after more than two years ofeffort - including trips to Albania and

a significant contribution to the grossprofits of local caffeine suppliers - ourworkdoes not provethat Slobodan Milo­sevic orchestrated systematic warcrimes in Kosovo in 1999. The signifi­cance in our work, instead, lies in whatit disproves. Given the results of ourstudy, key hypotheses which might sup­port the defendant's innocence arc sim­ply not plausible.

The trial of Siobodan Milosevic willcontinue for many months; only at itsconclusion when the judges render theirdecision will we learn what weight theychose to give our analysis. Our hope isthat Exhibit 67 will help them to deter­mine who is truly responsible for thesuffering of the KosovarAlbanians dur­ing the spring of 1999.

Additional Reading

Ball, Patrick, Wendy Betts. FritzScheuren, Jana Dudukovic, andJana Asher. 2002. "Killings andRefugee Flow in Kosovo,March-June 1999: A Report to theInternational Criminal Tribunal forFormer Yugoslavia." Washington.DC: American Association for theAdvancement of Science.

Ball.Patrick. 2000. Policy or Panic: TheFlight of Ethnic Albanians fromKosovo, March-May /999. Wash­ington. DC: American Associationfor the Advancement of Science.

Human Rights Watch. 200 I. UnderOrders: WarCrimes in Kosovo. NewYork: Human Bights Watch.

Organization for Security and Cooper­ation in Europe. 1999.Kos(wolKoSOlla As Seen As Ibid: AnAnalysis of the Human Hights Find­ingsof the OSCE Kosovo VerificationMission October 1998 toJune 1999.Warsaw. Poland: OSCE Office forDemocratic Institutions andHuman Rights.

Dow

nloa

ded

by [

Vir

gini

a C

omm

onw

ealth

Uni

vers

ity L

ibra

ries

] at

17:

20 2

5 Se

ptem

ber

2013