66
Page | 1 REPORT – Version March 2013 A systematic review of the sensitivity and specificity of symptom- and chest-radiography screening for active pulmonary tuberculosis in HIV-negative persons and persons with unknown HIV status. A.H. van’t Hoog, M.W. Langendam, E. Mitchell, F.G. Cobelens, D. Sinclair, M.M.G. Leeflang, K. Lonnroth.

REPORT – DRAFT 2

  • Upload
    voliem

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: REPORT – DRAFT 2

Page | 1

REPORT – Version March 2013 A systematic review of the sensitivity and specificity of symptom- and chest-radiography screening for active pulmonary tuberculosis in HIV-negative persons and persons with

unknown HIV status. A.H. van’t Hoog, M.W. Langendam, E. Mitchell, F.G. Cobelens, D. Sinclair, M.M.G. Leeflang, K. Lonnroth.

Page 2: REPORT – DRAFT 2

Page | 2

Summary

Background: Screening for active pulmonary tuberculosis is increasingly recommended as a strategy for early detection of TB, reducing mortality, morbidity and transmission, and improving equity in access to care. In screening, the target population is systematically examined with a screening tool to identify people with suspected active TB, who should be tested with a confirmative diagnostic test. The accuracy of the screening tool is an important consideration in the design of a screening program and determines, in combination with the confirmatory tests accuracy, the yield of a screening program and the burden on individuals and the health service. Questioning patients for symptoms and chest radiography (CXR) are the screening tools most widely available. Objectives: The primary objective was to assess the sensitivity and specificity of questioning for presence of one or more symptoms, CXR abnormalities, and combinations of symptoms and CXR as screening tools for detecting bacteriologically confirmed active pulmonary tuberculosis in HIV-negative persons and persons with unknown HIV status considered eligible for TB screening. The secondary objective was to investigate sources of heterogeneity. Search Methods: The data bases MEDLINE, EMBASE, LILIACS and HTA (Health Technology Assessment) were searched from 1992-2012, supplemented by reference lists of relevant reviews and studies, websites of the World Health Organization (WHO) Stop TB department, and expert consultation for relevant studies and unpublished reports. Selection Criteria: Eligible studies were selected for inclusion using broad criteria: (i) the publication be original research; and (ii) titles, abstracts, or key words suggest that symptom or CXR screening, or active case finding for TB took place. Full text articles of these studies were obtained and assessed for study eligibility using the predefined inclusion and exclusion criteria. Data collection and analysis: Of 3596 identified studies, the full text was retrieved of 187 and 24 publications were included reporting on 17 studies. Of those, 11 studies were conducted in the general adult population and six evaluated screening in specific sub-groups. All but three studies had one or more concerns related to methodological quality. Results: For studies conducted in the general population meta-analysis using a bivariate model was performed, to obtain pooled estimated of sensitivity and specificity for the following screens:

Screen Reference standard

Population HIV prevalence/ region

No. of studies Sensitivity [95% CI] Specificity [95% CI]

CXR any abnormality Bact+ Combined 3 0.978 [0.951; 1.00] 0.754 [0.720; 0.788]

CXR TB abnormality Bact+ Combined 4 0.868 [0.792; 0.945] 0.894 [0.867; 0.920]

Prolonged Cough Bact+ High HIV /SSA* 4 0.492 [0.389; 0.597] 0.923 [0.891; 0.956]

Prolonged Cough Bact+ Low HIV /Asia 4 0.247 [0.176; 0.317] 0.963 [0.947; 0.979]

Prolonged Cough Bact+ Combined 8 0.351 [0.244; 0.457] 0.947 [0.925; 0.968]

Prolonged Cough Sm+ Combined 4 0.578 [0.417; 0.738] 0.922 [0.868; 0.976]

Any cough Bact+ Combined 6 0.627 [0.493; 0.761] 0.775 [0.655; 0.895]

Any Symptom Bact+ High HIV /SSA* 4 0.842 [0.756; 0.927] 0.740 [0.531; 0.949]

Any Symptom Bact+ Low HIV / Asia 4 0.698 [0.579; 0.818] 0.606 [0.347; 0.866]

Any Symptom Bact+ Combined 8 0.770 [0.680; 0.860] 0.677 [0.502; 0.851] *SSA= sub-Saharan Africa Bact+=bacteriologically positive When screening with CXR and symptoms in parallel, the addition of prolonged cough to CXR increased sensitivity by 0-9% and decreased specificity by 2-5%. Of CXR as a 2nd screen after symptom screening, reported in one study, sensitivity was 0.90 [0.81; 0.96] and specificity 0.56 [0.54; 0.58].

Page 3: REPORT – DRAFT 2

Page | 3

Conclusion: CXR screening had greater accuracy compared to symptoms screening. Of the symptom screens, prolonged cough had low sensitivity and high specificity. The sensitivity of screening for ‘any TB symptom’ was higher but specificity low, and considerable heterogeneity. The pooled estimates of symptom and CXR screening can inform the choice of screening and diagnostic algorithms, but should be interpreted with caution due to the small number of studies, methodological differences, and heterogeneity.

Page 4: REPORT – DRAFT 2

Page | 4

Background

Target condition being diagnosed Tuberculosis (TB) is one of the world’s most important infectious causes of morbidity and mortality among adults. An estimated one-third of the world’s population is infected with TB. In 2010, there were 8.8 million new, and 12.0 million prevalent cases of TB; 1.1 million TB deaths occurred among HIV-uninfected people and an additional 0.35 million deaths among HIV-infected people [1]. Tuberculosis is caused by Mycobacterium tuberculosis complex, and most commonly affects the lungs. In humans M.tuberculosis is responsible for most of the disease and spreads through airborne transmission [2]. Patients with active tuberculosis spread bacilli, most commonly through cough. After initial infection approximately 5% develop active tuberculosis, while 90-95% develop a latent infection which may reactivate at a later stage, more so in the presence of conditions that affect immunity (HIV infection, undernutrition, old age, etc) [3]. In the absence of diagnosis and treatment, people with active TB may stay infectious for prolonged periods. In HIV-negative people with active TB the average duration until self-cure or death is 3 years, and case fatality with no treatment is approximately 70% for smear-positive and 20% for smear-negative TB [4]. It has been estimated that only 65% of incident TB cases were detected globally in 2010 [1]. Most TB cases are detected passively, i.e. among symptomatic people seeking care [5]. Passive case detection can involve considerable delay and is associated with higher morbidity [6–8]. Improving TB case detection is important to reduce incidence, prevalence and mortality of tuberculosis, which are the goals of TB control [9]. Recent prevalence surveys have revealed a considerable burden of culture positive smear negative TB, and a minority of those cases report classical symptoms [10–13]. Screening The Strategic and Technical Advisory Group TB (STAG-TB) recommends that the World Health Organization (WHO), working with partners, develops guidelines on TB screening [14]. The WHO has defined screening as “the presumptive identification of unrecognized disease or defect by the application of tests, examinations, or other procedures which can be applied rapidly. Screening tests sort out apparently well persons who probably have a disease from those who probably do not, and are not intended to be diagnostic. Persons with positive or suspicious findings must be referred to their physicians for diagnosis and necessary treatment [15]. For the purpose of guideline development, TB screening is defined as “systematic identification, in a predetermined target group, of people with suspected active TB, by the application of tests, examinations, or other procedures which can be applied rapidly” and these people should be tested with a confirmative diagnostic test. Screening could be offered to both those who seek health care (with or without symptoms/signs compatible with TB) and those who do not. Screening is offered systematically to predetermined groups, and not only in response to a specific request or complaint by an individual seeking care [16]. The two main goals of systematic screening for active TB are (1) Better health outcomes for people with TB, through earlier detection and treatment; and (2) More effective reduction of TB transmission and incidence, through shortening the average duration of TB infectiousness [16]. Index Tests This review focuses on symptom and chest radiography (CXR) screening. In symptom screening individuals are questioned about the presence of one or more symptoms that are considered to be suggestive of pulmonary tuberculosis, which are respiratory symptoms like persistent cough and haemoptysis, and systemic symptoms like weight loss, fever, night sweats and fatigue [17]. Chest radiography as a screening tool involves having participants undergo one posterio-anterior CXR recording. Different technologies exist: conventional CXR (producing a 36*43cm film), digital radiography, and mass miniature radiography (MMR) [18]. CXR classification systems may distinguish between any abnormality versus normal, or among abnormal CXRs only abnormalities suggestive of TB may qualify as a positive screen [19]. The latter requires interpretation by

Page 5: REPORT – DRAFT 2

Page | 5

specialist readers (usually radiologists or pulmonologists), while presence for ’any abnormality’ can more easily be interpreted by health workers with a general medical background (e.g. medical officers, clinical officers, radiographers) [20,21]. Screening may be done with either symptom or CXR screening, or with symptom and CXR screening combined in parallel or sequentially (Figure 1) [22]. Sequential (or serial) screening means that in the first step persons are screened for symptoms, and as a second step, CXR screening is offered only to symptom positives (Figure 1). Parallel screening implies that both symptom and CXR screening are offered, and persons found to have symptoms and/or abnormalities on chest X-ray are eligible for further examination. This is for instance practiced in TB prevalence surveys, in order to have as high sensitivity as possible, while at the same time avoiding the need for laboratory investigation on all study subjects [21]. Clinical pathway In a TB screening program, the screening test(s) are offered as part of a diagnostic algorithm that also includes one or more confirmatory tests. Individuals with a positive screen are offered further confirmatory testing to establish a TB diagnosis, with any of the tests mentioned below as reference tests or a new test that may become available. True screen positives are people rightfully referred for confirmatory testing, and false screen positives are people who are referred for confirmatory testing while they do not have TB. They may or may not be ruled out by the confirmatory test. Individuals with a positive screen, but negative confirmatory test would not necessarily be declared disease-free, but may be advised on further examination or follow up if warranted by the actual finding on screening (e.g. severity of symptoms or the CXR finding). Persons with a negative screen would not be further evaluated. This group includes both the true screen negatives who do not have TB, and false screen negatives, who will not be evaluated further although they do have TB. Reference Test Of the available tests with high specificity, mycobacterial culture followed by mycobacterium speciation is considered the reference test for bacteriologically confirmed tuberculosis. Culture on liquid medium is considered to be the most sensitive. Prior to the availability of automated reading of mycobacterial growth inhibitor tubes (MGIT culture), culture on solid medium (Löwenstein-Jensen [LJ]) had been the mainstay, and may still be the only available method in resource constrained settings. MGIT culture increases the recovery of mycobacteria by 11-18% compared to LJ culture but MGIT culture alone may have slightly lower specificity due to higher contamination rates [23–26]. The yield of mycobacterial culture also increases if two or three specimens per patient are tested [27]. Sputum smear-microscopy is the most commonly available TB diagnostic test. Compared to culture, the sensitivity of the Ziehl-Neelsen method [ZN] shows wide variation, and is between 50-70% in a majority of studies [28]. The specificity of direct ZN microscopy is 98% (95% CI 97-99) [28–30] The sensitivity of auramine-stained fluorescence microscopy [FM] is on average 10% higher than of ZN, but with slightly reduced specificity [28]. Processing sputum by use of centrifugation and various chemicals, including bleach and NaOH, show varying levels of increase in the sensitivity of microscopy compared with the direct smear method, and similar or slightly lower specificity [29,30]. Compared to culture, the sensitivity of the nucleic acid amplification tests (NAAT) Xpert MTB/RIF test is 92%, and specificity 99% in smear-positive and smear-negative patients combined [31]. Although sputum culture is considered the gold standard for demonstrating the presence of M. tuberculosis, culture negative active TB is characterized by clinical disease and highly suggestive CXR abnormalities [17]. Clinical diagnosis, which is commonly practiced, is however not considered as a reference test, since it has poor accuracy [32]. Serological tests, which are not recommended for TB diagnosis [33] and other tests that are not endorsed by WHO for TB diagnosis are not considered as a reference test for this review. Rationale The accuracy of the screening tools and the confirmatory tests, as well as the TB prevalence in the screened population, will determine the potential yield of a screening program and the burden on individuals and the health service. The latter includes the required amount of confirmatory tests and possibly diagnostics and care for other conditions. This systematic review of the literature aims to contribute to the development of TB screening guidelines, by generating estimates of the

Page 6: REPORT – DRAFT 2

Page | 6

sensitivity and specificity of symptoms, chest radiography (CXR) and combinations of those if used as TB screening tools. The TB screening guidelines aim to provide guidance to decision makers on the choice of diagnostic algorithms [combinations of one or more screening test(s) and confirmatory test(s)] in different populations and settings. Therefore as part of the guideline development process, the yield, positive and negative predictive value and diagnostic cost of different diagnostic algorithms will be calculated for different levels of TB prevalence. This information should help decision makers to choose the best diagnostic algorithm option for their specific setting, taking into account the TB prevalence, resource availability and logistical aspects (e.g. are X-ray or Xpert MTB/RIF equipment at all available). The pooled estimates of sensitivity and specificity of symptom and CXR screening from this review are a prerequisite for these calculations and recommendations. This review covers TB screening of HIV-negative persons and persons with unknown HIV status (part of whom may be HIV-infected). Individuals with a known HIV-positive status are expected to be referred for HIV-care and treatment, which includes periodic TB screening. A systematic review to determine a screening rule in HIV-infected persons has recently been published [34]. Objectives Primary Objective To assess the sensitivity and specificity of questioning for presence of one or more symptoms, chest radiography, and combinations of symptoms and chest radiography as screening tools for detecting bacteriologically confirmed active pulmonary tuberculosis in persons considered eligible for TB screening. Secondary Objectives To investigate heterogeneity, for as far as the data will allow, in relation to the background epidemiology, risk groups targeted, reference standard, representativeness of the study population for intended screening practice, study participants characteristics, geographic area and economic region. METHODS We searched online data bases MEDLINE, EMBASE, LILACS and HTA (Health Technology Assessment) from 1992-2012, up till 2nd July 2012, to identify titles and abstracts of peer-reviewed papers that met criteria for initial review. Search terms included combinations of three domains: (i) “tuberculosis” and related terms, (ii) terms related to “screening”, “survey”, “sensitivity”, “specificity”, and (iii) search terms related to the reference standard, “bacterial culture”, “microscopy”. A complete list of search terms for each database is in Appendix 1.

In addition we checked reference lists of relevant reviews and studies, searched the websites of the World Health Organization (WHO) Stop TB department and asked experts for relevant studies and still unpublished reports. Unpublished reports were only included if permission was granted by the investigators.

The titles and abstracts of the records that remained after electronic removal of duplicates were independently reviewed by two reviewers, to identify records that were possibly eligible for inclusion in the next review stage. In this screening phase studies were selected for inclusion using broad criteria: (i) that the publication be original research; and (ii) titles, abstracts, or key words suggest that symptom or CXR screening, or active case finding for TB took place in humans (including TB prevalence surveys) and data to determine accuracy of a screening tool may be available. If a determination could not be made at this stage, title and abstracts were included for further review. We excluded studies that only identified persons with TB infection and not active TB, studies in chest clinics and other clinical settings among patients who came to a health facility due to illness and were already considered a TB suspects by prevailing guidelines for passive case detection, and studies in HIV-infected persons only. Case control studies were excluded because of the risk of bias in diagnostic accuracy studies [35]. We included studies that screened participants for the first time or only once, as well as reports from longitudinal screening programs with

Page 7: REPORT – DRAFT 2

Page | 7

repeated rounds at predetermined intervals. The primary focus of the review is on adults (15 years and older). Studies that combine adults and children are included. Studies focusing on young children (0-5 years old) or pediatric TB only were excluded1 because the clinical presentation of tuberculosis in especially young children differs from the presentation in adults and older children. Extra-pulmonary disease is more common. If the lungs are affected, young children have more often paucibacillary disease, and obtaining sputum is difficult. Clinical presentation including X-ray findings are often part of the reference standard [36,37].

Full text articles were obtained and assessed for study eligibility using the predefined inclusion and exclusion criteria by one reviewer. The 2nd reviewer reviewed a 10% sample, and all records with any doubts. Full text articles were included if published in English, German, Dutch, French, Portuguese, Spanish, Japanese, Swedish, Norwegian and Danish, and if diagnostic two-by-two tables could be generated for a specific screen (symptom definition and/or chest radiography finding), i.e. studies that reported data from which we could extract true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). Index tests Studies on symptom screens were selected if one or more author-defined symptoms or symptom combinations were evaluated. For CXR screening, studies were included that used conventional radiography (i.e. large films, chemical development), digital radiography, or computed radiography (which is an ‘upgrade’ that allows the production of digital radiographs by conventional X-ray equipment). Studies evaluating MMR only were excluded, since the method is not expected to be used for future screening purposes and has generally lower sensitivity compared to conventional or digital radiography. With respect to the classification of abnormalities, all author-defined classification systems were included. Studies evaluating CXR screening in a population that was pre-screened with symptoms were included, since there may be relevance in accuracy estimates of CXR as a second screen in a sequential screening algorithm. In such an algorithm persons would first be offered symptoms screening, and if positive CXR screening. If the CXR screen is also positive, confirmatory testing would be offered. Target condition The target condition is bacteriologically confirmed active pulmonary tuberculosis, characterized by the presence of M. tuberculosis in sputum, and some suggestion of active disease, e.g. a symptom, CXR abnormality, and/or repeated positive sputum cultures. A positive sputum culture or smear at one point in time in a person without any symptom or CXR abnormality may reflect transient primary MTB infection or laboratory cross contamination [38], and neither are primary targets of screening programs. If studies had defined persons with one positive sputum culture only but without symptoms or CXR abnormalities as a bacteriologically positive TB case they were not excluded but rated differently in the assessment of methodological quality. Studies evaluating tests to investigate for the presence of latent TB infection only were excluded. Reference standard For inclusion the reference standard was any author-defined combination of mycobacterial culture (on solid or liquid medium), and/or sputum smear microscopy, and/or Xpert MTB/RIF. Studies in which not all participants received the reference standard were included. This is a common design in TB prevalence surveys [21] whereby it is assumed that persons with a negative symptom screen and a negative CXR screen do not have TB. It should be noted that especially if only prolonged cough and TB suggestive CXR abnormalities are used to screen, the proportion of TB cases identified in such surveys overestimate sensitivity and reflect a relative yield rather than sensitivity. The estimate of specificity from prevalence surveys with partial verification bias from this type of design is however more reliable, due to the large numbers of TB-negatives in prevalence surveys. Sources of Heterogeneity

1 Studies that included young children (0-5 years old) or pediatric TB only were not upfront excluded from the search, but were marked and were reviewed separately. The details are in appendix 2.

Page 8: REPORT – DRAFT 2

Page | 8

To be able to assess sources of heterogeneity, we collected information (if available) on TB prevalence and notification rates, HIV prevalence in the study population, the type of risk group targeted (e.g. migrants, occupational, prisoners, or general population), age and sex of the study population, the prevalence of smoking, and the geographic area.

Data extraction Data were extracted by one reviewer and entered in an Excel database through pre-designed electronic forms. The second reviewer checked the extract of all included articles. Discrepancies between the database and the publication were corrected. Assessment of methodological quality The methodological quality of included studies was assessed by two reviewers using the modified Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) instrument [39,40]. The tool (see Appendix 3) comprises four domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first three domains are also assessed in terms of concerns regarding applicability. Signaling questions are included to help judge risk of bias. They are answered as “yes,” “no,” or “unclear” and are phrased such that “yes” indicates low risk of bias. Risk of bias is judged as “low,” “high,” or “unclear.” If the answers to all signaling questions for a domain are “yes,” then risk of bias can be judged low. If any signaling question is answered “no,” potential for bias exists. Applicability sections are structured in a way similar to that of the bias sections but do not include signaling questions. Concerns about applicability are rated as “low,” “high,” or “unclear.” Because we perceived a large contrast between “low” and “high” we added a category “moderate” for the applicability sections. The “unclear” category is used only when insufficient data are reported to permit a judgment.

The methodological quality assessment was done by discussing all items with two researchers.

The results of the QUADAS assessment should not be used to generate a summary “quality score” because of the problems associated with such scores. If a study is judged as “low” on all domains relating to bias or applicability, then it is appropriate to have an overall judgment of “low risk of bias” or “low concern regarding applicability” for that study. If a study is judged “high” or “unclear” in 1 or more domains, then it may be judged “at risk of bias” or as having “concerns regarding applicability.”

Data analysis Diagnostic two-by-two tables were generated, from which sensitivities and specificities for each index test with 95% confidence intervals were calculated and presented in paired forest plots for each study. In addition, a Receiver Operating Characteristic (ROC) plot of sensitivity versus 1-specificity was used to display the data for each test. If studies showed sufficient clinical homogeneity (e.g. same index test, similar definition of tuberculosis, similar screening population) a meta-analysis of pairs of sensitivity and specificity was done by the use of bivariate random-effects methods. The bivariate model was chosen because we deal with binary decisions for which an implicit threshold is assumed. Forest plots and ROC curves were generated in Review Manager 5.1. The bivariate model was developed in SAS 9.1.

Quality of the evidence The quality of the evidence on accuracy of symptom and chest X ray screening was assessed by applying the GRADE methodology [41].

RESULTS From the online database search 3596 titles and abstracts were identified after (electronic) removal of duplicates (Figure 2). Fifty-eight studies with a focus on young children (0-5 years old) or pediatric TB only were taken to a separate database. Of those, only one study evaluated a screening tool in HIV-negative children [42] and is described in appendix 2.

Page 9: REPORT – DRAFT 2

Page | 9

Of the 225 titles and abstracts that we identified to obtain full text to further assess eligibility, 4 appeared duplicates, 8 were conference abstracts [43–50], 11 were ineligible due to language (Chinese [51–60] or Russian [61]). Of 14 the full text could not be retrieved [62–75], in none of the latter the title or abstract directly suggested an evaluation of a symptom or CXR screening tool.

We further assessed the full text of 188 publications, of which 167 were not eligible for inclusion (Figure 2). The most common reason for exclusion was that the study design did not allow evaluation of the sensitivity and specificity of a screening test, usually because one screening method was used (often symptoms) and only screen positives were further evaluated by a reference test (n=101, 60%). In 32 (19%) publications the data required to make a diagnostic 2 by 2 table were not or insufficiently reported. The full list of references and their reasons for ineligibility is available from the first author.

Out of the online database search 21 publications were identified for inclusion [10,11,76–94]. In addition three publications were identified through the WHO website and the authors [13,95,96]. Those 24 publications reported on 17 studies. The general characteristics of these studies are shown in Table 1. Eleven were community prevalence surveys [13,84,86,92–103]; all 11 reported on one or more symptom screens and from five studies data on CXR screening could be obtained [13,85,86,86,95,96,102,104]. Two studies were in attendants of a voluntary counseling and testing center (VCT), where TB screening was offered to both HIV-negative and HIV-positive clients. Of those studies one reported on a symptom screen [105], and one on CXR in persons pre-selected on a history of cough [106]. One study was among residents of a homeless shelter and presented a symptom screen [88]; one study was in immigrants providing data on CXR [107]; one study was in a general outpatient setting, providing data on CXR in a pre-selected population that was normally not considered a TB suspect [108], and one study was in an occupational setting providing data on symptom screening [89]. The latter study involved annual medical examination of miners of whom 27% had some signs of silicosis, and was the only study in which repeated screening at predetermined intervals was common. The studies in miners and in homeless included only males. All but two studies involved adults only, either from 15 or 18 years and older. One study included persons from 10 years and above, and one study from one year and older. The studies with children <15 years of age did not provide sufficient data to disaggregate the results by age. Information on smoking was reported in five studies only, and ranged from 1.8% to 28.7% of the study population reporting current smoking [13,92–94,100]. Eleven studies were in low income countries, two in lower-middle-income countries (India, Vietnam), four in upper-middle-income (all South-Africa) and one in a high-income country (USA).

The results are presented for the study populations and index tests, as follows.

Population Screen No of studies

A. General population 1. CXR any abnormality 3 2. CXR abnormalities suggestive of TB 5 3. CXR and symptom screening in parallel 5 4. CXR as a 2nd screen 1 5. Prolonged cough 8 6. Cough of any duration 7 7. Any TB Symptom 8 B. Special populations 1. VCT Cough 1 Any Symptom 1 CXR 1 2. Homeless Cough 1 3. Migrants CXR 1 4. Outpatient CXR 1 5. Occupational - Miners Cough 1 Any Symptom 1

Page 10: REPORT – DRAFT 2

Page | 10

Tables A1-7, B1-5 and Figures B1-5 in the Tables and Figures section follow the same order and present the QUADAS ratings for the studies followed by a forest plot and a summary ROC curve if applicable. The QUADAS tables also show the screen definition applied in each study, the proportion of participants that had a bacteriological examination, and whether the HIV prevalence in the study population was high or low. Although HIV prevalence was not exactly reported in some studies, in low HIV prevalence populations HIV prevalence was less than 3% and in high HIV-prevalence populations HIV prevalence was in the order of 10% or above. The studies in populations with high HIV prevalence were all from sub Saharan Africa (SSA) and were also the studies with the highest proportions of participants verified by the reference standard. The studies in populations with low HIV prevalence were all from Asian countries. Information on smoking was provided for a minority of studies only, but generally in three of the included Asian countries smoking was more prevalent than in the SSA countries [109]. Therefore, the effects of these sources of heterogeneity could not be separated and the distinction between ‘high-HIV / SSA’ and ‘low-HIV / Asia’ reflects more than HIV-prevalence or geographic region alone. Two studies from high HIV populations reported accuracy for the HIV-positive and HIV-negative study population separately. In sensitivity analysis, the effect of including these HIV-negative populations on the pooled estimate of the low HIV prevalence stratum was assessed. Some studies reported many different symptom screens including questions about single symptoms other than cough (e.g. fever) or combinations that were explored by the authors in their analysis, but were uncommon in other studies. Those are not included in the report.

A. GENERAL POPULATION

CXR screening Five prevalence surveys provided data on CXR screening in the general population, of which three reported on any CXR abnormality (Table A1) and five on abnormalities suggestive of TB (Table A2). With respect to the assessment of methodological quality (Tables A1, A2) of three studies the risk of bias related to interpretation of the reference test could not be assessed, and all five studies had a high risk of bias in the flow and timing domain. The participants were selected for sputum culture examination based on CXR findings and symptom screening in all except the study by den Boon et al. Therefore partial verification bias may result in an overestimation of sensitivity in these studies. The extent may differ depending on the symptom selection criteria. In the Myanmar and Cambodia surveys the symptom screen to select persons for sputum examination included only prolonged cough. In the study by den Boon et al. verification bias was also likely, since 55% of the participants who had a CXR made did not provide sputum. In the authors’ report the participants who did not provide sputum were excluded from the analysis [indicated in the tables as version (a)]. The excluded participants had significantly fewer symptoms and fewer CXR abnormalities, which results in overestimation of specificity, and limits the applicability for screening practice. We therefore recalculated the accuracy of the reported screens, and included participants without sputum examination, for as far as the reported data allowed, assuming that all persons who had a CXR but did not give sputum did not have TB [indicated in the tables as version (b)]. This makes the specificity more realistic for a screening practice. The direction of bias in the sensitivity estimate is less predictable. Two studies had moderate applicability concerns related to the definition of a TB case. Two studies were in a moderate to high HIV prevalence population in SSA and three in Asian countries with low HIV prevalence (Tables A1, A2), but the results were not stratified due to the small number of studies. 1. Any Abnormality Of the three studies reporting on ‘Any CXR Abnormality’ we used the CXR reading that was done in the field by medical personnel with a general background in the Myanmar and Kenya surveys. The

Page 11: REPORT – DRAFT 2

Page | 11

study by den Boon et el. reported on expert readers who read the films at a later point in time. Later reading for any abnormality by expert readers gave, compared to the field reading, a lower sensitivity and higher specificity in the Myanmar survey (94% sensitivity, 84% specificity) and in the Kenya survey, where rereading by two experts of a sample of the CXRs had 81% and 83% sensitivity, and 80% and 72% specificity for each expert respectively. The accuracy of the three studies was homogeneous (Figure A2) and the pooled estimate is presented below. As explained above, verification bias may have resulted in overestimation of sensitivity in all three studies. ANY CXR ABNORMALITY - SUMMARY ESTIMATES Sensitivity Specificity

(n=3) 0.978 [0.951; 1.00] 0.754 [0.720; 0.788] Stratification and sensitivity analyses not done 2. Abnormalities suggestive of TB Of the five studies reporting ‘Abnormalities suggestive of TB’ CXR reading for abnormalities suggestive of TB was done by experts at a later point in time in the surveys in Myanmar, South Africa and Kenya. In Kenya a random sample of the survey CXRs of participants without TB was reread. In the Vietnam and the Cambodia survey reading for TB abnormalities was in the field, and by physicians with general training in Vietnam. The qualifications of the readers in Cambodia were not specified. The accuracy reported in the five studies was homogeneous. The summary estimates are presented below. Of the two results from Kenya, the result of expert 1 was maintained since it was the most conservative. As described above the risk of verification bias was likely the strongest in the studies from Vietnam and Cambodia, since CXR abnormalities suggestive of TB were – besides symptom screening- used as the CXR screen to select survey participants for culture examination, while the other studies selected persons with any CXR abnormality or all participants for culture examination. In sensitivity analysis the effect was assessed of excluding these studieswith the lowest proportion of participants ascertained by mycobacterial culture (Vietnam and Cambodia surveys (Figure A2.ii), These summary estimates of sensitivity (0.868) and specificity (0.894) are presented as they are more conservative. CXR ABNORMALITIES SUGGESTIVE OF TB - SUMMARY ESTIMATES Stratification / included studies Sensitivity Specificity All (n=5) Kenya expert 1 0.900 [0.837; 0.963] 0.914 [0.820; 0.945] All (n=5) Kenya expert 2 0.901 [0.834; 0.967] 0.922 [0.896; 0.948] Excluding the Cambodia and Vietnam study (lowest proportion of participants verified by mycobacterial culture) (n=3)

0.868 [0.792; 0.945] 0.894 [0.867; 0.920]

2.1. Smear-positive TB as the reference standard The sensitivity and specificity of CXR with smear-positive TB as the reference standard could be obtained from two of the above mentioned studies (Figure A2.1). Within each study the sensitivity and specificity were the same or almost the same as when bacteriologically-positive PTB was the reference standard. In the study by den Boon et al. data to adjust calculations for persons who did not give sputum were not available for the smear-positive reference standard (referred to as version b), so only results that exclude persons without sputum examination could be used for this subgroup. We therefore consider the same pooled estimates for CXR accuracy regardless whether the reference standard is all bacteriologically positive TB or smear-positive TB only. 3. Symptom and/or CXR (parallel screening) Of the studies that provided data on CXR screening we compared CXR screening alone with a combined CXR-symptom screen that would be considered positive if either the CXR screen was positive, or a prolonged cough screen was positive (Figure 1 option C) Table A3). The study by den Boon et al. did not provide sufficient data to obtain accuracy of such a screen, but in this study adding symptoms did not result in the identification of more TB cases. Of the other four studies two reported CXR screening for any abnormality, and one study CXR screening for abnormalities

Page 12: REPORT – DRAFT 2

Page | 12

suggestive of TB. The addition of cough lowered specificity by (an absolute) 2-5%. The SROC plot (Figure A3) relates CXR to symptom/CXR screen for each study. Adding cough to the CXR screen increased sensitivity by (an absolute) 3-9% in 3 studies, and reduced sensitivity in one (Table A3). The reduction in sensitivity is explained by different denominators: to calculate the accuracy of CXR screening all persons who had a CXR were included in the denominator. For the combined symptom-CXR screen all participants and all cases identified in the survey were included. Cases were also identified among persons who did not go for CXR and did not meet the prolonged cough criteria. In the Myanmar and Vietnam studies the sensitivity of the parallel screen would be 100% if only persons with CXR and prolonged cough would be included in the denominator, since that was the prevalence survey screening method. 4. CXR as a 2nd screen, in persons pre-screened with symptoms Data on CXR as a second screen in sequential screening (Figure 1 option D), i.e. in persons who first screen positive on a symptom screen were only reported in studies in the VCT and OPD populations (see paragraph B1 and B5). In the Kenya study, recalculation of published data gave a sensitivity of 90% [81; 96] for any CXR abnormality in persons with “TB suggestive symptoms” (Cough >7 days or hemoptysis or 2 out of 3 of fever, night sweats (each >7 days), weight loss resulting in a changed fit of cloths), and 56% [54; 58] specificity.

SYMPTOM SCREENING

5. Prolonged cough Eight studies provided data on prolonged cough, of which five had data on cough for two or more weeks and three on cough for three or more weeks only. With respect to the assessment of methodological quality (Table A5), all but two studies had a high risk of bias in the flow and timing domain, and for three of these in addition the risk of bias related to the reference standard was unclear. For three studies the applicability of the reference standard was of moderate concern, since the case definition included persons with a positive culture or smear but absent or minimal clinical signs. The summary estimates of sensitivity and specificity were significantly different in the SSA populations with high HIV prevalence compared to the Asian countries that all had low HIV prevalence. As explained above, this distinction implies more than population HIV prevalence alone. The summary estimates for the high HIV prevalence/SSA, low HIV prevalence/Asian subgroups and of all eight studies combined are presented below. Excluding the two studies that used MMR to preselect (part of) the survey participants for culture (Datta et al., Hoa et al.), and studies with the lowest percentage of study participants verified by the reference standard (Hoa et al., Cambodia survey), the summary estimates combining SSA /high HIV and Asia /low HIV prevalence were 41.8% [95% CI 28.5; 55.0] for sensitivity, and 93.8% [95% CI 60.4; 97.1] for specificity. This contrasts with the expectation that studies with likely more verification bias overestimate sensitivity. If the estimates of the HIV-negative population in the two African studies that provided separate data by HIV-status were added to the Asia/low HIV populations, the summary estimate for low HIV populations increased slightly. In those two studies, in the HIV-negative population the sensitivity of cough for 3 or more weeks was 36% [21; 54] and specificity 94% [93; 95] (Ayles et al.), and of cough for 2 or more weeks in the study by Corbett et al. 45% [27; 64] and 98% [97; 98] respectively. To model the yield of diagnostic algorithms, the overall summary estimate was used.

PROLONGED COUGH - SUMMARY ESTIMATES Stratification / included studies Sensitivity Specificity High HIV prevalence population / SSA (n=4) 0.492 [0.389; 0.597] 0.923 [0.891; 0.956] Low HIV prevalence population / Asia (n=4) 0.247 [0.176; 0.317] 0.963 [0.947; 0.979] HIV-negative population in Ayles, Corbett et al. and studies in low HIV prevalence populations (n=6)

0.28 [0.19; 0.37] 0.96 [0.95; 0.98]

Low and high HIV combined (n=8) 0.351 [0.244; 0.457] 0.947 [0.925; 0.968]

Page 13: REPORT – DRAFT 2

Page | 13

Low and high HIV combined , exclude low proportion with reference standard and/or MMR pre-screening (excl. Hoa et al., Cambodia survey, Datta et al.; n=5)

0.418 [0.285; 0.550] 0.938 [0.604; 0.971]

5.1 Prolonged cough with smear-positive TB as the reference standard. Four studies (included in the studies reporting prolonged cough) also provided data from which the accuracy of prolonged cough could be obtained if smear-positive TB was considered the reference standard (Figure A7). For each of the four studies the specificity of prolonged cough did not differ by reference standard. The sensitivity of prolonged cough to identify smear-positive TB was respectively 13, 14, 21 and 21% higher than for all bacteriologically positive PTB as the reference standard (median/mean 17%). The pooled sensitivity of prolonged cough with smear-positive TB as the reference standard, multiplied by an assumed sensitivity of 0.6 for smear-microscopy [28–30] compared to culture gives almost the same value (0.347) as the pooled sensitivity of prolonged cough with bacteriologically positive TB as the reference standard (0.351), suggesting that distortion of the pooled estimates by heterogeneity may be limited.

PROLONGED COUGH WITH SMEAR-POSITIVE TB AS REF STANDARD - SUMMARY ESTIMATE Stratification / included studies Sensitivity Specificity Low and high HIV combined (n=4) 0.578 [0.417; 0.738] 0.922 [0.868; 0.976] 6. Any cough Seven studies reported on cough of any duration of which five were also included in the prolonged cough analysis. Two studies, by Wood et al. and by Sebatu et al., only reported on cough of any duration only. The latter, a prevalence survey from Eritrea, used smear-microscopy only as the reference standard. Of two other studies the risk of bias related to the reference standard was unclear (Table A6) and three studies had a high risk of bias in the flow and timing domain. Generally cough of any duration had higher sensitivity and lower specificity than prolonged cough. In the bivariate model the pooled estimates differed by HIV prevalence, but less so than the prolonged cough screen. After excluding the study with smear-positive TB as the reference standard, the sensitivity did not differ by the HIV co-variate. The pooled estimate includes studies with bacteriologically positive TB only. In three of the seven prevalence surveys participants were selected for sputum culture examination based on CXR abnormalities or on a symptom screen. In those studies some persons who reported cough of any duration were not subjected to the full reference standard, which may have underestimated the actual sensitivity of the screen.

ANY COUGH - SUMMARY ESTIMATES Stratification / included studies Sensitivity Specificity High HIV prevalence population / SSA (n=4) 0.621 [0.419; 0.824] 0.815 [0.692; 0.939] Low HIV prevalence population / (n=3) 0.519 [0.277; 0.760] 0.767 [0.598; 0.935] HIV-negative population in Ayles, Corbett and studies in low HIV prevalence populations (n=5)

0.565 [0.404; 0.725] 0.830 [0.713; 0.946]

Low and high HIV combined (n=7) 0.567 [0.396; 0.739] 0.795 [0.693; 0.898] Low and high HIV combined, excluding smear-positive reference standard (n=6)

0.627 [0.493; 0.761] 0.775 [0.655; 0.895]

7. Any TB Symptom Eight studies provided data on ‘any TB symptom’ as a symptom screen. The number of symptoms and durations of each symptom that would qualify differed across studies, from four to eight symptoms. Cough, haemoptysis, fever, night sweats and weight loss were the most common. The Myanmar study also asked for fatigue (Table A7). The issues related to methodological quality (Table A7) are similar as for prolonged cough as seven out of eight were the same studies. The summary estimates of sensitivity and specificity were significantly different in populations with a high HIV/SSA vs. low HIV prevalence/ Asian studies. As mentioned before, this distinction implied

Page 14: REPORT – DRAFT 2

Page | 14

besides difference in HIV prevalence also differences in continent reflecting geographic, demographic and cultural factors, smoking prevalence etc.. The studies from sub-Saharan African had the highest proportions of participants verified by the reference standard. The pooled estimates are listed below. In fiv of the eight prevalence surveys participants were selected for sputum culture examination based on CXR abnormalities or on a symptom that was more restricted than ‘any TB symptom’. In those studies not all persons who reported any symptom were subjected to the full reference standard. This may have underestimated the actual sensitivity of the screen, but has less effect on the specificity due to the large numbers of TB-negatives in prevalence surveys. SUMMARY ESTIMATES – ANY SYMPTOM Stratification / included studies Sensitivity Specificity High HIV prevalence population / SSA (n=4) 0.842 [0.756; 0.927] 0.606 [0.347; 0.866] Low HIV prevalence population / Asia (n=4) 0.698 [0.579; 0.818] 0.740 [0.531; 0.949] HIV-negative population in Ayles et al., Corbett et al. and studies in Asian low HIV prevalence populations (n=6)

0.723 [0.628; 0.819] 0.726 [0.538; 0.913]

Low HIV/Asia and high HIV/SSA combined (n=8) 0.770 [0.680; 0.860] 0.677 [0.502; 0.851] Low HIV/Asia and high HIV/SSA combined, excluding low proportion with reference standard and/or MMR pre-screening (Gopi et al., Cambodia survey, Datta et al.) n=5

0.796 [0.687; 0.905] 0.611 [0.391; 0.831]

B. OTHER POPULATIONS

Six studies were included that evaluated screening tools in groups other than the general population: in VCT clients, in residents of a homeless shelter, in migrants, in outpatients who were not considered a TB suspect by the prevailing definition, and one occupational cohort of gold miners. The results of the QUADAS assessment and the sensitivity and specificity are shown in tables B1-B5. Since there was only one study per screen per population, pooling of estimates was not applicable.

1. Attendants of a voluntary counseling and HIV-testing centre (VCT) Two studies evaluated a screening tool in clients of a VCT center (Tables B1i, B1ii). TB screening and sputum culture were offered to clients regardless of their HIV status. In the study by Chheng et al. symptom screens were evaluated. VCT clients who declined TB screening (about half) were not included, which may have biased the accuracy estimates. Symptomatic persons may have been more likely to go for VCT, and for TB examinations as a next step, which could explain the higher sensitivity and lower specificity of cough and ‘any TB symptom’ screening compared to the general population. The study by Burgess et al. provided data on any CXR abnormality in VCT clients who had a cough of any duration. The estimate may be biased because it was unclear how pre-screening was done, and CXR was not specifically done as a screening CXR, but was part of the TB diagnostic work up. The sensitivity of ‘Any CXR abnormality’ was 100% [92; 100] and specificity 53% [45; 60], in the same order as the accuracy of ‘Any CXR abnormality’ as a 2nd screen in the general population (paragraph A4).

2. Residents of a homeless shelter One study evaluated a screening tool in a homeless population in the USA (Tables B2i, B2ii). All participants were male. Screening for ‘cough of any duration’ and sputum culture on 1 sample were offered to all participants. Compared to the general population, the sensitivity of the screen is low and the specificity high in this population. This may be explained by underreporting of symptoms, and possibly by the small sample size (n=127; 4 TB cases). For the reference standard

Page 15: REPORT – DRAFT 2

Page | 15

one sputum sample only was collected, which may also contribute to low sensitivity since the probability of a positive culture increases when more samples are cultured [27].

3. Occupational setting – gold miners One study (Lewis et al. 2009) evaluated several screening tools in males working in a South African gold mine (Tables B3i, B3ii). The gold miners were screened for tuberculosis on an annual basis, including with CXR (MMR). Sensitivity of the symptoms screens was considerably lower compared to symptom screening in the general population, and specificity was high. The low sensitivity may be because in longitudinal screening programs with screening rounds at regular/predetermined intervals, the proportion of persons with advanced disease due to delay in getting diagnosed reduces due to the early case finding. High specificity may be explained by the healthy worker bias (a low proportion of persons reporting symptoms due to other diseases, compared to the general population), and the definition of the reference standard that avoided diagnosing TB based on a single positive sputum sample only.

4. Migrants One study (Mor et al., 2012) provided data on CXR screening in Ethiopians preparing for migration to Israel. This study included some children of 1 year and older (Tables B4i, B4ii). Only 1% of the study population was however verified by the full reference standard, and the specificity of CXR abnormalities suggestive of TB was very high (99%), suggesting that CXR was used as a diagnostic rather than a screening tool. The classification included only CXR abnormalities suggestive of TB. TB cases will have been missed among persons who were not eligible for sputum culture. The applicability of the results of this study for screening as anticipated by the TB screening guidelines is limited.

5. Outpatients One study (Banda et al., 1998) provided data on CXR screening in outpatients of a large hospital in Malawi. The study included OPD attendants with a short duration of cough (1-2 weeks) who were not considered a TB suspect, which would require cough>3 weeks in that setting. The risk of bias was high, since the study did not clearly describe how patients were referred within the hospital for inclusion in the study. Although the study formally met the inclusion criteria, the TB prevalence (35% of participants) was much higher than in other studies and reflects a population of TB suspects identified by passive case detection rather than by screening. This is a concern for applicability to other outpatient settings. The applicability of the index test is also a concern: the CXR classification 'consistent with TB' was intended more as a diagnostic rather than a screening tool, which may explain the low sensitivity 26% [13; 44%] compared to other studies.

DISCUSSION This review provided summary estimates of the sensitivity and specificity of symptom- and CXR- screening for active tuberculosis, obtained from 11 prevalence surveys that were conducted in general populations with a high TB burden. These pooled estimates, summarized below, can be used to estimate the expected yield of diagnostic algorithms for the purpose of TB screening guideline development, and may guide the design of screening programs. Another six studies provided data on symptom and/or CXR screening in special populations, each in one specific population.

Page 16: REPORT – DRAFT 2

Page | 16

Screen Reference standard

population HIV prev

No. of studies

Sensitivity [95% CI]

Specificity [95% CI]

CXR any abnormality Bact+ combined 3 0.978 [0.951; 1.00] 0.754 [0.720; 0.788] CXR TB abnormality Bact+ combined 4 0.868 [0.792; 0.945] 0.894 [0.867; 0.920] Prolonged Cough Bact+ high 4 0.492 [0.389; 0.597] 0.923 [0.891; 0.956] Prolonged Cough Bact+ low 4 0.247 [0.176; 0.317] 0.963 [0.947; 0.979] Prolonged Cough Bact+ combined 8 0.351 [0.244; 0.457] 0.947 [0.925; 0.968] Prolonged Cough Sm+ combined 4 0.578 [0.417; 0.738] 0.922 [0.868; 0.976] Any cough Bact+ combined 6 0.627 [0.493; 0.761] 0.775 [0.655; 0.895] Any TB Symptom Bact+ high 4 0.842 [0.756; 0.927] 0.740 [0.531; 0.949] Any TB Symptom Bact+ low 4 0.698 [0.579; 0.818] 0.606 [0.347; 0.866] Any TB Symptom Bact+ combined 8 0.770 [0.680; 0.860] 0.677 [0.502; 0.851]

Bact+ = bacteriologically positive ; Sm+ = smear-positive Symptom screening Of the symptom screens, prolonged cough i.e. cough for two or more weeks had the lowest pooled sensitivity (35%) and highest specificity (95%). Screening for prolonged cough in a programmatic setting would limit the resource needs for confirmatory tests, but leave a majority of the TB cases undetected. Screening for ‘any TB symptom’ i.e. presence of one or more symptoms out of a list of 4-7 had the highest sensitivity (77%) but lower specificity (68%). The sensitivity and specificity of a cough of any duration (‘any cough’) were in between those of prolonged cough and ‘any TB symptom’. The studies on symptom screening showed considerable heterogeneity. The sensitivity of prolonged cough ranged between 14 and 54 % (between 9-73% when considering the extremes of the 95% confidence intervals). The specificity of prolonged cough varied less, between 87-98%. Studies that reported a higher sensitivity of prolonged cough generally showed a lower specificity. Screening for ‘any cough’ and ‘any TB symptom’ showed considerable heterogeneity across the studies for both sensitivity and specificity. In the three studies that reported the highest sensitivity when screening for ‘any TB symptom’ (between 84 and 90%), the specificity was between 32% and 42% [110–112]. For programmatic settings this would imply that up to two-thirds of the screened population would require further confirmatory testing. Some of the heterogeneity was explained by the HIV prevalence in the study population. The studies in the general population that reported stratified results for the HIV-positive and HIV-negative study participants [100,110] showed higher sensitivity and lower specificity of all symptom screens in the HIV-positive participants. This may be explained by a generally higher prevalence of symptoms and ailments in HIV-infected persons, and by a faster progression of TB-disease implying that the time period that persons with TB have no or only mild symptoms is shorter in HIV-infected compared to HIV-uninfected persons [113]. Our results also showed higher sensitivity and lower specificity of symptom screening in populations with high HIV prevalence compared to the populations with low HIV prevalence. However, the stratification by HIV-prevalence in the analysis comprised more than population HIV prevalence alone, since the studies in populations with high HIV prevalence were all from sub-Saharan Africa and had the highest proportions of participants verified by the reference standard. The studies providing data on prolonged cough and ‘any TB symptom’ in low HIV-prevalence populations were all from Asia. Information on smoking was reported by a minority of studies, but in three of the Asian countries smoking is more prevalent than in the SSA countries [109]. Differences between SSA and Asian populations in age structure of the population, cultural aspects affecting responses to symptom questionnaires, and difference in access to health care in general may also have contributed to the observed heterogeneity. Other likely sources of heterogeneity were differences in the actual questions and number of symptom questions that were asked, the fact that symptoms in general are subjective, and different definitions of the reference standard. The small number of studies that was available for each screen and the coinciding risk factors limited further exploration of the

Page 17: REPORT – DRAFT 2

Page | 17

effect of different sources heterogeneity on the pooled estimates. The implication of heterogeneity for screening programs is however that the expected sensitivity and specificity of a symptom screening tool in a specific setting cannot be predicted with great precision. CXR screening Compared to symptom screening, CXR screening generally showed greater accuracy and less heterogeneity. The pooled sensitivity of CXR reading was higher than of the symptom screens, 98% for ‘any CXR abnormality’, with 75% specificity, and 87% pooled sensitivity for ‘TB abnormalities’, with 89% specificity. The high pooled sensitivity of ‘any CXR abnormality’ should be interpreted with some caution. The number of studies providing data on CXR screening was low, in all studies verification bias may have resulted in overestimation of sensitivity, and in two of the three studies the results were from field readers in prevalence surveys who were encouraged to over-read. The possible effect of verification bias from not performing bacteriological examinations on persons who did not report the required symptoms and had normal CXRs was assessed by the investigators of the Myanmar survey. All CXRs were read in the field and read again by expert readers at a central level at a later point in time. Through multiple imputation the investigators adjusted for missed cases and assumed that CXRs classified as abnormal by the field readers, but normal at central level, were normal, and that the probability of being a TB case among persons with this type of discrepant CXR reading was the same as among persons with a normal CXR during field reading. Before imputation, the sensitivity of ‘any CXR abnormality’ at central reading was 91% and after imputation the sensitivity was estimated to 89% (Nyamada, preliminary data presented at “WHO TB screening meeting” in May 2012). A limitation of this approach was that in general the agreement between CXR readers is modest [104,114], and differences between field and central readers may also be due to chance. We were unable to assess the possible effect of factors that may cause heterogeneity on the pooled estimates for CXR reading, due to the low number of studies. Older age and smoking are associated with increased prevalence of CXR abnormalities in general [13,115] , and may affect the performance of CXR screening in different programmatic settings. In HIV-infected persons bacteriologically confirmed TB in the presence of a normal or atypical CXR is more common than in HIV-negative persons, and a slightly lower sensitivity of CXR reading may be expected in populations with a high HIV-prevalence [116,117]. The definitions of what was considered an abnormality or a TB abnormality were not specified in all studies, and if specified differed slightly across studies. When studies presented ‘any CXR abnormality’ results for different types of readers, we assumed that field reading conditions of prevalence surveys, where CXRs readers have a general medical background, were more representative for a screening program setting than CXR reading by expert readers at a later point in time. In many high TB burden countries expert readers may not be available for deployment in large scale screening programs. Field reading for ‘any CXR abnormality’ had higher sensitivity, but lower specificity compared to expert readers [[13,104]]. In programmatic conditions, limiting the number of persons requiring confirmatory tests will be a concern, so a higher specificity and thus in more ‘restrictive reading’ may be encouraged. Reading for ‘TB abnormalities’ had a pooled specificity of 89%. The accuracy of TB abnormalities obtained from the included studies was based on expert readers and cannot be assumed to be the same when readers with a general medical background will be expected to classify abnormalities as suggestive for TB or not. Inter-reader variability is also a concern for the generalizability of the summary estimates to screening programs, since reader agreement is less for CXRs of persons who do not have TB [20]. Computer aided classification of digital radiographs (CAD) is a recent development that can overcome some of these issues [118]. As mentioned the low number of studies that could be included in the review, and methodological concerns were the main limitations of this review, resulting in mostly low quality of the evidence. Although there were considerably more studies that described TB screening, the majority was not eligible because screening was done with one screening tool only (e.g. cough, or CXR), and the

Page 18: REPORT – DRAFT 2

Page | 18

screen-negatives were not further evaluated with a reference test, which precludes evaluation of the accuracy of the screening tool. Of the 11 prevalence surveys conducted in the general population, only three had sputum culture examinations of all participants [94,97,100]. None of those evaluated CXR screening. All other studies included verification bias by design because symptom and CXR screening were applied to select persons for culture examination. The proportion of participants verified by culture varied between 8-32%. Persons without sputum culture examination were included in the analysis and assumed not to have TB. This design on one hand avoids that persons with a positive sputum culture, but without clinical signs of disease are considered a TB case. On the other hand this design results in overestimation of sensitivity [35] if the symptom and CXR screening methods that were applied in the survey had low sensitivity. However, specificity is less affected, due to the large number of TB-negatives in prevalence surveys. Since several countries are conducting or planning to conduct prevalence surveys [1] a few more studies can be expected over time. The ideal study in which a sufficiently large sample of participants are all evaluated by a bacteriological reference standard and by symptom questions and CXR is however expected to remain a rare event due to the large resource requirements. Two evaluations of screening strategies from prevalence surveys reported 95% confidence intervals that were adjusted for the cluster sampling design [93,102], which were slightly wider for some symptom screens compared to the binomial exact confidence intervals used in the analysis in Review Manager (e.g. 0.41-0.63 rather than 0.43-0.61). In the meta-analysis the design effect was ignored, since the effect on the 95% confidence intervals of the pooled estimates was expected to be very small. The pooled estimates only apply to populations of which the majority were screened for the first time. In screening programs the accuracy of the tools may change if screening is repeated over time, due to an increase in the proportion of patients with mild/early disease. The accuracy of annual screening in the gold mines was considerably lower [89], but the population was very specific in age, sex and general health status as well. For a screening program the choice of screening method will depend on several considerations. Symptom screening may be an attractive choice since it does not involve expensive equipment and exposure to radiation. When considering an algorithm (Figure 1) with one of the screening tools described in this review, followed by a confirmatory test that had 90% sensitivity and 98% specificity, our pooled estimates would imply that if a population of 100 000 persons with a TB prevalence of 2% (i.e. 2000 TB cases) would be screened, screening for prolonged cough (35% sensitivity, 95% specificity) would require 5600 persons to undergo confirmatory testing, 32% (630) of the true TB cases would be detected, and 98 persons would receive a false positive TB diagnosis, reflecting a positive predictive value (PPV) of the whole diagnostic algorithm of 87%. CXR screening for TB abnormalities would result in 12 520 persons requiring confirmatory testing, detection of 78% (1566) of the TB cases, and 216 persons would receive a false positive TB diagnosis, reflecting a PPV of the whole algorithm of 88%. When screening for ‘any TB symptom’ 32 900 persons would require confirmatory testing, 69% (1386) of the TB cases would be detected, and 627 persons would receive a false positive TB diagnosis, reflecting a PPV of the whole algorithm of 69%. A low specificity of the screening method would not only result in a high proportion of persons requiring confirmatory testing, but also result in a high proportion of false positives among the persons receiving a TB diagnosis if the TB prevalence in the screened population is low. If the TB prevalence were 0.5%, screening for ‘any TB symptom’ would detect 347 of 500 true TB cases, and 634 persons would receive a false positive TB diagnosis, reflecting a PPV of the diagnostic algorithm of 35%. The PPV could improve if the confirmatory test has very high specificity, e.g. 99.8%, the false positive TB diagnoses would be 64, reflecting a PPV 84%. Sequential screening (Figure 1D), i.e. first applying e.g. symptoms screening for ‘any TB symptom’ followed by CXR screening for ‘TB abnormalities’ if the symptom screen is positive will also improve the specificity of the algorithm, but at some loss of sensitivity.

Page 19: REPORT – DRAFT 2

Page | 19

Conclusion In conclusion, the pooled estimates of symptom and CXR screening can inform the choice of screening and diagnostic algorithms, but should be interpreted with some caution due to the small number of studies, methodological concerns, and heterogeneity. Symptom screening is less accurate and more variable compared to CXR screening. The expected sensitivity and specificity of especially symptom screening in a specific setting cannot be predicted with great precision. When the TB prevalence in the screened population is low screening tools with low specificity may potentially lead to a high proportion of false diagnoses among the persons diagnosed with TB, unless the specificity of the confirmatory test is very high. Evaluation of the performance of screening tools under programmatic conditions is important. A simple screening test with accuracy in the order of CXR screening for TB abnormalities but without the need for CXR equipment would be highly desirable asset to facilitate TB screening as a way to improve TB case detection.

Reference List

1. World Health Organization (2010) global tuberculosis control: who report 2010.

2. Lawn SD, Zumla AI (2011) Tuberculosis. 378: 57-72. Epub 2011 Mar 21.;Seminar.

3. Rieder HL (1999) Tuberculosis. Etiologic epidemiology: risk factors for disease given that infection has occurred. In: Epidemiologic basis of tuberculosis control. Paris: International Union Against Tuberculosis and Lung Disease. pp. 63-119.

4. Tiemersma EW, van der Werf MJ, Borgdorff MW, Williams BG, Nagelkerke NJ (2011) Natural history of tuberculosis: duration and fatality of untreated pulmonary tuberculosis in HIV negative patients: a systematic review. 6: e17601.

5. Golub JE, Mohan CI, Comstock GW, Chaisson RE (2005) Active case finding of tuberculosis: historical perspective and future prospects. 9: 1183-1203.

6. Sreeramareddy CT, Panduru KV, Menten J, Van den Ende J (2009) Time delays in diagnosis of pulmonary tuberculosis: a systematic review of literature. 9: 91.

7. den Boon S, Verver S, Lombard CJ, Bateman ED, Irusen EM, Enarson DA, Borgdorff MW, Beyers N (2008) Comparison of symptoms and treatment outcomes between actively and passively detected tuberculosis cases: the additional value of active case finding. 136: 1342-1349.

8. van't Hoog AH, Marston BJ, Ayisi JG, Agaya JA, Muhenje O, Odeny LO, Hongo J, Laserson KF, Borgdorff MW (2011) Risk Factors for Inadequate TB case finding in Kenya: a Comparison of Prevalent and Self-reported TB Patients.

9. World Health Organization (2006) The Stop TB Strategy; Building on and enhancing DOTS to meet the TB-related Millenium Development Goals.

10. Ayles H, Schaap A, Nota A, Sismanidis C, Tembwe R, de Haas P, Muyoyeta M, Beyers N, Godfrey-Faussett P (2009) Prevalence of tuberculosis, HIV and respiratory symptoms in two Zambian communities: Implications for tuberculosis control in the era of HIV. PLoS ONE 4. http://dx.doi.org/10.1371/journal.pone.0005602.

11. Corbett EL, Bandason T, Cheung YB, Makamure B, Dauya E, Munyati SS, Churchyard GJ, Williams BG, Butterworth AE, Mungofa S, Hayes RJ, Mason PR (2009) Prevalent infectious tuberculosis in Harare,

Page 20: REPORT – DRAFT 2

Page | 20

Zimbabwe: burden, risk factors and implications for control. Int J Tuberc Lung Dis 13: 1231-1237. Research Support, Non-U.S. Gov't.

12. van't Hoog AH, Laserson KF, Githui WA, Meme HK, Agaya JA, Odeny LO, Muchiri BG, Marston BJ, DeCock KM, Borgdorff MW (2011) High prevalence of pulmonary tuberculosis and inadequate case finding in rural western Kenya. Am J Respir Crit Care Med 183: 1245-1253. Research Support, Non-U.S. Gov't.

13. Ministy of Health DoHGoM (2012) Report on National TB Prevalence Survey 2009-2010, Myanmar.

14. Stop TB Department World Health Organization. (2011 June) Draft meeting report of the Scoping meeting for the development of guidelines on screening for active TB. World Health Organization, Geneva, Switzerland. Available.

15. Wilson J, Jungner G (1968) Principles and practice of screening for disease. In: Originally appearing in: Commission on Chronic Illness (1957) Chronic illness in the United States: Volume I. Prevention of Chronic Illness, Cambridge, Mass., Harvard University Press, p.45). Geneva, Switzerland: WHO.

16. Lonnroth K (2012) DRAFT: Systematic screening for active tuberculosis. Proposed principles and recommendations. Proposed outline and draft principles and recommendations for discussion in the technical consultation to review evidence on screening for active TB 23-24 May 2012.

17. Maher D (2009) Clinical features and index of suspicion in adults (HIV-negative and HIV-positive). In: Tuberculosis. A comprehensive clinical reference. Philadelphia: Saunders. pp. 164-168.

18. Kerley P, Major T (1942) Technique in Mass Miniature Radiography. The British Journal of Radiology 15: 346-347.

19. den Boon S, White NW, van Lill SW, Borgdorff MW, Verver S, Lombard CJ, Bateman ED, Irusen E, Enarson DA, Beyers N (2006) An evaluation of symptom and chest radiographic screening in tuberculosis prevalence surveys. 10: 876-882.

20. van't Hoog AH, Meme HK, van DH, Mithika AM, Olunga C, Onyino F, Borgdorff MW (2011) High sensitivity of chest radiograph reading by clinical officers in a tuberculosis prevalence survey. Int J Tuberc Lung Dis 15: 1308-1314.

21. World Health Organization (2010) Tuberculosis Prevalence Surveys: a handbook, second edition (pre-publication copy). World Health Organization.

22. Hayen A, Macaskill P, Irwig L, Bossuyt P (2010) Appropriate statistical methods are required to assess diagnostic tests for replacement, add-on, and triage. J Clin Epidemiol 63: 883-891. S0895-4356(09)00306-0 [pii];10.1016/j.jclinepi.2009.08.024 [doi].

23. Whitelaw A, Sturm WA (2009) Microbiological testing for Mycobactrium tuberculosis. In: Schaaf HS, Zumla AI, editors. Tuberculosis. A comprehensive clinical reference. Philadelphia: Saunders. pp. 164-168.

24. Chien HP, Yu MC, Wu MH, Lin TP, Luh KT (2000) Comparison of the BACTEC MGIT 960 with Lowenstein-Jensen medium for recovery of mycobacteria from clinical specimens. Int J Tuberc Lung Dis 4: 866-870.

25. Hanna BA, Ebrahimzadeh A, Elliott LB, Morgan MA, Novak SM, Rusch-Gerdes S, Acio M, Dunbar DF, Holmes TM, Rexer CH, Savthyakumar C, Vannier AM (1999) Multicenter evaluation of the BACTEC MGIT 960 system for recovery of mycobacteria. J Clin Microbiol 37: 748-752.

Page 21: REPORT – DRAFT 2

Page | 21

26. Somoskovi A, Kodmon C, Lantos A, Bartfai Z, Tamasi L, Fuzy J, Magyar P (2000) Comparison of recoveries of mycobacterium tuberculosis using the automated BACTEC MGIT 960 system, the BACTEC 460 TB system, and Lowenstein-Jensen medium. J Clin Microbiol 38: 2395-2397.

27. Monkongdee P, McCarthy KD, Cain KP, Tasaneeyapan T, Nguyen HD, Nguyen TN, Nguyen TB, Teeratakulpisarn N, Udomsantisuk N, Heilig C, Varma JK (2009) Yield of acid-fast smear and mycobacterial culture for tuberculosis diagnosis in people with human immunodeficiency virus. 180: 903-908.

28. Steingart KR, Henry M, Ng V, Hopewell PC, Ramsay A, Cunningham J, Urbanczik R, Perkins M, Aziz MA, Pai M (2006) Fluorescence versus conventional sputum smear microscopy for tuberculosis: a systematic review. 6: 570-581.

29. Cattamanchi A, Davis JL, Pai M, Huang L, Hopewell PC, Steingart KR (2010) Does bleach processing increase the accuracy of sputum smear microscopy for diagnosing pulmonary tuberculosis? J Clin Microbiol 48: 2433-2439. Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't Review.

30. Steingart KR, Ng V, Henry M, Hopewell PC, Ramsay A, Cunningham J, Urbanczik R, Perkins MD, Aziz MA, Pai M (2006) Sputum processing methods to improve the sensitivity of smear microscopy for tuberculosis: a systematic review. 6: 664-674.

31. Boehme CC, Nicol MP, Nabeta P, Michael JS, Gotuzzo E, Tahirli R, Gler MT, Blakemore R, Worodria W, Gray C, Huang L, Caceres T, Mehdiyev R, Raymond L, Whitelaw A, Sagadevan K, Alexander H, Albert H, Cobelens F, Cox H, Alland D, Perkins MD (2011) Feasibility, diagnostic accuracy, and effectiveness of decentralised use of the Xpert MTB/RIF test for diagnosis of tuberculosis and multidrug resistance: a multicentre implementation study. 377: 1495-1505.

32. Vassall A, van Kampen S., Sohn H, Michael JS, John K, den Boon S., Davis L, Whitelaw A, Nicol MP, Gler MT, Khaliqov A, Zamudio C, Perkins MD, Boehme CC, Cobelens F (2011) Rapid diagnosis of tuberculosis with the Xpert MTB/RIF assay in high burden countries: a cost-effectiveness analysis. PLoS Med 8: e1001120. 10.1371/journal.pmed.1001120 [doi];PMEDICINE-D-11-00805 [pii].

33. Steingart KR, Henry M, Laal S, Hopewell PC, Ramsay A, Menzies D, Cunningham J, Weldingh K, Pai M (2007) Commercial serological antibody detection tests for the diagnosis of pulmonary tuberculosis: a systematic review. 4: e202.

34. Getahun H, Kittikraisak W, Heilig CM, Corbett EL, Ayles H, Cain KP, Grant AD, Churchyard GJ, Kimerling M, Shah S, Lawn SD, Wood R, Maartens G, Granich R, Date AA, Varma JK (2011) Development of a standardized screening rule for tuberculosis in people living with HIV in resource-constrained settings: individual participant data meta-analysis of observational studies. PLoS Med 8: e1000391. 10.1371/journal.pmed.1000391 [doi].

35. Rutjes AW, Reitsma JB, Di NM, Smidt N, van Rijn JC, Bossuyt PM (2006) Evidence of bias and variation in diagnostic accuracy studies. CMAJ 174: 469-476. 174/4/469 [pii];10.1503/cmaj.050090 [doi].

36. Graham SM, Ahmed T, Amanullah F, Browning R, Cardenas V, Casenghi M, Cuevas LE, Gale M, Gie RP, Grzemska M, Handelsman E, Hatherill M, Hesseling AC, Jean-Philippe P, Kampmann B, Kabra SK, Lienhardt C, Lighter-Fisher J, Madhi S, Makhene M, Marais BJ, McNeeley DF, Menzies H, Mitchell C, Modi S, Mofenson L, Musoke P, Nachman S, Powell C, Rigaud M, Rouzier V, Starke JR, Swaminathan S, Wingfield C (2012) Evaluation of tuberculosis diagnostics in children: 1. Proposed clinical case definitions for classification of intrathoracic tuberculosis disease. Consensus from an expert panel. J Infect Dis 205 Suppl 2: S199-S208. jis008 [pii];10.1093/infdis/jis008 [doi].

37. Luabeya KK, Mulenga H, Moyo S, Tameris M, Sikhondze W, Geldenhuys H, Mohamed H, Hanekom W, Hussey G, Hatherill M (2012) Diagnostic features associated with culture of Mycobacterium

Page 22: REPORT – DRAFT 2

Page | 22

tuberculosis among young children in a vaccine trial setting. Pediatr Infect Dis J 31: 42-46. Clinical Trial, Phase IV Research Support, Non-U.S. Gov't.

38. Corbett EL, Bandason T, Cheung YB, Makamure B, Dauya E, Munyati SS, Churchyard GJ, Williams BG, Butterworth AE, Mungofa S, Hayes RJ, Mason PR (2009) Prevalent infectious tuberculosis in Harare, Zimbabwe: burden, risk factors and implications for control. 13: 1231-1237.

39. Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, Leeflang MM, Sterne JA, Bossuyt PM (2011) QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 155: 529-536. 155/8/529 [pii];10.1059/0003-4819-155-8-201110180-00009 [doi].

40. Reitsma JB, Rutjes AWS, Whiting P, Vlassov V, Leeflang MMG, et al. (2009) Chapter 9: Assessing methodological quality. In: Deeks JJ BPGCe, editors. Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. The Cochrane Collaboration.

41. Schunemann HJ, Oxman AD, Brozek J, Glasziou P, Jaeschke R, Vist GE, Williams JW, Jr., Kunz R, Craig J, Montori VM, Bossuyt P, Guyatt GH (2008) Grading quality of evidence and strength of recommendations for diagnostic tests and strategies. BMJ 336: 1106-1110. 336/7653/1106 [pii];10.1136/bmj.39500.677199.AE [doi].

42. Kruk A, Gie RP, Schaaf HS, Marais BJ (2008) Symptom-based screening of child tuberculosis contacts: improved feasibility in resource-limited settings. Pediatrics 121: e1646-e1652.

43. Aldridge RW, Story A, Stagg H, Lipman M, Knight J, Taubman D, Shaji K, Quinn D, Watson J, Abubakar I, Hayward A (2010) Sensitivity and specificity of mobile digital chest radiography for the diagnosis of active pulmonary tuberculosis. a cohort study in high risk groups in London. Thorax 65: A6. http://dx.doi.org/10.1136/thx.2010.150912.6;Conference Abstract.

44. Borra H, Si TM, Morris MJ, Waibel KH, Arroyo RA, Merrill GA, Battafarano DF (2009) Reliability of tuberculosis screening tests in patients receiving tumor necrosis factor antagonist therapy in a united states rheumatology clinic. UNKN 60: 137. http://dx.doi.org/10.1002/art.25220;Conference Abstract.

45. Costantino F, Faure G, Carvalho MD, Rat AC, Dintinger H, Loeuille D, Chary-Valckenaere I (2010) High level of disease activity in chronic inflammatory rheumatisms increases the rate of indeterminate interferon-gamma-release assay results for latent tuberculosis infection detection. UNKN 62: 768. http://dx.doi.org/10.1002/art.28536;Conference Abstract.

46. Duenas E, Sanchez M, De La Eva R, Putulin A, Mendoza A (2009) Household TB case finding results and clinical improvement of children with endobronchial tuberculosis. UNKN 14: A145. http://dx.doi.org/10.1111/j.1440-1843.2009.01657.x;Conference Abstract.

47. Fox GJ, Sy DNS, Nhung NV, Lien LT, Cuong NK, Britton WJ, Marks GB (2012) Active case-finding of tuberculosis among household contacts in hanoi, vietnam-A 12-month prospective study. UNKN 17: 83. http://dx.doi.org/10.1111/j.1440-1843.2012.02143.x;Conference Abstract.

48. Haldar P, Thuraisingam H, Patel H, Hoskyns W, Woltmann G (2010) Quantiferon testing in close contacts of smear positive pulmonary TB identifies people at low risk of secondary progression. UNKN 65: A148. http://dx.doi.org/10.1136/thx.2010.151043.17;Conference Abstract.

49. Karabela S, Papaventsis D, Georgoulas S, Nikolaou S, Ioannidis P, Konstantinidou E, Sainti A, Marinou I, Kanavaki S (2010) Epidemiological monitoring of pulmonary tuberculosis in a correctional facility population, Athens, Greece, 2005-2009. UNKN 16: S620. http://dx.doi.org/10.1111/j.1469-0691.2010.03239.x;Conference Abstract.

50. Yates S, Story A, Hayward AC (2009) Screening prisoners for tuberculosis: What should the UK do? UNKN 64: A105-A106. http://dx.doi.org/10.1136/thx.2009.127142w;Conference Abstract.

Page 23: REPORT – DRAFT 2

Page | 23

51. 1992) [The third nationwide random survey for the epidemiology of tuberculosis in 1990]. Chung Hua Chieh Ho Ho Hu Hsi Tsa Chih 15: 69-71, 125.

52. 1995) [A three-year retrospective cohort study on pulmonary tuberculosis in Shanghai. Shanghai Pulmonary Tuberculosis Epidemiological Research Group]. Chung Hua Chieh Ho Ho Hu Hsi Tsa Chih 18: 309-12, 319.

53. 2007) An epidemiological study of tuberculosis in Guangzhou in the year of 2005. [Chinese]. UNKNOWN 30: 415-418.

54. Duanmu H (2002) Report on fourth national epidemiological sampling survey of tuberculosis. [Chinese]. UNKN 25: 3-7.

55. Group Of Technical Guidance For Tuberculosis Epidemiological Survey In Heilongjiang Province Office Of Tuberculosis Epidemiological Survey In Heilongjiang P, Wang P (2002) [Report on fourth epidemiological survey for tuberculosis in Heilongjiang province]. Chung Hua Chieh Ho Ho Hu Hsi Tsa Chih 25: 8-11.

56. Guangzhou Technique Steering Group of the Epidemiological Sampling Survey for T (2007) [An epidemiological study of tuberculosis in Guangzhou in the year of 2005]. Chung Hua Chieh Ho Ho Hu Hsi Tsa Chih 30: 415-418.

57. Jiao X, Liu C, Wang L, Song H, Wang K, Zhuang Y (2002) [Epidemic trends of pulmonary tuberculosis and case finding in Henan province]. Chung Hua Chieh Ho Ho Hu Hsi Tsa Chih 25: 15-17.

58. Wang P (2002) Report on fourth epidemiological survey for tuberculosis in Heilongjiang province. [Chinese]. UNKN 25: 8-11.

59. Wang Wb, Wang Fd, Xu B, Zhu Jf, Shen W, Xiao Xr, Jiang Qw (2006) [A cost-effectiveness study on a case-finding program of tuberculosis through screening those suspects with chronic cough symptoms in the rich rural areas]. Chung Hua Liu Hsing Ping Hsueh Tsa Chih 27: 857-860.

60. Wu J, Xiong G, Feng S, Cao H, Rao Z, Jiang T, Liu Y, Duan W, Tang X (2002) [Study on epidemic trend and control policy of tuberculosis in Sichuan province]. Chung Hua Chieh Ho Ho Hu Hsi Tsa Chih 25: 12-14.

61. Meisner AF, Ovsiankina ES, Stakheeva LB (2008) [Tuberculin diagnosis in children. Latent tuberculosis?]. Probl 29-32.

62. 2005) Short-course chemotherapy significantly reduces the prevalence of tuberculosis in China. UNKNOWN 9: 71-72. http://dx.doi.org/10.1016/j.ehbc.2004.11.007.

63. Alseda M, Godoy P (2003) [Study investigating infection in contacts of tuberculosis patients in a semi-urban area]. Enferm Infecc Microbiol Clin 21: 281-286.

64. Bucher H, Weinbacher M, Gyr K (1994) [Compliance of asylum seekers with an expanded border health screening program in the Basel-Stadt canton 1992-1993]. Schweiz Med Wochenschr 124: 1660-1665.

65. Clarke M, Dick J, Zwarenstein M, Diwan V (2003) DOTS for temporary workers in the agricultural sector. An exploratory study in tuberculosis case detection. Curationis 26: 66-71. Research Support, Non-U.S. Gov't.

66. Huizinga TWJ, Arend SM (2006) Is the tuberculin skin test an accurate method of detecting tuberculosis in patients with rheumatoid arthritis? UNKN 2: 188-189. http://dx.doi.org/10.1038/ncprheum0161;Short Survey.

Page 24: REPORT – DRAFT 2

Page | 24

67. Isawa T, Saotome N (1998) A nosocomial infection of tuberculosis among hospital employees in a general hospital. [Japanese]. UNKN 52: 643-649.

68. Liippo KK, Kulmala K, Tala EO (1993) Focusing tuberculosis contact tracing by smear grading of index cases. Am Rev Respir Dis 148: 235-236. Research Support, Non-U.S. Gov't.

69. Neill MA, Mates S (1992) Screening for tuberculosis: everything you already knew, probably forgot and meant to look up. R I Med 75: 453-456.

70. Schmitt HJ (2004) Tuberculosis screening in Germany. [German] Tuberkulosescreening in Deutschland. UNKN 28: 765-766. Note.

71. Shimao T (2009) [Tuberculosis prevalence survey in Japan]. Kekkaku 84: 713-720.

72. Sriyabhaya N, Payanandana V, Bamrungtrakul T, Konjanart S (1993) Status of tuberculosis control in Thailand. Southeast Asian J Trop Med Public Health 24: 410-419.

73. Vidal R, Miravitlles M, Cayla JA, Torrella M, Martin N, de Gracia J (1997) [A contagiousness study in 3071 familial contacts of tuberculosis patients]. Med Clin (Barc) 108: 361-365. Research Support, Non-U.S. Gov't.

74. Woo J, Chan HS, Hazlett CB, Ho SC, Chan R, Sham A, Davies PD (1996) Tuberculosis among elderly Chinese in residential homes: tuberculin reactivity and estimated prevalence. Gerontology 42: 155-162.

75. Yagi T, Yamagishi F, Sasaki Y, Hashimoto T, Bekku R, Yamanaka M, Tsuyusaki J (2006) [Clinical review of patients with pulmonary tuberculosis who were detected by the screening of homeless persons admitted in the shelter facilities]. Kekkaku 81: 371-374.

76. Balasubramanian R, Garg R, Santha T, Gopi PG, Subramani R, Chandrasekaran V, Thomas A, Rajeswari R, Anandakrishnan S, Perumal M, Niruparani C, Sudha G, Jaggarajamma K, Frieden TR, Narayanan PR (2004) Gender disparities in tuberculosis: report from a rural DOTS programme in south India. Int J Tuberc Lung Dis 8: 323-332. Research Support, U.S. Gov't, Non-P.H.S.

77. Banda HT, Harries AD, Welby S, Boeree MJ, Wirima JJ, Subramanyam VR, Maher D, Nunn PA (1998) Prevalence of tuberculosis in TB suspects with short duration of cough. Trans R Soc Trop Med Hyg 92: 161-163. Research Support, Non-U.S. Gov't.

78. Burgess AL, Fitzgerald DW, Severe P, Joseph P, Noel E, Rastogi N, Johnson WD, Jr., Pape JW (2001) Integration of tuberculosis screening at an HIV voluntary counselling and testing centre in Haiti. Aids 15: 1875-1879. Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, P.H.S.

79. Chheng P, Tamhane A, Natpratan C, Tan V, Lay V, Sar B, Kimerling ME (2008) Pulmonary tuberculosis among patients visiting a voluntary confidential counseling and testing center, Cambodia. Int J Tuberc Lung Dis 12: 54-62.

80. Corbett EL, Zezai A, Cheung YB, Bandason T, Dauya E, Munyati SS, Butterworth AE, Rusikaniko S, Churchyard GJ, Mungofa S, Hayes RJ, Mason PR (2010) Provider-initiated symptom screening for tuberculosis in Zimbabwe: diagnostic value and the effect of HIV status. Bull World Health Organ 88: 13-21. Research Support, Non-U.S. Gov't.

81. Datta M, Radhamani MP, Sadacharam K, Selvaraj R, Rao DL, Rao RS, Gopalan BN, Prabhakar R (2001) Survey for tuberculosis in a tribal population in North Arcot District. Int J Tuberc Lung Dis 5: 240-249.

82. den Boon S, White NW, van Lill SWP, Borgdorff MW, Verver S, Lombard CJ, Bateman ED, Irusen E, Enarson DA, Beyers N (2006) An evaluation of symptom and chest radiographic screening in tuberculosis prevalence surveys. Int J Tuberc Lung Dis 10: 876-882. Research Support, Non-U.S. Gov't.

Page 25: REPORT – DRAFT 2

Page | 25

83. den Boon S, van Lill SWP, Borgdorff MW, Enarson DA, Verver S, Bateman ED, Irusen E, Lombard CJ, White NW, de Villiers C, Beyers N (2007) High prevalence of tuberculosis in previously treated patients, Cape Town, South Africa. Emerg Infect Dis 13: 1189-1194. Research Support, Non-U.S. Gov't.

84. Gopi PG, Subramani R, Radhakrishna S, Kolappan C, Sadacharam K, Devi TS, Frieden TR, Narayanan PR (2003) A baseline survey of the prevalence of tuberculosis in a community in south India at the commencement of a DOTS programme. Int J Tuberc Lung Dis 7: 1154-1162. Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

85. Hoa NB, Cobelens FGJ, Sy DN, Nhung NV, Borgdorff MW, Tiemersma EW (2012) Yield of interview screening and chest X-ray abnormalities in a tuberculosis prevalence survey. UNKN 16: 762-767. http://dx.doi.org/10.5588/ijtld.11.0581.

86. Hoa NB, Sy DN, Nhung NV, Tiemersma EW, Borgdorff MW, Cobelens FGJ (2010) National survey of tuberculosis prevalence in Viet Nam. Bull World Health Organ 88: 273-280. Research Support, Non-U.S. Gov't.

87. van't Hoog AH, Meme HK, van Deutekom H, Mithika AM, Olunga C, Onyino F, Borgdorff MW (2011) High sensitivity of chest radiograph reading by clinical officers in a tuberculosis prevalence survey. Int J Tuberc Lung Dis 15: 1308-1314. Comparative Study Research Support, Non-U.S. Gov't.

88. Kimerling ME, Shakes CF, Carlisle R, Lok KH, Benjamin WH, Dunlap NE (1999) Spot sputum screening: evaluation of an intervention in two homeless shelters. Int J Tuberc Lung Dis 3: 613-619. Research Support, U.S. Gov't, P.H.S.

89. Lewis JJ, Charalambous S, Day JH, Fielding KL, Grant AD, Hayes RJ, Corbett EL, Churchyard GJ (2009) HIV infection does not affect active case finding of tuberculosis in South African gold miners. Am J Respir Crit Care Med 180: 1271-1278. Research Support, Non-U.S. Gov't.

90. Mor Z, Leventhal A, Weiler-Ravell D, Peled N, Lerman Y (2012) Chest radiography validity in screening pulmonary tuberculosis in immigrants from a high-burden country. Respir Care 57: 1137-1144.

91. Santha T, Renu G, Frieden TR, Subramani R, Gopi PG, Chandrasekaran V, Selvakumar N, Thomas A, Rajeswari R, Balasubramanian R, Kolappan C, Narayanan PR (2003) Are community surveys to detect tuberculosis in high prevalence areas useful? Results of a comparative study from Tiruvallur District, South India. Int J Tuberc Lung Dis 7: 258-265. Comparative Study Evaluation Studies.

92. Sebhatu M, Kiflom B, Seyoum M, Kassim N, Negash T, Tesfazion A, Borgdorff MW, van der Werf MJ (2007) Determining the burden of tuberculosis in Eritrea: a new approach. Bull World Health Organ 85: 593-599. Research Support, Non-U.S. Gov't.

93. van't Hoog AH, Laserson KF, Githui WA, Meme HK, Agaya JA, Odeny LO, Muchiri BG, Marston BJ, DeCock KM, Borgdorff MW (2011) High prevalence of pulmonary tuberculosis and inadequate case finding in rural western Kenya. Am J Respir Crit Care Med 183: 1245-1253. Research Support, Non-U.S. Gov't.

94. Wood R, Middelkoop K, Myer L, Grant AD, Whitelaw A, Lawn SD, Kaplan G, Huebner R, McIntyre J, Bekker LG (2007) Undiagnosed tuberculosis in a community with high HIV prevalence: implications for tuberculosis control. Am J Respir Crit Care Med 175: 87-93. Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't.

95. Ministry of Health Kingdom of Cambodia (2005) report national tb prevalence survey, 2002 cambodia.

96. Van't Hoog AH, Meme HK, Laserson KF, Agaya JA, Muchiri BG, Githui WA, Odeny LO, Marston BJ, Borgdorff MW (2012) Screening strategies for tuberculosis prevalence surveys: the value of chest radiography and symptoms. PLoS One 7: e38691. 10.1371/journal.pone.0038691 [doi];PONE-D-11-20600 [pii].

Page 26: REPORT – DRAFT 2

Page | 26

97. Ayles H, Schaap A, Nota A, Sismanidis C, Tembwe R, de Haas P, Muyoyeta M, Beyers N, Godfrey-Faussett P (2009) Prevalence of tuberculosis, HIV and respiratory symptoms in two Zambian communities: Implications for tuberculosis control in the era of HIV. PLoS ONE 4. http://dx.doi.org/10.1371/journal.pone.0005602.

98. Balasubramanian R, Garg R, Santha T, Gopi PG, Subramani R, Chandrasekaran V, Thomas A, Rajeswari R, Anandakrishnan S, Perumal M, Niruparani C, Sudha G, Jaggarajamma K, Frieden TR, Narayanan PR (2004) Gender disparities in tuberculosis: report from a rural DOTS programme in south India. Int J Tuberc Lung Dis 8: 323-332. Research Support, U.S. Gov't, Non-P.H.S.

99. Corbett EL, Bandason T, Cheung YB, Munyati S, Godfrey-Faussett P, Hayes R, Churchyard G, Butterworth A, Mason P (2007) Epidemiology of tuberculosis in a high HIV prevalence population provided with enhanced diagnosis of symptomatic disease. PLoS Med 4: e22. Research Support, Non-U.S. Gov't.

100. Corbett EL, Zezai A, Cheung YB, Bandason T, Dauya E, Munyati SS, Butterworth AE, Rusikaniko S, Churchyard GJ, Mungofa S, Hayes RJ, Mason PR (2010) Provider-initiated symptom screening for tuberculosis in Zimbabwe: diagnostic value and the effect of HIV status. Bull World Health Organ 88: 13-21. Research Support, Non-U.S. Gov't.

101. Datta M, Radhamani MP, Sadacharam K, Selvaraj R, Rao DL, Rao RS, Gopalan BN, Prabhakar R (2001) Survey for tuberculosis in a tribal population in North Arcot District. Int J Tuberc Lung Dis 5: 240-249.

102. den Boon S, White NW, van Lill SWP, Borgdorff MW, Verver S, Lombard CJ, Bateman ED, Irusen E, Enarson DA, Beyers N (2006) An evaluation of symptom and chest radiographic screening in tuberculosis prevalence surveys. Int J Tuberc Lung Dis 10: 876-882. Research Support, Non-U.S. Gov't.

103. den Boon S, van Lill SWP, Borgdorff MW, Enarson DA, Verver S, Bateman ED, Irusen E, Lombard CJ, White NW, de Villiers C, Beyers N (2007) High prevalence of tuberculosis in previously treated patients, Cape Town, South Africa. Emerg Infect Dis 13: 1189-1194. Research Support, Non-U.S. Gov't.

104. van't Hoog AH, Meme HK, van Deutekom H, Mithika AM, Olunga C, Onyino F, Borgdorff MW (2011) High sensitivity of chest radiograph reading by clinical officers in a tuberculosis prevalence survey. Int J Tuberc Lung Dis 15: 1308-1314. Comparative Study Research Support, Non-U.S. Gov't.

105. Chheng P, Tamhane A, Natpratan C, Tan V, Lay V, Sar B, Kimerling ME (2008) Pulmonary tuberculosis among patients visiting a voluntary confidential counseling and testing center, Cambodia. Int J Tuberc Lung Dis 12: 54-62.

106. Burgess AL, Fitzgerald DW, Severe P, Joseph P, Noel E, Rastogi N, Johnson WD, Jr., Pape JW (2001) Integration of tuberculosis screening at an HIV voluntary counselling and testing centre in Haiti. Aids 15: 1875-1879. Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, P.H.S.

107. Mor Z, Lerman Y, Leventhal A (2009) Pre-immigration screening for pulmonary tuberculosis. Eur Respir J 33: 701-702. Comment Letter.

108. Banda HT, Harries AD, Welby S, Boeree MJ, Wirima JJ, Subramanyam VR, Maher D, Nunn PA (1998) Prevalence of tuberculosis in TB suspects with short duration of cough. Trans R Soc Trop Med Hyg 92: 161-163. Research Support, Non-U.S. Gov't.

109. Eriksen M, Mackay J, Ross H (2013) The Tobacco Atlas. http://www.tobaccoatlas.org/.

110. Ayles H, Schaap A, Nota A, Sismanidis C, Tembwe R, De Haas P, Muyoyeta M, Beyers N (2009) Prevalence of tuberculosis, HIV and respiratory symptoms in two Zambian communities: implications for tuberculosis control in the era of HIV. 4: e5602.

Page 27: REPORT – DRAFT 2

Page | 27

111. van't Hoog AH, Laserson KF, Githui WA, Meme HK, Agaya JA, Odeny LO, Muchiri BG, Marston BJ, Decock KM, Borgdorff MW (2011) High Prevalence of Pulmonary Tuberculosis and Inadequate Case Finding in Rural Western Kenya. Am J Respir Crit Care Med 183: 1245-1253.

112. Ministry of Health Kingdom of Cambodia (2005) report national tb prevalence survey, 2002 cambodia.

113. Corbett EL, Charalambous S, Moloi VM, Fielding K, Grant AD, Dye C, De Cock KM, Hayes RJ, Williams BG, Churchyard GJ (2004) Human immunodeficiency virus and the prevalence of undiagnosed tuberculosis in African gold miners. Am J Respir Crit Care Med 170: 673-679. Research Support, Non-U.S. Gov't.

114. den Boon S, Bateman ED, Enarson DA, Borgdorff MW, Verver S, Lombard CJ, Irusen E, Beyers N, White NW (2005) Development and evaluation of a new chest radiograph reading and recording system for epidemiological surveys of tuberculosis and lung disease. 9: 1088-1096.

115. Pinsky PF, Freedman M, Kvale P, Oken M, Caporaso N, Gohagan J (2006) Abnormalities on chest radiograph reported in subjects in a cancer screening trial

15. Chest 130: 688-693. 130/3/688 [pii];10.1378/chest.130.3.688 [doi].

116. Marciniuk DD, McNab BD, Martin WT, Hoeppner VH (1999) Detection of pulmonary tuberculosis in patients with a normal chest radiograph. 115: 445-452.

117. Davis JL, Worodria W, Kisembo H, Metcalfe JZ, Cattamanchi A, Kawooya M, Kyeyune R, den Boon S, Powell K, Okello R, Yoo S, Huang L (2010) Clinical and radiographic factors do not accurately diagnose smear-negative tuberculosis in HIV-infected inpatients in Uganda: a cross-sectional study. PLoS ONE 5: e9859. Research Support, N.I.H., Extramural.

118. Hogeweg L, Mol C, de Jong PA, Dawson R, Ayles H, van Ginneken B (2010) Fusion of local and global detection systems to detect tuberculosis in chest radiographs. Med Image Comput Comput Assist Interv 13: 650-657.

Page 28: REPORT – DRAFT 2

Page | 28

FIGURES AND TABLES

Figure 1. Different screening and diagnostic algorithms.

A. 1 screening test followed by 1 confirmatory test

B. 1 screening test followed by 2 sequential confirmatory tests

C. 2 parallel screening tests D. 2 sequential screening tests

Page 29: REPORT – DRAFT 2

Page | 29

Figure 2. Flow of inclusion.

Systematic search of MEDLINE and EMBASE (n=3513), LILIACS

(n=83) and HTA (n=0) n=3596

Eligible for full text retrieval n=225

Publication about TB in children only n=58

Not eligible (n=3313)

Duplication: n=4 Full text not retrieved n=33 of which:

• Conference abstract only n=8 • Chinese (i.e. ineligible) n=10 • Russian (i.e. ineligible) n=1 • Full text not available n=14

Full text retrieved n=188

Not eligible (n=167), reason (more than 1 may apply): • Insufficient data for 2x2 table:

o Only one screen applied and screen negatives are not further evaluated (n=101)

o Required data not reported (n=32) • No cases of active bact+ TB identified (n=5) • Cohort study, cases identified at a later

time point (n=13) • Population ineligible (n=11) • Not original data (n=7) • Other (n=12 )

17 studies in 24 publications

Other sources n=3

Eligible publications n=21

Page 30: REPORT – DRAFT 2

Page | 30

Table 1 – Characteristics of included studies

Population TB rates per 100 000

Age, Sex Reference standard

First author, year of publication

Country - region

Setting Age-inclusion

TB prev**

Notification in region

HIV prev**

Age (years) Sex (% fe-male)

Sample size (%RS)

Type of Screen Bacteriology s=sample

Case definition

Ayles, 2009 Zambia, Lusaka Province

General population, prevalence survey

≥ 15 years

870 275 (rural) ; 438 (urban

28.6% Median between 25 and 34

54% 8,044 (100%)

Symptom. 1s MGIT, ZN if +

1 pos culture

Banda, 1998 Malawi, Queen Elizabeth Hosp Blantyre

General outpatient, 'TB suspects with short duration of cough'

≥ 15 years

35000 55% had HIV-related illness

Mean 33; sd 12

39% 98 (100%)

CXR abnormalities consistent with a diagnosis of TB

3s LJ and ZN

≥2 pos smears and/or ≥1 pos culture

Burgess, 2001 Haiti VCT center. Attendants for first visit pre-test counselling

>18 years 570 250 36% Median 34 54% 241 (100%)

CXR, in patients with a history of cough.

3s LJ and ZN

pos smear or culture with TB symptoms or TB-CXR

Chheng, 2008 Cambodia VCT center. Attendants for first visit pre-test counselling

≥ 19 years

580 508 25% Median 31; IQR 25-39

38% 496 (100%)

Symptom 3s LJ and ZN

2 pos smears or 1pos culture

Corbett, 2010; Corbett, 2009

Zimbabwe, Harare, high-density suburbs

General population, prevalence survey

≥ 16 years

700 Smear-pos 500

20.7% Median 25; IQR 20-35

59% 8,979 (100%)

Symptom 2s LJ and FM

definite: 2pos cult or 1pos cult with clinical evidence

Datta, 2001 India, tribal population in North Arcot District

General population, prevalence survey

≥ 15 years

800 nr, very low

Median between 25-34

49% 3,301 (21%)

Symptom 2s LJ, FM and ZN

≥1 pos ZN and/or ≥1 pos culture

den Boon, 2006; den Boon 2007

South Africa, Cape Town

General population, prevalence survey

≥ 15 years

1000 Sm-positive 314

<12.4% (estimated)

Median between 35-44

51% 1170 /2608 (45%)

Symptom, CXR (conventional)

1s LJ and ZN

1 pos culture and/or smear

Page 31: REPORT – DRAFT 2

Page | 31

Population TB rates per 100 000

Age, Sex Reference standard

First author, year of publication

Country - region

Setting Age-inclusion

TB prev**

Notification in region

HIV prev**

Age (years) Sex (% fe-male)

Sample size (%RS)

Type of Screen Bacteriology s=sample

Case definition

Gopi, 2003; Balasubramanian, 2004; Santha 2003

India, Tiruvallur District, South India

General population, prevalence survey

≥ 15 years

605 Smear-pos incidence 85/100000

nr, very low

34 53% 88,833 (11%)

Symptom (CXR screening was by MMR - not evaluated as a screen)

2s LJ and FM

1 pos culture and/or smear

Hoa, 2012; Hoa, 2010

Vietnam General population, prevalence survey

≥ 15 years

307 Smear-positive 93

nr. low Median between 35-44

55% 93,758 (8%)

Symptom, CXR (MMR and digital were used)

1s LJ, 2s ZN

1 pos culture or 2 pos smears, or 1 pos smear with CXR abnormality

Kimerling, 1999

USA, Alabama

Homeless and/or drug users

adults 310 1103 (among homeless )

nr Mean 40.8; sd 9.4

0% 127 (95%)

Symptom 1s LJ and ZN

1 pos culture (with or without smear)

Lewis, 2009 South Africa Occupational - Miners, annual medical examination

≥ 20 years

2609 29% Median 41; IQR 20-61

0% 1,955 (100%)

Symptom (CXR screening was by MMR - not evaluated as a screen)

3s LJ and FM

1 pos culture with clinical or radiological features, or 2pos culture

Ministy of Health Myanmar, 2012

Myanmar General population, prevalence survey

≥ 15 years

613 279 ± 2.7% Median between 35-44

56% 51,367 (23%)

Symptom, CXR (conventional)

2s LJ and FM

2pos smear or 1 pos smear with TB-CXR, or 1 pos culture

MoH Cambodia, 2005

Cambodia General population, prevalence survey

≥ 10 years

846 189 1% Median between 25-34

54% 22,160 (15%)

Symptom, CXR (conventional)

2s LJ and ZN

2 pos smear or 1 pos smear with TB-CXR or 1pos culture

Mor, 2012 Ethiopia Immigrants, refugees, border, Jewish Ethiopian immigrants to Israel

≥ 1 years 321 344 ±2% Mean 34; sd 25

53% 13,379 (1%)

CXR (type not stated)

3s LJ and ZN

symptomatic pulmonary disease and confirmed Mtb culture.

Page 32: REPORT – DRAFT 2

Page | 32

Population TB rates per 100 000

Age, Sex Reference standard

First author, year of publication

Country - region

Setting Age-inclusion

TB prev**

Notification in region

HIV prev**

Age (years) Sex (% fe-male)

Sample size (%RS)

Type of Screen Bacteriology s=sample

Case definition

Sebatu, 2007 Eritrea General population, prevalence survey

≥ 15 years

90 23 nr. Low Median between 30-40

65% 19,185 (95%)

Symptom 2s FM 2 pos ZN or 1pos ZN with TB-CXR

van't Hoog, 2012; van't Hoog, 2011; van't Hoog, 2011

western Kenya

General population, prevalence survey

≥ 15 years

600 440 16.8% Median 35 63% 20,566 (32%)

Symptom, CXR (conventional)

1s LJ,MGIT and FM, 2s FM

1pos culture or 2pos smears or 1 pos smear andTB-CXR

Wood, 2006 South Africa General population, prevalence survey

≥ 15 years

1575 1931 23% Median 27 nr. 762 (100%)

Symptom 2s MGIT and FM

2pos smear or 2pos culture

** in study population TB-CXR=radiological evidence of pulmonary tuberculosis %RS = percentage of the study sample verified by the reference standard pos=positive. A positive culture refers to a culture positive for M.tuberculosis

Page 33: REPORT – DRAFT 2

Page | 33

Tables and Figures on Screening Tools A. GENERAL POPULATION

1. CXR screening – Any Abnormality Table A1. QUADAS ratings for studies reporting CXR screening for any abnormality in the general population, with bacteriologically positive PTB as the reference standard

Population General population - low and high HIV prevalence combined Screen CXR - Any abnormality

Reference standard Culture and/or smear-positive

RISK OF BIAS APPLICABILITY CONCERNS

Study Screen HIV prevalence

/ region

Sample size PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

FLOW AND

TIMING

PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

Author Author definition N % RS den Boon, 2006 (b)† [Cape Town] Any abnormality high / SSA 2608 45% High Low Low High Low Low Moderate van't Hoog, 2012 [Kenya]

CXR any abnormality (field reading) high / SSA 19216 32% Low Low Low High Low Low Moderate

MoH Myanmar, 2012 CXR any abnormality at field level Low / Asia 50241 23% Low Low Unclear High Low Low Low

% RS= percentage examined by the reference standard CXR=chest radiography SSA=sub Saharan Africa †(b) refers to the sample including all participants with a CXR

Figure A1.i. Forest plot of the sensitivity and specificity of CXR screening for any abnormality in the general population, with bacteriologically positive PTB as the reference standard.

Page 34: REPORT – DRAFT 2

Page | 34

Figure A1.ii. SROC plot of ensitivity and specificity of CXR screening for any abnormality in the general population, with bacteriologically positive PTB as the reference standard.

Page 35: REPORT – DRAFT 2

Page | 35

2. CXR screening – Abnormalities suggestive of TB Table A2. QUADAS ratings for studies reporting CXR screening for abnormalities suggestive of TB in the general population, with bacteriologically positive PTB as the reference standard

Population General population - low and high HIV prevalence combined Screen CXR abnormalites suggestive of TB

Reference standard Culture and/or smear-positive

RISK OF BIAS APPLICABILITY CONCERNS

Study Screen HIV prevalence

/ region

Sample size PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

FLOW AND

TIMING

PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

Author Author definition N % RS den Boon, 2006 (b)† TB-related abnormalities High / SSA 2608 45% High Low Low High Low Low Moderate

Hoa, 2012 Chest X-ray abnormality suggestive of TB Low / Asia 87642 8% Low Low Unclear High Low High Low

MoH Myanmar, 2012

CXR abnormal findings consistent with TB, central level Low / SSA 50241 23% Low Low Unclear High Low Low Low

MoH Cambodia, 2005

TB related shadows or other lung disease Low / Asia 22012 15% Low Low Unclear High Low Moderate Low

van't Hoog, 2012 (b)

Abnormality consistent with TB by expert 1 or expert 2 High / SSA 1143 32% Low Low Low High Low Low Moderate

% RS= percentage examined by the reference standard CXR=chest radiography SSA=sub Saharan Africa †(b) refers to the sample including all participants with a CXR

Figure A2.i Forest plot of the sensitivity and specificity of CXR screening for abnormalities suggestive of TB in the general population, with bacteriologically positive PTB as the reference standard

*The SROC curve only includes one (the *marked) observations from the van’t Hoog(b) study

Page 36: REPORT – DRAFT 2

Page | 36

Figure A2.ii SROC curves of the sensitivity and specificity of CXR screening for abnormalities suggestive of TB in the general population, with bacteriologically positive PTB as the reference standard. At the left including all 5 studies, the plot at the right excludes 2 studies with the lowest proportion participants verified by the reference standard.

2.1 CXR screening Abnormalities suggestive of TB; smear positive PTB as the reference standard. Figure A2.1. Sensitivity and specificity of CXR screening for any abnormality in the general population, with smear positive PTB as the reference standard

Page 37: REPORT – DRAFT 2

Page | 37

3. CXR AND SYMPTOM screening in parallel Table A3. Sensitivity and specificity of screening for CXR and/or prolonged in the general population, with bacteriologically positive PTB as the reference standard

Population General population - low and high HIV prevalence combined

Screen Symptom and/or CXR Reference standard Culture and/or smear-positive Study Screen HIV

prevalence / region

Sample size Sensitivity Specificity

Author Category Author definition N % RS PE 95% CI PE 95% CI

den Boon, 2006 (b)

Symptom and CXR Any symptom and/or TB related abnormality high / SSA 2592 45% 90% 73% 98% CXR TB-related abnormalities 2608 90% 73% 98% 88% 87 89%

Hoa, 2012 Symptom and CXR

Productive cough for >2 weeks or TB history or Chest X-ray abnormality suggestive of TB low / Asia 87642 8% 94% 91% 97% 92% 92% 92%

CXR Chest X-ray abnormality suggestive of TB 87642 85% 80% 89% 96% 96% 96% van't Hoog, 2012

Symptom and CXR

Cough for 2 or more weeks or any CXR abnormality high / SSA 20566 32% 97% 92% 99% 69% 68% 70%

CXR CXR any abnormality (field reading) 19216 94% 88% 98% 73% 72% 73%

MoH Myanmar, 2012

Symptom and CXR

Cough >=3 weeks, haemoptysis or any CXR lesion in lungfield or mediastinum low / Asia 51367 23% 96% 93% 98% 77% 76% 77%

CXR CXR any abnormality at field level 50241 99% 98% 100% 79% 79% 80%

MoH Cambodia, 2005

Symptom and CXR Cough for >= 3 weeks and/or TB related

shadows or other lung disease high / SSA 22160 15% 100% 99% 100% 85% 85% 86% CXR TB related shadows or other lung disease 22012 97% 94% 99% 90% 90% 91%

%RS=% examined by the reference standard CXR=chest radiography PE=point estimate SSA=sub Saharan Africa

Page 38: REPORT – DRAFT 2

Page | 38

Figure A3. SROC plot comparing screening for CXR and/or prolonged cough with CXR alone for each study.

Page 39: REPORT – DRAFT 2

Page | 39

SYMPTOM SCREENING

5. Prolonged cough

Table A5. QUADAS ratings for studies reporting prolonged cough in the general population, with bacteriologically positive PTB as the reference standard

Population General population - low and high HIV prevalence combined Screen Prolonged cough

Reference standard culture and/or smear-pos

RISK OF BIAS APPLICABILITY CONCERNS

Study Screen HIV prevalence /

region

Sample size PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

FLOW AND

TIMING

PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

Author Author definition N % RS den Boon, 2006 (b)†

Cough for 2 or more weeks vs. no cough or < 2 weeks high / SSA 2511 44% High Low Low High Low Low Moderate

van't Hoog, 2012 Cough for 2 or more weeks high / SSA 20566 32% Low Low Low High Low Low Moderate

Corbett, 2010 Cough for 2 or more weeks high / SSA 8979 100% Low Low Low Low Low Low Low

Ayles, 2009 TB suspect: Cough 3 weeks or more or haemoptysis high / SSA 8044 100% Low Low Low Low Low Low Moderate

MoH Cambodia, 2005 Cough for 3 weeks or more low / Asia 22160 15% Low Low Unclear High Low Low Low

Hoa, 2012

Productive cough for >2 weeks (i.e. cough with sputum production) low / Asia 93758 8% Low Unclear Unclear High Low Unclear Low

MoH Myanmar, 2012 Cough for 21+ days low / Asia 51367 23% Low Low Unclear High Low Low Low

Datta, 2001 Cough for >2 weeks low / Asia 16017 21% Low Low Low High Low Low Low % RS= percentage examined by the reference standard CXR=chest radiography SSA=sub Saharan Africa †(b) refers to the sample including all participants with a CXR

Page 40: REPORT – DRAFT 2

Page | 40

Figure A5.i. Forest plot of sensitivity and specificity of prolonged cough in the general population, with bacteriologically positive PTB as the reference standard

Figure A5.ii. SROC plot of sensitivity and specificity of prolonged cough in the general population, with bacteriologically positive PTB as the reference standard. At the left all studies combined, at the right stratified by high and low HIV prevalence populations.

Page 41: REPORT – DRAFT 2

Page | 41

5.1 Prolonged cough with smear-positive TB as the reference standard. Figure A5.1.i Forest plot of sensitivity and specificity of prolonged cough in the general population, with smear-positive PTB as the reference standard

Figure A5.1.ii SROC plot of sensitivity and specificity of prolonged cough in the general population, with smear-positive PTB as the reference standard

Page 42: REPORT – DRAFT 2

Page | 42

6. Any cough Table A6. QUADAS ratings for studies reporting ‘Any cough’ in the general population, with bacteriologically positive PTB as the reference standard

Population General population - low and high HIV prevalence combined Screen Cough of any duration

Reference standard Any

RISK OF BIAS APPLICABILITY CONCERNS

Study Screen HIV prevalence

/ region

Reference Sample size PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

FLOW AND

TIMING

PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

Author standard N % RS van't Hoog, 2012

Any cough high / SSA

culture and/or smear-positive 20566 32% Low Low Low High Low Low Moderate

Ayles, 2009 Any cough high / SSA

culture and/or smear- positive 8044 100% Low Low Low Low Low Low Moderate

Corbett, 2010 Any cough high / SSA

culture and/or smear- positive 8979 100% Low Low Low Low Low Low Low

Wood, 2006 Any cough high / SSA

culture and/or smear- positive 751 100% Low Low Low Low Low Low Low

MoH Cambodia, 2005

Any cough Low / Asia

culture and/or smear- positive 22160 15% Low Low Unclear High Low Low Low

MoH Myanmar, 2012

Any cough Low / Asia

culture and/or smear- positive 51367 23% Low Low Unclear High Low Low Low

Sebhatu, 2007 Any cough Low / SSA smear-pos only 19185 95% Low Unclear High Low Low Low Low

% RS= percentage examined by the reference standard CXR=chest radiography SSA=sub Saharan Africa Figure A6.i. Forest plot of sensitivity and specificity of ‘any cough’ in the general population, with bacteriologically positive or smear-positive PTB as the reference standard.

* Reference standard in this study is smear-positive pulmonary tuberculosis only, and bacteriologically positive (smear and/or culture) in the other studies

Page 43: REPORT – DRAFT 2

Page | 43

Figure A6.i. SROC curves of sensitivity and specificity of ‘any cough’ in the general population, with at the left bacteriologically positive TB only as the reference standard (n=6), and at the right stratified by population HIV prevalence, with bacteriologically positive or smear-positive PTB as the reference standard.

Page 44: REPORT – DRAFT 2

Page | 44

7. Any Symptom Table A7. QUADAS ratings for studies reporting ‘Any Symptom’ in the general population, with bacteriologically positive PTB as the reference standard

Population General population - low and high HIV prevalence combined Screen Any symptom

Reference standard culture and/or smear-pos

RISK OF BIAS APPLICABILITY CONCERNS

Study Screen HIV prevalence

/ region

Sample size PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

FLOW AND TIMING

PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

Author Author definition N % RS

van't Hoog, 2012

Any TB symptom (cough, haemoptysis, fever, night sweats or weight loss) of any duration or severity high / SSA 20566 32% Low Low Low High Low Low Moderate

Ayles, 2009

Any TB symptom (cough, shortness of breath; fever; night sweats; weight loss; chest pain). high / SSA 8044 100% Low Low Low Low Low Low Moderate

Corbett, 2010

Any TB symptom (cough, haemoptysis, fever, night sweats or weight loss) high / SSA 8979 100% Low Low Low Low Low Low Low

den Boon, 2006 (b)†

Any TB symptoms (cough, haemoptysis, weight loss, night sweats and fever) high / SSA 2511 44% High Low Low High Low Low Moderate

MoH Cambodia, 2005

Any TB symptoms (cough, sputum, haemoptysis, chest pain, weight loss, fatigue, fever, night sweats) and duration Low / Asia 22160 15% Low Low Unclear High Low Low Low

Datta, 2001 cough of >2 weeks, chest pain, fever >1 month, haemoptysis Low / Asia 16017 21% Low Low Low High Low Low Low

MoH Myanmar, 2012

Any of symptoms out of 7 (Cough, sputum, haemoptysis, chest pain, weight loss, fever) Low / Asia 51367 23% Low Low Unclear High Low Low Low

Gopi, 2003

cough ≥ 2 wks, chest pain ≥ 1 month , fever ≥ 1 month and/or haemoptysis at any time. Low / Asia 88832 11% Low Low Low High Low Low Moderate

% RS= percentage examined by the reference standard CXR=chest radiography †(b) refers to the sample including all participants with a CXR SSA=sub Saharan Africa

Page 45: REPORT – DRAFT 2

Page | 45

Figure A7.i Forest plot of sensitivity and specificity of ‘Any Symptom’ in the general population, with bacteriologically positive PTB as the reference standard.

Figure A7.ii SROC curve of sensitivity and specificity of ‘Any Symptom’ in the general population, with bacteriologically positive PTB as the reference standard. All studies combined at the left, and stratified by high vs. low HIV prevalence at the right.

Page 46: REPORT – DRAFT 2

Page | 46

B. OTHER POPULATIONS 1. Attendants of a voluntary counseling and HIV-testing centre (VCT) Table B1.i QUADAS ratings for the studies in VCT attendants (HIV-negative and HIV-positive)

Population VCT clients

Screen CXR, Cough STUDY RISK OF BIAS APPLICABILITY CONCERNS

SCREEN PATIENT

SELECTION INDEX TEST

REFERENCE STANDARD

FLOW AND

TIMING

PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

AUTHOR CATEGORY Reference Standard % RS

Burgess, 2001 Any CXR abnormality culture and/or smear-pos 100% High Unclear Low Low High High Low

Chheng, 2008 Cough for 3 or more weeks, Any symptom

culture and/or smear-pos 100% High Low Low Low High Low Moderate

% RS= percentage examined by the reference standard CXR=chest radiography Table B1.ii Sensitivity and specificity of screens in VCT attendants

Population VCT clients reporting a cough of any duration Screen CXR Study Screen

HIV status*

RS Sensitivity Specificity Sample size

Author Category Author definition Type PE (95% CI) PE (95% CI) N % RS

Burgess, 2001 Any CXR abnormality

Any abnormality on CXR high

culture and/or smear-pos 1.00 [0.92, 1.00] 0.53 [0.45, 0.60] 217 100%

Population VCT clients Screen Symptoms

Chheng, 2008 Cough for 3 or more weeks Cough >3 weeks HIV negative

culture and/or smear-pos 0.67 [0.30, 0.93] 0.63 [0.58, 0.68] 372 100%

Chheng, 2008 Cough for 3 or more weeks Cough >3 weeks high

culture and/or smear-pos 0.59 [0.39, 0.76] 0.62 [0.58, 0.67] 496 100%

Chheng, 2008 Any symptom Any TB Symptom HIV negative culture and/or smear-pos 1.00 [0.66, 1.00] 0.28 [0.24, 0.33] 372 100%

Chheng, 2008 Any symptom Any TB symptom

high culture and/or smear-pos 1.00 [0.88, 1.00] 0.26 [0.22, 0.30] 496 100%

% RS= percentage examined by the reference standard CXR=chest radiography * HIV negative means the HIV-negative clients only. High HIV refers to all VCT attendants in the study (HIV-positive and HIV-negative).

Page 47: REPORT – DRAFT 2

Page | 47

2. Residents of a homeless shelter Table B2.i QUADAS ratings for the study in the homeless population

Population Homeless

Screen Cough of any duration STUDY RISK OF BIAS APPLICABILITY CONCERNS

SCREEN PATIENT SELECTIO

N

INDEX TEST

REFERENCE STANDARD

FLOW AND TIMING

PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD AUTHOR CATEGORY Reference Standard % RS

Kimerling, 1999 Any cough

culture and/or smear-positive 95% Low Low Low Unclear Low Low Moderate

% RS= percentage examined by the reference standard CXR=chest radiography Table B2.ii Sensitivity and specificity of cough screening in the homeless population

Population Homeless Screen Cough of any duration Study Screen

HIV status RS Sensitivity Specificity Sample size

Author Category Author definition

Type PE [95% CI] PE [95% CI] N % RS Kimerling, 1999 Any cough Productive cough

HIV negative

culture and/or smear-positive 0.25 [0.01, 0.81] 0.93 [0.88, 0.97] 127 95%

RS=reference standard % RS= percentage examined by the reference standard PE=point estimate

Page 48: REPORT – DRAFT 2

Page | 48

3. Occupational setting – gold miners Table B3.i QUADAS ratings for the study in gold miners

Population Occupational – Gold Miners

Screen Cough, Any symptom STUDY RISK OF BIAS APPLICABILITY CONCERNS

SCREEN PATIENT SELECTIO

N

INDEX TEST

REFERENCE STANDARD

FLOW AND

TIMING

PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD AUTHOR CATEGORY Ref standard % RS

Lewis, 2009 All symptom screens culture and/or smear-positive 100% Low Low High Low Unclear Low Low

RS=reference standard % RS= percentage examined by the reference standard PE=point estimate

Table B3.ii Sensitivity and specificity of symptom screens in gold miners Population Occupational - Gold Miners

Screen Cough, Any symptom

Study Screen

HIV status

RS Sample size Sensitivity Specificity

Author Category Author definition Type N % RS PE [95% CI] PE [95% CI] Lewis, 2009

Cough for 2 or more weeks

New or worsening cough of >2 weeks

HIV negative

culture and/or smear-positive 1378 100% 0.10 [0.02, 0.26] 0.99 [0.99, 1.00]

Lewis, 2009

Cough for 2 or more weeks

New or worsening cough of >2 weeks

high HIV prev

culture and/or smear-positive 1942 100% 0.08 [0.02, 0.19] 0.99 [0.99, 0.99]

Lewis, 2009

Cough for 3 or more weeks

New or worsening cough of >3 weeks

HIV negative

culture and/or smear-positive 1378 100% 0.06 [0.01, 0.21] 0.99 [0.99, 1.00]

Lewis, 2009

Cough for 3 or more weeks

New or worsening cough of >3 weeks

high HIV prev

culture and/or smear-positive 1942 100% 0.06 [0.01, 0.16] 0.99 [0.99, 1.00]

Lewis, 2009 Any cough

new or worsening cough of any duration

HIV negative

culture and/or smear-positive 1378 100% 0.16 [0.05, 0.34] 0.99 [0.98, 0.99]

Lewis, 2009 Any cough

new or worsening cough of any duration

high HIV prev

culture and/or smear-positive 1942 100% 0.16 [0.07, 0.29] 0.98 [0.97, 0.99]

Lewis, 2009 Any symptom At least one symptom

HIV negative

culture and/or smear-positive 1378 100% 0.29 [0.14, 0.48] 0.91 [0.90, 0.93]

Lewis, 2009 Any symptom At least one symptom

high HIV prev

culture and/or smear-positive 1942 100% 0.29 [0.17, 0.44] 0.90 [0.89, 0.92]

RS=reference standard % RS= percentage examined by the reference standard PE=point estimate

Page 49: REPORT – DRAFT 2

Page | 49

4. Migrants Table B4.i QUADAS ratings for the study in migrants

Population Migrants

Screen CXR STUDY RISK OF BIAS APPLICABILITY CONCERNS

SCREEN PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

FLOW AND TIMING

PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD AUTHOR CATEGORY Ref standard % RS

Mor, 2012 CXR abnormalites suggestive of TB

culture and/ or smear-pos 1% Low Low Low High Unclear Unclear Low

RS=reference standard % RS= percentage examined by the reference standard PE=point estimate

Table B4.ii Sensitivity and specificity of CXR screening in migrants Population Migrants Screen CXR

Study Screen HIV

prevalence*

RS Sample size Sensitivity Specificity

Author Category Author definition Type N % RS PE [95% CI] PE [95% CI]

Mor, 2012

CXR abnormalities suggestive of TB

CXR showing changes suggestive of PTB

low HIV prevalence

culture and/or smear-positive 13379 1% 0.86 [0.72, 0.95] 0.99 [0.99, 0.99]

RS=reference standard % RS= percentage examined by the reference standard PE=point estimate

5. Out-patients Table B5.i QUADAS ratings for the study in out-patients

Population Out Patient Attendants

Screen CXR STUDY RISK OF BIAS APPLICABILITY CONCERNS

SCREEN PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD

FLOW AND

TIMING

PATIENT SELECTION

INDEX TEST

REFERENCE STANDARD AUTHOR CATEGORY

% RS

Banda, 1998

CXR abnormalites suggestive of TB 100% High Low Low Low High High Low

RS=reference standard % RS= percentage examined by the reference standard PE=point estimate

Page 50: REPORT – DRAFT 2

Page | 50

Table B5.ii Sensitivity and specificity of CXR screening in out-patients

Screen CXR Population OPD attendants with short duration of cough (1-2 wks), and not considered a TB suspect (i.e. cough>3 wks)

Study Screen HIV prevalence

RS Sample size Sensitivity Specificity Author Category Author definition Type N % RS PE (95% CI) PE (95% CI)

Banda, 1998

CXR abnormalites suggestive of TB

CXR abnormalities consistent with a diagnosis of TB high

culture and/ or smear-pos 98 100% 0.26 [0.13, 0.44] 0.92 [0.83, 0.97]

RS=reference standard % RS= percentage examined by the reference standard PE=point estimate CXR= chest radiograph

Page 51: REPORT – DRAFT 2

Page | 51

APPENDIX 1 - Search strategy MEDLINE and EMBASE

3. Results A. Primary studies electronic bibliographic databases:

Database # Records MEDLINE(R) In-Process & Other Non-Indexed Citations (OvidSP)

2881

Embase 1980—current (OvidSP) 2665 Total after duplicate removal 3731 From 1992 onwards 3513 4. Search Strategy A. Medline Platform: OvidSP Database: MEDLINE(R) In-Process & Other Non-Indexed Citations Date: 06-07-2012 Limits: no limits were used Methodological filters: none 1 exp Mycobacterium/ 2 mycobacterium.ti,ab. 3 tuberculosis/ 4 peritonitis, tuberculous/ 5 exp tuberculoma/ 6 tuberculosis, bovine/ 7 exp tuberculosis, cardiovascular/ 8 exp tuberculosis, central nervous system/ 9 tuberculosis, cutaneous/ 10 erythema induratum/ 11 tuberculosis, endocrine/ 12 tuberculosis, gastrointestinal/ 13 tuberculosis, hepatic/ 14 exp tuberculosis, lymph node/ 15 tuberculosis, miliary/ 16 tuberculosis, multidrug-resistant/ 17 tuberculosis, ocular/ 18 tuberculosis, oral/ 19 tuberculosis, osteoarticular/ 20 tuberculosis, pleural/ 21 tuberculosis, pulmonary/ 22 tuberculosis, splenic/ 23 tuberculosis, urogenital/ 24 (tuberculo* or TB or scrofuloderma).ti,ab. 25 or/1-24 26 (case adj finding).ti,ab. 27 screen*.ti,ab.

Page 52: REPORT – DRAFT 2

Page | 52

28 exp Mass Screening/ 29 exp Population Surveillance/ 30 (disease adj3 surveillance).ti,ab. 31 case-finding.ti,ab. 32 (case adj detection).ti,ab. 33 exp Contact Tracing/ 34 (contact adj tracing).ti,ab. 35 exp Health Surveys/ 36 survey.ti,ab. 37 or/26-36 38 exp "Sensitivity and Specificity"/ 39 (false adj negative).ti,ab. 40 odds.mp. 41 exp "Predictive Value of Tests"/ 42 (predictive adj3 value).ti,ab. 43 specificit*.ti,ab. 44 accuracy.ti,ab. 45 or/38-44 46 prevalence.mp or exp Prevalence/ 47 exp Cross-Sectional Studies/ or cross sectional.mp. 48 46 or 47 49 35 or 36 or 48 50 (mycobacteri$ adj2 culture).ti,ab. 51 (microscopy adj2 (sputum smear or ZN or Ziehl-neelsen or FM or

fluorescence)).ti,ab. 52 lowenstein-jensen.ti,ab. 53 (LJ adj2 medium).ti,ab. 54 "mycobacteria growth incubator tube".ti,ab. 55 mgit.ti,ab. 56 geneXpert.ti,ab. 57 (auramine adj2 staining).ti,ab. 58 ((culture or smear) adj positiv*).ti,ab. 59 or/50-58 60 25 and 49 and 59 61 25 and 37 and 45 62 60 or 61 B. Embase Platform: OvidSP Database: Embase 1980-present Date: 06-07-2012 Limits: no limits were used Methodological filters: no filters used 1 exp Mycobacterium/ or exp tuberculosis/ or central nervous system tuberculosis/ or

congenital tuberculosis/ or drug resistant tuberculosis/ or extrapulmonary tuberculosis/ or intestine tuberculosis/ or kidney tuberculosis/ or laryngeal

Page 53: REPORT – DRAFT 2

Page | 53

tuberculosis/ or latent tuberculosis/ or lung tuberculosis/ or miliary tuberculosis/ or ocular tuberculosis/ or postprimary tuberculosis/ or primary tuberculosis/ or skin tuberculosis/ or tuberculoma/ or (tuberculo* or TB or scrofuloderma).ti,ab. or mycobacterium.ti,ab.

2 exp case finding/ 3 (case adj finding).ti,ab. 4 screen*.ti,ab. 5 exp screening/ 6 (disease adj3 surveillance).ti,ab. 7 exp disease surveillance/ 8 case-finding.ti,ab. 9 (case adj detection).tw. 10 (contact adj tracing).ti,ab. 11 exp health survey/ 12 survey.ti,ab. 13 or/2-12 14 exp "sensitivity and specificity"/ 15 (false adj negative).ti,ab. 16 odds.mp. 17 (predictive adj3 value).ti,ab. 18 specificit*.ti,ab. 19 accuracy.tw. 20 or/14-19 21 exp prevalence/ 22 prevalence.ti,ab. 23 cross sectional.ti,ab. 24 exp cross-sectional study/ 25 or/21-23 26 11 or 12 or 25 27 (mycobacteri$ adj2 culture).ti,ab. 28 (microscopy adj2 sputum smear).ti,ab. 29 (microscopy adj2 ZN).ti,ab. 30 (microscopy adj2 Ziehl-neelsen).ti,ab. 31 (microscopy adj2 FM).ti,ab. 32 (microscopy adj2 fluorescence).ti,ab. 33 lowenstein-jensen.ti,ab. 34 (LJ adj2 medium).ti,ab. 35 "Mycobacteria growth incubator tube".ti,ab. 36 MGIT.ti,ab. 37 geneXpert.ti,ab. 38 (auramine adj2 staining).ti,ab. 39 or/27-38 40 1 and 13 and 20 41 1 and 26 and 39 42 ((culture or smear) adj positiv*).ti,ab. 43 39 or 42 44 1 and 26 and 43 45 40 or 44 46 animal/ not human/ 47 45 not 46

Page 54: REPORT – DRAFT 2

Page | 54

APPENDIX 2 – Search results for studies on TB screening in children.

Among the identified publications 58 studies had a focus on young children (0-5 years old) or pediatric TB only. These studies were taken were taken to a separate database and titles and abstracts were assessed if they possibly evaluated TB screening tools in children. Of those 58 studies, only one study evaluated a screening tool in predominantly HIV-negative children [42]. This study assessed symptom screening of child TB contacts under the age of 5 years. Presence of cough, fever, weight-loss or fatigue, had a sensitivity of 76% (95%CI 58-89) and specificity 77% (95%CI 71-82) when the reference was all children treated for active TB. The positive predictive value was 33% and negative predictive value 96% in this study population with 13% TB prevalence. Among the 8 children with negative screen who were diagnosed with active TB in the study, all had uncomplicated hilar adenopathy on CXR only, which may represent transient hilar adenopathy after infection rather than active tuberculosis disease. Excluding asymptomatic children with uncomplicated hilar adenopathy from the reference group gave a sensitivity of 100% (95%CI 85-100) and specificity of 76% (95%CI 70-81). In addition one unpublished study was identified, but was ineligible since it evaluated a TB screening tool in HIV-infected children only.

Page 55: REPORT – DRAFT 2

Page | 55

APPENDIX 3- QUADAS 2 tool Key questions Signaling questions

Domain 1: Patient selection

Risk of Bias: could the selection of patients have introduced bias?

1. Was a consecutive or random sample of patients enrolled? • Yes: if all eligible patients were also included; or if there were no clear indications that this

was not the case • No: if the authors report that the selection was based on clinical judgement of health

workers, or a large proportion of randomly selected persons did not participate in the study. • Unclear: if there is discrepancy between the numbers of eligible people and the number of

included people, but no reasons given for that, or the selection procedure is not clearly described.

2. Was a case–control design avoided? • Yes: if a case–control design was avoided.

• No: if a case–control design was not avoided.

• Unclear: if not reported or insufficient information is provided to decide.

3. Did the study avoid inappropriate exclusions? • Yes: if no study participants were excluded after inclusion.

• No: if study participants were excluded with e.g. mild or severe symptoms or signs.

• Unclear: if insufficient information is provided to decide.

Applicability: Are there concerns that the included patients and setting do not match the review question?

• High concern: if the study population does not resemble a population that would be considered for a screening TB screening program in practice.

• Low concern: if the study population does resemble a population that would be considered for a screening TB screening program in practice.

• Unclear: if not reported or insufficient information is provided to decide.

Domain 2: Index Test

Risk of Bias: Could the conduct or interpretation 1. Were the index test results interpreted without knowledge of the results of the reference standard?

Page 56: REPORT – DRAFT 2

Page | 56

of the index test have introduced bias?

• Yes: if the screening test was performed without knowing whether the person had active TB.

• No: if symptom questions were asked after the results of the reference test were known, or the CXR was interpreted with knowledge of the results of the reference test.

• Unclear: if insufficient information is provided to decide. E.g. if it was unclear whether the CXR reader was blinded to the results of the reference test.

2. If a threshold was used, was it pre-specified? This question was not applicable for our review question.

Applicability: Are there concerns that the index test, its conduct, or its interpretation differ from the review question?

• High concern: if the symptom questions or CXR classification were intended as a diagnostic rather than a screening tool; or if part of the population was screened with MMR.

• Low concern: if the symptom questions or CXR assessment were done with the intention to screen.

• Unclear: if insufficient information is provided to decide.

Domain 3: Reference Standard

Risk of Bias: Could the reference standard, its conduct, or its interpretation have introduced bias?

1. Is the reference standard likely to correctly classify the target condition? • Yes: if

• No: if.

• Unclear: if insufficient information is provided to decide.

2. Were the reference standard results interpreted without knowledge of the results of the index test? • Yes: if

• No: if.

• Unclear: if insufficient information is provided to decide.

Applicability: Are there concerns that the target condition as defined by the reference standard does not match the question?

• High concern: if there was a high probability that a considerable proportion of the TB cases identified in the study did may not have had bacteriologically confirmed TB or did not have active TB .

• Low concern: if the TB cases in the study have TB symptoms or CXR abnormalities in

Page 57: REPORT – DRAFT 2

Page | 57

addition to a positive culture and/or smear microscopy, and/or at least two different samples positive on culture and/or smear microscopy.

• Moderate concern: Because we perceived a large contrast between “low” and “high” we added a category “moderate” for the applicability sections. We applied the “moderate” category if the TB cases in the study could include persons with one positive sputum culture or smear only, without the presence or symptoms or CXR abnormalities.

• Unclear: if insufficient information is provided to decide.

Domain 4: Flow and Timing

Risk of Bias: could the patient flow have introduced bias?

1. Was there an appropriate interval between the index test and reference standard? • Yes: if the screening test and reference standard were applied (or samples taken) at the same time or within a short time period.

• No: if the time between the screening test and reference standard was more than 1 week.

• Unclear: if insufficient information is provided to decide.

2. Did all patients receive the same reference standard? • Yes: if all participants were evaluated with the reference standard, and if all or a large majority of participants were evaluated with the same test(s).

• No: if not all participants were evaluated with the reference standard, or participants received different tests (e.g. some smear only, some culture, or different numbers of samples were submitted for testing).

• Unclear: if insufficient information is provided to decide.

3. Were all patients included in the analysis? • Yes: if all participants were included.

• No: if participants who participated were excluded. For instance because they did not provide sputum for a reference test.

• Unclear: if insufficient information is provided to decide.

Page 58: REPORT – DRAFT 2

Page | 58

APPENDIX 4 - GRADE TABLES What is the accuracy of chest x-ray to identify active TB in a screening situation in the general population? Index test: CXR – Any abnormality | Reference test: Sputum culture and/or smear Place of the test: TRIAGE Test-treatment pathway: CXR positive => confirmatory test (mycobacterial culture or GeneXpert) => anti-TB chemotherapy (6-9 months antibiotics) Included studies: Den Boon, 2006; MoH Myanmar, 2012, Van ‘t Hoog, 2012

Se = sensitivity; Sp = specificity; * numbers in brackets are 95% confidence intervals 1. Limitations in study design (see QUADAS-2): High risk of selection bias in 1 study (Den boon, 2006). In all studies less than half of the participants received the

reference standard (23% to 45%); accuracy was calculated under the assumption that those who did not receive the reference standard were culture and/or smear negative (no active TB).

2. Indirectness (see QUADAS-2): Some concern about applicability of reference standard in 2 studies – no downgrading. 3. Inconsistency: Homogenous results for sensitivity and specificity (based on visual inspection CIs). 4. Imprecision: Precise estimates for sensitivity and specificity. 5. Publication bias: Not applicable (the evidence base for publication bias in DTA studies is very limited).

Outcome No. of studies (participants)

Study design

Factors that may decrease quality of evidence Quality of evidence

Effect per 100,000 Se 98% (95-100%)* Sp 75% (72-79%)

Limitations in study design1 Indirectness2 Inconsistency 3 Imprecision4 Publication bias5

True positives (active TB)

3 (72,065)

Cross-sectional Serious None None None NA

⊕⊕⊕O

Moderate

Prevalence 0.25%: 245 (238-250) 1%: 980 (950-100)

False negatives (incorrectly classified as no active TB)

Prevalence 0.25%: 5 (0-13) 1%: 20 (0-50)

True negatives (no active TB)

3 (72,065)

Cross-sectional Serious None None None NA

⊕⊕⊕O Moderate

Prevalence 0.25%: 74,813 (71,820-78,803) 1%: 74,250 (72,280-78,210)

False positives (incorrectly classified as active TB)

Prevalence 0.25%: 24,938 (20,948-27,930) 1%: 24,750 (20,790-27,720)

Page 59: REPORT – DRAFT 2

Page | 59

What is the accuracy of chest x-ray to identify active TB in a screening situation in the general population? Index test: CXR – TB related abnormalities| Reference test: Sputum culture and/or smear Place of the test: TRIAGE Test-treatment pathway: CXR positive => confirmatory test (mycobacterial culture or GeneXpert) => anti-TB chemotherapy (6-9 months antibiotics) Included studies: Den Boon, 2006; Hoa, 2012; MoH Cambodja, 2005; MoH Myanmar, 2012, Van ‘t Hoog, 2012

Se = sensitivity; Sp = specificity; * numbers in brackets are 95% confidence intervals 1. Limitations in study design (see QUADAS-2): High risk of selection bias in 1 study (Den Boon, 2006) and in 3 studies unclear risk of bias for the reference standard. In

all studies less than half of the participants received the reference standard (8% to 45%); accuracy was calculated under the assumption that those who did not receive the reference standard were culture and/or smear negative (no active TB).

2. Indirectness (see QUADAS-2): Concern about applicability of reference standard in 2 studies – no downgrading. 3. Inconsistency: Moderate heterogeneity for sensitivity (based on visual inspection CIs); little heterogeneity for specificity – no downgrading. 4. Imprecision: Precise estimates for sensitivity and specificity. 5. Publication bias: Not applicable (the evidence base for publication bias in DTA studies is very limited).

Outcome No. of studies (participants)

Study design

Factors that may decrease quality of evidence Quality of evidence

Effect per 100,000 Se 87% (79-95%)* Sp 89% (87-92%)

Limitations in study design1 Indirectness2 Inconsistency 3 Imprecision4 Publication bias5

True positives (active TB)

5 (163,646)

Cross-sectional Very serious None None None NA

⊕⊕OO

Low

Prevalence 0.25%: 218 (198-238) 1%: 870 (790-950)

False negatives (incorrectly classified as no active TB)

Prevalence 0.25%: 33 (13-53) 1%: 130 (50-210)

True negatives (no active TB)

5 (163,646)

Cross-sectional Very serious None None None NA

⊕⊕OO Low

Prevalence 0.25%: 88,778 (86,783-91,770) 1%: 88,110 (86,130-91,080)

False positives (incorrectly classified as active TB)

Prevalence 0.25%: 10,973 (7,980-12,968) 1%: 10,890 (7,920-12,870)

Page 60: REPORT – DRAFT 2

Page | 60

What is the accuracy of screening for cough to identify active TB in a screening situation in the general population? Index test: Prolonged cough| Reference test: Sputum culture and/or smear Place of the test: TRIAGE Test-treatment pathway: Cough positive => confirmatory test (mycobacterial culture or GeneXpert) => anti-TB chemotherapy (6-9 months antibiotics) Included studies: Ayles, 2009; Corbett, 2010, Datta, 2001; Den Boon, 2006; Hoa, 2012; MoH Cambodja, 2005; MoH Myanmar, 2012, Van ‘t Hoog, 2012

Se = sensitivity; Sp = specificity; * numbers in brackets are 95% confidence intervals 1. Limitations in study design (see QUADAS-2): High risk of selection bias in 1 study (Den Boon, 2006), in 1 study unclear risk of bias for the index test and in 3 studies for

the reference standard. In 6 of the 8 studies less than half of the participants received the reference standard (8% to 45%); accuracy was calculated under the assumption that those who did not receive the reference standard were culture and/or smear negative (no active TB).

2. Indirectness (see QUADAS-2): Concern about applicability of reference standard in 3 studies – no downgrading. 3. Inconsistency: Moderate heterogeneity for sensitivity (based on visual inspection CIs); little heterogeneity for specificity – no downgrading. 4. Imprecision: Imprecise estimate for sensitivity; precise estimate for specificity. 5. Publication bias: Not applicable (the evidence base for publication bias in DTA studies is very limited).

Outcome No. of studies (participants)

Study design

Factors that may decrease quality of evidence Quality of evidence

Effect per 100,000 Se 35% (24-46%)* Sp 95% (93-97%)

Limitations in study design1 Indirectness2 Inconsistency 3 Imprecision4 Publication bias5

True positives (active TB)

8 (223,402)

Cross-sectional Very serious None None Serious NA

⊕OOO

Very low

Prevalence 0.25%: 88 (60-115) 1%: 350 (240-460)

False negatives (incorrectly classified as no active TB)

Prevalence 0.25%: 163 (135-190) 1%: 650 (540-760)

True negatives (no active TB)

8 (223,402)

Cross-sectional Very serious None None None NA

⊕⊕OO Low

Prevalence 0.25%: 94,763 (92,768-96,758) 1%: 94,050 (92,070-96,030)

False positives (incorrectly classified as active TB)

Prevalence 0.25%: 4,988 (2,993-6,983) 1%: 4,950 (2,970-6,930)

Page 61: REPORT – DRAFT 2

Page | 61

What is the accuracy of screening for cough to identify active TB in a screening situation in the general population? Index test: Prolonged cough| Reference test: Sputum smear Place of the test: TRIAGE Test-treatment pathway: Cough positive => confirmatory test (mycobacterial culture or GeneXpert) => anti-TB chemotherapy (6-9 months antibiotics) Included studies: Den Boon, 2006; MoH Cambodja, 2005; MoH Myanmar, 2012, Van ‘t Hoog, 2012

Se = sensitivity; Sp = specificity; * numbers in brackets are 95% confidence intervals 1. Limitations in study design (see QUADAS-2): High risk of selection bias in 1 study (Den Boon, 2006) and in 2 studies unclear risk of bias for the reference standard. In

all studies less than half of the participants received the reference standard (15% to 45%); accuracy was calculated under the assumption that those who did not receive the reference standard were culture and/or smear negative (no active TB).

2. Indirectness (see QUADAS-2): No major concerns on applicability. 3. Inconsistency: Moderate heterogeneity for sensitivity (based on visual inspection CIs); little heterogeneity for specificity – no downgrading. 4. Imprecision: Imprecise estimates for sensitivity and specificity (wide CI for false negatives and false positives). 5. Publication bias: Not applicable (the evidence base for publication bias in DTA studies is very limited).

Outcome No. of studies (participants)

Study design

Factors that may decrease quality of evidence Quality of evidence

Effect per 100,000 Se 56% (42-74%)* Sp 92% (87-98%)

Limitations in study design1 Indirectness2 Inconsistency 3 Imprecision4 Publication bias5

True positives (active TB)

4 (95,188)

Cross-sectional Very serious None None Serious NA

⊕OOO

Very low

Prevalence 0.25%: 140 (105-185) 1%: 560 (420-740)

False negatives (incorrectly classified as no active TB)

Prevalence 0.25%: 110 (65-145) 1%: 440 (260-580)

True negatives (no active TB)

4 (95,188)

Cross-sectional Very serious None None Serious NA

⊕OOO Very low

Prevalence 0.25%: 91,770 (86,783-97,755) 1%: 91,080 (86,130-97,020)

False positives (incorrectly classified as active TB)

Prevalence 0.25%: 7,980 (1,995-12,968) 1%: 7,920 (1,980-12,870)

Page 62: REPORT – DRAFT 2

Page | 62

What is the accuracy of screening for cough to identify active TB in a screening situation in the general population? Index test: Any cough| Reference test: Sputum culture and/or smear Place of the test: TRIAGE Test-treatment pathway: Cough positive => confirmatory test (mycobacterial culture or GeneXpert) => anti-TB chemotherapy (6-9 months antibiotics) Included studies: Ayles, 2009; Corbett, 2010; MoH Cambodja, 2005; MoH Myanmar, 2012; Sebhatu, 2007; Van ‘t Hoog, 2012, Wood, 2006

Se = sensitivity; Sp = specificity; * numbers in brackets are 95% confidence intervals 1. Limitations in study design (see QUADAS-2): In 3 of the 7 studies less than half of the participants received the reference standard (15% to 32%); accuracy was

calculated under the assumption that those who did not receive the reference standard were culture and/or smear negative (no active TB). 2. Indirectness (see QUADAS-2): No major concerns on applicability. 3. Inconsistency: Moderate heterogeneity for sensitivity and specificity (based on visual inspection CIs) – no downgrading. 4. Imprecision: Imprecise estimates for sensitivity and specificity (wide CI for false negatives and false positives). 5. Publication bias: Not applicable (the evidence base for publication bias in DTA studies is very limited).

Outcome No. of studies (participants)

Study design

Factors that may decrease quality of evidence Quality of evidence

Effect per 100,000 Se 56% (40-74%)* Sp 80% (69-90%)

Limitations in study design1 Indirectness2 Inconsistency 3 Imprecision4 Publication bias5

True positives (active TB)

7 (131,052)

Cross-sectional Serious None None Serious NA

⊕⊕OO

Low

Prevalence 0.25%: 140 (100-185) 1%: 560 (400-470)

False negatives (incorrectly classified as no active TB)

Prevalence 0.25%: 110 (65-150) 1%: 440 (260-600)

True negatives (no active TB)

7 (131,052)

Cross-sectional Serious None None Serious NA

⊕⊕OO Low

Prevalence 0.25%: 79,800 (68,828-89,775) 1%: 79,200 (68,310-89,100)

False positives (incorrectly classified as active TB)

Prevalence 0.25%: 19,800 (9,975-30,923) 1%: 7,920 (9,900-30,690)

Page 63: REPORT – DRAFT 2

Page | 63

What is the accuracy of screening for any symptom to identify active TB in a screening situation in the general population with low HIV prevalence? Index test: Any symptom| Reference test: Sputum culture and/or smear Place of the test: TRIAGE Test-treatment pathway: Symptom positive => confirmatory test (mycobacterial culture or GeneXpert) => anti-TB chemotherapy (6-9 months antibiotics) Included studies: Datta, 2001; Gopi, 2003; MoH Cambodja, 2005; MoH Myanmar, 2012

Se = sensitivity; Sp = specificity; * numbers in brackets are 95% confidence intervals 1. Limitations in study design (see QUADAS-2): In all studies less than 25% of the participants received the reference standard (11% to 23%); accuracy was calculated

under the assumption that those who did not receive the reference standard were culture and/or smear negative (no active TB). 2. Indirectness (see QUADAS-2): No major concerns on applicability. 3. Inconsistency: Moderate heterogeneity for sensitivity and specificity (based on visual inspection CIs) – no downgrading. 4. Imprecision: Imprecise estimates for sensitivity and specificity (wide CI for false negatives and false positives). 5. Publication bias: Not applicable (the evidence base for publication bias in DTA studies is very limited).

Outcome No. of studies (participants)

Study design

Factors that may decrease quality of evidence Quality of evidence

Effect per 100,000 Se 70% (58-82%)* Sp 61% (35-87%)

Limitations in study design1 Indirectness2 Inconsistency 3 Imprecision4 Publication bias5

True positives (active TB)

4 (178,376)

Cross-sectional Very serious None None Serious NA

⊕OOO

Very low

Prevalence 0.25%: 175 (145-205) 1%: 700 (580-820)

False negatives (incorrectly classified as no active TB)

Prevalence 0.25%: 75 (45-105) 1%: 300 (180-420)

True negatives (no active TB)

4 (178,376)

Cross-sectional Very serious None None Serious NA

⊕OOO Very low

Prevalence 0.25%: 60,848 (34,913-86,783) 1%: 60,390 (34,650-86,130)

False positives (incorrectly classified as active TB)

Prevalence 0.25%: 38,903 (12,968-64,838) 1%: 38,610 (12,870-64,350)

Page 64: REPORT – DRAFT 2

Page | 64

What is the accuracy of screening for any symptom to identify active TB in a screening situation in the general population with high HIV prevalence? Index test: Any symptom| Reference test: Sputum culture and/or smear Place of the test: TRIAGE Test-treatment pathway: Symptom positive => confirmatory test (mycobacterial culture or GeneXpert) => anti-TB chemotherapy (6-9 months antibiotics) Included studies: Ayles, 2009; Corbett, 2010; Den Boon, 2006; Van ‘t Hoog, 2012

Se = sensitivity; Sp = specificity; * numbers in brackets are 95% confidence intervals 1. Limitations in study design (see QUADAS-2): In 2 of the 4 studies less than half of the participants received the reference standard (32% and 44%); accuracy was

calculated under the assumption that those who did not receive the reference standard were culture and/or smear negative (no active TB). 2. Indirectness (see QUADAS-2): No major concerns on applicability. 3. Inconsistency: Homogenous results for sensitivity, considerable heterogeneity for specificity (based on visual inspection CIs) – no downgrading. 4. Imprecision: Imprecise estimates for sensitivity and specificity (wide CI for false negatives and false positives). 5. Publication bias: Not applicable (the evidence base for publication bias in DTA studies is very limited).

Outcome No. of studies (participants)

Study design

Factors that may decrease quality of evidence Quality of evidence

Effect per 100,000 Se 84% (77-93%)* Sp 74% (53-95%)

Limitations in study design1 Indirectness2 Inconsistency 3 Imprecision4 Publication bias5

True positives (active TB)

4 (40,100)

Cross-sectional Serious None None Serious NA

⊕⊕OO

Low

Prevalence 0.25%: 210 (193-233) 1%: 840 (770-930)

False negatives (incorrectly classified as no active TB)

Prevalence 0.25%: 40 (18-58) 1%: 160 (70-230)

True negatives (no active TB)

4 (40,100)

Cross-sectional Serious None None Very serious NA

⊕OOO Very low

Prevalence 0.25%: 73,815 (52,868-94,763) 1%: 73,260 (52,470-94,050)

False positives (incorrectly classified as active TB)

Prevalence 0.25%: 25,935 (4,988-46,883) 1%: 25,740 (4,950-46,530)

Page 65: REPORT – DRAFT 2

Page | 65

What is the accuracy of screening for any symptom to identify active TB in a screening situation in the general population? Index test: Any symptom| Reference test: Sputum culture and/or smear Place of the test: TRIAGE Test-treatment pathway: Symptom positive => confirmatory test (mycobacterial culture or GeneXpert) => anti-TB chemotherapy (6-9 months antibiotics) Included studies: Ayles, 2009; Corbett, 2010; Den Boon, 2006; Van ‘t Hoog, 2012; Datta, 2001; Gopi, 2003; MoH Cambodja, 2005; MoH Myanmar, 2012

Se = sensitivity; Sp = specificity; * numbers in brackets are 95% confidence intervals 1. Limitations in study design (see QUADAS-2): High risk of selection bias in 1 study (Den Boon, 2006) and in 2 studies unclear risk of bias for the reference standard. In 6

of the 8 studies less than half of the participants received the reference standard (11% to 44%); accuracy was calculated under the assumption that those who did not receive the reference standard were culture and/or smear negative (no active TB).

2. Indirectness (see QUADAS-2): No major concerns on applicability. 3. Inconsistency: Moderate heterogeneity for sensitivity, considerable heterogeneity for specificity (based on visual inspection CIs) – no downgrading. 4. Imprecision: Imprecise estimates for sensitivity and specificity (wide CI for false negatives and false positives). 5. Publication bias: Not applicable (the evidence base for publication bias in DTA studies is very limited).

Outcome No. of studies (participants)

Study design

Factors that may decrease quality of evidence Quality of evidence

Effect per 100,000 Se 77% (68-86%)* Sp 68% (50-85%)

Limitations in study design1 Indirectness2 Inconsistency 3 Imprecision4 Publication bias5

True positives (active TB)

8 (218,476)

Cross-sectional Very serious None None Serious NA

⊕OOO

Very low

Prevalence 0.25%: 193 (170-215) 1%: 770 (680-860)

False negatives (incorrectly classified as no active TB)

Prevalence 0.25%: 58 (35-80) 1%: 230 (140-320)

True negatives (no active TB)

8 (218,476)

Cross-sectional Very serious None None Very serious NA

⊕OOO Very low

Prevalence 0.25%: 67,830 (49,875-84,788) 1%: 67,320 49,500-84,150)

False positives (incorrectly classified as active TB)

Prevalence 0.25%: 31,920 (14,963-49,875) 1%: 31,680 (14,850-49,500)

Page 66: REPORT – DRAFT 2

Page | 66