Supplementary Information - media.nature.com · Supplementary Information ... Durham, NC 27710, USA 4 Division of Allergy, Pulmonary and Critical Care, Vanderbilt University School

1

Supplementary Information

Beyond single marker analyses: Mining whole genome scans for insights into treatment responses in severe sepsis

Michael Man, PhD1, Sandra L. Close, PhD2, Andrew D. Shaw, MD3, Gordon R. Bernard, MD4, Ivor S. Douglas, MD5, Robert J. Kaner, MD6, Didier Payen, MD, PhD7, Jean-Louis Vincent, MD, PhD8, Stewart Fossceco, PhD1, Jonathan M. Janes, FRCP1, Amy G. Leishman, PhD1, Lee O’Brien, PhD1, Mark D. Williams, MD9, and Joe G. N. Garcia, MD10.

Affiliations 1 Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN 46285, USA 2 Department of Clinical Pharmacology, Indiana University School of Medicine, Indianapolis, IN 46202, USA 3 Department of Anesthesiology, Duke University Medical Center, Durham, NC 27710, USA 4 Division of Allergy, Pulmonary and Critical Care, Vanderbilt University School of Medicine, Nashville, TN 37232, USA 5 Denver Health Medical Center and University of Colorado, School of Medicine Denver, CO 80204, USA 6 Division of Pulmonary and Critical Care Medicine, Departments of Medicine and Genetic Medicine, Weill Cornell Medical College, New York, NY 10021, USA. 7 Department of Anesthesiology & Critical Care & SAMU, University Paris 7, Hôpital Lariboisière, 2 rue Ambroise Paré, 75074 Paris, France 8 Department of Intensive Care, Erasme University Hospital, Université Libre de Bruxelles, Route de Lennik 808, 1070 Brussels, Belgium 9 BioCritica, Inc., P.O. Box 40967, Indianapolis, IN 46240, USA 10 Institute for Personalized Respiratory Medicine, University of Illinois at Chicago, Chicago, IL 60612, USA

Corresponding author:

Name: Michael Man, PhD

Address: Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN 46285, USA

Phone: +1-317-276-5796 Fax: +1-317-651-8834 Email: [email protected]

2

Beyond single marker analyses: Mining whole genome scans for insights into treatment responses in severe sepsis

Supplementary Note, Tables, Figures and Methods

Contents

Supplementary Note Genetic data quality control

Supplementary Table 1 Number of patient samples removed for quality control issues (in Supplementary Note)

Supplementary Table 2 Number of SNPs removed for quality control issues (in

Supplementary Note)

Supplementary Table 3 Table listing all clinical or other markers included in the analysis

Supplementary Table 4 Top single markers with logistic regression P values

Supplementary Table 5 One hundred highest-ranked 3-SNP genetic combination markers from the GWAS for treatment response

Supplementary Table 6 One hundred highest-ranked 3-SNP genetic single markers from the multimarker response

Supplementary Table 7 Listing of the 20 single markers above the LOESS line for prognostic markers after the first stage

Supplementary Figure 1 Manhattan plots of the GWAS results for genotype AA versus not AA and AB versus not AB

Supplementary Figure 2 Quantile-Quantile plots from GWAS in the entire genetic cohort

Supplementary Figure 3 Multidimensional scaling of the highest-ranking 100, 3-SNP combination markers

Supplementary Figure 4 Effect of permutation size on the variability of the top markers

Supplementary Methods Methodology for single marker analyses using a traditional association approach

Methodology for the multimarker search

Methodology for linkage disequilibrium analyses

Methodology for permutation testing (and modified prognostic markers)

3

Supplementary Note. Genetic data quality control.

Genetic data quality control were performed for each patient sample and for each genetic

marker. The following criteria were needed to be met for samples or single nucleotide

polymorphisms (SNPs) to be included in analyses30-32.

Patient Samples

• At least an 80% genotyping success rate (data for at least 80% of variants on BeadChip)

• Be unique (remove unintentional duplicates)

• Have high genotyping concordance between the 199 SNPs that were common to both

the CGS (candidate gene study, previously performed on genes thought to be involved

in the biology of sepsis) and GWAS

• Pass gender check using multidimensional scaling (MDS); the sample needed to be in a

cluster consistent with its reported gender.

Sufficient quantities of DNA were available for genotyping studies from 1568 subjects. Twelve

samples were included as duplicates to assess batch effects. A total of 1580 samples were sent

to be genotyped (1568 +12). Of the 1580 results received, 16 samples did not have clinical

data, 67 samples were not analyzed from France, due to restrictions on their use, and an

additional 51 samples did not pass internal quality control assessments. In total, 1446 samples

were available for analyses (Supplementary Table 1).

4

Supplementary Table 1. Number of patient samples removed for quality control issues.

QC Reason for Removing Samples Samples Removed

Samples Remaining

Samples with insufficient quantities of DNA for genotyping 0 1568

Samples sent for genotyping (12 duplicates) (+12) 1580

Samples removed because clinical data was not collected 16 1564

Samples from France removed due to restrictions 67 1497

Samples with genotyping success rate <80% 8 1489

Samples with gender misidentified 9 1480

Remaining duplicate samples removed 9 1471

Samples with inconsistent results between CGS and GWAS 25 1446

Abbreviations: CGS=candidate gene study; GWAS=genome-wide association study; QC=quality control.

Single Nucleotide Polymorphisms

All SNPs were included in the initial analysis then combination markers (CMs) containing SNPs

meeting the following criteria were kept:

• Missing rate ≤10%

• Minor allele frequency (MAF) ≥5%

• Hardy Weinberg equilibrium (HWE) P values >10-15

Samples were run in 3 batches, using 2 different amplification kits. Logistic regression was used

to determine batch and amplification kit effects. Any SNPs with P values >10e-6 were kept for

further analysis. SNP quality control was performed for the 1 199 187 variants included on the

BeadChip; of these 856 627 SNPs passed the criteria (Supplementary Table 2).

5

Supplementary Table 2. Number of SNPs removed for quality control issues.

QC Criteria Number of SNPs Removed Number of SNPs Remaining

All variants on BeadChip 0 1,199,187

Monoallelic variants 70,172 1,129,015

SNPs with success rate <90% 29,609 1,099,406

SNPs with MAF <5% 208,054 891,352

SNPs with HWE p value <1e-15 34,654 856,698

SNPs with batch or kit effect 71 856,627 Abbreviations: HWE=Hardy Weinberg equilibrium; MAF=minor allele frequency; QC=quality control; SNP=single nucleotide polymorphism.

All 1 199 187 variants were included in the initial analysis, and then CMs containing SNPs that met the quality control criteria were kept.

6

Supplementary Table 3. Table listing all clinical or other markers included in the analysis.

Name Label AGE Age at Admission (years) GENDER Gender ORIGIN Racial Origin PRAPACHE Pre-infusion APACHE II Score ORGANNUM Number of Baseline Organ Failures RECSURG Recent Surgery (w/in last 30 days) BLGCS Baseline Glasgow Coma Score BLGCSEYE Baseline GCS Eye Score BLGCSVRB Baseline GCS Verbal Score BLGCSMOT Baseline GCS Motor Score BLHPRN BL Heparin Exposure BLSTER BL Steroid Exposure BLSHOCK BL Shock Status BLPCACT BL Protein C Activity BLDIC BL DIC Status BLVENT BL Ventilation Status BLVASO BL Vasopressor Status BLSOFCV BL SOFA Cardiovascular Score BLSOFHEM BL SOFA Hematology Score BLSOFHEP BL SOFA Hepatic Score BLSOFREN BL SOFA Renal Score BLSOFRES BL SOFA Respiration Score BLAPTT BL APTT BLAT3 BL AT3 Activity BLCPLAT BL Central Lab Platelet Count BLDDIMER BL d-Dimer BLIL6 BL IL-6 BLPSACT BL Protein S Activity BLPT BL Prothrombin Time BLADL76 BL ADL Level (Katz 76) BLWEIGHT BL Weight kg BLCRTCLR BL Creatinine Clearance (Cockcroft-Gault) BLARDS BL ARDS Status

Abbreviations: ADL=activities of daily living; APACHE II=acute physiology and chronic health evaluation II; APTT=activated partial thromboplastin time; ARDS=acute respiratory distress syndrome; AT=antithrombin; BL=baseline; DIC=disseminated intravascular coagulation; GCS=Glasgow coma score; IL=interleukin; SOFA=sequential organ failure assessment.

7

Supplementary Table 4. Top single markers with logistic regression P values.

Presenting unadjusted P values by SNP genotype for the total PROWESS cohort. SNPs with at least one model showing significance (P<0.0001) are presented. SNPs are sorted by chromosome and position. One marker, rs17513961 in the 5’ untranslated region of LOC222052 (located near the insulin-like growth factor I binding protein [IGF1-BP)] gene), met the threshold for genome-wide significance (P<5x107) (shown in blue). Inadequate sample size for modeling donated by a ‘-‘.

Alleles trt*SNP P value trt*SNP P value trt*SNP P value SNP Chr Position (A:B) Gene Location (AA vs. not AA) (AB vs. not AB) (BB vs. not BB) rs2076977 1 30346745 C:T MATN1 flanking_3UTR 3.6E-05 0.0018 0.2066 rs2173399 1 76982973 C:T LOC729708 flanking_3UTR 6.4E-05 0.0335 0.0442 rs2942917 1 194045294 C:T KCNT2 flanking_3UTR 9.5E-05 0.0376 0.0342 rs1915279 1 237142337 A:G KRT18P32 flanking_3UTR 0.4256 2.0E-05 2.2E-06 rs12618741 2 114834542 C:T LOC391428 flanking_5UTR 6.8E-05 0.0067 0.0146 rs1829975 2 161493415 A:G TANK flanking_5UTR 0.1425 6.0E-06 0.0002 rs1605461 2 189264326 C:T DIRC1 flanking_3UTR 0.0364 2.9E-05 0.0078 rs9815663 3 3589887 C:T LOC728221 flanking_5UTR 8.8E-05 0.0017 0.0513 rs2728981 3 22228198 A:G LOC728516 flanking_3UTR 9.8E-05 0.0372 0.1255 rs1478842 3 35127164 C:T KRT8P18 flanking_3UTR 7.8E-05 0.0034 0.0520 rs1920386 3 117259269 A:G LSAMP intron 6.8E-05 0.4444 0.1707 rs993691 3 141603432 C:T CLSTN2 intron 0.0345 2.4E-05 0.0007 rs7619971 3 141610022 A:C CLSTN2 intron 0.0008 5.5E-05 0.0379 rs16862676 3 151661629 C:T TSC22D2 flanking_3UTR 5.8E-05 3.7E-05 0.8099 rs6446731 4 3254549 A:G LOC345222 flanking_3UTR 0.9224 7.4E-05 0.0001 rs4525972 4 19101354 A:G LOC645174 flanking_5UTR 0.0013 9.8E-05 0.2492 rs6835841 4 19102675 G:T LOC645174 flanking_5UTR 0.0013 6.4E-05 0.1936 rs4608848 4 187247098 C:T TLR3 flanking_3UTR 5.7E-05 0.4568 0.0341 rs7702195 5 2142221 C:T IRX4 flanking_5UTR 0.0583 9.2E-05 0.0072 rs1897833 5 89684874 A:C CETN3 flanking_3UTR 0.1335 5.7E-05 0.0006 rs3952709 5 89688112 A:G CETN3 flanking_3UTR 0.0011 7.1E-05 - rs13153368 5 119045408 C:T LOC391824 flanking_5UTR 0.0047 8.1E-05 0.0892 rs359447 5 173203769 C:T CPEB4 flanking_5UTR 0.0006 2.5E-05 0.2598 rs7725278 5 174693553 C:T DRD1 flanking_3UTR 0.9190 6.6E-05 8.4E-05 rs12652255 5 174693701 A:C DRD1 flanking_3UTR 0.5204 1.4E-05 8.1E-06 rs9460540 6 20756741 A:G CDKAL1 intron 7.4E-05 0.0025 0.2252 rs10946835 6 26641736 C:T HMGN4 flanking_5UTR 8.9E-05 0.0013 0.6683

8

rs10484442 6 26663858 A:G HMGN4 flanking_3UTR 3.1E-05 0.0005 0.7141 rs6456733 6 26674783 C:T HMGN4 flanking_3UTR 2.8E-05 0.0004 0.7569 rs1078679 6 26676720 C:T HMGN4 flanking_3UTR 3.0E-05 0.0005 0.7053 rs6925895 6 26680144 G:T HMGN4 flanking_3UTR 4.0E-05 0.0006 0.7171 rs3130454 6 31216464 A:G PSORS1C1 flanking_3UTR 0.0005 7.4E-05 0.4521 rs6458690 6 49519578 A:G MUT intron 7.7E-05 0.0126 0.0433 rs6922979 6 99734696 A:G C6orf168 flanking_3UTR 0.2709 1.7E-05 0.0003 rs9488845 6 116559351 A:C COL10A1 flanking_5UTR 0.0364 4.3E-05 0.0139 rs174948 7 29855758 A:G WIPF3 intron 0.0055 6.9E-05 0.0969 rs174957 7 29876752 A:G WIPF3 intron 0.0150 3.9E-05 0.0236 rs2159688 7 31722887 A:G C7orf16 flanking_3UTR 8.4E-05 4.8E-05 0.9550 rs2159499 7 38195427 A:G STARD3NL intron 0.0234 1.7E-05 0.0040 rs4723738 7 38207433 A:G STARD3NL intron 0.0072 6.2E-06 0.0090 rs6462817 7 38217390 C:T STARD3NL intron 0.0044 1.1E-05 0.0180 rs7795499 7 38234920 C:T STARD3NL intron 0.0468 3.1E-05 0.0023 rs17513961 7 46359092 G:T LOC222052 flanking_5UTR - 8.4E-07 1.3E-07 rs17172693 7 46397198 C:T LOC222052 flanking_5UTR 9.2E-07 3.7E-06 0.4927 rs10275461 7 46416755 G:T LOC222052 flanking_5UTR 6.7E-05 0.0049 0.0086 rs6953258 7 94213744 C:T LOC645973 flanking_3UTR 3.0E-05 0.0670 0.0429 rs2394824 7 97577253 A:G LMTK2 intron 0.5935 7.8E-05 4.4E-05 rs4729408 7 97610594 C:T LMTK2 intron 4.4E-05 3.2E-05 0.9316 rs11772060 7 97620757 G:T LMTK2 intron 3.6E-05 5.4E-05 0.6489 rs13273073 8 23640171 G:T NKX2-6 flanking_5UTR 5.3E-05 7.2E-06 0.5101 rs12680523 8 114208773 C:T CSMD3 intron - 2.4E-05 0.0004 rs11984724 8 114323285 C:T CSMD3 intron 0.3105 9.3E-05 0.0006 rs2721952 8 116714231 C:T TRPS1 intron 0.7850 6.5E-05 9.9E-05 rs11786372 8 135561345 C:T ZFAT1 intron 0.0014 9.1E-05 0.2258 rs321224 9 20211607 A:G MLLT3 flanking_3UTR 0.0007 5.8E-05 - rs12380611 9 71838485 C:T MAMDC2 flanking_5UTR 7.3E-05 0.0119 0.2168 rs10869665 9 77708819 C:T PCSK5 intron 0.0028 8.1E-05 0.1098 rs4553010 9 132051053 A:G FREQ flanking_3UTR 3.9E-05 1.3E-05 - rs10999122 10 71507881 C:T H2AFY2 intron 9.3E-05 0.0371 0.0103 rs10500982 11 24505621 A:G LUZP2 intron 0.0002 4.5E-05 0.6083 rs11027980 11 24509005 A:G LUZP2 intron 0.0003 8.0E-05 0.6278 rs10767205 11 24510016 A:C LUZP2 intron 0.1823 6.0E-05 0.0006 rs611003 11 69154465 A:C CCND1 flanking_5UTR 8.0E-05 0.4848 0.0013 rs471814 11 119299719 A:G LOC390255 flanking_5UTR 5.4E-05 0.0476 0.0226 rs7134223 12 72141185 G:T LOC645654 flanking_5UTR 0.9899 7.9E-05 0.0001 rs1581403 12 72882519 A:G LOC387869 flanking_3UTR 3.5E-05 3.0E-05 0.7369

9

rs589258 13 93605357 C:T GPC6 intron 0.0340 4.3E-05 0.0074 rs497836 13 93605509 G:T GPC6 intron 0.0085 3.5E-05 0.0279 rs9587280 13 106480426 C:T LOC728215 flanking_3UTR 9.5E-05 0.0630 0.2311 rs17114618 14 43512877 C:T LOC645086 flanking_3UTR 9.2E-05 0.0002 - rs227397 14 69477522 C:T SMOC1 intron 0.2745 5.9E-05 0.0009 rs1011061 15 53774790 C:T PRTG intron 0.0003 6.0E-05 0.3344 rs8028880 15 53806945 C:T PRTG intron 0.0002 7.6E-05 0.3895 rs168841 16 8865964 A:G CARHSP1 flanking_5UTR - 9.3E-05 0.0009 rs16953047 16 52687671 G:T FTO intron 0.0009 6.9E-05 0.1107 rs2716601 16 72633753 C:T LOC441506 flanking_5UTR 0.3214 4.0E-05 1.0E-06 rs12951391 17 67423177 C:T FLJ37644 flanking_3UTR 0.0009 9.7E-05 0.3207 rs7409 17 71547138 A:G SRP68 3UTR 0.0032 5.9E-05 0.1110 rs2598414 17 71578694 C:T SRP68 intron 0.0035 2.5E-05 0.0623 rs2246277 17 71592940 C:T EXOC7 intron 0.0332 4.5E-05 0.0137 rs2665981 17 71600906 C:T EXOC7 intron 0.1730 6.4E-05 0.0052 rs2250054 17 71601856 C:T EXOC7 intron 0.0050 6.6E-05 0.1696 rs2243536 17 71601921 C:T EXOC7 intron 0.1713 5.7E-05 0.0044 rs2457693 17 71603745 C:T EXOC7 intron 0.0059 8.3E-05 0.1821 rs4140512 22 39362809 A:G MKL1 flanking_5UTR 3.9E-05 6.2E-05 0.5727 rs5995886 22 39363747 A:G MKL1 flanking_5UTR 0.5751 5.4E-05 3.4E-05 rs6001990 22 39364131 A:G MKL1 flanking_5UTR 4.5E-05 0.0001 0.4000 rs12159200 22 39372037 A:C MKL1 flanking_5UTR 5.9E-07 4.9E-06 0.1831 rs7292804 22 39377502 G:T MKL1 flanking_5UTR 2.0E-05 1.1E-05 0.9585 Abbreviations: Chr=chromosome; PROWESS=Recombinant Human Activated Protein C Worldwide Evaluation in Severe Sepsis trial; SNP=single nucleotide polymorphism.

10

Supplementary Table 5. One hundred highest-ranked 3-SNP genetic combination markers from the GWAS for treatment response.

ID SNP1 Gene1 SNP2 Gene2 SNP3 Gene3 ARR N

1 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs10910651..A/A PCNXL2 41.65 372

2 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs6695322..C/C PCNXL2 41.65 372

3 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs10910651..A/A G/G PCNXL2 39.76 395

4 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs6695322..C/C T/T PCNXL2 39.76 395



7 rs7725278..C/C T/T DRD1 rs3820116..G/G PCNXL2 rs2253239..T/C T/T LOC391273 37.22 437


9 rs7725278..T/T DRD1 rs2256527..T/C T/T LOC391273 rs10910651..A/A PCNXL2 41.30 359

10 rs7725278..T/T DRD1 rs2256527..T/C T/T LOC391273 rs6695322..C/C PCNXL2 41.30 359

11 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs3820116..A/A G/G PCNXL2 36.57 446

12 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs10910651..A/A PCNXL2 39.40 392

13 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs6695322..C/C PCNXL2 39.40 392

14 rs7725278..C/C T/T DRD1 rs3820116..G/G PCNXL2 rs2844315..G/G T/G LOC391273 36.93 435

15 rs7725278..C/C T/T DRD1 rs3820116..G/G PCNXL2 rs2256527..T/C T/T LOC391273 38.35 410





11

20 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs6695322..C/C T/T PCNXL2 37.92 415

21 rs2716601..T/T LOC441506 rs10488192..C/C T/T TMEM106B rs4638843..C/C MSH2 42.91 309

22 rs7725278..T/T DRD1 rs2256527..T/C T/T LOC391273 rs10910651..A/A G/G PCNXL2 39.30 381

23 rs7725278..T/T DRD1 rs2256527..T/C T/T LOC391273 rs6695322..C/C T/T PCNXL2 39.30 381

24 rs7725278..T/T DRD1 rs2253239..T/C T/T LOC391273 rs10910651..A/A PCNXL2 39.27 380

25 rs7725278..T/T DRD1 rs2253239..T/C T/T LOC391273 rs6695322..C/C PCNXL2 39.27 380

26 rs4723738..A/A G/G STARD3NL rs6703979..T/C T/T C1orf212 rs247447..A/A G/G TMEM170 42.31 317

27 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs12624775..T/T C20orf82 42.68 307

28 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs12624775..T/T C20orf82 41.89 319

29 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs10910651..A/A PCNXL2 38.95 378

30 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs6695322..C/C PCNXL2 38.95 378


32 rs1829975..A/A G/G TANK rs2033483..C/C T/T ZNF317 rs2671822..A/A A/G SLFN5 22.95 787

33 rs7725278..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs10489576..G/G PCNXL2 34.21 462

34 rs7725278..T/T DRD1 rs2256527..T/C T/T LOC391273 rs3820116..G/G PCNXL2 37.92 397


36 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs10489576..G/G PCNXL2 34.29 459

37 rs12652255..A/A C/C DRD1 rs8121302..C/C T/C SNX5 rs583582..T/C T/T KRT8P5 33.03 481

38 rs7725278..T/T DRD1 rs2253239..T/C T/T LOC391273 rs3820116..G/G PCNXL2 36.63 421


40 rs17513961..T/T LOC222052 rs1555772..T/C T/T GJA1 rs2571490..C/C T/T PCDH7 19.86 919


12

42 rs7725278..C/C T/T DRD1 rs8121302..C/C T/C SNX5 rs12530988..C/C T/T GLI3 38.58 382


44 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs16858553..T/T PCNXL2 38.47 384

45 rs2716601..T/T LOC441506 rs10488192..C/C T/T TMEM106B rs4638843..C/C G/G MSH2 41.79 317

46 rs4140512..A/A G/G MKL1 rs6746008..T/G T/T FAM49A rs10093480..C/C T/T PLEKHF2 26.56 646



49 rs6462817..C/C T/T STARD3NL rs4899362..C/C T/C TTC9 rs3934594..C/C T/C C9orf47 29.71 554

50 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs3820116..A/A G/G PCNXL2 35.98 430

51 rs2716601..T/T LOC441506 rs1394713..A/A G/G OR51D1 rs2400517..C/C LOC730224 42.19 307

52 rs6462817..C/C T/T STARD3NL rs6703979..T/C T/T C1orf212 rs247447..A/A G/G TMEM170 40.96 331


54 rs16862676..C/C T/T TSC22D2 rs1829975..A/A G/G TANK rs7632166..A/C C/C CNTN6 27.57 607

55 rs17513961..T/T LOC222052 rs2993446..T/C T/T LGR6 rs6908897..A/A A/G GJA1 17.93 999

56 rs16862676..C/C TSC22D2 rs1829975..A/A G/G TANK rs7632166..A/C C/C CNTN6 27.95 595

57 rs4723738..A/A G/G STARD3NL rs6703979..T/C T/T C1orf212 rs2073619..C/C T/T CFDP1 41.84 311

58 rs13080378..A/A G/G KBTBD8 rs9957770..A/A G/G LOC653322 rs17513961..T/T LOC222052 28.93 568

59 rs17513961..T/T LOC222052 rs10864734..A/C C/C GALNT2 rs2571490..C/C T/T PCDH7 19.84 904

60 rs6441260..A/A G/G SCHIP1 rs2027871..C/C T/C PKIG rs3732062..A/A SLC1A4 29.66 549

61 rs2716601..T/T LOC441506 rs10488192..C/C TMEM106B rs4638843..C/C G/G MSH2 42.27 302


63 rs7725278..T/T DRD1 rs2253239..T/C T/T LOC391273 rs6695322..C/C T/T PCNXL2 37.27 403

13

64 rs1362578..C/C CBLN1 rs906691..C/C T/C LOC643307 rs7725278..C/C T/T DRD1 26.83 628

65 rs16862676..C/C TSC22D2 rs1829975..G/G TANK rs7632166..A/C C/C CNTN6 29.05 564

66 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs10910651..A/A G/G PCNXL2 37.39 400

67 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs6695322..C/C T/T PCNXL2 37.39 400



70 rs17513961..T/T LOC222052 rs168841..A/A G/G CARHSP1 rs2028548..C/C T/C IDH3A 21.46 827

71 rs1915279..A/A A/G KRT18P32 rs12159200..A/A MKL1 rs17098287..C/C T/T AK5 41.33 320

72 rs7725278..T/T DRD1 rs2253239..T/C T/T LOC391273 rs12624775..T/T C20orf82 41.82 310

73 rs2716601..T/T LOC441506 rs4638843..C/C MSH2 rs2409571..A/A A/G TNKS 41.75 311

74 rs12652255..A/A C/C DRD1 rs8121302..C/C T/C SNX5 rs7916830..T/C T/T BTRC 29.85 542

75 rs7292804..G/G T/T MKL1 rs6746008..T/G T/T FAM49A rs10093480..C/C T/T PLEKHF2 26.02 653

76 rs17513961..T/T LOC222052 rs6908897..A/A A/G GJA1 rs2571490..C/C T/T PCDH7 19.15 936


78 rs6462817..C/C T/T STARD3NL rs3934594..C/C T/C C9orf47 rs10496695..A/A G/G NULL 29.82 542

79 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs3820116..G/G PCNXL2 36.32 419

80 rs1420589..C/C CBLN1 rs906691..C/C T/C LOC643307 rs7725278..C/C T/T DRD1 26.57 633

81 rs4723738..A/A G/G STARD3NL rs6703979..T/C T/T C1orf212 rs247447..A/A TMEM170 41.90 307

82 rs12652255..A/A C/C DRD1 rs2468133..T/C T/T EXT1 rs8121302..C/C T/C SNX5 30.40 527

83 rs1362578..C/C CBLN1 rs906691..C/C T/C LOC643307 rs12652255..A/A C/C DRD1 26.33 640

84 rs17513961..T/T LOC222052 rs2993446..T/C T/T LGR6 rs2028548..C/C T/C IDH3A 18.49 966

85 rs6462817..C/C T/T STARD3NL rs4899362..C/C T/C TTC9 rs2294360..A/G G/G SMCR7L 30.13 533

14

86 rs5995886..A/A G/G MKL1 rs6746008..T/G T/T FAM49A rs10093480..C/C T/T PLEKHF2 26.12 647

87 rs17513961..T/T LOC222052 rs1555772..T/C T/T GJA1 rs2571490..C/C PCDH7 19.53 912

88 rs16862676..C/C T/T TSC22D2 rs2993446..T/C T/T LGR6 rs10778233..A/A A/G LOC642550 19.27 925

89 rs4140512..A/A G/G MKL1 rs6746008..T/G T/T FAM49A rs10093480..T/T PLEKHF2 26.61 630

90 rs1420589..C/C CBLN1 rs906691..C/C T/C LOC643307 rs12652255..A/A C/C DRD1 26.12 646

91 rs16862676..C/C T/T TSC22D2 rs1829975..G/G TANK rs7632166..A/C C/C CNTN6 28.54 573

92 rs7725278..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs12624775..T/T C20orf82 41.08 322

93 rs2716601..T/T LOC441506 rs4638843..C/C G/G MSH2 rs2409571..A/A A/G TNKS 41.27 318

94 rs583582..T/C T/T KRT8P5 rs426357..A/G ST8SIA6 rs2173399..T/C T/T LOC729708 41.96 304

95 rs17172693..C/C LOC222052 rs1555772..T/C T/T GJA1 rs2571490..C/C T/T PCDH7 19.94 888

96 rs1829975..A/A G/G TANK rs2033483..C/C T/T ZNF317 rs1503938..A/A A/G GALNTL4 22.11 794

97 rs966775..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs3820116..G/G PCNXL2 35.35 433


99 rs10500413..G/G LOC644649 rs8121302..C/C T/C SNX5 rs12652255..A/A C/C DRD1 31.28 505

100 rs1829975..A/A G/G TANK rs11656883..T/C T/T KCNJ16 rs2046545..A/A G/G ACTR3 24.59 700

Abbreviations: ARR=absolute risk reduction; GWAS=genome-wide association study; ID=combination marker identification (based on ranking using the LOESS method); LOESS=local polynomial regression fitting; N=subgroup size; SNP=single nucleotide polymorphism.

15

Supplementary Table 6. One hundred highest-ranked 3-SNP genetic single markers from the multimarker response. ID SNP Gene ARR N

1 rs12772424..T/T TCF7L2 20.39 509

2 rs2716601..T/T LOC441506 19.24 569

3 rs12951391..T/C FLJ37644 20.82 464

4 rs6462817..C/C T/T STARD3NL 16.52 761

5 rs497836..T/G GPC6 18.06 622

6 rs1078679..C/C HMGN4 21.13 421

7 rs6456733..C/C HMGN4 21.13 421

8 rs589258..T/C GPC6 17.85 621

9 rs10484442..A/A HMGN4 21.05 420

10 rs7204404..A/A KIAA1576 22.98 331

11 rs7205198..A/A KIAA1576 22.98 331

12 rs4723738..A/A G/G STARD3NL 16.43 727

13 rs2159499..A/A G/G STARD3NL 16.06 759

14 rs1915279..A/A A/G KRT18P32 17.30 646

15 rs767471..G/G HMGN4 20.87 421

16 rs9461271..G/G HMGN4 20.87 421

17 rs174957..A/G WIPF3 16.73 686

18 rs6925895..G/G HMGN4 20.71 422

19 rs1570061..T/T ABT1 20.55 430

20 rs1677991..A/G PPP2R5C 20.85 414

21 rs7795499..C/C T/T STARD3NL 15.73 758

22 rs359447..T/C CPEB4 18.86 517

23 rs9467782..G/G HMGN4 20.70 412

24 rs2224380..G/G HMGN4 20.48 422

25 rs6941022..T/T HMGN4 20.48 422

26 rs12159200..A/A MKL1 13.84 924

27 rs993629..C/C DEFB112 23.06 310

28 rs1321482..T/T ABT1 20.25 429

16

29 rs6918854..G/G ABT1 20.25 429

30 rs2716555..C/C LOC441506 20.22 427

31 rs2256537..T/C PPP2R5C 19.75 453

32 rs6564472..C/C KIAA1576 21.97 346

33 rs4871..C/C HMGN4 20.38 417

34 rs10946835..C/C HMGN4 20.44 414

35 rs9986382..T/T HMGN4 20.25 419

36 rs1535276..T/T BTN1A1 20.26 415

37 rs9393729..C/C BTN1A1 20.26 415

38 rs3917490..G/G PON1 22.19 327

39 rs11772060..T/G T/T LMTK2 17.66 564

40 rs1056667..T/T BTN1A1 20.18 414

41 rs2638458..C/C T/C LOC727878 14.25 841

42 rs12495889..T/G NAALADL2 17.34 586

43 rs4729408..T/C LMTK2 18.77 491

44 rs6456735..G/G ABT1 20.17 411

45 rs3803237..T/C T/T COL4A2 18.86 484

46 rs529998..C/C VCAN 19.70 436

47 rs2159688..A/A C7orf16 16.78 618

48 rs4312025..A/A A/C TMEM16C 21.90 333

49 rs2393670..A/A HMGN4 20.10 413

50 rs9467783..C/C HMGN4 20.10 413

51 rs611003..A/A CCND1 20.98 372

52 rs706723..A/A LOC728399 20.95 373

53 rs1982774..C/C CCND1 18.99 473

54 rs137749..A/G G/G EFCAB6 22.05 326

55 rs10869665..T/C PCSK5 17.53 563

56 rs7619971..A/A C/C CLSTN2 14.57 805

57 rs6505113..G/G MYO18A 18.45 503

58 rs2159688..A/A G/G C7orf16 14.75 787

59 rs12159200..A/A C/C MKL1 12.67 986

17

60 rs12652255..A/A C/C DRD1 12.04 1050

61 rs1829975..A/A G/G TANK 11.99 1055

62 rs7894316..C/C T/C PITRM1 15.47 711

63 rs12573176..A/C LOC645120 20.40 393

64 rs9295695..T/T HMGN4 19.94 414

65 rs17057618..A/A A/C ZFAND5 22.28 314

66 rs12652255..C/C DRD1 12.33 1014

67 rs11772060..T/G LMTK2 18.50 493

68 rs1894629..T/C T/T EFCAB6 21.78 330

69 rs4729408..T/C T/T LMTK2 17.37 562

70 rs17724172..T/C DLGAP1 19.62 427

71 rs7796440..C/C T/C DGKB 12.99 933

72 rs592483..T/T CCND1 18.65 480

73 rs1001594..A/G PCLO 15.55 691

74 rs426357..A/G ST8SIA6 16.76 602

75 rs10006329..T/T LOC646187 21.42 342

76 rs1478842..T/C T/T KRT8P18 15.13 729

77 rs993691..C/C T/T CLSTN2 14.91 752

78 rs16953047..T/G FTO 21.02 359

79 rs4754011..A/G JRKL 18.74 471

80 rs10895059..T/C T/T JRKL 18.67 475

81 rs1677991..A/G G/G PPP2R5C 18.02 512

82 rs4754011..A/A A/G JRKL 17.78 526

83 rs1897833..A/C CETN3 21.72 326

84 rs1915279..A/G KRT18P32 17.69 530

85 rs739234..A/A A/G EFCAB6 21.50 334

86 rs12941303..T/C FLJ37644 18.94 456

87 rs3952709..A/G CETN3 22.24 309

88 rs2394824..A/A A/G LMTK2 17.15 565

89 rs6895193..A/A G/G PRLR 14.14 811

90 rs1894631..A/A A/G EFCAB6 21.38 336

18

91 rs11918193..T/C SERPINI2 15.50 678

92 rs11263509..C/C CCND1 16.00 639

93 rs16862676..C/C TSC22D2 10.93 1132

94 rs16862676..C/C T/T TSC22D2 10.69 1158

95 rs17176958..A/G GVIN1 19.47 420

96 rs6953258..C/C LOC645973 20.97 352

97 rs11128801..A/G G/G LOC644638 15.68 659

98 rs17513961..T/T LOC222052 10.22 1207

99 rs12951391..T/C T/T FLJ37644 17.34 542

100 rs6780177..T/C ZNF385D 21.16 342

Abbreviations: ARR=absolute risk reduction; GWAS=genome-wide association study; ID=combination marker identification (based on ranking using the LOESS method); LOESS=local polynomial regression fitting; N=subgroup size; SNP=single nucleotide polymorphism.

19

Supplementary Table 7. Listing of the 20 single markers above the LOESS line for prognostic markers after the first stage. A cutoff was used for continuous variables based on what has been previously defined and accepted in the critical care field or by the regulatory agencies (for example, APACHE II score ≥25 or multiple organ dysfunction ≥2) or determined via a recursive partition method. ID Subgroup Marker Risk N

1 blcrtclr..<52.34 BL Creatinine Clearance (Cockcroft-Gault) 46.36 302

2 prapache..≥25 Pre-infusion APACHE II Score 43.23 347

3 age..q4 Age at Admission (yrs) 47.34 169

4 age..q3 q4 Age at Admission (yrs) 41.13 355

5 blpcact..q1 BL Protein C Activity 46.63 163

6 rs11657397..A/A A/G C17orf79 47.90 119

7 blpt..q4 BL Prothrombin Time 45.06 162

8 rs2908435..T/T NXPH3 48.51 101

9 rs190254..T/C TSHZ2 47.79 113

10 rs12772424..T/T TCF7L2 41.35 237

11 rs2814471..A/G G/G TMCO1 44.97 149

12 rs6506428..A/A A/G ARHGAP28 41.47 217

13 blsofren..q2 q3 q4 BL SOFA Renal Score 36.72 433

14 rs1454371..A/G G/G FAM92A3 44.10 161

15 rs1800802..T/C MGP 41.63 209

16 rs12610468..C/C OLFM2 41.75 206

17 rs7204404..A/A KIAA1576 43.71 167

18 rs7205198..A/A KIAA1576 43.71 167

19 rs1454371..A/G FAM92A3 44.67 150

20 rs7555523..A/C C/C TMCO1 44.67 150

Abbreviations: APACHE II=acute physiology and chronic health evaluation II; BL=baseline; ID=marker identification (based on ranking using LOESS method); LOESS=local polynomial regression fitting; N=subgroup size; Risk=28-day mortality; SOFA=sequential organ failure assessment.

20

Supplementary Figure 1. Manhattan plots of the GWAS results for genotype AA versus not AA (a) and AB versus not AB (b). GWAS=genome-wide association study.

a. AA vs. not AA

b. AB vs. not AB

21

Supplementary Figure 2. Quantile-Quantile plots from GWAS in the entire genetic cohort.

Quantile-Quantile plots showing the distribution of P values for a homozygous genotype (BB) versus heterozygous (AB) or homozygous for the other allele (AA), or BB versus not BB (a), AA versus not AA (b) and AB versus not AB (c). The vast majority of markers have observed P values below the X=Y line expected due to chance. GWAS=genome-wide association study.

a. BB vs. not BB

b. AA vs. not AA

c. AB vs. not AB

22

Supplementary Figure 3. Multidimensional scaling of the highest-ranking 100, 3-SNP combination markers.

The top 3 multidimensional scaling (MDS) components were used to cluster the highest-ranking 100, 3-SNP combination markers (CMs). Six discrete clusters were observed and are defined largely by a common SNP being present in each cluster. For example, cluster 1 (red cluster) contains 54 CMs; all of which include the DRD1 and LOC39127 genes and the majority of which include PCNXL2. The CMs in cluster 5 are relatively more diverse in comparison, and cluster 5 contains multiple sets of genes; this is visually depicted by the diffuse distribution in the MDS plot (green plot). SNP=single nucleotide polymorphism.

23

Supplementary Figure 4. Effect of permutation size on the variability of the top markers.

From the 1000 permutation results (1-, 2-, 3-marker top results), various bootstrap samples (size: 10, 20, 50, 100, 200, 500, 1000) were generated, repeated 10 times. We then calculated the standard deviation of the residues from the LOESS fit on the combined top results for each bootstrap sample size. LOESS=local polynomial regression fitting.

24

Supplementary Methods.

Methodology for single marker analyses using a traditional association approach

For genetic markers, the following derived variables (to model different modes of inheritance)

were generated for each SNP (single nucleotide polymorphism) passing quality control:

• AA (two major alleles) or Not AA (0 or 1 copy of a major allele)

• Aa (heterozygous) or Not Aa (not heterozygotes)

• aa (two minor alleles) or Not aa (0 or 1 copy of a minor allele)

Logistic regression was used to identify a SNP-by-treatment interaction association with

treatment response. A forward selection procedure determined that age and baseline Acute

Physiology and Chronic Health Evaluation (APACHE) II score was included as covariates in the

model, in addition to treatment. This was completed for the total PROWESS cohort of severe

sepsis patients to maximize the power to see an effect. The analysis was then repeated in the

subgroup of interest, e.g., multiple organ dysfunction (MOD) ≥2 (the indicated population in

Europe for DAA), to examine this subgroup for consistency.

Likewise, analyses were completed on the total cohort population (all ethnicities) as well as the

Caucasian subgroup because of the possibility of population substructure confounding. A

population substructure adjustment was not included in the model because of the overwhelming

majority of Caucasians in the data.

A logistic model was fit to 28-day survival. Model 1 was used to test for a therapy*SNP

interaction. The general form of the logistic model was:

bxeP '

11ˆ−+

= [1],

where the response and parameters (contained in the full x’b vector) were defined as

25

P̂ = probability of 28-day survival b0 = is a common intercept bage = effect of age bprapache = effect of baseline apache score btherapy = main effect of therapy bSNP = main effect of SNP btherapy*SNP = therapy*SNP interaction.

Log likelihood-ratio (LR) tests, in the form of the deviance (D), were used to estimate the P

value of therapy*SNP interaction effects (2-sided at 0.05 level) in the presence of additional

regressors. The deviance is asymptotically equivalent to taking the difference in the log

likelihoods of full and reduced models, LRfull model – LRreduced model. For example, when the

therapy*SNP interaction is eliminated using model 1, the resulting deviance D(btherapy*SNP | b0,

bage, bprapache, btherapy, bSNP) = LRmodel1:full – LRmodel1:reduced represents the amount of variability in the

data explained by the therapy*SNP interaction in the presence of a common intercept, main

effects of age, baseline APACHE II score, therapy, and SNP, and is distributed as a chi-square

with 2 degrees-of-freedom for a SNP with 3 levels (if a marker is tested as AA versus not AA,

then the therapy*marker interaction is distributed as a chi-square with 1 degree-of-freedom).

Similarly, logistic regression was used to identify a prognostic marker. Logistic regression tests

were used to estimate the P value of marker effect (2-sided at 0.05 level) in the presence of

additional regressors.

Methodology for the multimarker search

For clinical markers, important subgroups of patients as defined by clinical criteria such as

APACHE II ≥25 and multiorgan dysfunction (MOD) ≥2 were included as factors in the modeling

so as to identify their interaction and relative importance in the presence of genetic and other

non-genetic markers.

26

In addition, clinical markers, such as protein C levels, interleukin 6 (IL-6) levels, disseminated

intravascular coagulation (DIC) score, thrombin-antithrombin (TAT) complex, baseline

Cockcroft-Gault estimation of creatinine clearance <52.3, were evaluated as single markers as

described below in Step 1 and could be included in the modeling as described in Step 2 (see

below).

For selected clinical markers, a CUTOFF may be used that has been previously defined and

accepted in the critical care field or by the regulatory agencies (e.g., APACHE II score ≥25 or

MOD ≥2) or determined via a recursive partition method. The following derived variable was

generated for selected clinical markers or additional subgroups:

• ≥ (or >) CUTOFF or < (or ≤) CUTOFF

We could further subset clinical data and derived variables (subgroup of patients) by

quartiles to generate subgroups that could include:

• X ≤ 1st quartile or X > 1st quartile

• X ≤ 1st quartile, 1st quartile < X ≤ 3rd quartile or X > 3rd quartile

• X ≤ 1st quartile, 1st quartile < X ≤ 2nd quartile, 2nd quartile < X ≤ 3rd quartile or

X > 3rd quartile

• X ≤ 2nd quartile or X > 2nd quartile

• X ≤ 3rd quartile or X > 3rd quartile

Each subgroup defined by one marker (genetic, clinical, or subgroup) or combination of markers

with sufficient size (>20% of total population) was assessed for differential treatment response

(as defined by absolute risk reduction [ARR] or number needed to treat [NNT]). To reduce

computation time, ARR or NNT could be calculated as observed instead of as model based and

the confidence interval of ARR or NNT may not be generated.

27

A 2-step method was used to identify markers that define subgroups with better differential

treatment response in a sizable number of subjects23-25,28. (Marchini et al. 2005 proposed a 2-

stage approach and subsequently other authors (Evans et al 2006; Ionita and Man 2006; and

Zhang et al. 2010) recommended alternative approaches shown to have more power. In this

manuscript we utilized a 2-step conditional approach augmented by LOESS and MDS

strategies.)

• Step 1. Calculated ARR or NNT for each subgroup defined by a single marker. ARR could be calculated as observed difference in survival rate in treatment and placebo group. Alternatively, ARR could be determined from a logistic model with appropriate covariates, e.g., logit (survival) ~ treatment + marker + treatment*marker + age + additional covariates (first few principle components or other covariates). The calculation of observed ARR is computationally more efficient and was therefore the primary analysis. NNT can be calculated as the inverse of ARR.

• Step 2. Selected the top markers from the first step (selected on the merit of best balancing maximal ARR and subgroup size). The number of markers (M) that were selected from the first step were dependent on the computing capacity since the maximal computation is bound by M*N (N is the total of derived marker variables). If M is 1000 and N is 5 million, the total number of subgroups was 5 billion. If computer resources allowed 1 billion calculations, then M was needed to be reduced to 200. Each “significant” marker, whether it was clinical or genetic, was paired with another marker to define a new subgroup, then the ARR or NNT was calculated for the new subgroup. Step 2 could be iterated to include one or more additional markers.

Because multiple subgroups defined by many markers were examined, a chance finding of a

subgroup with favorable ARR and subgroup size could occur. Bonferroni correction is commonly

used for multiple comparisons. The total number of subgroups that could be examined may be

as high as 1012. Using N=1012 for Bonferroni correction would virtually guarantee that no

significant result would be found due to the diminished power with the current sample size.

Hence, it was not reasonable to adjust the P value based on total number of all possible

subgroups or tests. For this reason, it has been argued that the P value threshold should not be

28

based on the number of tests performed since larger studies with more markers are more likely

to succeed than studies with smaller number of markers30. In this study, we used ARR

(differential treatment effect) instead of P value as selection criteria for the 2-step search. We

selected subgroups with ARR >12.5% and subgroup frequency >20%. The total number of

selected subgroups was the M and the subgroups may not be independent.

Methodology for linkage disequilibrium analyses

Linkage disequilibrium (LD) is a population genetics measure of the non-random association of

alleles at two or more loci. The pair-wise LD structure, as determined by the r2 correlation

coefficient among SNPs, was calculated and visualized using Haploview software33. The Gabriel

method34 was then used to infer population-based haplotype blocks (that is, genomic regions

over which there is little evidence for historical recombination and within which only a limited

number of haplotypes are observed). Haploview default settings were used, which assumes no

LD among SNPs that are more than 500 kb apart. Linkage disequilibrium plots using all SNPs

identified in the analyses were generated by chromosome in the overall cohort and Caucasian

population for every chromosome with at least 2 SNPs represented in the top 100 CMs.

Methodology for permutation testing (and modified prognostic markers)

Permutation – treatment response

The permutation runs to establish a reference distribution for the top 100 combination markers

from GWAS for treatment response were performed as follows:

1. Fix patient info, candidate gene data, and response (Xp, Xc, and Y, respectively), shuffle

label for GWAS genetic data (Xg). e.g.,

Y1, Xp1, Xc1, Xg1 ≥ Y1, Xp1, Xc1, Xg1' (1' is a shuffled/permutated patient)

2. Perform 2-stage search and save 1-, 2-, 3-marker top results (use cutoff: arr.minus.fit > 0).

3. Iterate 1000 times (permutation runs).

29

4. All permutation results (top hits) are combined. Due to the skewness of the data, a LOESS-

smoothed line based on the top 95% (or 99%) percentile for each N was generated to determine

whether observed top GCMs fall outside the 95% line.

GWAS for prognostic markers

Identified markers for risk of 28-day mortality in placebo population using 2-stage multimarker

search. Top markers were selected based on the LOESS method that balances the tradeoff

between risk (28-day mortality rate in placebo) and subgroup size defined by the marker.

a. at least 100 subjects (~14%, out of 720 subjects on placebo) and

b. at least 36% mortality (approximately 20% increase over the overall placebo population's

mortality rate) and

c. at least –0.01 (or 0 for 3-marker) relative to LOESS line.

Permutation – prognostic markers

The permutation runs to establish a reference distribution for the top markers identified in

GWAS for prognostic markers was performed as follows:

1. Shuffle survival status and therapy together (Y, T) while fixing patient information, candidate

gene data (Xp and Xc), and GWAS genetic data (Xg). e.g.,

(Y1,T1), Xp1, Xc1, Xg1 ≥ (Y1‘,T1‘), Xp1, Xc1, Xg1 (1' is a shuffled/permutated patient)

2. Perform 2-stage search and save 1-, 2-, 3-marker top results (use cutoff: arr.minus.fit > 0).

3. Iterate 1000 times (permutation runs).

4. All permutation results (top hits) are combined. Due to the skewness of the data, a LOESS-

smoothed line basing on top 95% (or 99%) percentile for each N was generated to determine

whether observed top CMs fall outside the 95% line.

30

Supplementary References

23. Marchini J, Donnelly P, Cardon LR. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet 2005; 37: 413-417. 24. Evans DM, Marchini J, Morris AP, Cardon LR. Two-stage two-locus models in genomewide association. PLoS Genet 2006; 2: e157. 25. Ionita I, Man M. Optimal two-stage strategy for detecting interacting genes in complex diseases. BMC Genet 2006; 7: 39. 28. Zhang Z, Niu A, Sha Q. Identification of interacting genes in genome-wide association studies using a model-based two-stage approach. Ann Hum Genet 2010; 74: 406-415. 30. [WTCCC] Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 2007; 447: 661-678. 31. Fellay J, Shianna KV, Ge D, Colombo S, Ledergerber B, Weale M et al. A whole-genome association study of major determinants for host control of HIV-1. Science 2007; 317: 944-947. 32. Sullivan PF, Lin D, Tzeng JY, van den Oord E, Perkins D, Stroup TS et al. Genomewide association for schizophrenia in the CATIE study: results of stage 1. Mol Psychiatry 2008; 13: 570-584. 33. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 2005; 21: 263-265. 34. Gabriel SB, Schaffer SF, Nguyen H, Moore JM, Roy J, Blumenstiel B et al. The structure of haplotype blocks in the human genome. Science 2002; 296: 2225-2229.

Documents

Supplementary Information - media.nature.com · Supplementary Information ... Durham, NC 27710, USA 4 Division of Allergy, Pulmonary and Critical Care, Vanderbilt University School