Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
1
Supplementary Information
Beyond single marker analyses: Mining whole genome scans for insights into treatment responses in severe sepsis
Michael Man, PhD1, Sandra L. Close, PhD2, Andrew D. Shaw, MD3, Gordon R. Bernard, MD4, Ivor S. Douglas, MD5, Robert J. Kaner, MD6, Didier Payen, MD, PhD7, Jean-Louis Vincent, MD, PhD8, Stewart Fossceco, PhD1, Jonathan M. Janes, FRCP1, Amy G. Leishman, PhD1, Lee O’Brien, PhD1, Mark D. Williams, MD9, and Joe G. N. Garcia, MD10.
Affiliations 1 Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN 46285, USA 2 Department of Clinical Pharmacology, Indiana University School of Medicine, Indianapolis, IN 46202, USA 3 Department of Anesthesiology, Duke University Medical Center, Durham, NC 27710, USA 4 Division of Allergy, Pulmonary and Critical Care, Vanderbilt University School of Medicine, Nashville, TN 37232, USA 5 Denver Health Medical Center and University of Colorado, School of Medicine Denver, CO 80204, USA 6 Division of Pulmonary and Critical Care Medicine, Departments of Medicine and Genetic Medicine, Weill Cornell Medical College, New York, NY 10021, USA. 7 Department of Anesthesiology & Critical Care & SAMU, University Paris 7, Hôpital Lariboisière, 2 rue Ambroise Paré, 75074 Paris, France 8 Department of Intensive Care, Erasme University Hospital, Université Libre de Bruxelles, Route de Lennik 808, 1070 Brussels, Belgium 9 BioCritica, Inc., P.O. Box 40967, Indianapolis, IN 46240, USA 10 Institute for Personalized Respiratory Medicine, University of Illinois at Chicago, Chicago, IL 60612, USA
Corresponding author:
Name: Michael Man, PhD
Address: Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN 46285, USA
Phone: +1-317-276-5796 Fax: +1-317-651-8834 Email: [email protected]
2
Beyond single marker analyses: Mining whole genome scans for insights into treatment responses in severe sepsis
Supplementary Note, Tables, Figures and Methods
Contents
Supplementary Note Genetic data quality control
Supplementary Table 1 Number of patient samples removed for quality control issues (in Supplementary Note)
Supplementary Table 2 Number of SNPs removed for quality control issues (in
Supplementary Note)
Supplementary Table 3 Table listing all clinical or other markers included in the analysis
Supplementary Table 4 Top single markers with logistic regression P values
Supplementary Table 5 One hundred highest-ranked 3-SNP genetic combination markers from the GWAS for treatment response
Supplementary Table 6 One hundred highest-ranked 3-SNP genetic single markers from the multimarker response
Supplementary Table 7 Listing of the 20 single markers above the LOESS line for prognostic markers after the first stage
Supplementary Figure 1 Manhattan plots of the GWAS results for genotype AA versus not AA and AB versus not AB
Supplementary Figure 2 Quantile-Quantile plots from GWAS in the entire genetic cohort
Supplementary Figure 3 Multidimensional scaling of the highest-ranking 100, 3-SNP combination markers
Supplementary Figure 4 Effect of permutation size on the variability of the top markers
Supplementary Methods Methodology for single marker analyses using a traditional association approach
Methodology for the multimarker search
Methodology for linkage disequilibrium analyses
Methodology for permutation testing (and modified prognostic markers)
3
Supplementary Note. Genetic data quality control.
Genetic data quality control were performed for each patient sample and for each genetic
marker. The following criteria were needed to be met for samples or single nucleotide
polymorphisms (SNPs) to be included in analyses30-32.
Patient Samples
• At least an 80% genotyping success rate (data for at least 80% of variants on BeadChip)
• Be unique (remove unintentional duplicates)
• Have high genotyping concordance between the 199 SNPs that were common to both
the CGS (candidate gene study, previously performed on genes thought to be involved
in the biology of sepsis) and GWAS
• Pass gender check using multidimensional scaling (MDS); the sample needed to be in a
cluster consistent with its reported gender.
Sufficient quantities of DNA were available for genotyping studies from 1568 subjects. Twelve
samples were included as duplicates to assess batch effects. A total of 1580 samples were sent
to be genotyped (1568 +12). Of the 1580 results received, 16 samples did not have clinical
data, 67 samples were not analyzed from France, due to restrictions on their use, and an
additional 51 samples did not pass internal quality control assessments. In total, 1446 samples
were available for analyses (Supplementary Table 1).
4
Supplementary Table 1. Number of patient samples removed for quality control issues.
QC Reason for Removing Samples Samples Removed
Samples Remaining
Samples with insufficient quantities of DNA for genotyping 0 1568
Samples sent for genotyping (12 duplicates) (+12) 1580
Samples removed because clinical data was not collected 16 1564
Samples from France removed due to restrictions 67 1497
Samples with genotyping success rate <80% 8 1489
Samples with gender misidentified 9 1480
Remaining duplicate samples removed 9 1471
Samples with inconsistent results between CGS and GWAS 25 1446
Abbreviations: CGS=candidate gene study; GWAS=genome-wide association study; QC=quality control.
Single Nucleotide Polymorphisms
All SNPs were included in the initial analysis then combination markers (CMs) containing SNPs
meeting the following criteria were kept:
• Missing rate ≤10%
• Minor allele frequency (MAF) ≥5%
• Hardy Weinberg equilibrium (HWE) P values >10-15
Samples were run in 3 batches, using 2 different amplification kits. Logistic regression was used
to determine batch and amplification kit effects. Any SNPs with P values >10e-6 were kept for
further analysis. SNP quality control was performed for the 1 199 187 variants included on the
BeadChip; of these 856 627 SNPs passed the criteria (Supplementary Table 2).
5
Supplementary Table 2. Number of SNPs removed for quality control issues.
QC Criteria Number of SNPs Removed Number of SNPs Remaining
All variants on BeadChip 0 1,199,187
Monoallelic variants 70,172 1,129,015
SNPs with success rate <90% 29,609 1,099,406
SNPs with MAF <5% 208,054 891,352
SNPs with HWE p value <1e-15 34,654 856,698
SNPs with batch or kit effect 71 856,627 Abbreviations: HWE=Hardy Weinberg equilibrium; MAF=minor allele frequency; QC=quality control; SNP=single nucleotide polymorphism.
All 1 199 187 variants were included in the initial analysis, and then CMs containing SNPs that met the quality control criteria were kept.
6
Supplementary Table 3. Table listing all clinical or other markers included in the analysis.
Name Label AGE Age at Admission (years) GENDER Gender ORIGIN Racial Origin PRAPACHE Pre-infusion APACHE II Score ORGANNUM Number of Baseline Organ Failures RECSURG Recent Surgery (w/in last 30 days) BLGCS Baseline Glasgow Coma Score BLGCSEYE Baseline GCS Eye Score BLGCSVRB Baseline GCS Verbal Score BLGCSMOT Baseline GCS Motor Score BLHPRN BL Heparin Exposure BLSTER BL Steroid Exposure BLSHOCK BL Shock Status BLPCACT BL Protein C Activity BLDIC BL DIC Status BLVENT BL Ventilation Status BLVASO BL Vasopressor Status BLSOFCV BL SOFA Cardiovascular Score BLSOFHEM BL SOFA Hematology Score BLSOFHEP BL SOFA Hepatic Score BLSOFREN BL SOFA Renal Score BLSOFRES BL SOFA Respiration Score BLAPTT BL APTT BLAT3 BL AT3 Activity BLCPLAT BL Central Lab Platelet Count BLDDIMER BL d-Dimer BLIL6 BL IL-6 BLPSACT BL Protein S Activity BLPT BL Prothrombin Time BLADL76 BL ADL Level (Katz 76) BLWEIGHT BL Weight kg BLCRTCLR BL Creatinine Clearance (Cockcroft-Gault) BLARDS BL ARDS Status
Abbreviations: ADL=activities of daily living; APACHE II=acute physiology and chronic health evaluation II; APTT=activated partial thromboplastin time; ARDS=acute respiratory distress syndrome; AT=antithrombin; BL=baseline; DIC=disseminated intravascular coagulation; GCS=Glasgow coma score; IL=interleukin; SOFA=sequential organ failure assessment.
7
Supplementary Table 4. Top single markers with logistic regression P values.
Presenting unadjusted P values by SNP genotype for the total PROWESS cohort. SNPs with at least one model showing significance (P<0.0001) are presented. SNPs are sorted by chromosome and position. One marker, rs17513961 in the 5’ untranslated region of LOC222052 (located near the insulin-like growth factor I binding protein [IGF1-BP)] gene), met the threshold for genome-wide significance (P<5x107) (shown in blue). Inadequate sample size for modeling donated by a ‘-‘.
Alleles trt*SNP P value trt*SNP P value trt*SNP P value SNP Chr Position (A:B) Gene Location (AA vs. not AA) (AB vs. not AB) (BB vs. not BB) rs2076977 1 30346745 C:T MATN1 flanking_3UTR 3.6E-05 0.0018 0.2066 rs2173399 1 76982973 C:T LOC729708 flanking_3UTR 6.4E-05 0.0335 0.0442 rs2942917 1 194045294 C:T KCNT2 flanking_3UTR 9.5E-05 0.0376 0.0342 rs1915279 1 237142337 A:G KRT18P32 flanking_3UTR 0.4256 2.0E-05 2.2E-06 rs12618741 2 114834542 C:T LOC391428 flanking_5UTR 6.8E-05 0.0067 0.0146 rs1829975 2 161493415 A:G TANK flanking_5UTR 0.1425 6.0E-06 0.0002 rs1605461 2 189264326 C:T DIRC1 flanking_3UTR 0.0364 2.9E-05 0.0078 rs9815663 3 3589887 C:T LOC728221 flanking_5UTR 8.8E-05 0.0017 0.0513 rs2728981 3 22228198 A:G LOC728516 flanking_3UTR 9.8E-05 0.0372 0.1255 rs1478842 3 35127164 C:T KRT8P18 flanking_3UTR 7.8E-05 0.0034 0.0520 rs1920386 3 117259269 A:G LSAMP intron 6.8E-05 0.4444 0.1707 rs993691 3 141603432 C:T CLSTN2 intron 0.0345 2.4E-05 0.0007 rs7619971 3 141610022 A:C CLSTN2 intron 0.0008 5.5E-05 0.0379 rs16862676 3 151661629 C:T TSC22D2 flanking_3UTR 5.8E-05 3.7E-05 0.8099 rs6446731 4 3254549 A:G LOC345222 flanking_3UTR 0.9224 7.4E-05 0.0001 rs4525972 4 19101354 A:G LOC645174 flanking_5UTR 0.0013 9.8E-05 0.2492 rs6835841 4 19102675 G:T LOC645174 flanking_5UTR 0.0013 6.4E-05 0.1936 rs4608848 4 187247098 C:T TLR3 flanking_3UTR 5.7E-05 0.4568 0.0341 rs7702195 5 2142221 C:T IRX4 flanking_5UTR 0.0583 9.2E-05 0.0072 rs1897833 5 89684874 A:C CETN3 flanking_3UTR 0.1335 5.7E-05 0.0006 rs3952709 5 89688112 A:G CETN3 flanking_3UTR 0.0011 7.1E-05 - rs13153368 5 119045408 C:T LOC391824 flanking_5UTR 0.0047 8.1E-05 0.0892 rs359447 5 173203769 C:T CPEB4 flanking_5UTR 0.0006 2.5E-05 0.2598 rs7725278 5 174693553 C:T DRD1 flanking_3UTR 0.9190 6.6E-05 8.4E-05 rs12652255 5 174693701 A:C DRD1 flanking_3UTR 0.5204 1.4E-05 8.1E-06 rs9460540 6 20756741 A:G CDKAL1 intron 7.4E-05 0.0025 0.2252 rs10946835 6 26641736 C:T HMGN4 flanking_5UTR 8.9E-05 0.0013 0.6683
8
rs10484442 6 26663858 A:G HMGN4 flanking_3UTR 3.1E-05 0.0005 0.7141 rs6456733 6 26674783 C:T HMGN4 flanking_3UTR 2.8E-05 0.0004 0.7569 rs1078679 6 26676720 C:T HMGN4 flanking_3UTR 3.0E-05 0.0005 0.7053 rs6925895 6 26680144 G:T HMGN4 flanking_3UTR 4.0E-05 0.0006 0.7171 rs3130454 6 31216464 A:G PSORS1C1 flanking_3UTR 0.0005 7.4E-05 0.4521 rs6458690 6 49519578 A:G MUT intron 7.7E-05 0.0126 0.0433 rs6922979 6 99734696 A:G C6orf168 flanking_3UTR 0.2709 1.7E-05 0.0003 rs9488845 6 116559351 A:C COL10A1 flanking_5UTR 0.0364 4.3E-05 0.0139 rs174948 7 29855758 A:G WIPF3 intron 0.0055 6.9E-05 0.0969 rs174957 7 29876752 A:G WIPF3 intron 0.0150 3.9E-05 0.0236 rs2159688 7 31722887 A:G C7orf16 flanking_3UTR 8.4E-05 4.8E-05 0.9550 rs2159499 7 38195427 A:G STARD3NL intron 0.0234 1.7E-05 0.0040 rs4723738 7 38207433 A:G STARD3NL intron 0.0072 6.2E-06 0.0090 rs6462817 7 38217390 C:T STARD3NL intron 0.0044 1.1E-05 0.0180 rs7795499 7 38234920 C:T STARD3NL intron 0.0468 3.1E-05 0.0023 rs17513961 7 46359092 G:T LOC222052 flanking_5UTR - 8.4E-07 1.3E-07 rs17172693 7 46397198 C:T LOC222052 flanking_5UTR 9.2E-07 3.7E-06 0.4927 rs10275461 7 46416755 G:T LOC222052 flanking_5UTR 6.7E-05 0.0049 0.0086 rs6953258 7 94213744 C:T LOC645973 flanking_3UTR 3.0E-05 0.0670 0.0429 rs2394824 7 97577253 A:G LMTK2 intron 0.5935 7.8E-05 4.4E-05 rs4729408 7 97610594 C:T LMTK2 intron 4.4E-05 3.2E-05 0.9316 rs11772060 7 97620757 G:T LMTK2 intron 3.6E-05 5.4E-05 0.6489 rs13273073 8 23640171 G:T NKX2-6 flanking_5UTR 5.3E-05 7.2E-06 0.5101 rs12680523 8 114208773 C:T CSMD3 intron - 2.4E-05 0.0004 rs11984724 8 114323285 C:T CSMD3 intron 0.3105 9.3E-05 0.0006 rs2721952 8 116714231 C:T TRPS1 intron 0.7850 6.5E-05 9.9E-05 rs11786372 8 135561345 C:T ZFAT1 intron 0.0014 9.1E-05 0.2258 rs321224 9 20211607 A:G MLLT3 flanking_3UTR 0.0007 5.8E-05 - rs12380611 9 71838485 C:T MAMDC2 flanking_5UTR 7.3E-05 0.0119 0.2168 rs10869665 9 77708819 C:T PCSK5 intron 0.0028 8.1E-05 0.1098 rs4553010 9 132051053 A:G FREQ flanking_3UTR 3.9E-05 1.3E-05 - rs10999122 10 71507881 C:T H2AFY2 intron 9.3E-05 0.0371 0.0103 rs10500982 11 24505621 A:G LUZP2 intron 0.0002 4.5E-05 0.6083 rs11027980 11 24509005 A:G LUZP2 intron 0.0003 8.0E-05 0.6278 rs10767205 11 24510016 A:C LUZP2 intron 0.1823 6.0E-05 0.0006 rs611003 11 69154465 A:C CCND1 flanking_5UTR 8.0E-05 0.4848 0.0013 rs471814 11 119299719 A:G LOC390255 flanking_5UTR 5.4E-05 0.0476 0.0226 rs7134223 12 72141185 G:T LOC645654 flanking_5UTR 0.9899 7.9E-05 0.0001 rs1581403 12 72882519 A:G LOC387869 flanking_3UTR 3.5E-05 3.0E-05 0.7369
9
rs589258 13 93605357 C:T GPC6 intron 0.0340 4.3E-05 0.0074 rs497836 13 93605509 G:T GPC6 intron 0.0085 3.5E-05 0.0279 rs9587280 13 106480426 C:T LOC728215 flanking_3UTR 9.5E-05 0.0630 0.2311 rs17114618 14 43512877 C:T LOC645086 flanking_3UTR 9.2E-05 0.0002 - rs227397 14 69477522 C:T SMOC1 intron 0.2745 5.9E-05 0.0009 rs1011061 15 53774790 C:T PRTG intron 0.0003 6.0E-05 0.3344 rs8028880 15 53806945 C:T PRTG intron 0.0002 7.6E-05 0.3895 rs168841 16 8865964 A:G CARHSP1 flanking_5UTR - 9.3E-05 0.0009 rs16953047 16 52687671 G:T FTO intron 0.0009 6.9E-05 0.1107 rs2716601 16 72633753 C:T LOC441506 flanking_5UTR 0.3214 4.0E-05 1.0E-06 rs12951391 17 67423177 C:T FLJ37644 flanking_3UTR 0.0009 9.7E-05 0.3207 rs7409 17 71547138 A:G SRP68 3UTR 0.0032 5.9E-05 0.1110 rs2598414 17 71578694 C:T SRP68 intron 0.0035 2.5E-05 0.0623 rs2246277 17 71592940 C:T EXOC7 intron 0.0332 4.5E-05 0.0137 rs2665981 17 71600906 C:T EXOC7 intron 0.1730 6.4E-05 0.0052 rs2250054 17 71601856 C:T EXOC7 intron 0.0050 6.6E-05 0.1696 rs2243536 17 71601921 C:T EXOC7 intron 0.1713 5.7E-05 0.0044 rs2457693 17 71603745 C:T EXOC7 intron 0.0059 8.3E-05 0.1821 rs4140512 22 39362809 A:G MKL1 flanking_5UTR 3.9E-05 6.2E-05 0.5727 rs5995886 22 39363747 A:G MKL1 flanking_5UTR 0.5751 5.4E-05 3.4E-05 rs6001990 22 39364131 A:G MKL1 flanking_5UTR 4.5E-05 0.0001 0.4000 rs12159200 22 39372037 A:C MKL1 flanking_5UTR 5.9E-07 4.9E-06 0.1831 rs7292804 22 39377502 G:T MKL1 flanking_5UTR 2.0E-05 1.1E-05 0.9585 Abbreviations: Chr=chromosome; PROWESS=Recombinant Human Activated Protein C Worldwide Evaluation in Severe Sepsis trial; SNP=single nucleotide polymorphism.
10
Supplementary Table 5. One hundred highest-ranked 3-SNP genetic combination markers from the GWAS for treatment response.
ID SNP1 Gene1 SNP2 Gene2 SNP3 Gene3 ARR N
1 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs10910651..A/A PCNXL2 41.65 372
2 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs6695322..C/C PCNXL2 41.65 372
3 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs10910651..A/A G/G PCNXL2 39.76 395
4 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs6695322..C/C T/T PCNXL2 39.76 395
5 rs7725278..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs10910651..A/A PCNXL2 39.71 394
6 rs7725278..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs6695322..C/C PCNXL2 39.71 394
7 rs7725278..C/C T/T DRD1 rs3820116..G/G PCNXL2 rs2253239..T/C T/T LOC391273 37.22 437
8 rs7725278..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs3820116..A/A G/G PCNXL2 36.48 449
9 rs7725278..T/T DRD1 rs2256527..T/C T/T LOC391273 rs10910651..A/A PCNXL2 41.30 359
10 rs7725278..T/T DRD1 rs2256527..T/C T/T LOC391273 rs6695322..C/C PCNXL2 41.30 359
11 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs3820116..A/A G/G PCNXL2 36.57 446
12 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs10910651..A/A PCNXL2 39.40 392
13 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs6695322..C/C PCNXL2 39.40 392
14 rs7725278..C/C T/T DRD1 rs3820116..G/G PCNXL2 rs2844315..G/G T/G LOC391273 36.93 435
15 rs7725278..C/C T/T DRD1 rs3820116..G/G PCNXL2 rs2256527..T/C T/T LOC391273 38.35 410
16 rs7725278..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs10910651..A/A G/G PCNXL2 37.80 418
17 rs7725278..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs6695322..C/C T/T PCNXL2 37.80 418
18 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs3820116..A/A G/G PCNXL2 37.52 422
19 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs10910651..A/A G/G PCNXL2 37.92 415
11
20 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs6695322..C/C T/T PCNXL2 37.92 415
21 rs2716601..T/T LOC441506 rs10488192..C/C T/T TMEM106B rs4638843..C/C MSH2 42.91 309
22 rs7725278..T/T DRD1 rs2256527..T/C T/T LOC391273 rs10910651..A/A G/G PCNXL2 39.30 381
23 rs7725278..T/T DRD1 rs2256527..T/C T/T LOC391273 rs6695322..C/C T/T PCNXL2 39.30 381
24 rs7725278..T/T DRD1 rs2253239..T/C T/T LOC391273 rs10910651..A/A PCNXL2 39.27 380
25 rs7725278..T/T DRD1 rs2253239..T/C T/T LOC391273 rs6695322..C/C PCNXL2 39.27 380
26 rs4723738..A/A G/G STARD3NL rs6703979..T/C T/T C1orf212 rs247447..A/A G/G TMEM170 42.31 317
27 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs12624775..T/T C20orf82 42.68 307
28 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs12624775..T/T C20orf82 41.89 319
29 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs10910651..A/A PCNXL2 38.95 378
30 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs6695322..C/C PCNXL2 38.95 378
31 rs7725278..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs10489576..A/A G/G PCNXL2 33.81 469
32 rs1829975..A/A G/G TANK rs2033483..C/C T/T ZNF317 rs2671822..A/A A/G SLFN5 22.95 787
33 rs7725278..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs10489576..G/G PCNXL2 34.21 462
34 rs7725278..T/T DRD1 rs2256527..T/C T/T LOC391273 rs3820116..G/G PCNXL2 37.92 397
35 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs10489576..A/A G/G PCNXL2 33.89 466
36 rs7725278..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs10489576..G/G PCNXL2 34.29 459
37 rs12652255..A/A C/C DRD1 rs8121302..C/C T/C SNX5 rs583582..T/C T/T KRT8P5 33.03 481
38 rs7725278..T/T DRD1 rs2253239..T/C T/T LOC391273 rs3820116..G/G PCNXL2 36.63 421
39 rs7725278..T/T DRD1 rs2253239..T/C T/T LOC391273 rs3820116..A/A G/G PCNXL2 35.88 433
40 rs17513961..T/T LOC222052 rs1555772..T/C T/T GJA1 rs2571490..C/C T/T PCDH7 19.86 919
41 rs966775..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs3820116..A/A G/G PCNXL2 35.10 446
12
42 rs7725278..C/C T/T DRD1 rs8121302..C/C T/C SNX5 rs12530988..C/C T/T GLI3 38.58 382
43 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs16858553..C/C T/T PCNXL2 37.75 399
44 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs16858553..T/T PCNXL2 38.47 384
45 rs2716601..T/T LOC441506 rs10488192..C/C T/T TMEM106B rs4638843..C/C G/G MSH2 41.79 317
46 rs4140512..A/A G/G MKL1 rs6746008..T/G T/T FAM49A rs10093480..C/C T/T PLEKHF2 26.56 646
47 rs966775..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs10910651..A/A PCNXL2 39.09 371
48 rs966775..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs6695322..C/C PCNXL2 39.09 371
49 rs6462817..C/C T/T STARD3NL rs4899362..C/C T/C TTC9 rs3934594..C/C T/C C9orf47 29.71 554
50 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs3820116..A/A G/G PCNXL2 35.98 430
51 rs2716601..T/T LOC441506 rs1394713..A/A G/G OR51D1 rs2400517..C/C LOC730224 42.19 307
52 rs6462817..C/C T/T STARD3NL rs6703979..T/C T/T C1orf212 rs247447..A/A G/G TMEM170 40.96 331
53 rs7725278..T/T DRD1 rs2256527..T/C T/T LOC391273 rs3820116..A/A G/G PCNXL2 37.08 409
54 rs16862676..C/C T/T TSC22D2 rs1829975..A/A G/G TANK rs7632166..A/C C/C CNTN6 27.57 607
55 rs17513961..T/T LOC222052 rs2993446..T/C T/T LGR6 rs6908897..A/A A/G GJA1 17.93 999
56 rs16862676..C/C TSC22D2 rs1829975..A/A G/G TANK rs7632166..A/C C/C CNTN6 27.95 595
57 rs4723738..A/A G/G STARD3NL rs6703979..T/C T/T C1orf212 rs2073619..C/C T/T CFDP1 41.84 311
58 rs13080378..A/A G/G KBTBD8 rs9957770..A/A G/G LOC653322 rs17513961..T/T LOC222052 28.93 568
59 rs17513961..T/T LOC222052 rs10864734..A/C C/C GALNT2 rs2571490..C/C T/T PCDH7 19.84 904
60 rs6441260..A/A G/G SCHIP1 rs2027871..C/C T/C PKIG rs3732062..A/A SLC1A4 29.66 549
61 rs2716601..T/T LOC441506 rs10488192..C/C TMEM106B rs4638843..C/C G/G MSH2 42.27 302
62 rs7725278..T/T DRD1 rs2253239..T/C T/T LOC391273 rs10910651..A/A G/G PCNXL2 37.27 403
63 rs7725278..T/T DRD1 rs2253239..T/C T/T LOC391273 rs6695322..C/C T/T PCNXL2 37.27 403
13
64 rs1362578..C/C CBLN1 rs906691..C/C T/C LOC643307 rs7725278..C/C T/T DRD1 26.83 628
65 rs16862676..C/C TSC22D2 rs1829975..G/G TANK rs7632166..A/C C/C CNTN6 29.05 564
66 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs10910651..A/A G/G PCNXL2 37.39 400
67 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs6695322..C/C T/T PCNXL2 37.39 400
68 rs966775..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs10910651..A/A G/G PCNXL2 37.76 392
69 rs966775..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs6695322..C/C T/T PCNXL2 37.76 392
70 rs17513961..T/T LOC222052 rs168841..A/A G/G CARHSP1 rs2028548..C/C T/C IDH3A 21.46 827
71 rs1915279..A/A A/G KRT18P32 rs12159200..A/A MKL1 rs17098287..C/C T/T AK5 41.33 320
72 rs7725278..T/T DRD1 rs2253239..T/C T/T LOC391273 rs12624775..T/T C20orf82 41.82 310
73 rs2716601..T/T LOC441506 rs4638843..C/C MSH2 rs2409571..A/A A/G TNKS 41.75 311
74 rs12652255..A/A C/C DRD1 rs8121302..C/C T/C SNX5 rs7916830..T/C T/T BTRC 29.85 542
75 rs7292804..G/G T/T MKL1 rs6746008..T/G T/T FAM49A rs10093480..C/C T/T PLEKHF2 26.02 653
76 rs17513961..T/T LOC222052 rs6908897..A/A A/G GJA1 rs2571490..C/C T/T PCDH7 19.15 936
77 rs966775..C/C T/T DRD1 rs2844315..G/G T/G LOC391273 rs3820116..A/A G/G PCNXL2 34.81 444
78 rs6462817..C/C T/T STARD3NL rs3934594..C/C T/C C9orf47 rs10496695..A/A G/G NULL 29.82 542
79 rs7725278..T/T DRD1 rs2844315..G/G T/G LOC391273 rs3820116..G/G PCNXL2 36.32 419
80 rs1420589..C/C CBLN1 rs906691..C/C T/C LOC643307 rs7725278..C/C T/T DRD1 26.57 633
81 rs4723738..A/A G/G STARD3NL rs6703979..T/C T/T C1orf212 rs247447..A/A TMEM170 41.90 307
82 rs12652255..A/A C/C DRD1 rs2468133..T/C T/T EXT1 rs8121302..C/C T/C SNX5 30.40 527
83 rs1362578..C/C CBLN1 rs906691..C/C T/C LOC643307 rs12652255..A/A C/C DRD1 26.33 640
84 rs17513961..T/T LOC222052 rs2993446..T/C T/T LGR6 rs2028548..C/C T/C IDH3A 18.49 966
85 rs6462817..C/C T/T STARD3NL rs4899362..C/C T/C TTC9 rs2294360..A/G G/G SMCR7L 30.13 533
14
86 rs5995886..A/A G/G MKL1 rs6746008..T/G T/T FAM49A rs10093480..C/C T/T PLEKHF2 26.12 647
87 rs17513961..T/T LOC222052 rs1555772..T/C T/T GJA1 rs2571490..C/C PCDH7 19.53 912
88 rs16862676..C/C T/T TSC22D2 rs2993446..T/C T/T LGR6 rs10778233..A/A A/G LOC642550 19.27 925
89 rs4140512..A/A G/G MKL1 rs6746008..T/G T/T FAM49A rs10093480..T/T PLEKHF2 26.61 630
90 rs1420589..C/C CBLN1 rs906691..C/C T/C LOC643307 rs12652255..A/A C/C DRD1 26.12 646
91 rs16862676..C/C T/T TSC22D2 rs1829975..G/G TANK rs7632166..A/C C/C CNTN6 28.54 573
92 rs7725278..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs12624775..T/T C20orf82 41.08 322
93 rs2716601..T/T LOC441506 rs4638843..C/C G/G MSH2 rs2409571..A/A A/G TNKS 41.27 318
94 rs583582..T/C T/T KRT8P5 rs426357..A/G ST8SIA6 rs2173399..T/C T/T LOC729708 41.96 304
95 rs17172693..C/C LOC222052 rs1555772..T/C T/T GJA1 rs2571490..C/C T/T PCDH7 19.94 888
96 rs1829975..A/A G/G TANK rs2033483..C/C T/T ZNF317 rs1503938..A/A A/G GALNTL4 22.11 794
97 rs966775..C/C T/T DRD1 rs2253239..T/C T/T LOC391273 rs3820116..G/G PCNXL2 35.35 433
98 rs7725278..C/C T/T DRD1 rs2256527..T/C T/T LOC391273 rs10489576..A/A G/G PCNXL2 34.86 441
99 rs10500413..G/G LOC644649 rs8121302..C/C T/C SNX5 rs12652255..A/A C/C DRD1 31.28 505
100 rs1829975..A/A G/G TANK rs11656883..T/C T/T KCNJ16 rs2046545..A/A G/G ACTR3 24.59 700
Abbreviations: ARR=absolute risk reduction; GWAS=genome-wide association study; ID=combination marker identification (based on ranking using the LOESS method); LOESS=local polynomial regression fitting; N=subgroup size; SNP=single nucleotide polymorphism.
15
Supplementary Table 6. One hundred highest-ranked 3-SNP genetic single markers from the multimarker response. ID SNP Gene ARR N
1 rs12772424..T/T TCF7L2 20.39 509
2 rs2716601..T/T LOC441506 19.24 569
3 rs12951391..T/C FLJ37644 20.82 464
4 rs6462817..C/C T/T STARD3NL 16.52 761
5 rs497836..T/G GPC6 18.06 622
6 rs1078679..C/C HMGN4 21.13 421
7 rs6456733..C/C HMGN4 21.13 421
8 rs589258..T/C GPC6 17.85 621
9 rs10484442..A/A HMGN4 21.05 420
10 rs7204404..A/A KIAA1576 22.98 331
11 rs7205198..A/A KIAA1576 22.98 331
12 rs4723738..A/A G/G STARD3NL 16.43 727
13 rs2159499..A/A G/G STARD3NL 16.06 759
14 rs1915279..A/A A/G KRT18P32 17.30 646
15 rs767471..G/G HMGN4 20.87 421
16 rs9461271..G/G HMGN4 20.87 421
17 rs174957..A/G WIPF3 16.73 686
18 rs6925895..G/G HMGN4 20.71 422
19 rs1570061..T/T ABT1 20.55 430
20 rs1677991..A/G PPP2R5C 20.85 414
21 rs7795499..C/C T/T STARD3NL 15.73 758
22 rs359447..T/C CPEB4 18.86 517
23 rs9467782..G/G HMGN4 20.70 412
24 rs2224380..G/G HMGN4 20.48 422
25 rs6941022..T/T HMGN4 20.48 422
26 rs12159200..A/A MKL1 13.84 924
27 rs993629..C/C DEFB112 23.06 310
28 rs1321482..T/T ABT1 20.25 429
16
29 rs6918854..G/G ABT1 20.25 429
30 rs2716555..C/C LOC441506 20.22 427
31 rs2256537..T/C PPP2R5C 19.75 453
32 rs6564472..C/C KIAA1576 21.97 346
33 rs4871..C/C HMGN4 20.38 417
34 rs10946835..C/C HMGN4 20.44 414
35 rs9986382..T/T HMGN4 20.25 419
36 rs1535276..T/T BTN1A1 20.26 415
37 rs9393729..C/C BTN1A1 20.26 415
38 rs3917490..G/G PON1 22.19 327
39 rs11772060..T/G T/T LMTK2 17.66 564
40 rs1056667..T/T BTN1A1 20.18 414
41 rs2638458..C/C T/C LOC727878 14.25 841
42 rs12495889..T/G NAALADL2 17.34 586
43 rs4729408..T/C LMTK2 18.77 491
44 rs6456735..G/G ABT1 20.17 411
45 rs3803237..T/C T/T COL4A2 18.86 484
46 rs529998..C/C VCAN 19.70 436
47 rs2159688..A/A C7orf16 16.78 618
48 rs4312025..A/A A/C TMEM16C 21.90 333
49 rs2393670..A/A HMGN4 20.10 413
50 rs9467783..C/C HMGN4 20.10 413
51 rs611003..A/A CCND1 20.98 372
52 rs706723..A/A LOC728399 20.95 373
53 rs1982774..C/C CCND1 18.99 473
54 rs137749..A/G G/G EFCAB6 22.05 326
55 rs10869665..T/C PCSK5 17.53 563
56 rs7619971..A/A C/C CLSTN2 14.57 805
57 rs6505113..G/G MYO18A 18.45 503
58 rs2159688..A/A G/G C7orf16 14.75 787
59 rs12159200..A/A C/C MKL1 12.67 986
17
60 rs12652255..A/A C/C DRD1 12.04 1050
61 rs1829975..A/A G/G TANK 11.99 1055
62 rs7894316..C/C T/C PITRM1 15.47 711
63 rs12573176..A/C LOC645120 20.40 393
64 rs9295695..T/T HMGN4 19.94 414
65 rs17057618..A/A A/C ZFAND5 22.28 314
66 rs12652255..C/C DRD1 12.33 1014
67 rs11772060..T/G LMTK2 18.50 493
68 rs1894629..T/C T/T EFCAB6 21.78 330
69 rs4729408..T/C T/T LMTK2 17.37 562
70 rs17724172..T/C DLGAP1 19.62 427
71 rs7796440..C/C T/C DGKB 12.99 933
72 rs592483..T/T CCND1 18.65 480
73 rs1001594..A/G PCLO 15.55 691
74 rs426357..A/G ST8SIA6 16.76 602
75 rs10006329..T/T LOC646187 21.42 342
76 rs1478842..T/C T/T KRT8P18 15.13 729
77 rs993691..C/C T/T CLSTN2 14.91 752
78 rs16953047..T/G FTO 21.02 359
79 rs4754011..A/G JRKL 18.74 471
80 rs10895059..T/C T/T JRKL 18.67 475
81 rs1677991..A/G G/G PPP2R5C 18.02 512
82 rs4754011..A/A A/G JRKL 17.78 526
83 rs1897833..A/C CETN3 21.72 326
84 rs1915279..A/G KRT18P32 17.69 530
85 rs739234..A/A A/G EFCAB6 21.50 334
86 rs12941303..T/C FLJ37644 18.94 456
87 rs3952709..A/G CETN3 22.24 309
88 rs2394824..A/A A/G LMTK2 17.15 565
89 rs6895193..A/A G/G PRLR 14.14 811
90 rs1894631..A/A A/G EFCAB6 21.38 336
18
91 rs11918193..T/C SERPINI2 15.50 678
92 rs11263509..C/C CCND1 16.00 639
93 rs16862676..C/C TSC22D2 10.93 1132
94 rs16862676..C/C T/T TSC22D2 10.69 1158
95 rs17176958..A/G GVIN1 19.47 420
96 rs6953258..C/C LOC645973 20.97 352
97 rs11128801..A/G G/G LOC644638 15.68 659
98 rs17513961..T/T LOC222052 10.22 1207
99 rs12951391..T/C T/T FLJ37644 17.34 542
100 rs6780177..T/C ZNF385D 21.16 342
Abbreviations: ARR=absolute risk reduction; GWAS=genome-wide association study; ID=combination marker identification (based on ranking using the LOESS method); LOESS=local polynomial regression fitting; N=subgroup size; SNP=single nucleotide polymorphism.
19
Supplementary Table 7. Listing of the 20 single markers above the LOESS line for prognostic markers after the first stage. A cutoff was used for continuous variables based on what has been previously defined and accepted in the critical care field or by the regulatory agencies (for example, APACHE II score ≥25 or multiple organ dysfunction ≥2) or determined via a recursive partition method. ID Subgroup Marker Risk N
1 blcrtclr..<52.34 BL Creatinine Clearance (Cockcroft-Gault) 46.36 302
2 prapache..≥25 Pre-infusion APACHE II Score 43.23 347
3 age..q4 Age at Admission (yrs) 47.34 169
4 age..q3 q4 Age at Admission (yrs) 41.13 355
5 blpcact..q1 BL Protein C Activity 46.63 163
6 rs11657397..A/A A/G C17orf79 47.90 119
7 blpt..q4 BL Prothrombin Time 45.06 162
8 rs2908435..T/T NXPH3 48.51 101
9 rs190254..T/C TSHZ2 47.79 113
10 rs12772424..T/T TCF7L2 41.35 237
11 rs2814471..A/G G/G TMCO1 44.97 149
12 rs6506428..A/A A/G ARHGAP28 41.47 217
13 blsofren..q2 q3 q4 BL SOFA Renal Score 36.72 433
14 rs1454371..A/G G/G FAM92A3 44.10 161
15 rs1800802..T/C MGP 41.63 209
16 rs12610468..C/C OLFM2 41.75 206
17 rs7204404..A/A KIAA1576 43.71 167
18 rs7205198..A/A KIAA1576 43.71 167
19 rs1454371..A/G FAM92A3 44.67 150
20 rs7555523..A/C C/C TMCO1 44.67 150
Abbreviations: APACHE II=acute physiology and chronic health evaluation II; BL=baseline; ID=marker identification (based on ranking using LOESS method); LOESS=local polynomial regression fitting; N=subgroup size; Risk=28-day mortality; SOFA=sequential organ failure assessment.
20
Supplementary Figure 1. Manhattan plots of the GWAS results for genotype AA versus not AA (a) and AB versus not AB (b). GWAS=genome-wide association study.
a. AA vs. not AA
b. AB vs. not AB
21
Supplementary Figure 2. Quantile-Quantile plots from GWAS in the entire genetic cohort.
Quantile-Quantile plots showing the distribution of P values for a homozygous genotype (BB) versus heterozygous (AB) or homozygous for the other allele (AA), or BB versus not BB (a), AA versus not AA (b) and AB versus not AB (c). The vast majority of markers have observed P values below the X=Y line expected due to chance. GWAS=genome-wide association study.
a. BB vs. not BB
b. AA vs. not AA
c. AB vs. not AB
22
Supplementary Figure 3. Multidimensional scaling of the highest-ranking 100, 3-SNP combination markers.
The top 3 multidimensional scaling (MDS) components were used to cluster the highest-ranking 100, 3-SNP combination markers (CMs). Six discrete clusters were observed and are defined largely by a common SNP being present in each cluster. For example, cluster 1 (red cluster) contains 54 CMs; all of which include the DRD1 and LOC39127 genes and the majority of which include PCNXL2. The CMs in cluster 5 are relatively more diverse in comparison, and cluster 5 contains multiple sets of genes; this is visually depicted by the diffuse distribution in the MDS plot (green plot). SNP=single nucleotide polymorphism.
23
Supplementary Figure 4. Effect of permutation size on the variability of the top markers.
From the 1000 permutation results (1-, 2-, 3-marker top results), various bootstrap samples (size: 10, 20, 50, 100, 200, 500, 1000) were generated, repeated 10 times. We then calculated the standard deviation of the residues from the LOESS fit on the combined top results for each bootstrap sample size. LOESS=local polynomial regression fitting.
24
Supplementary Methods.
Methodology for single marker analyses using a traditional association approach
For genetic markers, the following derived variables (to model different modes of inheritance)
were generated for each SNP (single nucleotide polymorphism) passing quality control:
• AA (two major alleles) or Not AA (0 or 1 copy of a major allele)
• Aa (heterozygous) or Not Aa (not heterozygotes)
• aa (two minor alleles) or Not aa (0 or 1 copy of a minor allele)
Logistic regression was used to identify a SNP-by-treatment interaction association with
treatment response. A forward selection procedure determined that age and baseline Acute
Physiology and Chronic Health Evaluation (APACHE) II score was included as covariates in the
model, in addition to treatment. This was completed for the total PROWESS cohort of severe
sepsis patients to maximize the power to see an effect. The analysis was then repeated in the
subgroup of interest, e.g., multiple organ dysfunction (MOD) ≥2 (the indicated population in
Europe for DAA), to examine this subgroup for consistency.
Likewise, analyses were completed on the total cohort population (all ethnicities) as well as the
Caucasian subgroup because of the possibility of population substructure confounding. A
population substructure adjustment was not included in the model because of the overwhelming
majority of Caucasians in the data.
A logistic model was fit to 28-day survival. Model 1 was used to test for a therapy*SNP
interaction. The general form of the logistic model was:
bxeP '
11ˆ−+
= [1],
where the response and parameters (contained in the full x’b vector) were defined as
25
P̂ = probability of 28-day survival b0 = is a common intercept bage = effect of age bprapache = effect of baseline apache score btherapy = main effect of therapy bSNP = main effect of SNP btherapy*SNP = therapy*SNP interaction.
Log likelihood-ratio (LR) tests, in the form of the deviance (D), were used to estimate the P
value of therapy*SNP interaction effects (2-sided at 0.05 level) in the presence of additional
regressors. The deviance is asymptotically equivalent to taking the difference in the log
likelihoods of full and reduced models, LRfull model – LRreduced model. For example, when the
therapy*SNP interaction is eliminated using model 1, the resulting deviance D(btherapy*SNP | b0,
bage, bprapache, btherapy, bSNP) = LRmodel1:full – LRmodel1:reduced represents the amount of variability in the
data explained by the therapy*SNP interaction in the presence of a common intercept, main
effects of age, baseline APACHE II score, therapy, and SNP, and is distributed as a chi-square
with 2 degrees-of-freedom for a SNP with 3 levels (if a marker is tested as AA versus not AA,
then the therapy*marker interaction is distributed as a chi-square with 1 degree-of-freedom).
Similarly, logistic regression was used to identify a prognostic marker. Logistic regression tests
were used to estimate the P value of marker effect (2-sided at 0.05 level) in the presence of
additional regressors.
Methodology for the multimarker search
For clinical markers, important subgroups of patients as defined by clinical criteria such as
APACHE II ≥25 and multiorgan dysfunction (MOD) ≥2 were included as factors in the modeling
so as to identify their interaction and relative importance in the presence of genetic and other
non-genetic markers.
26
In addition, clinical markers, such as protein C levels, interleukin 6 (IL-6) levels, disseminated
intravascular coagulation (DIC) score, thrombin-antithrombin (TAT) complex, baseline
Cockcroft-Gault estimation of creatinine clearance <52.3, were evaluated as single markers as
described below in Step 1 and could be included in the modeling as described in Step 2 (see
below).
For selected clinical markers, a CUTOFF may be used that has been previously defined and
accepted in the critical care field or by the regulatory agencies (e.g., APACHE II score ≥25 or
MOD ≥2) or determined via a recursive partition method. The following derived variable was
generated for selected clinical markers or additional subgroups:
• ≥ (or >) CUTOFF or < (or ≤) CUTOFF
We could further subset clinical data and derived variables (subgroup of patients) by
quartiles to generate subgroups that could include:
• X ≤ 1st quartile or X > 1st quartile
• X ≤ 1st quartile, 1st quartile < X ≤ 3rd quartile or X > 3rd quartile
• X ≤ 1st quartile, 1st quartile < X ≤ 2nd quartile, 2nd quartile < X ≤ 3rd quartile or
X > 3rd quartile
• X ≤ 2nd quartile or X > 2nd quartile
• X ≤ 3rd quartile or X > 3rd quartile
Each subgroup defined by one marker (genetic, clinical, or subgroup) or combination of markers
with sufficient size (>20% of total population) was assessed for differential treatment response
(as defined by absolute risk reduction [ARR] or number needed to treat [NNT]). To reduce
computation time, ARR or NNT could be calculated as observed instead of as model based and
the confidence interval of ARR or NNT may not be generated.
27
A 2-step method was used to identify markers that define subgroups with better differential
treatment response in a sizable number of subjects23-25,28. (Marchini et al. 2005 proposed a 2-
stage approach and subsequently other authors (Evans et al 2006; Ionita and Man 2006; and
Zhang et al. 2010) recommended alternative approaches shown to have more power. In this
manuscript we utilized a 2-step conditional approach augmented by LOESS and MDS
strategies.)
• Step 1. Calculated ARR or NNT for each subgroup defined by a single marker. ARR could be calculated as observed difference in survival rate in treatment and placebo group. Alternatively, ARR could be determined from a logistic model with appropriate covariates, e.g., logit (survival) ~ treatment + marker + treatment*marker + age + additional covariates (first few principle components or other covariates). The calculation of observed ARR is computationally more efficient and was therefore the primary analysis. NNT can be calculated as the inverse of ARR.
• Step 2. Selected the top markers from the first step (selected on the merit of best balancing maximal ARR and subgroup size). The number of markers (M) that were selected from the first step were dependent on the computing capacity since the maximal computation is bound by M*N (N is the total of derived marker variables). If M is 1000 and N is 5 million, the total number of subgroups was 5 billion. If computer resources allowed 1 billion calculations, then M was needed to be reduced to 200. Each “significant” marker, whether it was clinical or genetic, was paired with another marker to define a new subgroup, then the ARR or NNT was calculated for the new subgroup. Step 2 could be iterated to include one or more additional markers.
Because multiple subgroups defined by many markers were examined, a chance finding of a
subgroup with favorable ARR and subgroup size could occur. Bonferroni correction is commonly
used for multiple comparisons. The total number of subgroups that could be examined may be
as high as 1012. Using N=1012 for Bonferroni correction would virtually guarantee that no
significant result would be found due to the diminished power with the current sample size.
Hence, it was not reasonable to adjust the P value based on total number of all possible
subgroups or tests. For this reason, it has been argued that the P value threshold should not be
28
based on the number of tests performed since larger studies with more markers are more likely
to succeed than studies with smaller number of markers30. In this study, we used ARR
(differential treatment effect) instead of P value as selection criteria for the 2-step search. We
selected subgroups with ARR >12.5% and subgroup frequency >20%. The total number of
selected subgroups was the M and the subgroups may not be independent.
Methodology for linkage disequilibrium analyses
Linkage disequilibrium (LD) is a population genetics measure of the non-random association of
alleles at two or more loci. The pair-wise LD structure, as determined by the r2 correlation
coefficient among SNPs, was calculated and visualized using Haploview software33. The Gabriel
method34 was then used to infer population-based haplotype blocks (that is, genomic regions
over which there is little evidence for historical recombination and within which only a limited
number of haplotypes are observed). Haploview default settings were used, which assumes no
LD among SNPs that are more than 500 kb apart. Linkage disequilibrium plots using all SNPs
identified in the analyses were generated by chromosome in the overall cohort and Caucasian
population for every chromosome with at least 2 SNPs represented in the top 100 CMs.
Methodology for permutation testing (and modified prognostic markers)
Permutation – treatment response
The permutation runs to establish a reference distribution for the top 100 combination markers
from GWAS for treatment response were performed as follows:
1. Fix patient info, candidate gene data, and response (Xp, Xc, and Y, respectively), shuffle
label for GWAS genetic data (Xg). e.g.,
Y1, Xp1, Xc1, Xg1 ≥ Y1, Xp1, Xc1, Xg1' (1' is a shuffled/permutated patient)
2. Perform 2-stage search and save 1-, 2-, 3-marker top results (use cutoff: arr.minus.fit > 0).
3. Iterate 1000 times (permutation runs).
29
4. All permutation results (top hits) are combined. Due to the skewness of the data, a LOESS-
smoothed line based on the top 95% (or 99%) percentile for each N was generated to determine
whether observed top GCMs fall outside the 95% line.
GWAS for prognostic markers
Identified markers for risk of 28-day mortality in placebo population using 2-stage multimarker
search. Top markers were selected based on the LOESS method that balances the tradeoff
between risk (28-day mortality rate in placebo) and subgroup size defined by the marker.
a. at least 100 subjects (~14%, out of 720 subjects on placebo) and
b. at least 36% mortality (approximately 20% increase over the overall placebo population's
mortality rate) and
c. at least –0.01 (or 0 for 3-marker) relative to LOESS line.
Permutation – prognostic markers
The permutation runs to establish a reference distribution for the top markers identified in
GWAS for prognostic markers was performed as follows:
1. Shuffle survival status and therapy together (Y, T) while fixing patient information, candidate
gene data (Xp and Xc), and GWAS genetic data (Xg). e.g.,
(Y1,T1), Xp1, Xc1, Xg1 ≥ (Y1‘,T1‘), Xp1, Xc1, Xg1 (1' is a shuffled/permutated patient)
2. Perform 2-stage search and save 1-, 2-, 3-marker top results (use cutoff: arr.minus.fit > 0).
3. Iterate 1000 times (permutation runs).
4. All permutation results (top hits) are combined. Due to the skewness of the data, a LOESS-
smoothed line basing on top 95% (or 99%) percentile for each N was generated to determine
whether observed top CMs fall outside the 95% line.
30
Supplementary References
23. Marchini J, Donnelly P, Cardon LR. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet 2005; 37: 413-417. 24. Evans DM, Marchini J, Morris AP, Cardon LR. Two-stage two-locus models in genomewide association. PLoS Genet 2006; 2: e157. 25. Ionita I, Man M. Optimal two-stage strategy for detecting interacting genes in complex diseases. BMC Genet 2006; 7: 39. 28. Zhang Z, Niu A, Sha Q. Identification of interacting genes in genome-wide association studies using a model-based two-stage approach. Ann Hum Genet 2010; 74: 406-415. 30. [WTCCC] Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 2007; 447: 661-678. 31. Fellay J, Shianna KV, Ge D, Colombo S, Ledergerber B, Weale M et al. A whole-genome association study of major determinants for host control of HIV-1. Science 2007; 317: 944-947. 32. Sullivan PF, Lin D, Tzeng JY, van den Oord E, Perkins D, Stroup TS et al. Genomewide association for schizophrenia in the CATIE study: results of stage 1. Mol Psychiatry 2008; 13: 570-584. 33. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 2005; 21: 263-265. 34. Gabriel SB, Schaffer SF, Nguyen H, Moore JM, Roy J, Blumenstiel B et al. The structure of haplotype blocks in the human genome. Science 2002; 296: 2225-2229.