Upload
clyde-allen
View
219
Download
0
Embed Size (px)
Citation preview
KAROLINSKAINSTITUTET
International Biobank and Cohort Studies: Developing a Harmonious Approach
February 7-8, 2005, Atlanta; GA
Karolinska Institutet
GenomEUtwin Karolinska
Institutet Biobank LifeGene
Jan-Eric LittonKarolinska Institutet, StockholmSweden
KAROLINSKAINSTITUTET
Karolinska InstitutetStockholm, Sweden
KAROLINSKAINSTITUTET
Stockholm… an archipelago with 30.000 ilands
KAROLINSKAINSTITUTET
A letter from King Karl XIII to the Collegium Medicum in 1810 authorized the immediate establishment of a ”college for the corps of field surgeons”.
War and cholera War and cholera led to the royal decree ...led to the royal decree ...
KAROLINSKAINSTITUTET
From a school for army From a school for army surgeons to a surgeons to a medical universitymedical university
1810 The Royal Carolinska Medico-Surgical Institute was founded ”for the creation of skilled Army Surgeons”.
1811 Jöns Jacob Berzelius became one of KI’s first professors and laid the foundation for the Institute’s natural-scientific orientation.
1895 Alfred Nobel appointed Karolinska Institutet to decide who should be awarded the Nobel Prize in Physiology or Medicine.
1993 A comprehensive reorganization of KI began: 150 departments became 27.
KAROLINSKAINSTITUTET
GenomEUtwin
• The European Consortium of Twin Registries for Analyses of Complex Traits; www.genomeutwin.org
• Aims to capitalize special advantages of Europe in population genetics
•The goal is to identify critical genetic and life-style risk factors for common diseases using European strengths in genetics, epidemiology and biocomputing
QuickTime och enTIFF (LZW)-dekomprimerare
krävs för att kunna se bilden.
KAROLINSKAINSTITUTET
GenomEUtwin
• Funded by EU as one of three Centers of Excellence in Genomics in 2002, PI: Leena Peltonen
• Twin cohort from Denmark, Finland, Italy Netherlands, Norway and Sweden (England and Australia) over 600 000 twin pairs total, 65% dizygotic
•MORGAM cohort from 9 European countries, total of > 144 000 participants
QuickTime och enTIFF (LZW)-dekomprimerare
krävs för att kunna se bilden.
KAROLINSKAINSTITUTET
GenomEUtwin
• Twin Cohorts have been used for decades to estimate to role of genetics in the background of common diseases
• Gathering phenotype info for decades
• In SwedenCivic # 1947Swedish Cancer Registry 1958The Cause of Dead Registry, started 1756, modern classification 1961
QuickTime och enTIFF (LZW)-dekomprimerare
krävs för att kunna se bilden.
KAROLINSKAINSTITUTET
TwinNET
• Ten first monthsa database structure established
• Database completionA common, secure database established in Europe for all relevant scientific information in GenomEUtwin
KAROLINSKAINSTITUTET
TwinNET
•EUid number (EUIDNUM) 752000021210
–The EUidnumber consists of four parts:
Country code 3 digits – ISO 3166
Randomized number 7 digits
Identification number 1 digit
Check sum 1 digit
KAROLINSKAINSTITUTET
TwinNET
VARIABLE FORMAT STANDARD – DRAFT 1 Subject identifier Country and/or center Pair number/family number Twin number Individual specific – permanent Date of birth Gender Date of death Place of birth (region of birth) Zygosity Birth weight Changeable phenotypes/socio-demographic issues Weight Height (include age of assessment, Time of assessment, and Mode of assessment for each item)
KAROLINSKAINSTITUTET
Appendix A – Migraine Phenotype
Version: 1.1
Author:Ann Björklund
Participants:Ingunn BrandtLars HvidbergAnne Leinonen
Phenotype Group of Migraine:Elles MulderDavid Gaist
Mikko KallelaDorret Boomsma
Aarno Palotie
KAROLINSKAINSTITUTET
TwinNET
2 Data format
Variable Name Description Type Length ValuesMIGR_DIAGNOSIS Type of
migraineNUMBER 2 1 – Migraine with aura.
2 – Migraine without aura3 – Not classified.4 – Migraine with or withoutaura.5 – Never had migraine97 - Irrelevant/non- -
participant/non-response98 - Do not know99 - Unknown
MIGR_LEVEL Differentlevels ofdiagnosticaccuracy.Level Abeing thelowest andlevel Dbeing thehighest.
NUMBER 2 1 – Level A. Affirmativeresponse to at least one of thequestions (see section 1)2 – Level B. Self-reportedmigraine.3 – Level C. Interview-basedmigraine.4 – Level D Diagnosedmigraine.97 - Irrelevant/non-
participant/non-response98 - Do not know99 - Unknown
MIGR_DATE_DIAG Date ofdiagnosisi
DATE 8 YYYYMMDD - Year, month,dayYYYYMM15 - day unknownYYYY0701 - day andmonth unknownNULL/blank - Irrelevant/non-participant/non-response,Irrelevant/structural,Do not know, Unknown
MIGR_DATE_ERROR
Error codefor date ofmigrainediagnosis
NUMBER 1 1 - Date is complete2 - Day is missing, coded 153 - Day and month aremissing, coded 07016 - Irrelevant/non-participant/non-response7 - Irrelevant, structural8 - Do not know9 - Unknown
Description of the migraine phenotypeMigraine is a complex neurovascular disorder and can be divided in migraine with aura andmigraine without aura. These subtypes are clincally distinguishable but it is as yet unclearwhether different genes are involved. The levels as described below reflect different levels ofdiagnostic accuracy with level A being the lowest, Level D being the highest.
Level B is the minimal condition for initiating a genome scan for migraine. Those countrieswho are currently at level A will therefore need to confirm their diagnosis by using theextensive migraine screening questionnaires (Palotie, Kallela, and Ferrari, 2003) that havebeen validated in a Finnish population sample. The questionnaire will also require translation-back-translation checking, as well as review by national migraine specialists for appropriatewording.
Level A: Affirmative response to at least one of the following questions: Have you ever had migraine? Have you ever had visual disturbances of 5 to 60 minutes’duration followed by
headaches? Has a doctor ever told you that you have migraine? Have you suffered from headache attacks during the previous year and -if so- were
attacks associated specifically with any of the following symptoms: visualdisturbances, nausea or vomiting, unilateral location.
Level B: Self-reported migrainePositive classification for migraine with or migraine without aura based on the extensivescreening questionnaires (under construction). Classification is unconfirmed by a structuredinterview conducted by trained personnel, neurologist or physician.
Level C: Interview-based migrainePositive classification for migraine with aura or migraine without aura by a (locally designed)questionnaire that includes the IHS criteria. Classification has been confirmed by telephone orhome interviews conducted by trained personnel.
Level D: Diagnosed migrainePositive classification for migraine with aura or migraine without aura based on IHS criteriaduring a face-to-face interview and examination conducted by a neurologist or physician.
KAROLINSKAINSTITUTET
Federated data approch
• Data stay untouched–Integrates heterogeneous local or remote data sources through wrappers
• Just need to know what data should be available to whom and how to access them• It makes all data look like it is one virtual database hiding the data layer complexity
ODBC – JDBC and moreODBC – JDBC and more
KAROLINSKAINSTITUTET
TwinNET
Relational wrappers list
DB2 family
Informix (Informix Client SDK)
Oracle (SQLNet or Net8 client)
MSQL Server (ODBC driver version 3.0 or later)
Sybase (Sybase open client)
Teradata (Teradata CLI)
ODBC (ODBC driver version 3.x)
Non-relational wrappers list
Lotus Extended search
Excel
Flatfile (CSV format)
XML (1.0 specifications)
OLE DB
BLAST
Documentum (Documentum client API)
Entrez (version 1.0)
KAROLINSKAINSTITUTET
TwinNET
Genotypes
FinnishTwin
Registry
SwedishTwin
Registry
FinnishTwin
Registry
Genotypes
SwedishTwin
Registry
Federated database HUB
GenomEUtwin centres databases GenomEUtwin user
KAROLINSKAINSTITUTET
TwinNET December 2004
GT GT GT
FGCHelsinki
Mol Med Rudbeck labUppsala
Genotypes
SMdb
Database federation – Information Integrator
Phenotypes
Genotypesubmission database
Helsinki
SSH SSH Excel sheet SSH
WE
B
IBM xSeries 3252xOpteron, 4GB, SAN
DK
SQ
L
Stockholm
NO FI SE
KAROLINSKAINSTITUTET
Karolinska Institutet Biobank
• Karolinska Institute Biobank constitutes a national, non-commercial resource for collection, handling and storage of human biological material aiming at promoting scientific excellence within molecular andgenetic research.
• Biobank Informatic is a central part of the Karolinska Institutets Biobank
KAROLINSKAINSTITUTET
Biobank Information Management System - vision
A system that will enable a researcher to search and navigate the enormous amounts of information available for samples and their donors in various data sources.
KAROLINSKAINSTITUTET
Biobank Information Management System
BIMSLIMS
Freezer
WebInterface
LabRobot
DB:s
OtherBiobanks
KAROLINSKAINSTITUTET
Biobank Information Management System
WebSphere Application Server
WebSphere Portal
DB2Portal
Database
Web S
erv
er Portal
Admin
LDAP Directory
Web Interface
Data Collection
DDQB QueryInterface
DDQBData
Abstraction
Replication
Architectural Overview Diagram
DB2II Data Federation
LIMSDB
BRAINDB
I nternet
VPN
PIMSDB
SwedishTwin
Register
FederatedDB
BIMSResearch
Data
BIMSAdminData
Data Analysis
Std DataAnalysis
Tools
UserAdmin
StudyMgmt
WebSphere Application Server
WebSphere Portal
DB2Portal
Database
Web S
erv
er Portal
Admin
LDAP Directory
Web Interface
Web Interface
Data Collection
DDQB QueryInterface
DDQBData
Abstraction
Replication
Architectural Overview Diagram
DB2II Data Federation
LIMSDB
LIMSDB
BRAINDB
I nternet
VPN
BRAINDB
BRAINDB
I nternet
VPN
I nternet
VPN
PIMSDB
PIMSDB
SwedishTwin
Register
SwedishTwin
Register
FederatedDB
BIMSResearch
Data
BIMSAdminData
Data Analysis
Std DataAnalysis
Tools
UserAdmin
StudyMgmt
KAROLINSKAINSTITUTET
Biobank Information Management System
DDQB- Absract Data Model
DatabaseF
F FF
FF
F
C
C
C User View
Access Methods
Abstract Query
Abstract Query
Data Abstraction Model (DAM)
KAROLINSKAINSTITUTET
Biobank Information Management System
KAROLINSKAINSTITUTET
LifeGene
A Swedish Population-based
Project on
Lifestyle, Genomics and Health
KAROLINSKAINSTITUTET
LifeGene
Study of aetiology of diseases with high health care and economic impactProspective highly motivated and well characterized cohort with 5-800,000 Ascertainment of nuclear familiesAges 0-39: infections, asthma and allergy, obesity and diabetes, mental disorders and cognitive dysfunctionAges 40-64: CVD, diabetes, neurodegenerative disorders, cancer
KAROLINSKAINSTITUTET