Upload
01archivist
View
4.562
Download
0
Embed Size (px)
Citation preview
Exploring Your Personal Genome with Free, Online
Bioinformatics Tools by
Shannon Bohle, BA, MLIS, CDS (Cantab), FRAS, AHIP
.org 2014 Tech Conference
What is the future of genomic sciences and bioinformatics? Ethical considerations of newborn screening: privacy, inaccuracy, discrimination, eugenics
Video: Gattica (1997): http://www.youtube.com/watch?v=1Q67bMYOm7E
Reduced cost: The $1,000 genome “Illumina’s DNA Supercomputer Ushers in the $1,000 Human Genome” (January 14, 2014)http://www.businessweek.com/articles/2014-01-14/illuminas-dna-supercomputer-ushers-in-the-1-000-human-genome
+ Genome sequencing at birth:“Baby DNA Analysis Ushers in Brave New World of Treatment” (January 16, 2014)http://www.bloomberg.com/news/2014-01-16/baby-dna-analysis-ushers-in-brave-new-world-of-treatment-health.html
= Big industry“Illumina and a Billionaire Want to Jump-Start Genomics Upstarts” (February 17, 2014)http://www.businessweek.com/articles/2014-02-17/illimuna-and-billionaire-yuri-milner-to-aid-genomics-startups
The future of genomic sciences and bioinformatics is NOW.
Presentation Overview: Predictive Pathology Hopefully you will learn a great deal today about the biological basis of disease. Specifically, we will discuss the following pathways in which disease can occur:
• At conception, chromosomes from both parents combine to pass on genetic material to a child. Sometimes when chromosomes combine there are problems that occur in this crossing over process called chiasma, and these variations are not inherited. • Chromosomal abnormalities like an addition, deletion, translocation, inversion, or insertion, are inherited. A common example of a structural variation would be Down Syndrome where there is an additional copy of chromosome 21.• Also at conception, because chromosomes contain DNA, the specific traits (called phenotypes) and the genetic code (called genotypes) are also transferred. Genotypes are always present, while phenotypes may be expressed (dominant) or hidden (recessive) in an individual. Recessive traits can be passed on through generations expressing themselves down the family line, and dominant traits can skip generations. A common example of an autosomal recessive heritable disease is sickle cell anemia.• During childhood and adulthood, factors like the environment (such as exposure to chemicals), diet, exercise, aging, et cetera can also damage genes, mutating them, and this may lead to disease. The branch of study examining context dependent, non-inherited factors is called epigenetics. An example of this is Protein misfolding. • Inherited and de novo (chiasmic, protein misfolding, and epigenetically-caused) variations can be studied in detail when looking at the level of either proteins or DNA (which is made of amino acids). Therefore, sequencing of plant, animal, and other forms of life have been done to try to understand and control biology, specifically biological function. The field of functional genomics designs technology tools that aid in diagnoses when biology malfunctions. About 40-60% of genes in a sequenced genome are related to biological function. Under different conditions, proteins may express themselves in novel, transient ways. These gene expressions are difficult to detect.
Trained professionals identify specific biomarkers, like JAK2, that have a high association with diseases. Knowing these in advance can sometimes influence a person’s lifestyle choices, such as having children, diet, and medical decisions. Because bioinformatics is a very new field, a genetic counselor should interpret test results to provide patients with guidance on two items. First, their level of risk by percentage, and second, the level of confidence scientists have that a specific biomarker actually causes a disease. Scientists determine this looking across species, through phylogenetics. But most importantly they learn about the genetic basis of human disease by using bioinformatics tools to compare DNA of patients who share the same disease and creating cell lines. That is why projects like the Personal Genome Project not only benefit the individual participant, but also contribute to advances in medicine and personalized medicine. “Personalized medicine is an emerging practice of medicine that uses an individual's genetic profile to guide decisions made in regard to the prevention, diagnosis, and treatment of disease” (NLM ‘s GHR glossary).
Having your genome sequenced provides an overview of your genetic background as well as the state of your genes at a given time.
Courtesy of the Genetics & Public Policy Center with support from The Pew Charitable Trusts
Genome: all hereditary genetic material of an organismChromosome: DNA, protein, and RNA found in cellsGene: strands of 5’ to 3’ DNA (promoters, exons, introns) (Humans have about 22,000 genes)Allele: one of 2 or more variants of each gene (two of which are inherited from parents)Genotype: coded information 2 Types: Homozygote: same alleles – AA, aa Heterozygote: different alleles – AaPhenotype: physical manifestation of a characteristic Dominant Trait: expressed Recessive Trait: not expressed
a) Autosomal Recessive: Two abnormal copies must be present to get the disorderb) X-linked Recessive: Females are carriers only
GENETICS (GENES/CHROMOSOMES)A Short Overview of Biological Inheritance (Heredity)
Described Through Cell Biology CELLGENOME
CHROMOSOMEGENEDNAAMINO ACIDS
Image Courtesy of Mayo Clinic: http://www.mayoclinic.org/procedure/genetic-testing/multimedia/genetic-disorders/sls-20076216
If you have a genetic disorder or are a carrier, will your children inherit it?
NOT NECESSARILY. SEE AN MD OR GENETIC COUNSELOR.
GENETICS (GENES/CHROMOSOMES)Mitosis v. Meiosis
Some chromosome abnormalities are not inherited. De novo variants appear
for the first time in an individual. They can occur in recombination or
“crossing over” during mitosis or meiosis.
Image Credit: OpenStax College. "Laws of Inheritance." Connexions. February 24, 2014.
http://cnx.org/content/m44479/1.3.
Mitosis occurs with somatic cells. It results in two cells that are duplicates of the original cell. In other words, one cell with 46 chromosomes becomes two cells with 46 chromosomes each. This kind of cell division occurs throughout the body, except in the reproductive organs. This is how most of the cells that make up our body are made and replaced. These mutations are not passed on to children.
Meiosis occurs with germ cells. It results in cells with half the number of chromosomes (in diploid humans, 23 instead of the normal 46). These are the eggs and sperm. These mutations can be passed on to children in their stem cells. During gestation, the stem cells gain specificity as somatic cells of various types and germ cells to become a male or female child.
Source: http://www.genome.gov/11508982#6
ChiasmaDuring meiosis
chromosomal material crosses over
Video: Cell Division and the Cell Cycle
http://www.youtube.com/watch?v=Q6ucKWIIFmg
BIOCHEMISTRY (PROTEINS)A Short Overview of Molecular Biology
and Bioinformatics
Video: Central dogma of molecular biology (1958): replication, transcription and translationVariations (mutations) can occur during these processes, sometimes causing diseases
that can be passed on to children.
http://www.youtube.com/watch?v=Q_WRFw8KQk4
http://www.youtube.com/watch?v=D3fOXt4MrOM
Video animation:The central dogma of molecular biology
"DNA The Secret of Life” by PBS
After proteins are formed they fold into various shapes based on their chemical makeup. Misfolding is a second cause of de novo variants. Misfolding sometimes causes disease, and is
passed on to children. A linear analysis of amino acid chains in a protein cannot anticipate amino acids near each other when proteins fold so 3D modeling is used.
http://www.youtube.com/watch?v=Pjt1Q2ZZVjA“Simulating How Proteins Self-Assemble, Or Fold” by Stanford University
Video: Protein folding
When cells go bad,control decisions must be made
that regulate the micro-“society.” Reform or Remove?
DNA ligase, an enzyme, (shown left, in color)
repairs mistakes in DNA.Some proteins, like p53,
(shown below) enforce cell death (apoptosis). P53 malfunction is one cause of
cancer, where cells with mutations grow out of control.
The Life Cycle of DNA
Sir John Gurdon: Epigenetics Founder &
Nobel Laureate "for the discovery that
mature cells can be reprogrammed to become
pluripotent"
Turning back the clock on disease: Mature, specialized cells can be reverted to their embryonic stem cell state.
University of Cambridge, 2012, the year Gurdon won the Nobel Prize
Xenopus
Protein-Protein InteractionHow proteins interact with one another
is key to understanding their function in the body.
Only 1% of the human genome
codes for 20,000 our proteins.Function is largely determined
on how proteins interact.
Epigenetics“Epigenetic mechanisms are affected by several factors and processes including development in utero and in childhood, environmental chemicals, drugs and pharmaceuticals, aging, and diet. DNA methylation is what occurs when methyl groups, an epigenetic factor found in some dietary sources, can tag DNA and activate or repress genes. Histones are proteins around which DNA can wind for compaction and gene regulation. Histone modification occurs when the binding of epigenetic factors to histone “tails” alters the extent to which DNA is wrapped around histones and the availability of genes in the DNA to be activated. All of these factors and processes can have an effect on people’s health and influence their health possibly resulting in cancer, autoimmune disease, mental disorders, or diabetes among other illnesses.” Image and description credit:
National Institutes of Health
Comparative Genomics and PhylogenyTo locate new disease markers and learn how pathogens function,
it is helpful to examine ultra-conserved regions in cross-species protein & nucleic acid production, because these are most often linked to important bodily functions, disease and health.
(See: 1) Kumar S, Sanderford M, Gray VE, Ye J, Liu Li. Evolutionary diagnosis method for variants in personal exomes. Nature Methods (2012) p;9(9):855-6. doi:10.1038/nmeth.2147.
\2) Liu L, Kumar S. (2013) Evolutionary Balancing is Critical for Correctly Forecasting Disease
Associated Amino Acid Variants. Molecular Biology and Evolution 30:1252-1257 (Epub 2013 March 5))
About 5%-10% of the human genome are regulatory motifs across species, that turn genes “on” and “off” to control gene expression, in addition to the 1% used for coding proteins.
Visualization of a Phylogenic Tree Using MEGA 6
Newick notation: ((((Cucumis sativus,Ricinus communis), Solanum lycopersicum), Medicago truncatula)(Arabidopsis thaliana,Capsella rubella))
“Proteins are clustered on branches on the basis of the similarity of their amino acid sequences. The phylogenetic representation tends to cluster structurally (and sometimes functionally) related proteins. Drugs targeting a specific protein are more likely to be active against other proteins on the same branch. Distinct phylogenetic branches are highlighted with distinct colours (in the case of the malignant brain tumour (MBT) family, where only a few MBT domains are actually binding methyl-lysines, the red colour coding indicates the branch where all known methyl-lysine-binding domains are clustered). We assembled protein families by looking for domains associated with 'writing', 'reading' and 'erasing' acetyl and methyl marks in the Human Protein Reference Database, and by complementing the list with data from the literature, as well as data from the Pfam protein family database and the SMART (Simple Modular Architecture Research Tool) database. The phylogeny outlined in the trees is derived from multiple sequence alignments of the domain after which the family was named (full-length sequences were used for acetyltransferases as the catalytic domain is not always clearly defined for this family). If a domain is present multiple times in a protein, the protein is shown multiple times in the corresponding tree, followed by the sequential iteration of the domain in parenthesis for example, L3MBTL(2) corresponds to the second MBT domain of the protein L3MBTL. If multiple variants with insertions or deletions were reported for a gene, the variant number according to Swiss-Prot nomenclature is indicated after a hyphen: for example, TRIM33-2 in the tree of bromodomain-containing proteins corresponds to the second Swiss-Prot variant of the TRIM33 (tripartite motif-containing protein 33) bromodomain. For each tree, a seed alignment was derived from available protein structures by aligning residues that were superimposed in the three-dimensional space. Additional sequences were appended by aligning them to the closest seed sequence..”
http://www.nature.com/nrd/journal/v11/n5/fig_tab/nrd3674_F2.html
Phylogenetic trees of epigenetic protein families.
Mega-genomics and
Next Generation Sequence Analysis
Sequencing human DNA: The Human Genome Project and the Personal Genome Project
First Human Genomes Sequenced: 1) Dr. J. Craig Venter2) Dr. James D. Watson: Molecular Biology Founder & Nobel Laureate3) Personal Genome Project4) Hundred Person Wellness Project5) UK Personal Genome ProjectCold Spring Harbor Laboratory, 2006
Genome-Wide Association Studies (GWAS)compare one human genome to another to look for similarities and differences that might cause disease.
Current understanding of the human genome, categorized by function of each gene product,
given both as number of genes and as percentage of all genes.
Image description and credit: Mikael Häggström (Wikimedia Commons)
Our understanding of function within the human genome is incomplete. More samples are needed for improved results.
The Cost Reduction for Sequencing Genomes Greatly Outpaced Moore’s Law
State Direct-to-Consumer Testing Statutes and Regulations Courtesy of the Genetics & Public Policy Center with support from The Pew Charitable Trusts
Limitations of GINA
“The Genetic Information Nondiscrimination Act, known as GINA, does not apply to three types of insurance — life, disability and long-term care — that are especially important to people who may have serious inherited diseases … The American Medical Association’s code of ethics states that ‘it may be necessary’ for doctors to maintain a separate file for genetic test results so the information is not sent to insurers.”
-- “Fearing Punishment for Bad Genes,” The New York Times, April 7, 2014. http://www.nytimes.com/2014/04/08/science/fearing-punishment-for-bad-genes.html?_r=1
Genetic Information Nondiscrimination Act (GINA) of 2008: http://www.genome.gov/24519851
Henrietta Lacks: The Ethics of Cell Line Development and Research
Henrietta Lacks, 1945. Image courtesy of The Lacks Family. (Source: Wikipedia).
Do you own your DNA?
Testing Companies23andMe454 Life SciencesAdvanced Healthcare, IncAIBioTechAncestry DNAAtlas Sports GeneticsAthleticodeBiologis Personal Genomics ServiceBioresolveCounsylComplete GenomicsdeCODE GeneticsdeCODEme.comDNA-CARDIOCHECKDNA DTCDNATraitsEastern Biotech & Life ScienceseasyDNAEnteroLabFamily Tree DNAFuture GeneticsGeenitestiGenelex
GenePlanetGenetic HealthGenetic TechnologiesGenetic Testing LaboratoriesGeneyouinGenographic ProjectGenotekGentle LabsGraceful EarthHealthCheckUSAHelloGene / HelloGenomeHolistic HealthIDNA.comi-geneIlluminaIndian BiosciencesInoLife TechnologiesInterleukin Genetics JCVI Knome Lumigenix Map My Gene
MapMyGenome meragenome.com MyGene23 Navigenics Oxford Nanopore Technologies Pacific Biosciences Pathway Genomics Pediatrix Medical Group Perkin Elmer Genetics Personal Genome ProjectPersonalis PHENOM Biosciences Positive Bioscience Sequenom SNPedia Test Country Ubiome Viaguard/Accu-metrics vuGene Xcode Life Sciences
As of March 10, 2014, 23andMe had 650,000+ genotyped customers
ScreeningsMore than 420 Conditions and Traits are Screened for During Genetic Testing.
CONDITIONSCANCERLIVER HEARTHEARINGSIGHT DIABETES PSYCHIATRIAC/PSYCHOLOGICALREPRODUCTIVE / STD (FERTILITY)REGULATORY FUNCTIONS (BREATHING, SLEEP, WEIGHT, RENAL)ADDICTION (ALCOHOL, DRUG)IMMUNE SYSTEM (HIV, AIDS)MUSCULO-SKELETAL (MARFANS)PHARMACOGENOMICS/DRUG EFFICACY (CANCER, WARFARIN)NEUROLOGICAL (PARKINSON’S, ALZHEIMER’S, MS)SKIN
ABILITIES & PHYSICAL TRAITSINTELLIGENCEENDURANCEEYE & HAIR COLOR
NCBI Resourceshttps://www.ncbi.nlm.nih.gov/variationhttp://www.ncbi.nlm.nih.gov/guide/genetics-medicinehttp://www.ncbi.nlm.nih.gov/books/NBK1116http://www.ncbi.nlm.nih.gov/medgenhttp://www.ncbi.nlm.nih.gov/mesh
Other Resourceshttp://www.omim.orghttp://www.orpha.nethttp://www.genome.govhttp://www.dnapolicy.org
See handout for
specific tests
Asclepius
How to Submit Your DNA for Sequencing and Analysis with the Personal Genome Project
Basic eligibility: 1. US citizen age 21 or older2. Additional details: http://www.personalgenomes.org/harvard/protocols
How it Works: We will be using an existing volunteer’s genome for this presentation.
Steps:1. Provide Open Consent (form)2. Supply Medical History (form) 3. Donate DNA Samples (saliva, hair, blood, tissue) by self-collection or at a designated facility4. Samples Sent to Lab (blood=dna, tissue=exome, saliva=microbiome); tissue samples may be used to develop cell lines for research purposes5. Harvard’s PGP Team Analyzes Data for Anomalies and Creates a Personalized Health Prognosis Report 6. The PGP Team Publishes Your Information Online (Your data is associated with a volunteer number, but your name can also be used if you would like to do this)7. Safety follow-up monitoring by email8. Additional details: http://www.personalgenomes.org/harvard/howitworks
Volunteer huA90CE6
John Lauerman In His Own Words
Whole Genome Sequence (WGS) Analysis
http://youtu.be/YGIxMYiPLOU
Volunteer huA90CE6 = Case Study: John Lauerman (Harvard Analysis)JAK2-V617F and APOE-C130R variations
Step 1 Create a C:\data folder and download John Laurman’s genome from the PGP website:
https://my.pgp-hms.org/profile/huA90CE6. Examine the variant report on the same page.
Locating and Interpreting Errors: Cytogenetic Location JAK2-V617F is located on the short arm of chromosome 9p (9pLOH). Sources: Kralovics R1, Passamonti F, Buser AS, Teo SS, Tiedt R, Passweg JR, Tichelli A, Cazzola M, Skoda RC. A gain-of-function mutation of JAK2 in myeloproliferative disorders. N Engl JMed. 2005
Apr 28;352(17):1779-90.
There are 22 chromosomes and X or Y. The first integer is the chromosome number.The second integer is the letter p or q, where p is the “short arm” and q is the “long arm.”The position is usually designated by two digits (representing a region and a band), which are sometimes followed by a decimal point and one or more additional digits (representing sub-bands within a light or dark area)http://ghr.nlm.nih.gov/handbook/howgeneswork/genelocation
LIST OF COMMON ERRORS BY CHROMOSOME NUMBER: http://ghr.nlm.nih.gov/chromosomes
9pLOHJanus kinase 2 –
Cytogenetic Location: 9p24http://ghr.nlm.nih.gov/chromosome/9
Human Gene JAK2 Transcript (Including UTRs) Position: chr9:4,985,245-5,128,183 Size: 142,939 Total Exon Count: 25 Strand: +Coding Region Position: chr9:5,021,988-5,126,791 Size: 104,804 Coding Exon Count: 23
JAK2-V617F
Human Reference Genome - “Normal” JAK2 using UCSCGB
Right Click over JAK2 and choose
“Get DNA for JAK2.” Then, in the popup
window, choose “get DNA.” Using the shift key, highlight all
the information. “Save As” JAK2. .
Open the file with notepad to see JAK2
in more detail.
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&position=chr9%3A4985245-5128183
C:\Users\Shannon\Desktop\LRITA\LRITA_2014
We will examine a volunteer’s “Variant” JAK2 with two free bioinformatics tools using Windows.
At the end of the talk there will be a list of additional non-Windows compatible tools for other systems like Linux, MAC, and iPad.
PGA
BLAST National Center for Biotechnical Information (NCBI)(Web-based)
Personal Genome Analyzer from Archivopedia
Volunteer huA90CE6 -- Case Study: John Lauerman (Looking Closer with PGA & PyMOL)
Step 1 Download and install Python https://www.python.org/download/releases/2.7.5
Windows X86-64 MSI Installer (2.7.5) [1] (sig) Step 2
Download and install the PyMOL extension for a free 3D molecule viewerhttp://www.lfd.uci.edu/~gohlke/pythonlibs/#pymol (pymol 1.7.1.0.win amd64 py2.7.exe‑ ‑ ‑ )
Find application file C:\Python27\PyMOL\ Find PyMOL application file in the list and create shortcut.
Drag shortcut to the desktop. Double click icon on desktop to run PyMOL.
SETTING UP PYTHON (Win 7, 64-bit)
Note: Installing the extension may open a C prompt window to compile.
Step 3Download and install the wxPython extension: http://downloads.sourceforge.net/wxpython
/wxPython3.0-win64-3.0.0.0-py27.exe
Volunteer huA90CE6 -- Case Study: John Lauerman (Looking Closer with PGA)Step 4
Use the Python-driven tool designed for this project to convert an isolated chromosome in your whole genome sequence from TSV to FASTA and SQL formats in under 5 minutes.
Note: The following sources were used to create the tool: Search engine - http://wiki.personal-genome.org/index.php?title=Talk:MtDNA_haplogroup
Human reference genome (rCRS) - http://www.ncbi.nlm.nih.gov/nuccore/251831106?report=fasta
Insert •Browse your hard drive for the Volunteer’s Whole Genome Sequence •File name: huA90CE6--GS000006909-ASM.tsv
Insert •Enter a single chromosome number you wish to examine•1-22, X, or Y; or leave blank for whole genome. [Enter 9]
Insert •Enter an exact location or leave at defaults if you wish to scan the whole chromosome or whole genome. [Use Default]
Check mark “Generate FA” for FASTACheck mark “Generate SQL” for SQL
Click the PROCESS BUTTON.Go to C:\data
for the converted files in FASTA and SQL formats.
Volunteer huA90CE6 -- Case Study: John Lauerman (Looking Closer with PGA)
UNDER DEVELOPMENTAfter a single search of a whole
genome or chromosome,use PGA to view the FASTA file in
the “View FASTA” window. Or, view the exact location of variants
simply by clicking on the “Variants” tab.
This image shows some variants in John Lauerman’s Chr1 compared
to the Human Reference Genome.
Note: The following sources were used to create the tool: Search engine - http://wiki.personal-genome.org/index.php?title=Talk:MtDNA_haplogroup
Human reference genome (rCRS) - http://www.ncbi.nlm.nih.gov/nuccore/251831106?report=fasta
Future plans include adding reports with graphs and other
visualizations.
Volunteer huA90CE6 Case Study: John Lauerman
(Looking Closer with BLAST)
Step 5 Use the generated FASTA file to perform a BLASTn search.
In this case, John Lauerman’s Chr9 filewas used
(after using PGA, it is located in C:\data with a .fa extension).
Volunteer huA90CE6 Case Study: John Lauerman
(Looking Closer with BLAST)
Free Tools for Other Platforms
CGATools(MacOS or LINUX only)
Download Complete Genomics Analysis Tools software and User Guide documentation:http://cgatools.sourceforge.net
CGA Tools 1.8.0 Software: CGA Tools 1.8.0 User Guide:
http://cgatools.sourceforge.net/docs/1.8.0/cgatools-user-guide.pdf
Illumina’s MyGenome AppRequires iOS 6.1 or later. Compatible with iPad.
http://www.illumina.com/clinical/clinical_informatics/mygenome_app.ilmn
Complete Genomics’ Genome Voyagerhttp://www.completegenomics.com/analysis-tools/voyager
Complete Genomics’ List of Third Party Tools: http://www.completegenomics.com/analysis-tools/third-party-tools
PyMOL for Linux and Mac: http://www.pymolwiki.org/index.php/Linux_Installhttp://www.pymolwiki.org/index.php/MAC_Install
Using a mySQL database, it is possible to import many whole human genome sequences from
the PGP project by following the example in Slide #36 using PGA.
2. Consider the needed space allocation. ● Each unzipped TSV file of an entire genome is about 1.3 MB
TO GET STARTED 1. Determine the minimum and ideal sample sizes (number of volunteer DNA sequences)
for significance in your study (usually 10,000). The PGP aims for a collection of 100,000 sequenced genomes.
In silico human genome scientific studies can be conducted for the following applications: ● disease biomarker identification
● pharmacogenetics
3. Consider needed time for conversion and import into a mySQL database.
Create your own database Analyzing Collections of Whole Human Genomes
Through Multiple Sequence Alignments and Analysis
WHAT’S NEXT?
Archivopedia LLC and Real Data, Inc are discussing a possible collaboration to provide sequencing and analysis of personal genomic data.This includes further development of the bioinformatics tool, the Personal Genome Analyzer (free basic version), and a service plan for personalized medical services.
BUSINESS PLAN: What would be different about us?Profit sharing with participants if cell lines are developed based upon their DNA samples
FDA Friendly: Results delivery and counseling provided by a Staff PhysicianConfidential: No insurance reporting (all participants must be self-pay)
Private: Sequences are not published on the internet—protects against mosaic effectFree tool plus a new subscription service for medical professionals
Fast turnaroundIllumina 2500 sequencers for human genome sequencing
Storage in 2 encrypted supercomputers to help prevent data breachesExpertise creating novel algorithms & performing data analysis, text mining, AI, NLP
Probability value (p value) reporting percentages of likelihood of pathogenesisReporting and visualization tools integrating a variety of free online tools and databasesElectronic Health Record (EHR) integration to bridge research and clinical domains and to
help achieve meaningful use objectives
Create & Deliver Ordered Lab Reports
Sequence
Compare & Compute
Perform Text Mining on Medical Literature
(MESH, Informatics Taxonomies)
Samples
Isolate, Replicate & Monetize Cell Lines for Research
$$$
Workflow Overview (Simplified)
National Human Genome Research Institute’s Undiagnosed Diseases Program &
FDA pharmacogenics list
ID Biomarkers & Publish Academic Papers
Search Variant Databases Everyone has about 3-4 million variants. Which are pathogenic and which are not?Key: Look for nucleotide changes in exon
regions that code for different amino acids than the human reference genome
because these can affect function.Databases of Known Human Variants
NCBI Variant DatabasesDatabase of Genomic Variants Cancer Genome Consortium
Cancer Genome AtlasOnline Mendelian Inheritance in Man
$$$
Selected Bibliography
ALBERTS, B. (1983). Molecular biology of the cell. New York, Garland Pub.CAREY, N. (2013). Epigenetics revolution: how modern biology is rewriting our understanding of genetics, disease, and inheritance.CHURCH, G. M., & REGIS, E. (2012). Regenesis: how synthetic biology will reinvent nature and ourselves. New York, Basic Books.SCHRODINGER, E. (2012). What is life?: the physical aspect of the living cell. Cambridge,
Univ. Press.SKLOOT, R. (2010). The immortal life of Henrietta Lacks. New York, Crown Publishers.VENTER, J. C. (2007). A life decoded: my genome, my life. New York, Viking.WATSON, J. D. (1968). The double helix; a personal account of the discovery of the structure of DNA. New York, Atheneum. [SIGNED FIRST EDITION]WATSON, J. D. (2008). Molecular biology of the gene. San Francisco, Pearson/Benjamin Cummings.ZVELEBIL, M. J., & BAUM, J. O. (2008). Understanding bioinformatics. New York, Garland Science.
Credits
Personal Genome Project (Harvard)MITx: 7.00x: Introduction to Biology - The Secret of Life
(14 weeks) : Eric Lander (MIT, Harvard)Bioinformatic Methods I | Coursera
(6 weeks): Nicholas Provart - (University of Toronto)Bioinformatic Methods II | Coursera
(6 weeks): Nicholas Provart - (University of Toronto)Illumina
Gattica (screenshot) Genetics & Public Policy Center
Mayo ClinicStanford University
Mega 6JMOLNIH
UCSC Genome DatabasePyMOL
CGA ToolsComplete Genomics
MyGenome AppNCBI – BLAST
PBSMG – RAST
NatureReal Data, Inc.
Dr. Frank N. Kautzmann III, PhD.John Lauerman
Tracy KovachMikael Häggström
Database of Genomic VariantsNational Human Genome Research InstituteInternational Society of Genetic Genealogy
Personal Genome Analyzer: Architect: S. Bohle, Programmers: D. Yount, W. McCready
Contact Information
Archivopedia.com