25
Sanguinetti, 2012 Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number on the screen…” Prof. XXX, CBE, FRS, FRSE

Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Embed Size (px)

Citation preview

Page 1: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2012 Bio2 lecture 1

Bioinformatics 2

“My main problem these days is that I don’t understand how we go from

an experiment in the lab to a number on the screen…”

Prof. XXX, CBE, FRS, FRSE

Page 2: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Lecture 1

• Course Overview & Assessment

• Introduction to Bioinformatics Research

• Careers and PhD options

• Core topics in Bioinformatics– the central dogma of molecular biology

Page 3: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

About us...• Degree in Physics

• PhD in Mathematics (mathematical physics)

• Got interested in mathematical modelling

• Worked in machine learning & bioinformatics & systems biology

• Office in forum, IF 1.44

• Shahzad Asif – finishing PhD student – lab (week 5)

Page 4: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Course Outcomes

• An appreciation of statistical methods in bioinformatics, with special focus on networks and probabilistic models

• Experience in using and/or implementing simple solutions

• Appreciate the current ‘state of the art’

• Be familiar with some available resources

Page 5: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Course Design

• Lectures cover (selective) background/ general concepts

• Guest lectures present current research/ applications of data analysis

• Self-study and assignments designed to cover practical implementation

• Lab to give hands on experience support

Page 6: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Guest lectures and topics

Donald Dunbar (Centre for Inflammation Research)Transcriptomic technologies, week 4 (08/02)

Ian Simpson (School of Informatics)ChIP technologies, week 8 (07/03)

Chris Larminie (GlaxoSmithKline) Bioinformatics in the Pharmaceutical Sector, week 9 (14/03)

Page 7: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Assessment (Bio2)

• Written assignment (released on 24/02, due 10/03)– Data analysis mini project– Plagiarism will be refereed externally

• Cite all sources!!!

– Late submissions get 0 marks!

Page 8: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Bioinformatics?

• Introduce yourselves to each other.• What is Bioinformatics?• What does Bioinformatics do for CS?• What does Bioinformatics do for Biology?• What guest Bioinformatics lecture would you

like?

• Discuss in groups for 10 min.

Page 9: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Answers

Page 10: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

What is BioInformatics?

• Sequence analysis and genome building

• Molecular Structure prediction

• Evolution, phylogeny and linkage

• Automated data collection and analysis

• Simulations and modelling

• Biological databases and resources

Page 11: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

BioInf and CS

• Provides CS with new challenges with clear bio-medical significance.

• Complex and large datasets sometimes very noisy with hidden structures.

• Can biological solutions be used to inspire new computational tools and methods?

Page 12: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

BioInf and Biology

• High-throughput biology:– around 1989, the sequence of a 1.8kb gene

would be a PhD project – by 1993, the same project was an undergraduate

project – in 2000 we generated 40kb sequence per week

in a non-genomics lab– Illumina/Solexa systems Gigabases per expt.

Page 13: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Bioinformatics

• http://www.bbsrc.ac.uk/science/grants/index.html– Awarded grants database

• http://bioinformatics.oxfordjournals.org

• www.biomedcentral.org/bmcbioinformatics

• www.nature.com/msb

• http://bib.oxfordjournals.org/

Page 14: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Bioinformatics@ed

• Database integration

• Data provenance

• Evolutionary and genetic computation

• Gene expression databases

• High performance data structures for semi-structured data (Vectorised XML)

1/2

Page 15: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Bioinformatics@ed

• Machine learning • Microarray data analysis• Natural language and bio-text mining• Neural computation, visualisation and

simulation• Protein complex modeling• Systems Biology• Synthetic Biology

2/2

Page 16: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Career Options

• Academic Routes– Get Ph.D, do Postdoctoral Research -

lectureship and independent group– M.Sc. RA - becomes semi independent usually

linked to one or more academic groups. Career structure is less defined but improving. RAs can do Ph.D. part-time.

Page 17: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Career Options

• Commercial Sector– Big Pharma - Accept PhD and MSc entry.

Normally assigned to projects and work within defined teams. Defined career structure (group leaders, project managers etc)

– Spin-out/Small biotech - Accept PhD and MSc entry. More freedom and variety. A degree of ‘maintenance’ work is to be expected.

Page 18: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

Bioinformatics 2

Basic biology and roadmap

Page 19: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

How do you characterise life?

Sanguinetti, 2011 Bio2 lecture 1

Page 20: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

The central “dogma” of molecular biology

• Static genetic information is stored in DNA

• Genes are portions of DNA which are “transcribed” into mRNA

• mRNA is “translated” by ribosomes into proteins

• Proteins carry out the essential cellular functions: enzymatic, regulatory, structural

Sanguinetti, 2011 Bio2 lecture 1

Page 21: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Sanguinetti, 2011 Bio2 lecture 1

protein-gene interactions

protein-protein interactions

PROTEOME

GENOME

Slide from http://www.nd.edu/~networks/

Cell structure Metabolism

Page 22: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Multiple control points

• DNA replication is regulated

• mRNA transcription is regulated by transcription factor proteins

• mRNA degradation is regulated by RNA binding proteins/ small RNAs

• mRNA translation is regulated by tRNA

• Enzymes “regulate” metabolism

Sanguinetti, 2011 Bio2 lecture 1

Page 23: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Multiple data types

• Sequencing (deep) measures genomic data

• Microarrays (lecture 3), PCR, RNA-seq measure mRNA

• Flow cytometry/ fluorescence measures single cell protein abundance

• Mass spectrometry measures proteins/ metabolites

Sanguinetti, 2011 Bio2 lecture 1

Page 24: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Multiple data types

• Chromatin immunoprecipitation measures protein binding (lecture 8)

• Yeast 2 hybrid measures protein interactions

• Nuclear magnetic resonance measures metabolites

Sanguinetti, 2011 Bio2 lecture 1

Page 25: Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number

Roadmap

• Lectures 1, 2, 3, 6 aim at covering basic statistical/ ML tools to handle diverse data types

• Guest lectures 4, 7, 8 illustrate particular experimental tools and industrial apps

• Tutorials help clarifying concepts and applying to problems

Sanguinetti, 2011 Bio2 lecture 1