39
Understanding the function of conserved non-coding regions in the human genome Sofie Salama – Haussler lab CS273A, November 17, 2008

Understanding the function of conserved non-coding regions in the human genome

  • Upload
    asher

  • View
    26

  • Download
    0

Embed Size (px)

DESCRIPTION

Understanding the function of conserved non-coding regions in the human genome. Sofie Salama – Haussler lab CS273A, November 17, 2008. Haussler Lab. Dry lab – comparative genomics research Browser staff – UCSC genome browser, ENCODE data coordination center, 1000 genomes - PowerPoint PPT Presentation

Citation preview

Page 1: Understanding the function of conserved non-coding regions in the human genome

Understanding the function of conserved non-coding regions in the

human genome

Sofie Salama – Haussler labCS273A, November 17, 2008

Page 2: Understanding the function of conserved non-coding regions in the human genome

Haussler Lab

• Dry lab – comparative genomics research• Browser staff – UCSC genome browser, ENCODE data

coordination center, 1000 genomes• Wet lab - Experimental analysis of interesting human

genomic regions

Page 3: Understanding the function of conserved non-coding regions in the human genome

• Origin of conserved non-coding regions and co-regulated gene networks

• Function of ultraconserved elements

• Discovery of novel non-coding RNA genes

• Detailed analysis of Human Accelerated Regions (HAR’s)

Understanding the function of conserved non-coding regions in the

human genome

Page 4: Understanding the function of conserved non-coding regions in the human genome

How are we different from chimps?

• Brain anatomy– 3X larger, especially cortex– More later developing

neurons of the upper cortical layers projecting within the cortex

– functional asymmetries• What are the genotypic

differences responsible for these phenotypic differences?

Hill, R. S. & Walsh, C. A. Nature 437, 64–67 (2005)

Page 5: Understanding the function of conserved non-coding regions in the human genome

Clues from comparative genomics

• Human vs. chimpanzee genome– Genomes are almost identical– BUT, almost 29 million differences – What are the important

differences???

• Multiple mammalian genomes sequenced – Conservation used to identify functional

elements – only 1/3 of conserved regions are protein

coding

Page 6: Understanding the function of conserved non-coding regions in the human genome

The HAR screen • Identify previously conserved regions

– ≥100 bp 96% identical between the chimpananzee, mouse and rat genomes

– ~35,000 mammalian conserved regions

• Compare to human sequence to identify Human Accelerated Regions– Look for orthologous segments with a large

number of changes– Develop statistical methods to rank and

evaluate each HAR

• Identified 49 regions with a significant increased substitution rate in humans (genome wide FDR<5%)

Katie Pollard

Page 7: Understanding the function of conserved non-coding regions in the human genome

Wet lab HAR projects

• HAR population resequencing

• Analysis of HAR1

• Characterization of HAR2 knockout and knockin mice

Page 8: Understanding the function of conserved non-coding regions in the human genome

Why resequence the HARs?

• Positive selection– Beneficial mutation enters

population– Spreads. Nearby (neutral)

alleles from mutated chromosome hitchhike towards fixation – a selective sweep

– Skew DAF spectrum towards both ends

• Confounding factor: time – Neutral drift removes variation

in 4Neff generations (~1 MYr in human)

• Human/chimp ancestor 5-7 MYA

Stringer Nature 2003

Noonan et al. Science 2006

Page 9: Understanding the function of conserved non-coding regions in the human genome

Resequence HARs 1 to 49

• 40kb around each HAR (~2.5Mb total with 13 control regions)

• 24 samples (48 chromosomes) YRI hapmap samples (panel P2 Seattle SNPs)

• Enough to do population genetic analysis on a HAR-by-HAR basis (not like our paper on ultras in the average)

• High throughput sequencing technology enables cost effective investigation.

Sol Katzman

Page 10: Understanding the function of conserved non-coding regions in the human genome

“Next-Gen” Sequencing• ABI SOLiD (fluoro seq by repeated ligation)

– 35bp reads (fragment, not mate-pair)– $3-4K per run– 2 slides per run– multiple samples per slide

• barcoded samples• Isolated drops on a slide

– 50 to 100 Million reads per slide• Total 2.5Gb of reads• 50% mapped? 50% enriched?• 250X coverage of 2.5Mb target regions?• Divide by number of samples in run for sample coverage

– From 1000 Genomes project:• Need 11X to get both alleles @ 99% prob• Need 27X average to get 11X @ 99% prob

Page 11: Understanding the function of conserved non-coding regions in the human genome

Project Overview (part 1 of 2)

to Part 2 Sol Katzman

Page 12: Understanding the function of conserved non-coding regions in the human genome

Project Overview (part 2 of 2)from Part 1

Sol Katzman

Page 13: Understanding the function of conserved non-coding regions in the human genome

Wet lab HAR projects

• HAR population resequencing

• Analysis of HAR1

• Characterization of HAR2 knockout and knockin mice

Page 14: Understanding the function of conserved non-coding regions in the human genome

and the winner is….HAR1!

• 118 bp segment with 18 changes between the human and chimp sequences

Page 15: Understanding the function of conserved non-coding regions in the human genome

HAR1 genomic landscape

• Browser gazing suggested the HAR1 element may be expressed in both orientations

• rt-PCR on human tissue RNA preps suggested brain specific expression of the HAR1 element

• Used RACE to clone both forward and reverse transcripts from cortical and cerebellar RNA

Page 16: Understanding the function of conserved non-coding regions in the human genome

HAR1 is transcribed

• HAR1F expressed in brain (cerebellum, forebrain structures), ovary and testes (~1/10 of brain expression)

• HAR1R expressed in brain (1/10 of HAR1F) and testes• Outside HAR1 element, little conservation beyond primates

HAR1

Page 17: Understanding the function of conserved non-coding regions in the human genome

RNA in situ hybridization

• Fix tissue (whole embryo or sections)• Synthesize digoxygenin labelled probe anti-

sense to desired target• Hybridize, wash, visualize using enzyme linked

anti-DIG anitbody

superfly.ucsd.edu

Page 18: Understanding the function of conserved non-coding regions in the human genome

HAR1F is expressed in the in the neocortex

Nelle Lambert, Marie-Alexandra Lambot, Sandra Coppens, Pierre Vanderhaeghen

500µm 250µm

Page 19: Understanding the function of conserved non-coding regions in the human genome

Reelin and cortical development

Amadio, JP & Walsh, CA, Cell 126:1033-1035 (2006)

Page 20: Understanding the function of conserved non-coding regions in the human genome

HAR1F is expressed in the marginal zone and the cortical plate

Nelle Lambert, Marie-Alexandra Lambot, Sandra Coppens, Pierre Vanderhaeghen

125 µm

Page 21: Understanding the function of conserved non-coding regions in the human genome

Expression of HAR1F in the neocortex continues though 19 GW

Nelle Lambert, Marie-Alexandra Lambot, Sandra Coppens, Pierre Vanderhaeghen

250 µm

1000 µm

Page 22: Understanding the function of conserved non-coding regions in the human genome

Co-expression of Reelin and HAR1F in Cajal-Retzius neurons

Nelle Lambert, Marie-Alexandra Lambot, Sandra Coppens, Pierre Vanderhaeghen

250 µm

250 µm

Page 23: Understanding the function of conserved non-coding regions in the human genome

Expression of HAR1F elsewhere in the brain at later embryonic stages

Nelle Lambert, Marie-Alexandra Lambot, Sandra Coppens, Pierre Vanderhaeghen

Page 24: Understanding the function of conserved non-coding regions in the human genome

The HAR1F neocortical expression pattern is found in macaque

• Expression pattern conserved since the divergence of hominoids and old world monkeys 25 MYA

Colette Dehay, Pierre Vanderhaeghen

Page 25: Understanding the function of conserved non-coding regions in the human genome

HAR1F is predicted to form a stable RNA structure

Jakob Pederson

Page 26: Understanding the function of conserved non-coding regions in the human genome

Human

Chimp

Human Chimp

- 40

- 60

- 50

- 70

U G C A - 0 10 30 U G C A - 0 10 30DMS DMS

Haller Igel, Manny Ares

Structure probing reveals differences in the human and chimp structures

Page 27: Understanding the function of conserved non-coding regions in the human genome

Human HAR1F differs from the ancestral RNA stucture

Page 28: Understanding the function of conserved non-coding regions in the human genome

Resequencing/population genetics

• Samples– 24 member human diversity panel (HAR1 element)– 70 Caucasian and African American (6.5 kb region)– Other primates (gorilla, orangutan, macaque)

• Findings– human-specific changes fixed in the populations

(NO SNPs!)– Changes happened at least 1 MYA, no evidence of a

recent selective sweep– Large number of human changes extends throughout

HAR1F 1st exon

Sol Katzman, Bryan King, Andy Kern

Page 29: Understanding the function of conserved non-coding regions in the human genome

Summary• HAR1 is the most extreme of a set of genomic regions

showing increased substitutions specifically in the human lineage

• HAR1 overlaps 2 divergent ncRNA genes, HAR1F and HAR1R

• HAR1F is expressed in the neocortex in reelin producing Cajal-Retzius neurons which are critical for creating the architecture of the human cortex and also in other structures patterned by the reelin pathway

• HAR1F forms a stable RNA structure and the human substitutions appear to alter this structure

Page 30: Understanding the function of conserved non-coding regions in the human genome

What does HAR1 do???• What is the cellular role of HAR1 ncRNAs?

• Where are they localize?

• Who do they interact with?

• What is their role in neural development?

• How do human HAR1 ncRNAs differ from other mammalian HAR1 ncRNAs?

Page 31: Understanding the function of conserved non-coding regions in the human genome

Wet lab HAR projects

• HAR population resequencing

• Analysis of HAR1

• Characterization of HAR2 knockout and knockin mice

Page 32: Understanding the function of conserved non-coding regions in the human genome

HAR2

• 12 human substitutions in a 119 bp segment• highly conserved in amiotes, present in frog• Not in a mature transcript, no RNA secondary

structure

Page 33: Understanding the function of conserved non-coding regions in the human genome

HAR2 Genomic Neighborhood

• HAR2 located in an intron of Centaurin-gamma 2• Closest neighbor is Gastrulation and brain-specific

homeobox protein 2 • CENTG2-HAR2-GBX2 relationship conserved back to

frog-human ancestor

Page 34: Understanding the function of conserved non-coding regions in the human genome

Transgenic assay for enhancer activity

LacZMinimal PromoterHAR2

Harvest at embryonic timepoints. Stain to visualize lacZ activity.

How does LacZ expression compare with that of nearby genes (centg2 and gbx2)?

Page 35: Understanding the function of conserved non-coding regions in the human genome

HAR2 is a neural-specific enhancer

Bryan King and Armen Shamamian

Page 36: Understanding the function of conserved non-coding regions in the human genome

HAR2 is a limb specific enhancer

• Human HAR2 shows significant activity in the limb buds

• Human HAR2 is stronger and shows a broader pattern of expression

• Making the human substitutions in the chimp construct is sufficient for increased limb bud staining

Prabhakar et al. (2008) Science

Page 37: Understanding the function of conserved non-coding regions in the human genome

HAR2 targeted mutants

• HAR2 knockout – marked allele is made, breeding with constitutive cre mouse to remove vector/marker sequences

• HAR2 knockin human HAR2 – Have ES cell line, no chimeras yet

• HAR2 knockin mouse HAR2 – Have construct

Robert Sellers, Armen Shamamian

Page 38: Understanding the function of conserved non-coding regions in the human genome

AcknowledgementsHaussler Lab

Jeff Long, Ting Wang, Danielle Gomez

Manny AresHaller Igel

Harry NollerDavid FeldheimJena Yamada

Nader Pourmand

UCSC Collaborators

FundingHHMI, NIDA

Pierre Vanderhaegen – Univ. of BrusselsKatie Pollard – UCDUCSF/GladstoneAndy Kern - Dartmouth

Page 39: Understanding the function of conserved non-coding regions in the human genome