1
Incorrect pairing of direct homologous repeats during meiosis can result in non-allelic recombination. This reciprocal event leads to one gamete with duplication of the intervening sequence and one with deletion (Fig. 1a). Recombination between inverted homologous repeats results in inversion of the intervening sequence, which can disrupt gene function by inverting exons, or by disconnecting coding and regulatory regions (Fig. 1b). If the affected regions contain dosage-sensitive genes, the effect can range from mild to devastating, depending on the locus (Fig. 1c). Long reads enable assembly of repetitive regions and can also be mapped more unambiguously than short reads, enabling precise characterisation of recombination breakpoints. Resolving highly complex rearrangements of genomic architecture using long PromethION reads Genomic disorders are diseases that result from chromosomal rearrangements, rather than from nucleotide-scale changes, and lead to the loss or gain of chromosomal material, or to inversions Fig. 1 Non-allelic homologous recombination: a) and b) mechanisms c) examples of disorders Deletion, duplication and inversion of dosage-sensitive genes Fully characterising a duplication-inverted triplication-duplication on chromosome 14 Fig. 2 Mechanism of the duplication-inverted triplication-duplication rearrangement © 2019 Oxford Nanopore Technologies. All rights reserved. Oxford Nanopore Technologies’ products are currently for research use only. P18002 - Version 2.0 Fig. 3 PromethION sequence data showing AOH; copy number changes; structural variant calls Contact: [email protected] More information at: www.nanoporetech.com and publications.nanoporetech.com We generated 16x of reads with an aligned read N50 of 15 kb. Reads were aligned against the hg38 reference using ngmlr and structural variants were detected by Sniffles. Copy number was estimated by comparing read depth at each position to control data and applying a correction factor for different amounts sequenced per sample. Sniffles detected 26,036 bp and 2,263 bp inversions (the latter as an inverted duplication) at the centromeric and telomeric ends of the locus respectively. In both cases one or more reads spanned the entire duplicated region. We determined both breakpoints and confirmed these by capillary sequencing. To identify AOH we called SNPs using bcftools and compared the density of heterozygous SNPs in the target sample against the controls – a moving average of 10 bins is plotted in the top panel of Fig. 3. 0 1 2 3 4 5 Copy number Sniffles inversion Chr 14 p12 q11.2 14q12 23.1 q24.3 31.1 q32.2 p13 14p11.2 q21.2 31.3 0 2 4 6 61.5 62.0 62.5 63.0 63.5 64.0 64.5 65.0 65.5 Copy number Position along chromosome (Mb) 0 Heterozygosity 20 40 60 80 100 Position along chromosome (Mb) 0 0.5 1.0 1.5 62.37 62.38 62.39 62.40 62.41 62.42 62.43 64.63 64.64 64.65 64.66 64.67 Position along chromosome (Mb) Position along chromosome (Mb) ngmlr alignments +ve strand -ve strand Dosage- sensitive gene Homologous chromosomes Recombination hotspot Deletion Duplication Inversion Hotspot Location of repeats Repeat length (kb) Inter-repeat distance (Mb) Sequence similarity (%) Approximate frequency Disorder CMT1A-REP 17p11.2 24 98.7 1.5 1 x 10 −5 Charcot-Marie-Tooth disease (del.) Hereditary Neuropathy (dup.) AZFa-HERV Yq11.21 10 94.0 0.8 1 x 10 −5 Azoospermia (del.) Normal fertility (dup.) WBS-LCR 7q11.23 143 99.6 1.5 1 x 10 −5 Williams-Beuren syndrome (del.) 7q11.23 duplication syndrome (dup.) Int22h- Xq28 9.5 99.7 0.6 1 x 10 −4 Haemophilia A (inv.) a) Direct repeats c) b) Inverted repeats Fork collapse and processing Product a b c d’ c’ b’ C D E 0 1 2 3 4 Copy number Adapted from Carvalho, C.M.B. et al. Absence of heterozygosity due to template switching during replicative rearrangements. American Journal of Human Genetics 96 (4), 555–564 (2015) a) Homologous chromosome q a b c d e a’ b’ c’ d’ e’ Homologous chromosome Q A B C D E A’ B’ C’ D’ E’ Template switching 2: initiation of replication on homologous chromosome Parent strands DNA replication e d e d Leading strand Lagging strand c b a c b a d’ c’ a’ c b a d’ c’ b’ a’ Fork collapse and processing Template switching 1: newly synthesised strand initiates replication in different place in opposite direction a b c d e b’ The duplication-inverted triplication-duplication structure (Fig. 2) is a class of complex genomic rearrangement that can lead to severe phenotypic consequences. Just two breakpoints can give rise to both duplicated and triplicated regions. To determine the mechanism that gives rise to the rearrangement it is necessary to i) distinguish between duplicated and triplicated regions, ii) identify breakpoint junctions and iii) identify any absence of heterozygosity (AOH) created by the event. We analysed a patient sample where the rearranged locus had previously been identified using array-CGH, but this method had given insufficient resolution to identify the duplicated regions and thus it had not been possible to ascertain the breakpoints (Fig. 3). Elucidating the mechanism of complex genomic rearrangements with long reads Haplotype 1 Haplotype 2 Log-likelihood ratio 30 20 10 0 -10 -20 -30 Position (kb) 101,288 101,290 101,292 101,294 101,296 a) Control (NA12878) Haplotype 1 Haplotype 2 Log-likelihood ratio 30 20 10 0 -10 -20 -30 Position (kb) 101,288 101,290 101,292 101,294 101,296 b) Patient Over a 42 Mb region of chromosome 14, both of the patient’s chromosomes contain maternally inherited DNA. AOH in this region can affect gene dosing through the disruption of normal imprinting mechanisms. Reads aligning to the differentially methylated region of the MEG3 gene were first assigned to haplotypes using MarginPhase based on the phasing of single nucleotide variants. Next, methylation of CpG sites for each haplotype was assessed using Nanopolish and the average of the resulting log-likelihood ratios plotted for each site. A positive LLR indicates CpG methylation, while a negative LLR indicates CpG non-methylation. The control sample (Fig. 4a) shows the normal parent-of-origin specific methylation pattern, while the patient sample (Fig. 4b) lacks any hyper-methylated haplotype, a phenomenon termed loss of imprinting. Absence of the hyper-methylated paternal MEG3 allele in a patient genome Fig. 4 Methylation of the MEG3 gene a) differential methylation in NA12878 b) AOH in the patient Adapted from Carvalho C.M.B. et al. Interchromosomal template-switching as a novel molecular mechanism for imprinting perturbations associated with Temple syndrome. Genome Medicine 11:25 (2019).

Resolving highly complex rearrangements of genomic … · 2019-06-07 · Genomic disorders are diseases that result from chromosomal rearrangements, rather than from nucleotide-scale

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Resolving highly complex rearrangements of genomic … · 2019-06-07 · Genomic disorders are diseases that result from chromosomal rearrangements, rather than from nucleotide-scale

Incorrect pairing of direct homologous repeats during meiosis can result in non-allelic recombination. This reciprocal event leads to one gamete with duplication of the intervening sequence and one with deletion (Fig. 1a). Recombination between inverted homologous repeats results in inversion of the intervening sequence, which can disrupt gene function by inverting exons, or by disconnecting coding and regulatory regions (Fig. 1b). If the affected regions contain dosage-sensitive genes, the effect can range from mild to devastating, depending on the locus (Fig. 1c). Long reads enable assembly of repetitive regions and can also be mapped more unambiguously than short reads, enabling precise characterisation of recombination breakpoints.

Resolving highly complex rearrangements of genomic architecture using long PromethION readsGenomic disorders are diseases that result from chromosomal rearrangements, rather than from nucleotide-scale changes, and lead to the loss or gain of chromosomal material, or to inversions

Fig. 1 Non-allelic homologous recombination: a) and b) mechanisms c) examples of disorders

Deletion, duplication and inversion of dosage-sensitive genes

Fully characterising a duplication-inverted triplication-duplication on chromosome 14

Fig. 2 Mechanism of the duplication-inverted triplication-duplication rearrangement

© 2019 Oxford Nanopore Technologies. All rights reserved. Oxford Nanopore Technologies’ products are currently for research use only.P18002 - Version 2.0

Fig. 3 PromethION sequence data showing AOH; copy number changes; structural variant calls

Contact: [email protected] More information at: www.nanoporetech.com and publications.nanoporetech.com

We generated 16x of reads with an aligned read N50 of 15 kb. Reads were aligned against the hg38 reference using ngmlr and structural variants were detected by Sniffles. Copy number was estimated by comparing read depth at each position to control data and applying a correction factor for different amounts sequenced per sample. Sniffles detected 26,036 bp and 2,263 bp inversions (the latter as an inverted duplication) at the centromeric and telomeric ends of the locus respectively. In both cases one or more reads spanned the entire duplicated region. We determined both breakpoints and confirmed these by capillary sequencing. To identify AOH we called SNPs using bcftools and compared the density of heterozygous SNPs in the target sample against the controls – a moving average of 10 bins is plotted in the top panel of Fig. 3.

012345

Copynumber

Snifflesinversion

Chr 14 p12 q11.2 14q12 23.1 q24.3 31.1 q32.2p13 14p11.2 q21.2 31.3

0

2

4

6

61.5 62.0 62.5 63.0 63.5 64.0 64.5 65.0 65.5

Copynumber

Position along chromosome (Mb)

0

Heterozygosity

20 40 60 80 100

Position along chromosome (Mb)0

0.5

1.0

1.5

62.37 62.38 62.39 62.40 62.41 62.42 62.43 64.63 64.64 64.65 64.66 64.67

Position along chromosome (Mb) Position along chromosome (Mb)

ngmlralignments+ve strand-ve strand

Dosage-sensitive gene

Homologous chromosomes

Recombinationhotspot

Deletion

Duplication Inversion

HotspotLocation of repeats

Repeat length (kb)

Inter-repeatdistance (Mb)

Sequence similarity (%)

Approximatefrequency

Disorder

CMT1A-REP 17p11.2 24 98.7 1.5 1 x 10−5 Charcot-Marie-Tooth disease (del.)Hereditary Neuropathy (dup.)

AZFa-HERV Yq11.21 10 94.0 0.8 1 x 10−5 Azoospermia (del.)Normal fertility (dup.)

WBS-LCR 7q11.23 143 99.6 1.5 1 x 10−5 Williams-Beuren syndrome (del.)7q11.23 duplication syndrome (dup.)

Int22h- Xq28 9.5 99.7 0.6 1 x 10−4 Haemophilia A (inv.)

a) Direct repeats

c)

b) Inverted repeats

Fork collapseand processing

Producta b c d’ c’ b’ C D E

0

1

2

3

4

Cop

y nu

mbe

r

Adapted from Carvalho, C.M.B. et al. Absence of heterozygosity due to template switching during replicative rearrangements. American Journal of Human Genetics 96 (4), 555–564 (2015)

a)Homologous

chromosome q

a b c d e

a’ b’ c’ d’ e’

Homologous chromosome Q

A B C D E

A’ B’ C’ D’ E’

Template switching 2: initiation of replication on

homologous chromosome

Parent strands

DNA replication

e d e d

Leading strand

Lagging strand

c

b

a

c

b

a

b

c

a

d’ c’a’

c

b

a

d’ c’ b’a’

Fork collapseand processing

Template switching 1: newly synthesised strand initiates replication in different place

in opposite direction

a

b

c

d

e

b’

The duplication-inverted triplication-duplication structure (Fig. 2) is a class of complex genomic rearrangement that can lead to severe phenotypic consequences. Just two breakpoints can give rise to both duplicated and triplicated regions. To determine the mechanism that gives rise to the rearrangement it is necessary to i) distinguish between duplicated and triplicated regions, ii) identify breakpoint junctions and iii) identify any absence of heterozygosity (AOH) created by the event. We analysed a patient sample where the rearranged locus had previously been identified using array-CGH, but this method had given insufficient resolution to identify the duplicated regions and thus it had not been possible to ascertain the breakpoints (Fig. 3).

Elucidating the mechanism of complex genomic rearrangements with long reads

Haplotype 1

Haplotype 2

Log-

likel

ihoo

d ra

tio

30

20

10

0

-10

-20

-30

Position (kb)

101,288 101,290 101,292 101,294 101,296

a) Control (NA12878)

Haplotype 1

Haplotype 2

Log-

likel

ihoo

d ra

tio

30

20

10

0

-10

-20

-30

Position (kb)

101,288 101,290 101,292 101,294 101,296

b) Patient

Over a 42 Mb region of chromosome 14, both of the patient’s chromosomes contain maternally inherited DNA. AOH in this region can affect gene dosing through the disruption of normal imprinting mechanisms. Reads aligning to the differentially methylated region of the MEG3 gene were first assigned to haplotypes using MarginPhase based on the phasing of single nucleotide variants. Next, methylation of CpG sites for each haplotype was assessed using Nanopolish and the average of the resulting log-likelihood ratios plotted for each site. A positive LLR indicates CpG methylation, while a negative LLR indicates CpG non-methylation. The control sample (Fig. 4a) shows the normal parent-of-origin specific methylation pattern, while the patient sample (Fig. 4b) lacks any hyper-methylated haplotype, a phenomenon termed loss of imprinting.

Absence of the hyper-methylated paternal MEG3 allele in a patient genome

Fig. 4 Methylation of the MEG3 gene a) differential methylation in NA12878 b) AOH in the patient

Adapted from Carvalho C.M.B. et al. Interchromosomal template-switching as a novel molecular mechanism for imprinting perturbations associated with Temple syndrome. Genome Medicine 11:25 (2019).