30
Supplementary Information 3D Chromatin Architecture of Large Plant Genomes Determined by Local A/B Compartments

 · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Embed Size (px)

Citation preview

Page 1:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Supplementary Information

3D Chromatin Architecture of Large Plant Genomes Determined by Local A/B

Compartments

Page 2:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Supplementary Data Table 1 | Pearson correlation between replicatesfoxtial millet maize rice sorghum tomato

min median min median min median min median min median

0.9562 0.9671 0.9489 0.9863 0.8843 0.9451 0.8907 0.9550 0.9731 0.9835

To compute the correlation, the genome-wide interaction matrix was broken down to random 200*200 sub-matrices along the main

diagonal, and their Pearson correlation was calculated. Bin size = 10 kb

Supplementary Data Table 2 | Pearson correlation of rice seedling Hi-C data from Liu et al and rice protoplast Hi-C

10K 20K

min median min median

0.9292 0.9487 0.9483 09716

Supplementary Data Table 3 | Domain and loop consistency between replicatesfoxtail millet maize rice sorghum tomato

compartment 0.9237 0.86038 0.7935 0.8891 0.9642

domain 0.8320 0.8037 0.8316 0.7728 0.8130

loop - 0.5935 - - 0.6999

The domain border is considered to be consistent if the distance between two replicates was within 2 bins.

The loop is considered reproducible when the distance between two pixels in heatmap matrix are within 50K and within 20% of their

genomic distance. When comparing loop reproducibility, we lowered p threshold to 0.1 of the other replicate.

Supplementary Data Table 4 | Domain border consistency between rice seedling Hi-C data from Liu et al and rice protoplast Hi-C

Percentage of domain border overlap

single border 24.63%

both border 71.98%

no overlao 3.38%

The domain border is considered to be consistent if the distance was within 2 bins. Minimal domain size is 100kb.

Supplementary Data Table 5 | Comparison of three compartment calling methods

maize tomato sorghum foxtail millet rice

Global A Block B 50.46% 31.91% 37.75% 40.91% 38.08%

Global A Chromosome B 15.59% 15.96% 14.93% 27.91% 11.21%

Global B Block A 35.76% 22.58% 30.43% 30.15% 48.38%

Global B Chromosome A 1.39% 11.28% 10.96% 12.41% 4.94%

In mammalian Hi-C studies, A/B compartments (global) can be easily identify using by whole-genome Hi-C contact probability matrix.

A compartments correspond to gene-rich euchromatin, while B compartments correspond to heterochromatin. For the small plant

genomes such as Arabidopsis and rice, chromosome A/B compartments were often called in individual chromosome or euchromatin

arms using the chromosome-wide Hi-C matrix. We found that for large plant genomes, it was best to partition each chromosome into

blocks with similar chromatin interaction pattern using their chromosome-wide Hi-C contact probability by constrained clustering. In

Page 3:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

each block, local A/B compartments could be called using their block-wide Hi-C contact probability matrix. This block-based method is

more accurate in assigning A/B compartment status to euchromatin and heterochromatin islands that are often misidentified by the

global and chromosome-wide methods. For example, 50.46% of the maize A compartments identified by the global method are

assigned as local A compartments by the block-based method, while the chromosome-wide method only identified 15.59% of them.

Genetic and epigenetic feature analysis of these local B compartments initially assigned as global A compartments showed that those

regions are indeed associated with heterochromatin signatures such as high DNA methylation, low gene density etc (Fig 4 and Fig

S9).

Supplementary Data Table 6 | Whole genome bisulfite data summarytotal reads mapping ratio C coverage conversion ratio

maize 1,639,759,875.0

0

0.8547 20.69 0.9920

tomato 542,702,956 0.7312 9.11 0.9940

sorghum 407,967,136.00 0.7273 31.54 0.9930

rice 134,828,220 0.51390.6449 15.66 0.9886

foxtail millet 223,022,358.00 0.7891 28.716 0.9933

Supplementary Data Table 7 | Open chromatin and histone modification data summarytotal reads mapped reads overall alignment rate unique reads peak number

maize

ATAC-seq 41236249 39738033 96.37% 16149797 52681

H3K4me3 40328044 39203141 97.21% 25821001 23064

H3K27me3 34386418 33372525 97.05% 19982130 15974

H3K27ac 29790662 28967191 97.24% 21340313 32702

tomato

DNase-seq 30220717 29037745 96.09% 15674974 21062

H3K4me3 9830887 5546586 56.42% 4808246 19295

H3K27me3 30410104 29285755 96.30% 22325272 41530

H3K27ac 35188056 33118998 94.12% 26815802 23824

sorghum

ATAC-seq 51875055 50491161.8 97.33% 30184868 30512

H3K4me3 8574346 8309399 96.91% 6915264 20187

H3K27me3 13185584 11537386 87.50% 7614675 9323

rice

ATAC-seq 34470691 23842109 69.17% 6824383 18593

H3K4me3 5340841 4907699 91.89% 4178335 22338

H3K27me3 4698878 4297594 91.46% 3530393 8993

foxtail millet

ATAC-seq 27812365 23012261 82.74% 6122432 28677

H3K4me3 28056010 27377055 97.58% 26016696 25188

H3K27me3 29897680 27063380 90.52% 23200655 9177

Page 4:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Supplementary Data Table 8 | RNA-seq data summaryTotal reads rRNA mapping ratio mapping rate

maize

rep

1 9360564 13.19% 89%

rep

2 10431234 11.46% 89.30%

sorghum

rep

1 13337364 6.70% 98.50%

rep

2 13721834 3.32% 98.50%

rep

3 13431232 4.47% 98.50%

rice

rep

1 2623743 4.35% 77.30%

rep

2 6256279 5.17% 73.80%

foxtail millet

rep

1 6973237 21.57% 92.70%

rep

2 6973237 20.43% 92.60%

Supplementary Data Table 9 | Accession numbers of other NGS data used in this studyDataset Accession

tomato DNase-Seq SRP102870

tomato H3K4me3 SRP102870

tomato H3K27me3 SRP102870

Supplementary Data Table 10 | Number of syntenic domains (foxtail millet vs sorghum)

domain repressive polycomb intermediate

activ

e all

P value

number 192 341 243 506 1871

number (>4

genes) 3 58 391 818 1270

80% conserved 0 10 107 186 3031

randomized 0 13 90 190 293

90% conserved 0 5 66 114 1850.999

randomized 0 6 56 114 176

100% conserved 0 5 59 90 1540.9997

randomized 0 5 50 101 156

Page 5:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Supplementary Data Table 11 | Number of syntenic domains (sorghum vs maize)

domain repressive polycomb

intermediat

e active all

P value

num domain 777 364 241 967 2349

gene>=5 29 177 124 890 1220

50% conserved 12 43 23 88 1661

randomized 10 43 26 94 173

80% conserved 1 3 5 5 140.9926

randomized 1 4 4 7 16

100%

conserved 0 1 2 2 5 0.8549

randomized 0 0 2 2 4

Supplementary Data Table 12 | Loop position relative to domainborder-border border-domain border-other Within domain other

maize 3346 1197 674 336 63

tomato 441 423 355 279 152

Page 6:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Supplementary Data Table 13 | Domain border and compartment border overlap cutoff actual domain random domain p-value

not overlapped overlapped not overlapped overlapped

maize 0K 10373 7677 11776 5206 6.62E-118

20K 6157 11893 8411 8571 8.75E-189

40K 4465 13585 6294 10688 3.46E-138

tomato 0K 3783 1637 3654 728 2.17E-56

20K 2861 2559 3123 1259 1.20E-78

40K 2422 2998 2757 1625 9.89E-73

sorghum 0K 3265 1433 3175 685 6.91E-43

20K 2446 2252 2698 1162 1.10E-63

40K 2048 2650 2339 1521 1.91E-55

foxtail millet 0K 2291 1451 2011 811 2.17E-17

20K 1604 2138 1575 1247 2.73E-25

40K 1293 2449 1244 1578 5.46E-15

rice 0K 2835 999 2114 512 8.19E-10

20K 2216 1618 1759 867 7.64E-14

40K 1864 1970 1462 1164 2.46E-08

The random domains are generated by shifting the actual domains 1Mb along the chromosome.

Page 7:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Supplementary Fig 1 | Examples of tomato scaffold assembly error in the older genome version 2.5 shown by Hi-C contact matrix. Scaffold assembly error (inversion) would show high contact ratio between the border. Two examples of such scaffold assembly errors in tomato chromosome 12 are corrected in latest genome version SL3.0.

Tomato chr12 (SL2.5)

0M 10M 20M 30M 40M 50M 60M

0M

10M

20M

30M

40M

50M

60M

0M 10M 20M 30M 40M 50M 60M

0M

10M

20M

30M

40M

50M

60M

Tomato chr12 (SL3.0)

a

Page 8:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Supplementary Fig 2 | Examples of maize scaffold assembly error in the current maize genome version 4. The Hi-C contact matrix heatmap showing a region in maize chromosome 2 with 7 potential assembly errors. The abnormal Hi-C contact hotspot disappeared when we manually rearrange the scaffolds.

240M 241M 242M 243M 244M

240M

241M

242M

243M

244M

1

2

3

4

5

6

7

1

4

2 (rev)

5(rev)

6

7(rev)

3(rev)

240M 241M 242M 243M 244M

240M

241M

242M

243M

244M

Page 9:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Supplementary Fig 3 | Examples of tomato scaffold assembly error in the current tomato genome version SL3.0. The Hi-C contact matrix heatmap showing a region in tomato chromosome 10 with potential assembly error. The abnormal Hi-C contact hotspot disappeared when the scaffold is reversed.

tomato chr10 before

24M200k 24M300k 24M400k 24M500k 24M600k

24M200k

24M300k

24M400k

24M500k

24M600k

24M200k 24M300k 24M400k 24M500k 24M600k

24M200k

24M300k

24M400k

24M500k

24M600ktomato chr10 after correction

Page 10:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Supplementary Fig 4 | Examples of sorghum scaffold assembly error. (a) Potential scaffold assembly error (inversion) at sorghum chr7 11.08-13.07M and 13.07-14.17M, indicated by red rectangles. We also found a potential orientation error at 13.07-14.17M (green). (a) Interaction heatmap showing the potential scaffold assembly errors. (b) Interaction heatmap after manually reordering the scaffolds.

sorghum chr7 before

10M 11M 12M 13M 14M 15M 16M

10M

11M

12M

13M

14M

15M

16M

10M 11M 12M 13M 14M 15M 16M

10M

11M

12M

13M

14M

15M

16Msorghum chr7 after correction

a b

Page 11:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Supplementary Fig 5 | Examples of foxtail millet scaffold assembly error. (a) Interaction heatmap showing the potential scaffold assembly errors. (b) Interaction heatmap after manually reordering the scaffolds. Potential Ordering errors at foxtail millet chr5 26.8-31.6M and 31.6-32.3M. Orientation error at 26.8-31.6M.

foxtail millet chr5 before

26M 27M 28M 29M 30M 31M 32M 33M

26M

27M

28M

29M

30M

31M

32M

33M

26M 27M 28M 29M 30M 31M 32M 33M

26M

27M

28M

29M

30M

31M

32M

33Mfoxtail millet chr5 after correction

a b

Page 12:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Supplementary Fig 6 | Local A/B compartment showed distinct euchromatin and heterochromatin epigenetic features. All maize chromosomes was first partitioned into blocks by constrain clustering of the interaction matrix, block-based eigenvector analysis were then performed to identify the local A/B compartments. (a) The heterochromatin-like epigenetic features of the local B compartment previously characterized as global A compartments. (b) The euchromatin-like epigenetic features of the local A compartments previously characterized as global B compartments.

Genome wide A Block B (Idetified by both

Genome wide A Block B (Block specific)

Genome wide A Block A (control)

00.

01

DHS

00.

04

K4me3

0.00

0.04

K4me3

0.05

0.35

Gene

0.05

0.25

Gene

0.60

0.80

TE

0.75

0.85

CG

0.55

0.75

CHG

0.01

60.

020

CHH

00.

020

DHS

0.55

0.85

TE

0.70

0.90

CG

0.50

0.70

CHG

0.01

50.

021

CHH

−40K border border 40K

Genome wide B Block B (control)

Genome wide B Block A (Idetified by both

Genome wide B Block A (Block specific)

−40K border border 40K −40K border border 40K −40K border border 40K

−40K border border 40K −40K border border 40K −40K border border 40K

−40K border border 40K −40K border border 40K −40K border border 40K −40K border border 40K

−40K border border 40K −40K border border 40K −40K border border 40K

Page 13:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

0M 20M 40M 60M 80M

0M 8M 16M 24M 32M 40M

0M 8M 16M 24M 32M 40M

Genome WideEigene Vector (500K bin)

Chromosome WideEigen Vector (40K bin)

Chromosome BlockEigen Vector (40K bin)

Genome WideEigene Vector (500K bin)

Chromosome WideEigen Vector (40K bin)

Chromosome BlockEigen Vector (40K bin)

Genome WideEigene Vector (500K bin)

Chromosome WideEigen Vector (40K bin)

Chromosome BlockEigen Vector (40K bin)

a

b

c

Supplementary Fig 7 | Segregation of sorghum, foxtail millet and rice genome into local A/B compartments. Comparison of the genome-wide, chromosome-wide and block-specific eigenvector analysis of the sorghum (a), foxtail millet (b) and rice (c) chromosome 1. In all three cases, partitioning the chromosome into blocks by constrained clustering and the use of block-based eigenvector improved the compartment identification.

Page 14:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

0.00

00.

010

DHS

0.00

0.04

K4me3

0.01

0.03

K27me3

0.00

0.12

K27ac

0.05

0.15

Gene

010

mRNA

0.75

0.85

CG

0.55

0.75

CHG

0.01

50.

019

CHH

0.60

0.80

TE0.

480.

56cis_ratio

−40K border border 40K −40K border border 40K −40K border border 40K −40K border border 40K

−40K border border 40K −40K border border 40K −40K border border 40K −40K border border 40K

−40K border border 40K −40K border border 40K −40K border border 40K

Genome wide A Block A

Supplementary Fig 8 | Comparison of the local and genome-wide A/B compartments in the maize genome. After calling the global and block-specific local A/B compartments, we can partition the chromosomes into four types of regions: AA, AB, BA, BB. For example, chromosome regions classified as A compartment both globally and locally are referred to as “AA”, while a region classified as global A and local B compartment is referred to as “AB”. The AA and BA are associated with typical euchromatin features with higher gene density and active chromatin marks, while AB and BB are associated with typical heterochromatin feature with higher TE density and repressive marks. Different genetic and epigenetic features of these compartments and the surrounding regions are shown.

Page 15:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

0.00

00.

012

DHS

0.00

0.08

K4me3

0.02

0.10

K27me3

0.00

0.15

K27ac

0.1

0.5

Gene

08

mRNA

0.70

0.90

CG

0.3

0.7

CHG

0.06

50.

085

CHH

0.3

0.7

TE

0.45

0.55

cis_ratio

Genome wide A Block A

Genome wide A Block B

Genome wide B Block A

Genome wide B Block B

−40K border border 40K −40K border border 40K −40K border border 40K −40K border border 40K

−40K border border 40K −40K border border 40K −40K border border 40K −40K border border 40K

−40K border border 40K −40K border border 40K −40K

Supplementary Fig 9 | Comparison of the local and genome-wide A/B compartments in the tomato genome. After calling the global and block-specific local A/B compartments, we can partition the chromosomes into four types of regions: AA, AB, BA, BB. For example, chromosome regions classified as A compartment both globally and locally are referred to as “AA”, while a region classified as global A and local B compartment is referred to as “AB”. The AA and BA are associated with typical euchromatin features with higher gene density and active chromatin marks, while AB and BB are associated with typical heterochromatin feature with higher TE density and repressive marks. Different genetic and epigenetic features of these compartments and the surrounding regions are shown.

Page 16:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

0.00

0.04

DHS

0.00

0.12

K4me3

0.02

0.10

K27me3

0.1

0.5

Gene

015

mRNA

0.5

0.9

CG

0.2

0.6

CHG

0.02

0.06

CHH

0.3

0.7

TE

0.44

0.52

cis_ratio

−40K border border 40K −40K border border 40K −40K border border 40K −40K border border 40K

−40K border border 40K −40K border border 40K −40K border border 40K −40K border border 40K

−40K border border 40K −40K border border 40K

Genome wide A Block A

Supplementary Fig 10 | Comparison of the local and genome-wide A/B compartments in the sorghum genome. After calling the global and block-specific local A/B compartments, we can partition the chromosomes into four types of regions: AA, AB, BA, BB. For example, chromosome regions classified as A compartment both globally and locally are referred to as “AA”, while a region classified as global A and local B compartment is referred to as “AB”. The AA and BA are associated with typical euchromatin features with higher gene density and active chromatin marks, while AB and BB are associated with typical heterochromatin feature with higher TE density and repressive marks. Different genetic and epigenetic features of these compartments and the surrounding regions are shown.

Page 17:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

0.00

0.04

DHS

0.02

0.14

K4me3

0.05

0.20

K27me3

0.1

0.4

Gene

010

mRNA0.

50.

8CG

0.2

0.5

CHG

0.02

20.

026

CHH

0.2

0.6

TE

0.54

0.62

cis_ratio

−40K border border 40K −40K border border 40K −40K border border 40K −40K border border 40K

−40K border border 40K −40K border border 40K −40K border border 40K −40K border border 40K

−40K border border 40K −40K border border 40K

Genome wide A Block A

Supplementary Fig 11 | Comparison of the local and genome-wide A/B compartments in the foxtail millet genome. After calling the global and block-specific local A/B compartments, we can partition the chromosomes into four types of regions: AA, AB, BA, BB. For example, chromosome regions classified as A compartment both globally and locally are referred to as “AA”, while a region classified as global A and local B compartment is referred to as “AB”. The AA and BA are associated with typical euchromatin features with higher gene density and active chromatin marks, while AB and BB are associated with typical heterochromatin feature with higher TE density and repressive marks. Different genetic and epigenetic features of these compartments and the surrounding regions are shown.

Page 18:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

0.01

0.03

DHS

0.02

0.10

K4me3

0.01

0.07

K27me3

0.01

0.07

H3K27ac0.

150.

45

Gene0

12mRNA

0.4

0.7

CG

0.15

0.45

CHG

0.02

50.

035

CHH

0.2

0.6

TE0.

420.

54cis_ratio

−40K border border 40K −40K border border 40K −40K border border 40K −40K border border 40K

−40K border border 40K−40K border border 40K−40K border border 40K−40K border border 40K

−40K border border 40K −40K border border 40K −40K border border 40K

Genome wide A Block A

Supplementary Fig 12 | Comparison of the local and genome-wide A/B compartments in the rice genome. After calling the global and block-specific local A/B compartments, we can partition the chromosomes into four types of regions: AA, AB, BA, BB. For example, chromosome regions classified as A compartment both globally and locally are referred to as “AA”, while a region classified as global A and local B compartment is referred to as “AB”. The AA and BA are associated with typical euchromatin features with higher gene density and active chromatin marks, while AB and BB are associated with typical heterochromatin feature with higher TE density and repressive marks. Different genetic and epigenetic features of these compartments and the surrounding regions are shown.

Page 19:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

Genome wide A Block B TE rich (top 25%)

Genome wide A Block B TE poor (down 75%)

Genome wide A Block A

0.00

00.

010

DHS

−40K border

0.04

0.12

K27me3

0.2

0.6

TE

border 40K −40K border border 40K −40K border border

Supplementary Fig 13 | A sub-group of the AB region are associated with H3K27me3. The AB regions (defined as genome-wide A compartment and local B compartment) in all five genomes we analyzed have the highest average H3K27me3 level compared to AA, BA and BB regions (Supplementary Fig 8-12). We can further separate these AB regions to two groups based on their average H3K27me3 levels. The top 75% (indicated by the purple colored line) has higher DHS density and lower TE density than the bottom 25%. This suggests that part of these AB regions are TE-rich and gene-poor heterochromatin, while some of them are gene regions associated with the repressive H3K27me3 polycomb mark.

Page 20:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

00.

030

DHS

Dis to border

−100K 0K 100K

0.01

0.05

K4me3

Dis to border

−100K 0K 100K

0.00

0.15

K27me3

Dis to border

−100K 100K

0.00

0.20

K27ac

Dis to border

−100K 0K 100K

0.05

0.25

Gene

Dis to border

−100K 0K 100K

015

mRNA

Dis to border

−100K 0K 100K

0.60

0.90

CG

Dis to border

−100K 0K 100K

0.45

0.75

CHG

Dis to border

−100K 0K 100K

0.01

60.

022 CHH

Dis to border

−100K 0K 100K

0.5

0.8

TE

−100K 0K 100K

0.00

0.04

DHS top percentile

Dis to border

−100K 0K 100K

0.00

0.06

K4me3 top percentile

Dis to border

−100K 0K 100K

0.00

0.15

K27me3 top percentile

Dis to border

−100K 0K 100K

0.00

0.20

K27ac top percentile

Dis to border

−100K 0K 100K

0.05

0.25

Gene top percentile

Dis to border

−100K 0K 100K

015

mRNA top percentile

Dis to border

−100K 0K 100K

0.6

0.9

CG top percentile

Dis to border

−100K 0K 100K

0.4

0.7

CHG top percentile

Dis to border

−100K 0K 100K

0.01

60.

020

CHH top percentile

Dis to border

−100K 0K 100K

0.5

0.8

TE top percentile

−100K 0K 100K

active

polycomb

intermediate

repressive

Supplementary Fig 14 | Genetic and epigenetic feature of maize domains. (a) Genetic and epigenetic features of each domain type in the region 100 kb up and down-stream of the domain border. (b) Same enrichment analysis using only the top 10% of the domains.

Page 21:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

00.

020

DHS

Dis to border

−100K 0K 100K

00.

12

K4me3

Dis to border

−100K 0K 100K

00.

10

K27me3

Dis to border

−100K 0K 100K

00.

20

K27ac

Dis to border

−100K 0K 100K

0.1

0.5

Gene

Dis to border

−100K 0K 100K

010

mRNA

Dis to border

−100K 0K 100K

0.65

0.85

CG

Dis to border

−100K 0K 100K

0.2

0.6

CHG

Dis to border

−100K 0K 100K

0.05

0.07

CHH

Dis to border

−100K 0K 100K

0.2

0.6

TE

−100K 0K 100K

0.00

00.

020

DHS top percentile

Dis to border

−100K 0K 100K

0.00

0.12

K4me3 top percentile

Dis to border

−100K 0K 100K

0.00

0.20

K27me3 top percentile

Dis to border

−100K 0K 100K

0.00

0.20

K27ac top percentile

Dis to border

−100K 0K 100K

0.1

0.5

Gene top percentile

Dis to border

−100K 0K 100K

010

mRNA top percentile

Dis to border

−100K 0K 100K

0.6

0.9

CG top percentile

Dis to border

−100K 0K 100K

0.2

0.6

CHG top percentile

Dis to border

−100K 0K 100K

0.05

0.09

CHH top percentile

Dis to border

−100K 0K 100K

0.2

0.6

TE top percentile

−100K 0K 100K

active

polycomb

intermediate

repressive

Supplementary Fig 15 | Genetic and epigenetic feature of tomato domains. (a) Genetic and epigenetic features of each domain type in the region 100 kb up and down-stream of the domain border. (b) Same enrichment analysis using only the top 10% of the domains.

Page 22:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

0.00

0.02

DHS

Dis to border

−100K 0K 100K

0.00

0.12

K4me3

Dis to border

−100K 0K 100K

0.00

0.20

K27me3

Dis to border

−100K 0K 100K

0.1

0.5

Gene

Dis to border

−100K 0K 100K

015

mRNA

Dis to border

−100K 0K 100K

0.5

0.9

CG

Dis to border

−100K 0K 100K

0.2

0.6

CHG

Dis to border

−100K 0K 100K

0.02

0.06

CHH

Dis to border

−100K 0K 100K

0.3

0.7

TE

Dis to border

−100K 0K 100K

0.00

0.06

DHS top percentile

−100K 0K 100K

0.00

0.15

K4me3 top percentile

−100K 0K 100K

0.00

0.20

K27me3 top percentile

−100K 0K 100K

0.0

0.4

Gene top percentile

Dis to border

−100K 0K 100K

020

mRNA top percentile

−100K 0K 100K

0.5

0.9

CG top percentile

Dis to border

−100K 0K 100K

0.2

0.6

CHG top percentile

Dis to border

−100K 0K 100K

0.02

0.08

CHH top percentile

Dis to border

−100K 0K 100K

0.2

0.8

TE top percentile

Dis to border

−100K 0K 100K

a

b

active

polycomb

intermediate

repressive

Supplementary Fig 16 | Genetic and epigenetic feature of sorghum domains. (a) Genetic and epigenetic features of each domain type in the region 100 kb up and down-stream of the domain border. (b) Same enrichment analysis using only the top 10% of the domains.

Page 23:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

00.

03DHS

Dis to border

−100K 0K 100K0.

020.

10

K4me3

Dis to border

−100K 0K 100K

0.00

0.20

K27me3

Dis to border

−100K 0K 100K

0.1

0.5

Gene

Dis to border

−100K 0K 100K

08

mRNA

Dis to border

−100K 0K 100K

0.5

0.9

CG

Dis to border

−100K 0K 100K0.

20.

6

CHG

Dis to border

−100K 0K 100K

0.02

00.

028

CHH

Dis to border

−100K 0K 100K

0.2

0.8

TE

Dis to border

−100K 0K 100K

0.00

0.06

DHS top percentile

Dis to border

−100K 0K 100K

0.00

0.15

K4me3 top percentile

−100K 0K 100K

0.0

0.4

K27me3 top percentile

−100K 0K 100K

0.1

0.5

Gene top percentile

−100K 0K 100K

015

mRNA top percentile

−100K 0K 100K

0.4

0.8

CG top percentile

Dis to border

−100K 0K 100K

0.2

0.6

CHG top percentile

Dis to border

−100K 0K 100K

0.02

00.

030

CHH top percentile

Dis to border

−100K 0K 100K

0.2

0.8

TE top percentile

Dis to border

−100K 0K 100K

active

polycomb

intermediate

repressive

Supplementary Fig 17 | Genetic and epigenetic feature of foxtail millet domains. (a) Genetic and epigenetic features of each domain type in the region 100 kb up and down-stream of the domain border. (b) Same enrichment analysis using only the top 10% of the domains.

Page 24:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

00.

03DHS

Dis to border

−100K 0K 100K

0.00

0.08

K4me3

Dis to border

−100K 0K 100K

0.00

0.20

K27me3

Dis to border

−100K 0K 100K

0.00

0.04

H3K27ac

Dis to border

−100K 0K 100K

0.1

0.5

Gene

Dis to border

−100K 0K 100K

010

mRNA

Dis to border

−100K 0K 100K

0.5

0.9

CG

Dis to border

−100K 0K 100K

0.2

0.6

CHG

Dis to border

−100K 0K 100K

0.02

50.

035

CHH

Dis to border

−100K 0K 100K

0.3

0.7

TE

−100K 0K 100K

0.01

0.05

DHS top percentile

−100K 0K 100K

0.02

0.10

K4me3 top percentile

−100K 0K 100K

0.00

0.15

K27me3 top percentile

−100K 0K 100K

0.02

0.06

H3K27ac top percentile

−100K 0K 100K

0.2

0.5

Gene top percentile

−100K 0K 100K

015

mRNA top percentile

−100K 0K 100K

0.4

0.8

CG top percentile

−100K 0K 100K

0.2

0.5

CHG top percentile

−100K 0K 100K

0.02

50.

035

CHH top percentile

−100K 0K 100K

0.2

0.6

TE top percentile

−100K 0K 100K

active

polycomb

intermediate

repressive

Supplementary Fig 18 | Genetic and epigenetic feature of the rice domains. (a) Genetic and epigenetic features of each domain type in the region 100 kb up and down-stream of the domain border. (b) Same enrichment analysis using only the top 10% of the domains.

Page 25:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

sorg

hum

Chr

5

0.0

0.4

0.8

0M 20M 40M 60M

foxt

ail m

illet C

hr8

0.0

0.4

0.8

0M 10M 20M 30M 40M

Supplementary Fig 19 | Sorghum chr5 and foxtail millet chr8 H3K27me3 level and synteny relationship.

Page 26:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

0.00

20.

010

DHS

Genomic Position

−100K 0K 100K −100K 0K 100K

0.02

0.10

K27ac

Genomic Position

−100K 0K 100K −100K 0K 100K

0.0

0.4

Gene

Genomic Position

−100K 0K 100K −100K 0K 100K

0.25

0.55

TE

Genomic Position

−100K 0K 100K −100K 0K 100K

all (1650)

Supplementary Fig 20 | Epigenetic feature enrichment analysis of the loci involved in the tomato chromatin loops.

Page 27:  · Web view2349 gene>=5 29 177 124 890 1220 50% conserved 12 43 23 88 166 1 randomized 10 43 26 94 173 80% conserved 1 3 5 5 14 0.9926 randomized 1 4 4 7 16 100% conserved 0 1 2

compartment

domain

Supplementary Fig 21 | Example of tomato loop located within domain. Unlike the TAD corner loops in the mammalian genomes, most of the plant chromatin loops are located outside domains and are formed between two gene islands that are often separated by the repressive domain or local B compartment. In some regions, we could observe chromatin loops that are not associated with gene island. The figure illustrates the 600-1000 kb region on tomato chromosome 4 that contain a loop that is located within a domain.