54
Manuscript Template Page 1 of 54 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity 1 Pengxiang Fan 1 , Peipei Wang 2 , Yann-Ru Lou 1 , Bryan J. Leong 2 , Bethany M. Moore 2 , Craig A. Schenck 1 , Rachel 2 Combs 3, Pengfei Cao 2,4 , Federica Brandizzi 2,4 , Shin-Han Shiu 2,5 , Robert L. Last 1,2* 3 1 Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824 4 2 Department of Plant Biology, Michigan State University, East Lansing, MI 48824 5 3 Division of Biological Sciences, University of Missouri, Columbia, MO 65211 6 4 MSU-DOE Plant Research Laboratory, Michigan State University, East Lansing, United States, East Lansing, MI 7 48824 8 5 Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI 9 48824 10 Current address: Center for Applied Plant Sciences, The Ohio State University, Columbus, OH 43210 11 *Corresponding Author: [email protected] 12 Short title: Plant Metabolic Evolution 13 Abstract (<150 words) 14 Plants produce phylogenetically and spatially restricted, as well as structurally diverse specialized 15 metabolites via multistep metabolic pathways. Hallmarks of specialized metabolic evolution 16 include enzymatic promiscuity, recruitment of primary metabolic enzymes and genomic 17 clustering of pathway genes. Solanaceae plant glandular trichomes produce defensive acylsugars, 18 with aliphatic sidechains that vary in length across the family. We describe a tomato gene cluster 19 on chromosome 7 involved in medium chain acylsugar accumulation due to trichome specific 20 acyl-CoA synthetase and enoyl-CoA hydratase genes. This cluster co-localizes with a tomato 21 steroidal alkaloid gene cluster forming a ‘supercluster’, and is syntenic to a chromosome 12 22 region containing another acylsugar pathway gene. We reconstructed the evolutionary events 23 leading to emergence of this gene cluster and found that its phylogenetic distribution correlates 24 with medium chain acylsugar accumulation across the Solanaceae. This work reveals dynamics 25 behind emergence of novel enzymes from primary metabolism, gene cluster evolution and cell- 26 type specific metabolite diversity. 27 28 29 (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint this version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231 doi: bioRxiv preprint

Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 1 of 54

Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity 1

Pengxiang Fan1, Peipei Wang2, Yann-Ru Lou1, Bryan J. Leong2, Bethany M. Moore2, Craig A. Schenck1, Rachel 2

Combs3†, Pengfei Cao2,4, Federica Brandizzi2,4, Shin-Han Shiu2,5, Robert L. Last1,2* 3

1Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824 4

2Department of Plant Biology, Michigan State University, East Lansing, MI 48824 5

3Division of Biological Sciences, University of Missouri, Columbia, MO 65211 6

4MSU-DOE Plant Research Laboratory, Michigan State University, East Lansing, United States, East Lansing, MI 7

48824 8

5Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI 9

48824 10

†Current address: Center for Applied Plant Sciences, The Ohio State University, Columbus, OH 43210 11

*Corresponding Author: [email protected] 12

Short title: Plant Metabolic Evolution 13

Abstract (<150 words) 14

Plants produce phylogenetically and spatially restricted, as well as structurally diverse specialized 15

metabolites via multistep metabolic pathways. Hallmarks of specialized metabolic evolution 16

include enzymatic promiscuity, recruitment of primary metabolic enzymes and genomic 17

clustering of pathway genes. Solanaceae plant glandular trichomes produce defensive acylsugars, 18

with aliphatic sidechains that vary in length across the family. We describe a tomato gene cluster 19

on chromosome 7 involved in medium chain acylsugar accumulation due to trichome specific 20

acyl-CoA synthetase and enoyl-CoA hydratase genes. This cluster co-localizes with a tomato 21

steroidal alkaloid gene cluster forming a ‘supercluster’, and is syntenic to a chromosome 12 22

region containing another acylsugar pathway gene. We reconstructed the evolutionary events 23

leading to emergence of this gene cluster and found that its phylogenetic distribution correlates 24

with medium chain acylsugar accumulation across the Solanaceae. This work reveals dynamics 25

behind emergence of novel enzymes from primary metabolism, gene cluster evolution and cell-26

type specific metabolite diversity. 27

28 29

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 2: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 2 of 54

30 Introduction 31

Despite the enormous structural diversity of plant specialized metabolites, they are derived from a 32

relatively small number of primary metabolites, such as sugars, amino acids, nucleotides, and 33

fatty acids (1). These lineage-, tissue- or cell- type specific specialized metabolites mediate 34

environmental interactions, such as herbivore and pathogen deterrence or pollinator and symbiont 35

attraction (2, 3). Specialized metabolism evolution is primarily driven by gene duplication (4, 5), 36

and relaxed selection of the resulting gene pairs allows modification of cell- and tissue-specific 37

gene expression and changes in enzymatic activity. This results in expanded substrate recognition 38

and/or diversified product formation (6, 7). The neofunctionalized enzymes can prime the origin 39

and diversification of specialized metabolic pathways (8–10). 40

There are many examples of mechanisms that lead to novel enzymatic activities in 41

specialized cell or tissue types, however, the principles that govern assembly of multi-enzyme 42

specialized metabolic pathways are less well established. One appealing hypothesis involves the 43

stepwise recruitment of pathway enzymes (11). Clustering of non-homologous specialized 44

metabolic enzyme genes, which participate in the same biosynthetic pathway, is an unusual 45

feature of plant genomes (12, 13). This phenomenon is seen for pathways leading to products of 46

diverse structures and biosynthetic origins including terpenes (14, 15), cyclic hydroxamic acids 47

(16), alkaloids (17, 18), polyketides (19), cyanogenic glucosides (20), and modified fatty acids 48

(21). However, while these clusters encode multiple non-homologous enzymes of a biosynthetic 49

pathway, evolution of their assembly is not well understood. 50

Acylsugars are a group of insecticidal (22) and anti-inflammatory (23) chemicals mainly 51

observed in glandular trichomes of Solanaceae species (24, 25). These specialized metabolites are 52

sugar aliphatic esters with three levels of structural diversity across the Solanaceae family: acyl 53

chain length, acylation position, and sugar core (24). The primary metabolites sucrose and 54

aliphatic acyl-CoAs are the biosynthetic precursors of acylsucroses in plants as evolutionarily 55

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 3: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 3 of 54

divergent as the cultivated tomato Solanum lycopersicum (26) (Fig. 1), Petunia axillaris (27) and 56

Salpiglossis sinuata (28). The core tomato acylsucrose biosynthetic pathway involves four BAHD 57

[BEAT, AHCT, HCBT, DAT(29)] family acylsucrose acyltransferases (Sl-ASAT1 through Sl-58

ASAT4), which are specifically expressed in the type I/IV trichome tip cells (26, 30, 31). These 59

enzymes catalyze consecutive reactions utilizing sucrose and acyl-CoA substrates to produce the 60

full set of cultivated tomato acylsucroses in vitro (26). 61

Co-option of primary metabolic enzymes contributed to the evolution of acylsugar 62

biosynthesis and led to interspecific structural diversification across Solanum tomato clade. One 63

example is an invertase-like enzyme originating from carbohydrate metabolism that generates 64

acylglucoses in the wild tomato S. pennellii through cleavage of the acylsucrose glycosidic bond 65

(32). In another case, allelic variation of a truncated isopropylmalate synthase-like enzyme 66

(IPMS3) – from branched chain amino acid metabolism – leads to acylsugar iC4/iC5 (2-67

methylpropanoic/3-methylbutanoic acid) acyl chain diversity in S. pennellii and S. lycopersicum 68

(33). Acylsugar structural diversity is even more striking across the family. Previous studies 69

revealed variation in acyl chain length (28, 34, 35): Nicotiana, Petunia and Salpiglossis species 70

were reported to accumulate acylsugars containing only short acyl chains (carbon number, C≤8). 71

In contrast, some species in Solanum and other closely related genera produce acylsugars with 72

medium acyl chains (C≥10). These results are consistent with the hypothesis that the capability to 73

produce medium chain acylsugars varies across the Solanaceae family. 74

In this study, we identify a metabolic gene cluster on tomato chromosome 7 containing 75

two non-homologous genes – acylsugar acyl-CoA synthetase (AACS) and acylsugar enoyl-CoA 76

hydratase (AECH) – affecting medium chain acylsugar biosynthesis. Genetic and biochemical 77

results show that the trichome enriched AACS and AECH are involved in generating medium 78

chain acyl-CoAs, which are donor substrates for acylsugar biosynthesis. Genomic analysis 79

revealed a syntenic region of the gene cluster on chromosome 12, where the acylsucrose 80

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 4: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 4 of 54

biosynthetic Sl-ASAT1 is located (26). Phylogenetic analysis of the syntenic regions in Solanaceae 81

and beyond led to evolutionary reconstruction of the origin of the acylsugar gene cluster. We infer 82

that sequential gene insertion facilitated emergence of this gene cluster, while gene duplication 83

and pseudogenization rearranged and fragmented it. These results provide insights into how co-84

option of primary metabolic enzymes, emergence of cell-type specific gene expression, and the 85

formation of metabolic gene clusters play a role in specialized metabolism evolution. 86

Results 87

Identification of a metabolic gene cluster that affects tomato trichome medium chain 88

acylsugar biosynthesis 89

S. pennellii natural accessions (36), as well as the S. lycopersicum M82 × S. pennellii LA0716 90

chromosomal substitution introgression lines (ILs) (37), offer convenient resources to investigate 91

interspecific genetic variation that affects acylsugar metabolic diversity (36, 38). In a rescreen of 92

ILs for S. pennellii genetic regions that alter trichome acylsugar profiles (38), IL7-4 was found to 93

accumulate increased C10 medium chain containing acylsugars compared with M82 (Fig. 2, A 94

and B). The genetic locus that contributes to the acylsugar phenotype was narrowed down to a 95

685 kb region through screening selected backcross inbred lines (BILs) (39) that have 96

recombination breakpoints on chromosome 7 (Fig. 2C). Because tomato acylsucrose biosynthesis 97

occurs in trichomes, candidate genes in this region were filtered based on their trichome-specific 98

expression patterns. This analysis identified a locus containing multiple tandem duplicated genes 99

of three types – acyl-CoA synthetases (ACS), enoyl-CoA hydratases (ECH), and BAHD family 100

acyltransferases. Three genes with expression enriched in tomato trichomes were selected for 101

further analysis (Fig. 2D). 102

The three candidate genes were tested for involvement in tomato acylsugar biosynthesis 103

by making loss of function mutations using the CRISPR-Cas9 gene editing system. Two guide 104

RNAs (gRNAs) were designed to target one or two exons of each gene to assist site-specific DNA 105

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 5: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 5 of 54

cleavage by hCas (40) (fig. S1, A-C). In the self-crossed T1 progeny of stably transformed M82 106

plants, at least two homozygous mutants were obtained in Solyc07g043630, Solyc07g043660, and 107

Solyc07g043680 (fig. S1, A-C), and these were analyzed for leaf trichome acylsugar changes. 108

Altered acylsugar profiles were observed in the ACS-annotated Solyc07g043630 or ECH-109

annotated Solyc07g043680 mutants (Fig. 3, A and B), but not in the ACS-annotated 110

Solyc07g043660 mutant (fig. S1D). Despite carrying mutations in distinctly annotated genes 111

(ACS or ECH), the two mutants exhibited the same phenotype – no detectable medium acyl chain 112

(C10 or C12) containing acylsugars (Fig. 3, A and B). We renamed Solyc07g043630 as acylsugar 113

acyl-CoA synthetase 1 (Sl-AACS1) and Solyc07g043680 as acylsugar enoyl-CoA hydratase 1 (Sl-114

AECH1) based on this analysis. 115

Further genomic analysis revealed that Sl-AACS1 and Sl-AECH1 belong to a syntenic region 116

shared with a locus on chromosome 12, where Sl-ASAT1 is located (Fig. 3C and fig. S2). Sl-117

ASAT1 is specifically expressed in trichome tip cells and encodes the enzyme catalyzing the first 118

step of tomato acylsucrose biosynthesis (26). This led us to test the cell-type expression pattern of 119

Sl-AACS1 and Sl-AECH1. Like Sl-ASAT1, the promoters of both genes drove GFP expression in 120

the trichome tip cells of stably transformed M82 plants (Fig. 3C). This supports our hypothesis 121

that Sl-AACS1 and Sl-AECH1 are involved in tomato trichome acylsugar biosynthesis. Taken 122

together, we identified a metabolic gene cluster involved in medium chain acylsugar biosynthesis, 123

which is composed of two cell-type specific genes. 124

In vitro analysis of Sl-AACS1 and Sl-AECH1 implicates their roles in medium chain acyl-125

CoA metabolism 126

ACS and ECH are established to function in multiple cell compartments for the metabolism of 127

acyl-CoA(41), the acyl donor substrates for ASAT enzymes. We sought to understand the 128

organelle targeting of Sl-AACS1 and Sl-AECH1, to advance our knowledge of acylsugar 129

machinery at the subcellular level. We constructed expression cassettes of Sl-AACS1, Sl-AECH1 130

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 6: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 6 of 54

and Solyc07g043660 with C-terminal cyan fluorescent protein (CFP), hypothesizing that the 131

targeting peptides reside at the N-terminus of precursor proteins. When co-expressed in tobacco 132

leaf epidermal cells, three CFP-tagged recombinant proteins co-localized with the mitochondrial 133

marker MT-RFP (42) (Fig. 4A and fig. S3A). To rule out the possibility of peroxisomal 134

localization, we fused Sl-AACS1, Sl-AECH1, or Solyc07g043660 with N-terminus fused yellow 135

fluorescent protein (YFP), considering that potential peroxisomal targeting peptides are usually 136

located on the C-terminus (43). The expressed YFP-recombinant proteins were not co-localized 137

with the peroxisomal marker RFP-PTS (42) (fig. S3B). Instead, they appeared distributed in the 138

cytosol (fig. S3B), presumably because the N-terminal YFP blocked the mitochondria targeting 139

signal. Taken together, protein expression and co-localization analyses suggest that Sl-AACS1, 140

Sl-AECH1, and Solyc07g043660 encode enzymes targeted to mitochondria. 141

Sl-AACS1 belongs to a group of enzymes that activate diverse carboxylic acid substrates to 142

produce acyl-CoAs. We hypothesized that Sl-AACS1 uses medium chain fatty acids as substrates, 143

because ablation of Sl-AACS1 eliminated acylsugars with medium acyl chains. To characterize the 144

in vitro activity of Sl-AACS1, we purified recombinant His-tagged proteins from Escherichia 145

coli. Enzyme assays were performed by supplying fatty acid substrates with even carbon numbers 146

from C2 through C18 (Fig. 4B). The results showed that Sl-AACS1 utilized fatty acid substrates 147

with lengths ranging from C6 to C12, including those with a terminal branched carbon (iC10:0) or 148

an unsaturated bond (trans-2-decenoic acid, C10:1) (Fig. 4, B and C). However, no activity was 149

observed with the 3-hydroxylated C12 and C14 fatty acids as substrates (Fig. 4B). These results 150

support our hypothesis that Sl-AACS1 produces medium chain acyl-CoAs, which are in vivo 151

substrates for acylsugar biosynthesis. 152

To test whether Sl-AACS1 and Sl-AECH1 can produce medium chain acyl-CoAs in planta, 153

we transiently expressed these genes in Nicotiana benthamiana leaves using Agrobacterium-154

mediated infiltration (44). It is challenging to directly measure plant acyl-CoAs, due to their low 155

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 7: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 7 of 54

concentration and separate organellar pools. We used an alternative approach and characterized 156

membrane lipids, which are produced from acyl-CoA intermediates. We took advantage of the 157

observation that N. benthamiana membrane lipids do not accumulate detectable acyl chains of 12 158

carbons or shorter. N. benthamiana leaves were infiltrated with constructs containing Sl-AACS1 159

or Sl-AECH1 individually, or together (Fig. 4D). In contrast to the empty vector control, 160

infiltration of Sl-AECH1 led to detectable levels of C12 acyl chains in the leaf membrane lipid 161

phosphatidylcholine (PC) (Fig. 4D). We also observed increased C14 acyl chains in PC, 162

phosphatidylglycerol (PG), sulfoquinovosyl diacylglycerol (SQDG), and 163

digalactosyldiacylglycerol (DGDG) in Sl-AECH1 infiltrated plants (Fig. 4D and fig. S3C). These 164

results suggest that Sl-AECH1 participates in generation of medium chain acyl-CoAs in planta, 165

which are channeled into lipid biosynthesis. No medium chain acylsugars were detected, 166

presumably due to the lack of core acylsugar biosynthetic machinery in N. benthamiana 167

mesophyll cells. 168

We asked whether the closest known homologs of Sl-AECH1 from Solanum species can 169

generate medium chain lipids when transiently expressed in N. benthamiana. Two SQDGs with 170

C12 chains were monitored by LC/MS as peaks diagnostic of lipids containing medium chain 171

fatty acids (fig. S4, A and B). The results showed that only the putative Sl-AECH1 orthologs 172

Sopen07g023250 (Sp-AECH1) and Sq_c37194 (Sq-AECH1) – from S. pennellii and S. quitoense 173

respectively – generated medium chain lipids in the infiltrated leaves (fig. S4C). This confirms 174

that not all ECHs can produce medium chain lipids and suggests that the function of Sl-AECH1 175

evolved recently, presumably as a result of neofunctionalization after gene duplication (fig. S4C). 176

AACS1 and AECH1 are evolutionarily conserved in the Solanum 177

Medium chain acylsugars were documented in Solanum species besides cultivated tomato, 178

including S. pennellii (32), S. nigrum (28), as well as the more distantly related S. quitoense (45) 179

and S. lanceolatum (23). We hypothesized that evolution of AACS1 and AECH1 contributed to 180

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 8: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 8 of 54

medium chain acylsugar biosynthesis in Solanum. As a test, we analyzed the genomes of Solanum 181

species other than cultivated tomato. Indeed, the acylsugar gene cluster containing ACS and ECH 182

was found in both S. pennellii and S. melongena (eggplant), suggesting that the cluster assembly 183

evolved before divergence of the tomato and eggplant lineage (Fig. 5A). 184

We applied gene expression and genetic approaches to test the in vivo functions of ACS and 185

ECH in selected Solanum species. To explore the expression pattern of S. pennellii ACS and ECH 186

cluster genes, we performed RNA-seq analysis on trichomes and shaved stems to identify 187

acylsugar biosynthetic candidates (table S1). The expression pattern of S. pennellii cluster genes 188

is strikingly similar to S. lycopersicum: one ECH and two ACS genes are highly enriched in 189

trichomes, including the orthologs of Sl-AACS1 and Sl-AECH1. Sp-AACS1 function 190

(Sopen07g023200) was first tested by asking whether it can reverse the cultivated tomato sl-aacs1 191

mutant acylsugar phenotype. Indeed, Sp-AACS1 restored C12 containing acylsugars in the stably 192

transformed sl-aacs1 plants (fig. S5). To directly test Sp-AACS1 and Sp-AECH1 function, we 193

used CRISPR-Cas9 to make single mutants in S. pennellii LA0716. No medium chain acylsugars 194

were detected in T0 generation mutants with edits for each gene (Fig. 5B and fig. S6, A and C). 195

Similar to the ACS-annotated Solyc07g043660 cultivated tomato mutant (fig. S1D), deletion of S. 196

pennellii ortholog Sopen07g023220 has no observed effects on S. pennellii trichome acylsugars 197

(fig. S6D). 198

The medium chain acylsugar producer S. quitoense (45) was used for AACS1 and AECH1 199

functional analysis because of its phylogenetic distance from the tomato clade, it is in the 200

Solanum Leptostemonum clade (including eggplant), and the fact that it produces medium chain 201

acylsugars. We found trichome-enriched putative orthologs of AACS1 and AECH1 in the 202

transcriptome dataset of S. quitoense (28), and tested their in vivo function through virus-induced 203

gene silencing (VIGS) (fig. S6E). Silencing either gene led to decreased total acylsugars (Fig. 5C 204

and fig. S6F), which correlated with the degree of expression reduction in each sample (Fig. 5D). 205

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 9: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 9 of 54

These results are consistent with the hypothesis that Sq-AACS1 and Sq-AECH1 are involved in 206

medium chain acylsugar biosynthesis, because all acylsugars in S. quitoense carry two medium 207

chains (45). The importance of AACS1 and AECH1 in medium chain acylsugar biosynthesis in 208

distinct Solanum clades inspired us to explore the evolutionary origins of the gene cluster. 209

Evolution of the gene cluster correlates with the distribution of medium chain acylsugars 210

across Solanaceae 211

We sought to understand how the acylsugar gene cluster evolved and whether it correlates with 212

the distribution of medium chain acylsugars across the Solanaceae family. Taking advantage of 213

the available genome sequences of 13 species from Solanaceae and sister families, we analyzed 214

the regions that are syntenic with the tomato acylsugar gene cluster (fig. S7). This synteny was 215

found in all these plants, including the most distantly related species analyzed, Coffea canephora 216

(Rubiaceae) (fig. S7). BAHD acyltransferase genes were the only cluster genes observed in the 217

syntenic regions both inside and outside the Solanaceae, in contrast to ECH and ACS, which are 218

restricted to the family (Fig. 6A and fig. S7). Within the syntenic regions of the species analyzed, 219

ECH homologs, including pseudogenes, are present in all Solanaceae except for Capsicum 220

species, while ACS is more phylogenetically restricted, being found only in Nicotiana and 221

Solanum (Fig. 6A and fig. S7). 222

Based on the species phylogeny, we reconstructed the evolutionary history of the 223

acylsugar gene cluster in Solanaceae according to the presence or absence of these three genes in 224

the syntenic block (Fig. 6B). We propose two sequential gene insertion events that took place 225

during the evolution of Solanaceae followed by gene duplications and gene losses leading to the 226

cluster structures in extant plants (Fig. 6B). In this model, the BAHD acyltransferase was present 227

in the most recent common ancestor of Solanaceae, Convolvulaceae, and Rubiaceae, and was 228

likely lost in Ipomoea. An ancestral ECH gene insertion into the syntenic region prior to the 229

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 10: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 10 of 54

divergence of Petunia and its sister clade was the next event. This was followed by the insertion 230

of an ancestral ACS gene into the cluster in the most recent common ancestor of Nicotiana, 231

Capsicum, and Solanum (Fig. 6B). The entire cluster was lost in Capsicum (Fig. 6B). 232

We propose that the emergence of both ACS and ECH genes in the synteny was a 233

prerequisite for the rise of medium chain acylsugars in Solanaceae species (Fig. 6B). Consistent 234

with the hypothesis, only short chain acylsugars were observed in Petunia (35) (fig. S8), which 235

correlates with the absence of ACS homolog (Fig. 6B). In contrast, medium chain acylsugars 236

were detected throughout Solanum (fig. S8), supported by the observation that both ACS and 237

ECH homologs are present in extant Solanum species (Fig. 6B). Interestingly, although Nicotiana 238

species collectively have both ACS and ECH genes (Fig. 6B), they do not produce medium chain 239

acylsugars (fig. S8) due to differential gene losses in different species. For example, the ECH 240

homologs are pseudogenes in N. benthamiana and N. tomentosiformis (Fig. 6A), and the ECH 241

gene in N. attenuata, is not located within the same syntenic region with ACS gene (fig. S7), and 242

has very low expression in all the tissues tested (46), including in trichome-rich tissues such as 243

stems and pedicels (table S2). These results show that the presence of both functional ACS and 244

ECH genes are associated with the accumulation of medium chain acylsugars, supporting our 245

hypothesis above. 246

To further support our hypothesis, we extended our phenotypic analysis to six additional 247

Solanaceae genera (fig. S8). According to our hypothesis, the presence of ACS and ECH genes 248

should correlate with accumulation of medium chain acylsugars. Based on our phylogenetic and 249

synteny analysis, the cluster with both ACS and ECH should only be present in the lineage after 250

its divergence from the Petunia lineage. Consistent with our hypothesis, species such as 251

Jaltomata, Physalis, Iochroma, Atropa, and Hyoscyamus, which are more closely related to 252

Solanum species than they are to Petunia, have medium chain acylsugars (fig. S8). In contrast, 253

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 11: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 11 of 54

Salpiglossis sinuata, diverged before the lineage leading to Petunia and Solanum, produces no 254

detectable medium chain acylsugars (28) (fig. S8). 255

Discussion 256

This study identified a S. lycopersicum chromosome 7 cluster of ACS, ECH, and BAHD 257

acyltransferase genes including two involved in medium chain acylsugar biosynthesis. The 258

discovery of this locus was prompted by our observation of increased C10 containing acylsugars 259

in tomato recombinant lines carrying this region from the wild tomato S. pennellii chromosome 7. 260

In vitro biochemistry revealed that Sl-AACS1 produces acyl-CoAs using C6-C12 fatty acids as 261

substrates. The function of AACS1 and AECH1 in cultivated and wild tomato medium chain 262

acylsugar biosynthesis was confirmed by genome editing, and extended to the phylogenetically 263

distant S. quitoense using VIGS. The trichome tip cell-specific expression of these genes is 264

similar to that of previously characterized acylsugar pathway genes (24). 265

There are increasing examples of plant specialized metabolic innovation evolving from gene 266

duplication and neofunctionalization of primary metabolic enzymes (1, 4, 47). Recruitment of Sl-267

AACS1 and Sl-AECH1 from fatty acid metabolism provides new examples of ‘hijacking’ primary 268

metabolic genes into acylsugar biosynthesis, in addition to an isopropylmalate synthase (Sl-269

IPMS3) and an invertase (Sp-ASFF1) (32, 33). We hypothesize that both AACS1 and AECH1 270

participate in generation of medium chain acyl-CoAs, the acyl donor substrates for ASAT-271

catalyzed acylsugar biosynthesis. Indeed, Sl-AACS1 exhibits in vitro function consistent with this 272

hypothesis, efficiently utilizing medium chain fatty acids as substrates to synthesize acyl-CoAs. 273

Strikingly, Sl-AECH1 perturbs membrane lipid composition when transiently expressed in N. 274

benthamiana leaves, generating unusual C12-chain membrane lipids. 275

These results suggest that evolution of trichome tip cell-specific gene expression potentiated 276

the co-option of AACS1 and AECH1 in medium chain acylsugar biosynthesis. Analogous to 277

trichomes producing medium chain acylsugars, seeds of phylogenetically diverse plants 278

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 12: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 12 of 54

accumulate medium chain fatty acid storage lipids (48). In contrast, fatty acids with unusual 279

structures, including those of medium chain lengths, are rarely found in membrane lipids, 280

presumably because these would perturb membrane bilayer structure and function (49). For 281

example, seed embryo-specific expression of three neofunctionalized enzyme variant genes in 282

Cuphea species – an acyl-ACP thioesterase (50), a 3-ketoacyl-ACP synthase (51), and a 283

diacylglycerol acyltransferase (52) – lead to production of medium chain seed storage lipids (53). 284

Trichome tip cell restricted expression of AACS1 and AECH1 represents an analogous example of 285

diverting neofunctionalized fatty acid enzymes from general metabolism into cell-specific 286

specialized metabolism. It is notable that we obtained evidence that Sl-AACS1 and Sl-AECH1 287

are targeted to mitochondria. Because the other characterized acylsugar biosynthetic enzymes – 288

ASATs and Sl-IPMS3 – appear to be cytoplasmic, these results suggest that medium chain 289

acylsugar substrates are intracellularly transported within the trichome tip cell. 290

Beyond employing functional approaches, this study demonstrates the value of a combined 291

comparative genomic and metabolomic analysis in reconstructing the evolutionary history of a 292

gene cluster: in this case over tens of millions of years. We propose that the acylsugar gene 293

cluster started with a ‘founder’ BAHD acyltransferase gene, followed by sequential insertion of 294

ECH and ACS (Fig. 6B). This de novo assembly process is analogous to evolution of the 295

antimicrobial triterpenoid avenacin cluster in oat (14, 54). There are two noteworthy features of 296

our approach. First, reconstructing acylsugar gene cluster evolution in a phylogenetic context 297

allows us to deduce cluster composition in extant species (Fig. 6B). Second, it links cluster 298

genotype with acylsugar phenotype and promotes prediction of acylsugar diversity across the 299

Solanaceae (Fig. 6 and fig. S8): absence of either ECH or ACS in the synteny correlates with 300

deficiency of medium chain acylsugars in the plant. The short chain acylsugar producer 301

Salpiglossis, which diverged prior to the common ancestor of Petunia, validates the predictive 302

power of this model. 303

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 13: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 13 of 54

The current architecture of the acylsugar cluster merely represents a snapshot of a dynamic 304

genomic region. De novo assembly of the gene cluster was accompanied by gene duplication, 305

transposition, pseudogenization, and deletion in different genera. In the extreme case of 306

Capsicum, the cluster genes were deleted entirely (Fig. 6), which is consistent with lack of 307

detectable trichome acylsugars in Capsicum plants. In Nicotiana, the ACH genes became 308

pseudogenized (Fig. 6B) or poorly expressed (table S2), which is associated with lack of 309

detectable plant medium chain acylsugars (fig. S8). In tomatoes, the trichome expressed 310

Solyc07g043660 is a tandem duplicate of AACS1 (Fig. 2D), yet its deletion has no effect on 311

trichome acylsugars (fig. S1D). A parsimonious explanation is that Solyc07g043660 are 312

experiencing functional divergence, which may eventually lead to pseudogenization as observed 313

for other genes in the syntenic region. 314

The acylsugar gene cluster on chromosome 7 is syntenic to a chromosome 12 region 315

containing Sl-ASAT1 (fig. S2). This resembles the tomato steroidal alkaloid gene cluster 316

consisting of eight genes that are dispersed into two syntenic chromosome regions (17). In fact, 317

the tomato alkaloid cluster genes are directly adjacent to the acylsugar cluster, with six alkaloid 318

genes on chromosome 7 and two on chromosome 12 (fig. S2). To our knowledge, this is the first 319

discovery of two plant specialized metabolic gene clusters colocalizing in a genomic region, thus 320

forming a ‘supercluster’. Tomato steroidal alkaloids and acylsugars both serve defensive roles in 321

plants but are structurally distinct and stored in different tissues. This raises intriguing questions: 322

how common are ‘superclusters’ encoding enzymes of distinct biosynthetic pathways in plant 323

genomes; does this colocalization confer selective advantage through additive or synergistic 324

effects of multiple classes of defensive metabolites? Answering these questions requires 325

continued mining of metabolic gene clusters across broader plant species and analyzing the 326

impacts of clustering in evolutionary and ecological contexts. 327

Materials and methods 328

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 14: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 14 of 54

Plant materials and trichome metabolite extraction 329

The seeds of cultivated tomato Solanum lycopersicum M82 were obtained from the C.M. Rick 330

Tomato Genetic Resource Center (http://tgrc/ucdvis.edu). Tomato introgression lines (ILs) and 331

tomato backcross inbred lines (BILs) were from Dr. Dani Zamir (Hebrew University of 332

Jerusalem). The tomato seeds were treated with ½ strength bleach for 30 minutes and washed 333

with de-ionized water three or more times before placing on wet filter paper in a Petri dish. After 334

germination, the seedlings were transferred to peat-based propagation mix (Sun Gro Horticulture) 335

and transferred to a growth chamber for two or three weeks under 16 h photoperiod, 28 °C day 336

and 22 °C night temperatures, 50% relative humidity, and 300 μmol m-2 s-1 photosynthetic photon 337

flux density. The youngest fully developed leaf was submerged in 1 mL extraction solution in a 338

1.5 mL screw cap tube and agitated gently for 2 min. The extraction solution contains 339

acetonitrile/isopropanol/water (3:3:2) with 0.1% formic acid and 10 μM propyl-4-340

hydroxybenzoate as internal standard. The interactive protocol of acylsugar extraction is available 341

in Protocols.io at http://dx.doi.org/10.17504/protocols.io.xj2fkqe. 342

DNA construct assembly and tomato transformation 343

Assembly of the CRISPR-Cas9 constructs was as described (32, 40). Two guide RNAs (gRNAs) 344

were designed targeting one or two exons of each gene to be knocked out by the CRISPR-Cas9 345

system. The gRNAs were synthesized gene (gBlocks) by IDT (Integrated DNA Technologies, 346

location) (table S3). For each CRISPR construct, two gBlocks and four plasmids from Addgene, 347

pICH47742::2x35S-5′UTR-hCas9 (STOP)-NOST (Addgene no. 49771), pICH41780 (Addgene 348

no. 48019), pAGM4723 (Addgene no. 48015), pICSL11024 (Addgene no. 51144), were mixed for 349

DNA assembly using the Golden Gate assembly kit (NEB). 350

For in planta tissue specific reporter gene analysis, 1.8 kb upstream of the annotated 351

translational start site of Sl-AACS1 and Sl-AECH were amplified using the primer pairs 352

SlAACS1-pro_F/R and SlECH1-pro_F/R (table S3). The amplicon was inserted into the entry 353

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 15: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 15 of 54

vector pENTR/D-TOPO, followed by cloning into the GATEWAY vector pKGWFS7. For 354

ectopically expressing Sp-AACS1 in the cultivated tomato CRISPR mutant sl-aacs1, Sp-AACS1 355

gene including 1.8 kb upstream of the translational start site of Sp-AACS1 was amplified using the 356

primer pair SpAACS1-pro-gene_F/R (table S3). The amplicon was inserted into the entry vector 357

pENTR/D-TOPO, followed by cloning into the GATEWAY vector pK7WG. 358

The plant transformation of S. lycopersicum M82 and S. pennellii LA0716 was performed 359

using the Agrobacterium tumefaciens strain AGL0 following published protocols (32, 55). The 360

primers used for genotyping the S. lycopersicum M82 transgenic plants harboring pK7WG or 361

pKGWFS7 construct are listed in table S3. For genotyping the S. lycopersicum M82 CRISPR 362

mutants in the T1 generation, the sequencing primers listed in table S3 were used to amplify the 363

genomic regions harboring the gRNAs and the resultant PCR products were sent for Sanger 364

sequencing. For genotyping the S. pennellii LA0716 CRISPR mutants in the T0 generation, the 365

sequencing primers listed in table S3 were used to amplify the genomic regions harboring the 366

gRNAs. The resulting PCR products were cloned into the pGEM-T easy vector (Promega) and 367

transformed into E. coli. Plasmids from at least six individual E. coli colonies containing the 368

amplified products were extracted and verified by Sanger sequencing. 369

Protein subcellular targeting in tobacco mesophyll cells 370

For protein subcellular targeting analysis, the open reading frame (ORF) of Sl-AACS1, Sl-AECH, 371

and Solyc07g043660 were amplified using the primers listed in table S3. These amplicons were 372

inserted into pENTR/D-TOPO respectively, followed by subcloning into the GATEWAY vectors 373

pEarleyGate102 (no. CD3-684) and pEarleyGate104 (no. CD3-686), which were obtained from 374

Arabidopsis Biological Resource Center (ABRC). For the pEarleyGate102 constructs, the CFP 375

was fused to the C-terminal of the tested proteins. For the pEarleyGate104 constructs, the YFP 376

was fused to the protein N-terminus. Transient expressing the tested proteins was performed 377

following an established protocol (56) with minor modifications. In brief, cultures of A. 378

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 16: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 16 of 54

tumefaciens (strain GV3101) harboring the expression vectors were washed and resuspended with 379

the infiltration buffer (20 mM acetosyringone, 50 mM MES pH 5.7, 0.5% glucose [w/v] and 2 380

mM Na3PO4) to reach OD600nm = 0.05. Four-week-old tobacco (Nicotiana tabacum cv. Petit 381

Havana) plants grew in 21� and 8 h short-day conditions were infiltrated, and then maintained in 382

the same growth condition for three days before being sampled for imaging. The GV3101 cultures 383

containing the mitochondria marker MT-RFP (42) were co-infiltrated to provide the control 384

signals for mitochondrial targeting. In separate experiments, the GV3101 cultures containing the 385

peroxisome marker RFP-PTS (42) were co-infiltrated to provide the control signals for 386

peroxisomal targeting. 387

Confocal Microscopy 388

A Nikon A1Rsi laser scanning confocal microscope and Nikon NIS-Elements Advanced Research 389

software were used for image acquisition and handling. For visualizing GFP fluorescence in 390

trichomes of the tomato transformants, the excitation wavelength at 488 nm and a 505- to 525-nm 391

emission filter were used for the acquisition. For visualizing signals of fluorescence proteins in 392

the tobacco mesophyll cells, CFP, YFP and RFP, respectively, were detected by excitation lasers 393

of 443 nm, 513 nm, 561 nm and emission filters of 467-502 nm, 521-554 nm, 580-630 nm. 394

N. benthamiana transient gene expression and membrane lipid analysis 395

For N. benthamiana transient expression of Sl-AACS1, Sl-AECH1, and homologs of AECH1, the 396

ORFs of these genes were amplified using primers listed in table S3, followed by subcloning into 397

pEAQ-HT vector using the Gibson assembly kit (NEB). Linearization of pEAQ-HT vector was 398

performed by XhoI and AgeI restriction enzyme double digestion. A. tumefaciens (strain 399

GV3101) harboring the pEAQ-HT constructs were grown in LB medium containing 50 µg/mL 400

kanamycin, 30 µg/mL gentamicin, and 50 µg/ml rifampicin at 30 �. The cells were collected by 401

centrifugation at 5000 g for 5 min and washed once with the resuspension buffer (10 mM MES 402

buffer pH 5.6, 10 mM MgCl2, 150 µM acetosyringone). The cell pellet was resuspended in the 403

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 17: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 17 of 54

resuspension buffer to reach OD600nm = 0.5 for each strain and was incubated at room temperature 404

for 1 h prior to infiltration. Leaves of 4 to 5-week-old N. benthamiana grown under 16 h 405

photoperiod were used for infiltration. Five days post infiltration, the infiltrated leaves were 406

harvested, ground in liquid nitrogen, and stored at -80 � for later analysis. 407

The membrane lipid analysis was performed as previously described (57). In brief, the N. 408

benthamiana leaf polar lipids were extracted in the organic solvent containing methanol, 409

chloroform, and formic acid (20:10:1, v/v/v), separated by thin layer chromatography (TLC), 410

converted to fatty acyl methylesters (FAMEs), and analyzed by gas-liquid chromatography (GLC) 411

coupled with flame ionization. The TLC plates (TLC Silica gel 60, EMD Chemical) were 412

activated by ammonium sulfate before being used for lipid separation. Iodine vapor was applied 413

to TLC plates after lipid separation for brief reversible staining. Different lipid groups on the TLC 414

plates were marked with a pencil and were scraped for analysis. For LC/MS analysis, lipids were 415

extracted using the buffer containing acetonitrile/isopropanol/water (3:3:2) with 0.1% formic acid 416

and 10 μM propyl-4-hydroxybenzoate as the internal standard.

417

Protein expression and ACS enzyme assay 418

To express His-tagged recombinant protein Sl-AACS1, the full-length Sl-AACS ORF sequence 419

was amplified using the primer pair SlAACS1-pET28_F/R (table S3) and was cloned into 420

pET28b (EMD Millipore) using the Gibson assembly kit (NEB). The pET28b vector was 421

linearized by digesting with BamHI and XhoI to create overhangs compatible for Gibson 422

assembly. The pET28b constructs were transformed into BL21 Rosetta cells (EMD Millipore). 423

The protein expression was induced by adding 0.05 mM isopropyl β-D-1-thiogalactopyranoside 424

to the cultures when the OD600nm = 0.5. The E. coli cultures were further grown overnight at 16 �, 425

120 rpm. The His-tagged proteins were purified by Ni-affinity gravity-flow chromatography 426

using the Ni-NTA agarose (Qiagen) following the product manual. 427

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 18: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 18 of 54

Measurement of acyl-CoA synthetase activity was performed using minor modifications 428

of the coupled enzyme assay described by Schneider et al. (58). A multimode plate reader 429

(PerkinElmer, mode EnVision 2104) compatible with the 96-well UV microplate was used for the 430

assays. The fatty acid substrates were dissolved in 5% Triton X-100 (v/v) to make 5 mM stock 431

solutions. The enzyme assay premix was prepared containing 0.1 M Tris-HCl (pH 7.5), 2 mM 432

dithiothreitol, 5 mM ATP, 10 mM MgCl2, 0.5 mM CoA, 0.8 mM NADH, 250 µM fatty acid 433

substrate, 1 mM phosphoenolpyruvate, 20 units myokinase (Sigma-Aldrich, catalog no. M3003), 434

10 units pyruvate kinase (Roche, 10128155001), 10 units lactate dehydrogenase (Roche, 435

10127230001), and was aliquoted 95 µL each to the 96-well microplate. The reaction was started 436

by adding 5 µL (1-2 µg) proteins. The chamber of the plate reader was set to 30 � and the OD at 437

340 nm was recorded every 5 min for 40 min. Oxidation of NADH, which is monitored by the 438

decrease of OD340nm, was calculated using the NADH extinction coefficient 6.22 cm2 µmol-1. 439

Every two moles of oxidized NADH is equivalent to one mole of acyl-CoA product generated in 440

the reaction. To measure the parameters of enzyme kinetics, the fatty acid substrate concentration 441

was varied from 0 to 500 µM, with NADH set at 1 mM. The fatty acid substrates, sodium acetate 442

(C2:0), sodium butyrate (C4:0), sodium hexanoate (C6:0), sodium octanoate (C8:0), sodium 443

decanoate (C10:0), sodium laurate (C12:0), sodium myristate (C14:0), sodium palmitate (C16:0), 444

and sodium stearate (C18:0), were purchased from Sigma-Aldrich. Trans-2-decenoic acid 445

(C10:1), 8-methylnonanoic acid (iC10:0), 3-hydroxy lauric acid (C12:OH), and 3-hydroxy 446

myristic acid (C14:OH) were purchased from Cayman Chemical. 447

RNA extraction, sequencing, and differential gene expression analysis 448

Total RNA was extracted from trichomes isolated from stems and shaved stems of 7-week-old S. 449

pennellii LA0716 plants using the RNAeasy Plant Mini kit (Qiagen) and digested with DNase I. 450

A total of four RNA samples extracted from two tissues with two replicates were used for RNA 451

sequencing. The sequencing libraries were prepared using the KAPA Stranded RNA-Seq Library 452

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 19: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 19 of 54

Preparation Kit. Libraries went through quality control and quantification using a combination of 453

Qubit dsDNA high sensitivity (HS), Applied Analytical Fragment Analyzer HS DNA and Kapa 454

Illumina Library Quantification qPCR assays. The libraries were pooled and loaded onto one lane 455

of an Illumina HiSeq 4000 flow cell. Sequencing was done in a 2x150bp paired end format using 456

HiSeq 4000 SBS reagents. Base calling was done by Illumina Real Time Analysis (RTA) v2.7.6 457

and output of RTA was demultiplexed and converted to FastQ format with Illumina Bcl2fastq 458

v2.19.1. 459

The paired end reads were filtered and trimmed using Trimmomatic v0.32 (59) with the 460

setting (LLUMINACLIP: TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 461

SLIDINGWINDOW:4:30), and then mapped to the S. pennellii LA0716 genome v2.0 (60) using 462

TopHat v1.4 (61) with the following parameters: -p (threads) 8, -i (minimum intron length) 50, -I 463

(maximum intron length) 5000, and -g (maximum hits) 20. The FPKM (Fragments Per Kilobase 464

of transcript per Million mapped reads) values for the genes were analyzed via Cufflinks v2.2 465

(62).For differential expression analysis, the HTseq package (63) in Python was used to get raw 466

read counts, then Edge R version 3.22.5 (64) was used to compare read counts between trichome-467

only RNA and shaved stem RNA using a generalized linear model (glmQLFit). 468

VIGS and qRT-PCR 469

For VIGS analysis of Sq-AACS1 and Sq-AECH1 in S. quitoense, the fragments of these two 470

genes, as well as the phytoene desaturase (PDS) gene fragment, were amplified using the primers 471

listed in table S3, cloned into pTRV2-LIC (ABRC no. CD3-1044) using the ligation-independent 472

cloning method (65), and transformed into A. tumefaciens (strain GV3101). The VIGS 473

experiments were performed as described (66). In brief, the Agrobacterium strains harboring 474

pTRV2 constructs, the empty pTRV2, or pTRV1 were grown overnight in separate LB cultures 475

containing 50 µg/mL kanamycin, 10 µg/mL gentamicin, and 50 µg/ml rifampicin at 30 �. The 476

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 20: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 20 of 54

cultures were re-inoculated in the induction media (50 mM MES pH5.6, 0.5% glucose [w/v], 2 477

mM NaH2PO4, 200 µM acetosyringone) for overnight growth. The cells were harvested, washed, 478

and resuspended in the buffer containing 10 mM MES, pH 5.6, 10 mM MgCl2, and 200 µM 479

acetosyringone with the OD600nm = 1. Different cultures containing pTRV2 constructs were mixed 480

with equal volume of pTRV1 cultures prior to infiltration. The 2- to 3-week-old young S. 481

quitoense seedlings grown under 16 h photoperiod at 24 °C were used for infiltration: the two 482

fully expanded cotyledons were infiltrated. Approximately three weeks post inoculation, the 483

fourth true leaf of each infiltrated plant was cut in half for acylsugar quantification and gene 484

expression analysis, respectively. The onset of the albino phenotype of the control group 485

infiltrated with the PDS construct was used as a visual marker to determine the harvest time and 486

leaf selection for the experimental groups. At least fourteen plants were analyzed for each 487

construct. The trichome acylsugars were extracted using the solution containing 488

acetonitrile/isopropanol/water (3:3:2) with 0.1% formic acid and 1 μM telmisartan as internal 489

standard, following the protocol at http://dx.doi.org/10.17504/protocols.io.xj2fkqe. 490

The leaf RNA was extracted using RNeasy Plant Mini kits (Qiagen) and digested with 491

DNase I. The first-strand cDNA was synthesized by Superscript II (Thermofisher Scientific) 492

using total RNA as templates. Quantitative real-time PCR was performed to analyze the Sq-493

AACS1 or Sq-AECH1 mRNA in S. quitoense leaves using the primers listed in table S3. EF1α was 494

used as a control gene. A Quantstudio 7 Flex Real-Time PCR System with Fast SYBR Green 495

Master Mix (Applied Biosystems) was used for the analysis. The relative quantification method 496

(2-ΔΔCt) was used to evaluate the relative transcripts levels. 497

LC/MS analysis 498

Trichome acylsugars extracted from tomato IL and BILs were analyzed using a Shimadzu LC-499

20AD HPLC system connected to a Waters LCT Premier ToF mass spectrometer. Ten microliter 500

samples were injected into a fused core Ascentis Express C18 column (2.1 mm × 10 cm, 2.7 μm 501

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 21: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 21 of 54

particle size; Sigma-Aldrich) for reverse-phase separation with column temperature set at 40 °C. 502

The starting condition was 90% solvent A (0.15% formic acid in water) and 10% solvent B 503

(acetonitrile) with flow rate set to 0.4 mL/min. A 7 min linear elution gradient was used: ramp to 504

40% B at 1 min, then to 100% B at 5 min, hold at 100% B to 6 min, return to 90% A at 6.01 min 505

and hold until 7 min. 506

For analyzing trichome acylsugars extracted from S. pennellii transgenic plants and 507

membrane lipids from N. benthamiana, a Shimadzu LC-20AD HPLC system connected to a 508

Waters Xevo G2-XS QToF mass spectrometer was used. The starting condition were 95% solvent 509

A (10 mM ammonium formate, pH 2.8) and 5% solvent B (acetonitrile) with flow rate set to 0.3 510

mL/min. A 7 min linear elution gradient used for acylsugar analysis was: ramp to 40% B at 1 min, 511

then to 100% B at 5 min, hold at 100% B to 6 min, return to 95% A at 6.01 min and hold until 7 512

min. A 12 min linear elution gradient used for the lipid analysis was: ramp to 40% B at 1 min, 513

then to 100% B at 5 min, hold at 100% B to 11 min, return to 95% A at 11.01 min and hold until 514

12 min. 515

For analyzing trichome acylsugars extracted from other plants, a Waters Acquity UPLC 516

was coupled to a Waters Xevo G2-XS QToF mass spectrometer. The starting condition was 95% 517

solvent A (10 mM ammonium formate, pH 2.8) and 5% solvent B (acetonitrile) with flow rate set 518

to 0.3 mL/min. A 7 min linear elution gradient was: ramp to 40% B at 1 min, then to 100% B at 5 519

min, hold at 100% B to 6 min, return to 95% A at 6.01 min and held until 7 min. A 14 min linear 520

elution gradient was: ramp to 35% B at 1 min, then to 85% B at 12 min, then to 100% B at 12.01 521

min, hold at 100% B to 13 min, return to 95% A at 13.01 min and held until 14 min. 522

For Waters LCT Premier ToF mass spectrometer, the MS settings of electrospray 523

ionization in negative mode were: 2.5 kV capillary voltage, 100°C source temperature, 350°C 524

desolvation temperature, 350 liters/h desolvation nitrogen gas flow rate, 10 V cone voltage, and 525

mass range m/z 50 to 1500 with spectra accumulated at 0.1 seconds/function. Three collision 526

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 22: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 22 of 54

energies (10, 40, and 80 eV) were used in separate acquisition functions to generate both 527

molecular ion adducts and fragments. For Waters Xevo G2-XS QToF mass spectrometer, the MS 528

settings of the negative ion-mode electrospray ionization were as follows: 2.00 kV capillary 529

voltage, 100 °C source temperature, 350 °C desolvation temperature, 600 liters/h desolvation 530

nitrogen gas flow rate, 35V cone voltage, mass range of m/z 50 to 1000 with spectra accumulated 531

at 0.1 seconds/function. Three collision energies (0, 15, and 35 eV) were used in separate 532

acquisition functions. The MS settings for positive ion-mode electrospray ionization were: 3.00 533

kV capillary voltage, 100 °C source temperature, 350 °C desolvation temperature, 600 liters/h 534

desolvation nitrogen gas flow rate, 35V cone voltage, mass range of m/z 50 to 1000 with spectra 535

accumulated at 0.1 seconds/function. Three collision energies (0, 15, and 45 eV) were used in 536

separate acquisition functions. The Waters Quanlynx software was used to integrate peak areas of 537

the selected ion relative to the internal standard. For quantification purpose, data collected with 538

the lowest collision energy was used in the analysis. 539

Synteny scan 540

Protein sequences of annotated genes and the corresponding annotation files in General Feature 541

Format (GFF) of 11 Solanaceae species, Ipomoea trifida, and Coffea canephora were downloaded 542

from National Center for Biotechnology Information (NCBI, 543

https://www.ncbi.nlm.nih.gov/genome/) or Solanaceae Genomics Network (SGN, 544

https://solgenomics.net/). The GFF files contain the coordinates of annotated genes on assembled 545

chromosomes or scaffolds. The sources and version numbers of sequences and GFF files used are: 546

S. lycopersicum ITAG3.2 (SGN) and V2.5 (NCBI), S. pennellii SPENNV200 (NCBI) and v2.0 547

(SGN), S. tuberosum V3.4 (SGN), S. melongena r2.5.1 (SGN), Capsicum annuum L. zunla-1 548

V2.0 (SGN), C. annuum_var. glabriusculum V2.0 (SGN), Nicotiana attenuata NIATTr2 (SGN), 549

N. tomentosiformis V01 (NCBI), N. benthamiana V1.0.1 (SGN), Petunia axillaris V1.6.2 (SGN), 550

P. inflata V1.0.1 (SGN), I. trifida V1.0 (NCBI), and C. canephora Vx (SGN). 551

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 23: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 23 of 54

To hypothesize the evolutionary history of genes in the acylsugar gene cluster, putative 552

pseudogenes, which are homologs to protein-coding genes but with predicted premature 553

stops/frameshifts and/or protein sequence truncation, were also identified for each species as 554

described (67). Genome-wide syntenic analysis was conducted using annotated protein-coding 555

genes and putative pseudogenes from all the species with MCScanX-transposed (68) as described 556

(67). The MCScanX-based analysis did not lead to a syntenic block of acylsugar gene cluster on 557

chromosome 7 of S. melongena r2.5.1, which can be due to true absence of the gene cluster, 558

issues with genome assembly, or lack of coverage. To verify this, protein sequences of S. 559

lycopersicum genes in genomic blocks on chromosome 7 were searched against an updated S. 560

melongena genome from The Eggplant Genome Project (http://ddlab.dbt.univr.it/eggplant/) that 561

led to the identification of the gene cluster. 562

Acylsugar acyl chain composition analysis by GC-MS 563

Acyl chains were characterized from the corresponding fatty acid ethyl esters following 564

transesterification of acylsugar extractions as previously reported (33). Plants were grown for 4-8 565

weeks and approximately ten leaves were extracted for 3 minutes in 10 mL of 1:1 566

isopropanol:acetonitrile with 0.01% formic acid. Extractions were dried to completeness using a 567

rotary evaporator and then 300 µL of 21% (v/v) sodium ethoxide in ethanol (Sigma) was added 568

and incubated for 30 minutes with gentle rocking and vortexing every five minutes and 400 µL 569

hexane was added and vortexed for 30 seconds. To the hexane layer, 500 µL of saturated sodium 570

chloride in water was added and vortexed to facilitate a phase separation. After phase separation, 571

the top hexane layer was transferred to a new tube. The phase separation by addition of 500 µL 572

hexane was repeated twice, with the final hexane layer transferred to a 2 mL glass vial with a 573

glass insert. 574

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 24: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 24 of 54

The fatty acid ethyl esters were analyzed using an Agilent 5975 single quadrupole GC-MS 575

equipped with a 30-m, 0.25-mm internal diameter fused silica column with a 0.25-µm film 576

thickness VF5 stationary phase (Agilent). Injection of 1 µL of each hexane extract was performed 577

using splitless mode. The gas chromatography program was as follows: inlet temperature, 250°C; 578

initial column temperature, 70°C held for 2 min; ramped at 20°C/min until 270°C, then held at 579

270°C 3 min. The helium carrier gas was used with 70 eV electron ionization. Acyl chain type 580

was determined through NIST Version 2.3 library matches of the mass spectra of the 581

corresponding ethyl ester and relative abundances were determined through integrating the 582

corresponding peak area over the total acyl chain peak area. 583

Acknowledgements: We thank the C.M. Rick Tomato Genetics Resource Center (University of 584

California Davis, CA USA) for providing tomato seeds, Zamir lab in Hebrew University of 585

Jerusalem for providing tomato ILs and BILs seeds. We acknowledge Dr. Kun Wang and Dr. 586

Christoph Benning for their helpful guidance in lipid analysis. We thank Krystle Wiegert-587

Rininger and Cornelius Barry for their help in RNA sequencing and Dr. Kent Chapman from 588

University of North Texas for helpful discussions. We acknowledge Kathleen Imre and Sara 589

Haller for their help with tomato transformation. We thank the MSU Center for Advanced 590

Microscopy and RTSF Mass Spectrometry and Metabolomics Core Facilities for their support 591

with LC/MS analysis. 592

Funding: This work was supported by the US National Science Foundation (NSF) Plant Genome 593

Research Program grant IOS-1546617 to R.L.L and S.H.S; NSF grant DEB-1655386 and the U.S. 594

Department of Energy Great Lakes Bioenergy Research Center (BER DE-SC0018409) to S.H.S; 595

NSF grant MCB1727362 and AgBioResearch to F.B. This publication was made possible by a 596

predoctoral training award to B.J.L. from Grant Number T32-GM110523 from the National 597

Institute of General Medical Sciences of the National Institutes of Health. R.C. was supported by 598

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 25: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 25 of 54

the Plant Genomics at MSU REU Program funded in part by NSF 1757043. C.A.S. was supported 599

by a postdoctoral fellowship from the NSF IOS�PGRP�1811055. 600

Author contribution: P.F. and R.L.L. conceived the study. P.F., P.W., Y.R.L, B.J.L., B.M.M, 601

C.A.S., R.C., P.C., F.B., S.H.S, and R.L.L. designed the experiments and analyzed the data. P.F., 602

Y.R.L, B.J.L., C.A.S., R.C., and P.C. performed the experiments. P.F. and R.L.L wrote the first 603

draft. All authors contributed to the final version of the manuscript. 604

Competing interests: The authors declare that they have no competing interests. 605

Data and materials availability: All data needed to evaluate the conclusions in the paper are 606

present in the paper and/or the Supplementary Materials. The RNA-seq reads were deposited in 607

the National Center for Biotechnology Information Sequence Read Archive under the accession 608

number PRJNA605501. Sequence data used in this study are in the GenBank/EMBL data libraries 609

under these accession numbers: Sl-AACS1(MT078737), Sl-AECH1(MT078736), Sp-610

AACS1(MT078735), Sp-AECH1(MT078734), Sq-AACS1(MT078732), Sq-AECH1(MT078731), 611

Sq_c35719 (MT078733). The following materials require a material transfer agreement: pEAQ-612

HT, pK7WG, pKGWFS7, pEarleyGate102, pEarleyGate104, pTRV2-LIC, pICH47742::2x35S-613

5’UTR-hCas9(STOP)-NOST, pICH41780, pAGM4723, and pICSL11024. Requests for 614

biological materials or data should be submitted to R.L.L. at [email protected]. 615

616

References 617

1. H. A. Maeda, Evolutionary diversification of primary metabolism and its contribution to 618

plant chemical diversity. Front. Plant Sci. 10, 881 (2019). 619

2. E. Pichersky, E. Lewinsohn, Convergent evolution in plant specialized metabolism. Annu. 620

Rev. Plant Biol. 62, 549–566 (2011). 621

3. A. Mithöfer, W. Boland, Plant defense against herbivores: Chemical aspects. Annu. Rev. 622

Plant Biol. 63, 431–450 (2012). 623

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 26: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 26 of 54

4. G. Moghe, R. L. Last, Something old, something new: Conserved enzymes and the 624

evolution of novelty in plant specialized metabolism. Plant Physiol. 169, 1512–1523 625

(2015). 626

5. N. Panchy, M. D. Lehti-Shiu, S.-H. Shiu, Evolution of gene duplication in plants. Plant 627

Physiol. 171, 2294–2316 (2016). 628

6. B. J. Leong, R. L. Last, Promiscuity, impersonation and accommodation: evolution of plant 629

specialized metabolism. Curr. Opin. Struct. Biol. 47, 105–112 (2017). 630

7. O. Khersonsky, D. S. Tawfik, Enzyme promiscuity: a mechanistic and evolutionary 631

perspective. Annu. Rev. Biochem. 79, 471–505 (2010). 632

8. J.-K. Weng, R. N. Philippe, J. P. Noel, The rise of chemodiversity in plants. Science. 336, 633

1667–1670 (2012). 634

9. J. K. Weng, The evolutionary paths towards complexity: A metabolic perspective. New 635

Phytol. 201, 1141–1149 (2014). 636

10. C. A. Schenck, R. L. Last, Location, location! cellular relocalization primes specialized 637

metabolic diversification. FEBS J. (2019), doi:10.1111/febs.15097. 638

11. L. Noda-Garcia, W. Liebermeister, D. S. Tawfik, Metabolite–enzyme coevolution: From 639

single enzymes to metabolic pathways and networks. Annu. Rev. Biochem. 87, 187–216 640

(2018). 641

12. H. W. Nützmann, A. Huang, A. Osbourn, Plant metabolic clusters – from genetics to 642

genomics. New Phytol. 211, 771–789 (2016). 643

13. H. W. Nützmann, A. Osbourn, Gene clustering in plant specialized metabolism. Curr. 644

Opin. Biotechnol. 26, 91–99 (2014). 645

14. X. Qi et al., A gene cluster for secondary metabolism in oat: Implications for the evolution 646

of metabolic diversity in plants. Proc. Natl. Acad. Sci. USA. 101, 8233–8238 (2004). 647

15. Y. Shang et al., Biosynthesis, regulation, and domestication of bitterness in cucumber. 648

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 27: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 27 of 54

Science. 346, 1084–1088 (2014). 649

16. M. Frey et al., Analysis of a chemical plant defense mechanism in grasses. Science. 277, 650

696–699 (1997). 651

17. M. Itkin et al., Biosynthesis of antinutritional alkaloids in Solanaceous crops is mediated 652

by clustered genes. Science. 341, 175–179 (2013). 653

18. T. Winzer et al., A papaver somniferum 10-gene cluster for synthesis of the anticancer 654

alkaloid noscapine. Science. 6089, 1704–1708 (2012). 655

19. L. M. Schneider et al., The Cer-cqu gene cluster determines three key players in a β-656

diketone synthase polyketide pathway synthesizing aliphatics in epicuticular waxes. J. Exp. 657

Bot. 67, 2715–2730 (2016). 658

20. A. M. Takos et al., Genomic clustering of cyanogenic glucoside biosynthetic genes aids 659

their identification in Lotus japonicus and suggests the repeated evolution of this chemical 660

defence pathway. Plant J. 68, 273–286 (2011). 661

21. J. E. Jeon et al., A pathogen-responsive gene cluster for highly modified fatty acids in 662

tomato. Cell. 180, 176-187.e19 (2020). 663

22. B. M. Leckie et al., Differential and synergistic functionality of acylsugars in suppressing 664

oviposition by insect herbivores. PLoS ONE. 11, e0153345 (2016). 665

23. Y. Herrera-Salgado, L. Va, Y. Rios, L. Alvarez, Myo-inositol-Derived glycolipids with 666

anti-inflammatory activity from Solanum lanceolatum. J. Nat. Prod. 68, 1031–1036 667

(2005). 668

24. P. Fan, B. J. Leong, R. L. Last, Tip of the trichome: evolution of acylsugar metabolic 669

diversity in Solanaceae. Curr. Opin. Plant Biol. 49, 8–16 (2019). 670

25. R. Schuurink, A. Tissier, Glandular trichomes: micro-organs with model status? New 671

Phytol. 225, 2251–2266 (2019). 672

26. P. Fan et al., In vitro reconstruction and analysis of evolutionary variation of the tomato 673

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 28: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 28 of 54

acylsucrose metabolic network. Proc. Natl. Acad. Sci. USA. 113, E239–E248 (2016). 674

27. S. S. Nadakuduti, J. B. Uebler, X. Liu, A. D. Jones, C. S. Barry, Characterization of 675

trichome-expressed BAHD acyltransferases in Petunia axillaris reveals distinct acylsugar 676

assembly mechanisms within the Solanaceae. Plant Physiol. 175, 36–50 (2017). 677

28. G. D. Moghe, B. J. Leong, S. Hurney, A. D. Jones, R. L. Last, Evolutionary routes to 678

biochemical innovation revealed by integrative analysis of a plant-defense related 679

specialized metabolic pathway. eLife. 6, e28468 (2017). 680

29. J. C. D’Auria, Acyltransferases in plants: a good time to be BAHD. Curr. Opin. Plant Biol. 681

9, 331–340 (2006). 682

30. A. L. Schilmiller, A. L. Charbonneau, R. L. Last, Identification of a BAHD 683

acetyltransferase that produces protective acyl sugars in tomato trichomes. Proc. Natl. 684

Acad. Sci. USA. 109, 16377–82 (2012). 685

31. A. L. Schilmiller et al., Functionally divergent alleles and duplicated loci encoding an 686

acyltransferase contribute to acylsugar metabolite diversity in Solanum trichomes. Plant 687

Cell. 27, 1002–1017 (2015). 688

32. B. J. Leong et al., Evolution of metabolic novelty: A trichome-expressed invertase creates 689

specialized metabolic diversity in wild tomato. Sci. Adv. 5, eaaw3754 (2019). 690

33. J. Ning et al., A feedback insensitive isopropylmalate synthase affects acylsugar 691

composition in cultivated and wild tomato. Plant Physiol. 160, 1821–1835 (2015). 692

34. B. Ghosh, T. C. Westbrook, A. D. Jones, Comparative structural profiling of trichome 693

specialized metabolites in tomato (Solanum lycopersicum) and S. habrochaites: acylsugar 694

profiles revealed by UHPLC/MS and NMR. Metabolomics. 10, 496–507 (2014). 695

35. X. Liu, M. Enright, C. S. Barry, A. D. Jones, Profiling, isolation and structure elucidation 696

of specialized acylsucrose metabolites accumulating in trichomes of Petunia species. 697

Metabolomics. 13, 1–10 (2017). 698

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 29: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 29 of 54

36. S. Mandal, W. Ji, T. D. McKnight, Candidate gene networks for acylsugar metabolism and 699

plant defense in wild tomato Solanum pennellii. Plant Cell. 32, 81–99 (2020). 700

37. Y. Eshed, D. Zamir, An introgression line population of Lycopersicon pennellii in the 701

cultivated tomato enables the identification and fine mapping of yield-associated QTL. 702

Genetics. 141, 1147–1162 (1995). 703

38. A. Schilmiller et al., Mass spectrometry screening reveals widespread diversity in trichome 704

specialized metabolites of tomato chromosomal substitution lines. Plant J. 62, 391–403 705

(2010). 706

39. I. Ofner, J. Lashbrooke, T. Pleban, A. Aharoni, D. Zamir, Solanum pennellii backcross 707

inbred lines (BILs) link small genomic bins with tomato traits. Plant J. 87, 151–160 708

(2016). 709

40. C. Brooks, V. Nekrasov, Z. B. Lippman, J. Van Eck, Efficient gene editing in tomato in the 710

first generation using the clustered regularly interspaced short palindromic 711

repeats/CRISPR-associated9 system. Plant Physiol. 166, 1292–1297 (2014). 712

41. R. L. Buchanan, B. B. Gruissem, W. Jones, Biochemistry and Molecular Biology of Plants 713

(American Society of Plant Physiologists, Rockville, MD, ed. 2ND, 2015). 714

42. B. K. Nelson, X. Cai, A. Nebenführ, A multicolored set of in vivo organelle markers for co-715

localization studies in Arabidopsis and other plants. Plant J. 51, 1126–1136 (2007). 716

43. C. Brocard, A. Hartig, Peroxisome targeting signal 1: Is it really a simple tripeptide? 717

Biochim. Biophys. Acta - Mol. Cell Res. 1763, 1565–1573 (2006). 718

44. F. Sainsbury, E. C. Thuenemann, G. P. Lomonossoff, PEAQ: Versatile expression vectors 719

for easy and quick transient expression of heterologous proteins in plants. Plant 720

Biotechnol. J. 7, 682–693 (2009). 721

45. S. M. Hurney, thesis, Michigan State University (2018). 722

46. T. Brockmöller et al., Nicotiana attenuata Data Hub (NaDH): An integrative platform for 723

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 30: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 30 of 54

exploring genomic, transcriptomic and metabolomic data in wild tobacco. BMC Genomics. 724

18, 79 (2017). 725

47. R. Milo, R. L. Last, Achieving diversity in the face of constraints: Lessons from 726

metabolism. Science. 336, 1663–1667 (2012). 727

48. J. Ohlrogge et al., PlantFAdb: a resource for exploring hundreds of plant fatty acid 728

structures synthesized by thousands of plants and their phylogenetic relationships. Plant J. 729

96, 1299–1308 (2018). 730

49. A. A. Millar, M. A. Smith, L. Kunst, All fatty acids are not equal: Discrimination in plant 731

membrane lipids. Trends Plant Sci. 5, 95–101 (2000). 732

50. K. Dehesh, A. Jones, D. S. Knutzon, T. A. Voelker, Production of high levels of 8:0 and 733

10:0 fatty acids in transgenic canola by overexpression of Ch FatB2, a thioesterase cDNA 734

from Cuphea hookeriana. Plant J. 9, 167–172 (1996). 735

51. K. Dehesh, P. Edwards, J. A. Fillatti, M. Slabaugh, J. Byrne, KAS IV: A 3-ketoacyl-ACP 736

synthase from Cuphea sp. is a medium chain specific condensing enzyme. Plant J. 15, 737

383–390 (1998). 738

52. U. Iskandarov et al., A specialized diacylglycerol acyltransferase contributes to the 739

extreme medium-chain fatty acid content of Cuphea seed oil. Plant Physiol. 174, 97–109 740

(2017). 741

53. T. Voelker, A. J. Kinney, Variations in the Biosynthesis of Seed-Storage Lipids. Annu Rev 742

Plant Physiol Plant Mol Biol. 52, 335–361 (2001). 743

54. X. Qi et al., A different function for a member of an ancient and highly conserved 744

cytochrome P450 family: From essential sterols to plant defense. Proc. Natl. Acad. Sci. 745

USA. 103, 18848–18853 (2006). 746

55. S. McCormick, in Plant Tissue Culture Manual, K. Lindsey, Ed. (Springer Netherlands, 747

Dordrecht, 1997), vol. 13. 748

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 31: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 31 of 54

56. H. Batoko, H. Q. Zheng, C. Hawes, I. Moore, A Rab1 GTPase is required for transport 749

between the endoplasmic reticulum and Golgi apparatus and for normal Golgi movement 750

in plants. Plant Cell. 12, 2201–2218 (2000). 751

57. Z. Wang, C. Benning, Arabidopsis thaliana polar glycerolipid profiling by thin layer 752

chromatography (TLC) coupled with gas-liquid chromatography (GLC). J. Vis. Exp., 2–7 753

(2011). 754

58. K. Schneider et al., A new type of peroxisomal acyl-coenzyme a synthetase from 755

Arabidopsis thaliana has the catalytic capacity to activate biosynthetic precursors of 756

jasmonic acid. J. Biol. Chem. 280, 13962–13972 (2005). 757

59. A. M. Bolger, M. Lohse, B. Usadel, Trimmomatic: A flexible trimmer for Illumina 758

sequence data. Bioinformatics. 30, 2114–2120 (2014). 759

60. A. Bolger et al., The genome of the stress-tolerant wild tomato species Solanum pennellii. 760

Nat Genet. 46, 1034–1038 (2014). 761

61. C. Trapnell, L. Pachter, S. L. Salzberg, TopHat: Discovering splice junctions with RNA-762

Seq. Bioinformatics. 25, 1105–1111 (2009). 763

62. C. Trapnell et al., Transcript assembly and quantification by RNA-Seq reveals unannotated 764

transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 765

(2010). 766

63. S. Anders, P. T. Pyl, W. Huber, HTSeq — A Python framework to work with high-767

throughput sequencing data. Bioinformatics. 31, 166–169 (2015). 768

64. D. J. McCarthy, Y. Chen, G. K. Smyth, Differential expression analysis of multifactor 769

RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 40, 4288–770

4297 (2012). 771

65. Y. Dong, T. M. Burch-Smith, Y. Liu, P. Mamillapalli, S. P. Dinesh-Kumar, A ligation-772

independent cloning tobacco rattle virus vector for high-throughput virus-induced gene 773

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 32: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 32 of 54

silencing identifies roles for NbMADS4-1 and -2 in floral development. Plant Physiol. 145, 774

1161–1170 (2007). 775

66. B. J. Leong, thesis, Michigan State University (2019). 776

67. P. Wang et al., Factors influencing gene family size variation among related species in a 777

plant family, Solanaceae. Genome Biol. Evol. 10, 2596–2613 (2018). 778

68. Y. Wang, J. Li, A. H. Paterson, MCScanX-transposed: Detecting transposed gene 779

duplications based on multiple colinearity scans. Bioinformatics. 29, 1458–1460 (2013). 780

69. T. Sarkinen et al., A phylogenetic framework for evolutionary study of the nightshades 781

(Solanaceae): a dated 1000-tip tree. BMC Evol. Biol. 13, 214 (2013). 782

783

784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 33: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 33 of 54

Main Figures 814 815 816

817

818

Fig. 1. Primary metabolites are biosynthetic precursors of tomato trichome acylsugars. In cultivated 819

tomatoes, the trichome acylsucroses are synthesized by four Sl-ASATs using the primary metabolites – 820

sucrose and different types of acyl-CoAs – as substrates. 821

822

823

824

825

826

827

828

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 34: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 34 of 54

829

Fig. 2. Mapping of a genetic locus related to acylsugar variations in tomato interspecific 830

introgression lines. (A) Electrospray ionization negative (ESI-) mode, base-peak intensity (BPI) LC/MS 831

chromatogram of trichome metabolites from cultivated tomato S. lycopersicum M82 and introgression line 832

IL7-4. The orange bars highlight two acylsugars that have higher abundance in IL7-4 than in M82. For the 833

acylsucrose nomenclature, “S” refers to a sucrose backbone, “3:22” means three acyl chains with twenty-834

two carbons in total. The length of each acyl chain is shown in the parentheses. (B) Peak area percentage 835

of seven major trichome acylsugars in M82 and IL7-4. The sum of the peak area percentage of each 836

acylsugar is equal to 100% in each sample. ***p < 0.001, Welch two-sample t test, n=3. (C) Mapping the 837

genetic locus contributing to the IL7-4 acylsugar phenotype using selected backcross inbred lines (BILs) 838

that have recombination break points within the introgression region of IL7-4. (D) Narrowing down 839

candidate genes in the locus using trichome/stem RNA-seq datasets generated from previous study (33). 840

A region with duplicated genes of three types – acyl-CoA synthetase (ACS), BAHD acyltransferase, and 841

enoyl-CoA hydratase (ECH) – is shown. The red-blue color gradient provides a visual marker to rank the 842

expression levels represented by read counts. 843

844

845

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 35: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 35 of 54

846

Fig. 3. CRISPR/Cas9-mediated gene knockout of tomato Sl-AACS1 or Sl-AECH1 eliminates 847

detectable medium chain containing acylsugars. (A) Combined LC/MS extracted ion chromatograms 848

of trichome metabolites from CRISPR mutants sl-aacs1 and sl-aech1. The medium chain acylsugars that 849

are not detected in the two mutants are denoted by pairs of vertical dotted lines. (B) Quantification of 850

seven major trichome acylsugars in sl-aacs1 and sl-aech1 mutants. Two independent T2 generation 851

transgenic lines for each mutant were used for analysis. The peak area/internal standard (IS) normalized 852

by leaf dry weight (DW) is shown from six biological replicates ± SE. (C) Confocal fluorescence images 853

showing that GFP fluorescence driven by Sl-AACS1 or Sl-AECH1 is located in the tip cells of type I/IV 854

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 36: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 36 of 54

trichomes. Their tissue specific expressions are similar to Sl-ASAT1 (26), which locates in a chromosome 855

12 region that is syntenic to the locus containing Sl-AACS1 and Sl-AECH1. 856

857

858

859

860

861

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 37: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 37 of 54

862

Fig. 4. Functional analysis of Sl-AACS1 and Sl-AECH1 in N. benthamiana and recombinant Sl-863

AACS1 enzyme analysis. (A) Confocal images of co-expression analysis in tobacco leaf epidermal cells 864

using C-terminal CFP-tagged either Sl-AACS1 or Sl-AECH1 and the mitochondrial marker MT-RFP. 865

Arrowheads point to mitochondria that are indicated by MT-RFP fluorescent signals. Scale bar equals 10 866

μm. (B) Aliphatic fatty acids of different chain lengths were used as the substrates to test Sl-AACS1 acyl-867

CoA synthase activity. Mean amount of acyl-CoAs generated (nmol min-1 mg-1 proteins) was used to 868

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 38: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 38 of 54

represent enzyme activities. The results are from three measurements ± SE. (C) Enzyme activity of Sl-869

AACS1 for six fatty acid substrates. (D) Identification of membrane lipid phosphatidylcholine (PC), which 870

contains medium acyl chains, following transient expression of Sl-AECH1 in N. benthamiana leaves. The 871

results from expressing Sl-AACS1 and co-expressing both Sl-AECH1 and Sl-AACS1 are also shown. Mole 872

percentage (Mol %) of the acyl chains from membrane lipids with carbon number 12, 14, 16, and 18 are 873

shown for three biological replicates ± SE. *p < 0.05, **p < 0.01. Welch two-sample t test was performed 874

comparing with the empty vector control. Acyl groups of the same chain lengths with saturated and 875

unsaturated bonds were combined in the calculation. 876

877

878

879

880

881

882

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 39: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 39 of 54

883

Fig. 5. AACS1 and AECH1 are evolutionarily conserved in Solanum plants. (A) A conserved syntenic 884

genomic region containing AACS1 and AECH1 was found in three selected Solanum species. Nodes 885

representing estimated dates since the last common ancestors (69) shown on the left. The closest 886

homologs of AACS1 and AECH1 in Solanum quitoense are shown without genomic context because the 887

genes were identified from RNA-seq and genome sequences are not available. The lines connect genes 888

representing putative orthologs across the four species. The trichome/stem RNA-seq data of two biological 889

S. pennellii replicates are summarized (table S1) for genes in the syntenic region. The red-blue color 890

gradient provides a visual marker to rank the expression levels in Fragments Per Kilobase of transcript per 891

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 40: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 40 of 54

Million mapped reads (FPKM). Structures of representative medium chain acylsugars from S. quitoense 892

(acylinositol, I4:26) (45) and S. pennellii (acylglucose, G3:19) (32) are on the right. (B) CRISPR/Cas9-893

mediated gene knockout of Sp-AACS1 or Sp-AECH1 in S. pennellii produce no detectable medium chain 894

containing acylsugars. The ESI+ mode LC/MS extracted ion chromatograms of trichome metabolites are 895

shown for each mutant. The m/z 127.01 (left panel) corresponds to the glucopyranose ring fragment that 896

both acylsucroses and acylglucoses generate under high collision energy positive-ion mode. The m/z 897

155.14 (center panel) and 183.17 (right panel) correspond to the acylium ions from acylsugars with chain 898

length of C10 and C12, respectively. (C) Silencing Sq-AACS1 (Sq-c34025) or Sq-AECH1 (Sq-c37194) in 899

S. quitoense using VIGS leads to reduction of total acylsugars. The peak area/internal standard (IS) 900

normalized by leaf dry weight was shown from fifteen biological replicates. ***p < 0.001, Welch two-sample 901

t test. (D) Reduced gene expression of Sq-AACS1 or Sq-AECH1 correlates with decreased acylsugar 902

levels in S. quitoense. The qRT-PCR gene expression data are plotted with acylsugar levels of the same 903

leaf as described in Fig. S6E. 904

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 41: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 41 of 54

905

Fig. 6. Evolution of the acylsugar gene cluster is associated with acylsugar acyl chain diversity 906

across the Solanaceae family. (A) The acylsugar gene cluster syntenic regions of 11 Solanaceae 907

species and two outgroup species Ipomea trifida (Convolvulaceae) and Coffea canephora (Rubiaceae). 908

This is a simplified version adapted from fig. S7. Only genes from the three families – ACS (blue), BAHD 909

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 42: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 42 of 54

acyltransferase (green), and ECH (orange) – are shown. The vertical lines connect putative orthologs. For 910

information about the syntenic region size in each species refer to fig. S7. (B) The evolutionary history of 911

the acylsugar gene cluster and its relation to the acylsugar phenotypic diversity. Structures of 912

representative short and medium chain acylsugars were shown on the right. 913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

936

937

938

939

940

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 43: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 43 of 54

Supplementary Materials for 941

Evolution of a plant gene cluster in Solanaceae and emergence of metabolic 942

diversity 943

Pengxiang Fan, Peipei Wang, Yann-Ru Lou, Bryan J. Leong, Bethany M. Moore, Craig A. Schenck, Rachel Combs, 944

Pengfei Cao, Federica Brandizzi, Shin-Han Shiu, Robert L. Last* 945

*Corresponding Author: [email protected] 946

947

This file includes: 948

Fig. S1. CRISPR-Cas9-mediated gene knockouts in cultivated tomato S. lycopersicum. 949

Fig. S2. The ‘supercluster’ syntenic region of cultivated tomato S. lycopersicum chromosome 7 and 12 950

harboring the acylsugar and steroidal glycoalkaloid gene clusters. 951

Fig. S3. Characterization of cluster genes using leaf transient expression: protein subcellular targeting and 952

impacts on lipid metabolism. 953

Fig. S4. The closest homologs of Sl-AECH1 from other Solanum species generate medium chain lipids 954

when transiently expressed in N. benthamiana. 955

Fig. S5. Stable Sp-AACS1 transformation of the M82 CRISPR mutant sl-aacs1 restores C12 containing 956

acylsugars. 957

Fig. S6. Functional analysis of AACS1 and AECH1 in S. pennellii and S. quitoense via CRISPR-Cas9 958

system and VIGS, respectively. 959

Fig. S7. Syntenic regions containing the acylsugar gene cluster. 960

Fig. S8. Phylogenetic distribution of acylsugar acyl chains with different lengths across the Solanaceae 961

family. 962

Other Supplementary Materials not in this file: 963

Table S1. Gene expression levels of all analyzed transcripts in S. pennellii LA0716. 964

Table S2. The relative transcript abundance of ACS- and ECH-annotated genes in the syntenic region of 965

Nicotiana attenuate. 966

Table S3. Synthesized gene fragments and primers used in this study. 967

968

969

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 44: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 44 of 54

970

971

Fig. S1. CRISPR-Cas9-mediated gene knockouts in cultivated tomato S. lycopersicum. The gRNAs 972

targeting Sl-AACS1 (A), Sl-AECH1(B), and Solyc07g043660 (C) in cultivated tomato are highlighted with 973

red lines and text. Two pairs of gRNAs were designed that target Sl-AECH1, which were followed by two 974

trials of plant transformation. DNA sequences of the self-crossed T1 generation transgenic lines carrying 975

homozygous gene edits are shown beneath the gene model. Dotted rectangle boxes highlight edited 976

sequences. (D) Electrospray ionization negative (ESI-) mode, LC/MS extracted ion chromatograms of 977

seven major trichome acylsugars of the CRISPR mutant solyc07g043660 and the M82 parent. 978

979

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 45: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 45 of 54

980

981

982

983

Fig. S2. The ‘supercluster’ syntenic region of cultivated tomato S. lycopersicum chromosome 7 984

and 12 harboring the acylsugar and steroidal glycoalkaloid gene clusters. The gene models are 985

represented by rectangles, with pseudogenes labeled with dotted lines. The lines linking the gene models 986

of the two chromosomes denote putative orthologous genes in the synteny. GAME genes involved in 987

glycoalkaloid metabolism were previously reported (17). 988

989

990

991

992

993

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 46: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 46 of 54

994

Fig. S3. Characterization of cluster genes using leaf transient expression: protein subcellular 995

targeting and impacts on lipid metabolism. (A) Confocal images of co-expression analysis in N. 996

tabacum leaf epidermal cells using C-terminal CFP-tagged Solyc07g043660 and the mitochondrial marker 997

MT-RFP. Arrowheads point to mitochondria that are indicated by MT-RFP fluorescent signals. White bar 998

equals 10 μm. (B) Confocal images of co-expression analysis in N. tabacum leaf epidermal cells using N-999

terminal YFP-tagged Sl-AACS1, Sl-AECH1, or Solyc07g043660, and the peroxisomal marker RFP-PTS. 1000

Arrowheads point to peroxisomes that are indicated by RFP-PTS fluorescent signals. Bar equals 10 μm. 1001

(C) N. benthamiana leaf membrane lipid acyl chain composition. The results from infiltrating Sl-AACS1 or 1002

Sl-AECH1 individually, and infiltrating Sl-AECH1 and Sl-AACS1 together are shown. The lipid 1003

abbreviations are: phosphatidylglycerol (PG), sulfoquinovosyl diacylglycerol (SQDG), 1004

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 47: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 47 of 54

digalactosyldiacylglycerol (DGDG), monogalactosyldiacylglycerol (MGDG), phosphtatidylinositols (PI), and 1005

phosphtatidylethanolamine (PE). Mole percentage (Mol %) of the acyl chains from membrane lipids with 1006

carbon number 12, 14, 16, and 18 are shown for three biological replicates ± SE. *p < 0.05, **p < 0.01. 1007

Welch two-sample t test was performed comparing each experimental with the empty vector control. PI 1008

and PE lipids were adjacent on the TLC plates and were pooled for analysis. Acyl groups of the same 1009

chain lengths with saturated and unsaturated bonds were combined in the calculation. 1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 48: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 48 of 54

1021

Fig. S4. The closest homologs of Sl-AECH1 from other Solanum species generate medium chain 1022

lipids when transiently expressed in N. benthamiana. (A) ESI- mode, LC/MS extracted ion 1023

chromatograms of two SQDGs with C12 as peaks diagnostic of lipids containing medium chain fatty acids. 1024

(B) Mass spectra of two SQDGs contain C12 acyl chain. Fragmentation of SQDG (16:0, 12:0) and SQDG 1025

(18:3, 12:0) in ESI- mode revealed the fragment ion C12 fatty acid (m/z: 199.17) and the SQDG head 1026

group (m/z: 225.0). (C) Among the putative Sl-AECH1 homologs tested, Sopen07g023250 (Sp-AECH1) 1027

and Sq_c37194 (Sq-AECH1), from S. pennellii and S. quitoense respectively, generated medium chain 1028

SQDG in the infiltrated leaves. The maximum likelihood phylogenetic gene tree shown on the left was 1029

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 49: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 49 of 54

generated using the nucleotide sequences of ECH-annotated variants, with the Arabidopsis AT1G06550 1030

serving as an outgroup. The bootstrap values were obtained with 1000 replicates. 1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 50: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 50 of 54

1043

Fig. S5. Stable Sp-AACS1 transformation of the M82 CRISPR mutant sl-aacs1 restores C12 1044

containing acylsugars. (A) ESI- mode, LC/MS extracted ion chromatograms are shown for seven major 1045

acylsugar peaks extracted from trichome of S. lycopersicum M82, sl-aacs1, and two independent T0 1046

generation suppressed sl-aacs1 transgenic lines expressing Sp-AACS1 under its own promoter. (B) Peak 1047

area percentage of seven major trichome acylsugars of the suppression transgenic plants. The sum of the 1048

peak area percentage of each acylsugar equals to 100%. The results of acylsugar peak area percentage 1049

were calculated from six independent T0 transgenic lines ± SE. 1050

1051

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 51: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 51 of 54

1052

Fig. S6. Functional analysis of AACS1 and AECH1 in S. pennellii and S. quitoense via CRISPR-Cas9 1053

system and VIGS, respectively. The design of gRNAs targeting wild tomato S. pennellii LA0716 Sp-1054

AACS1 (A), Sopen07g023250 (B), and Sp-AECH1 (C) is highlighted with red lines and text. The 1055

transgenic T0 generation carrying chimeric or biallelic gene edits are shown beneath the gene model. DNA 1056

sequence of the gene edits was obtained through Sanger sequencing of cloned plant DNA fragments. The 1057

gene edits are highlighted with dotted rectangular boxes. (D) ESI+ mode, LC/MS extracted ion 1058

chromatograms shown for C10 (m/z: 155.14) and C12 (m/z: 183.17) fatty acid ions corresponding to 1059

medium chain trichome acylsugars extracted from the CRISPR mutants sopen07g023220 and the S. 1060

pennellii LA0716 parent. (E) Experimental design of VIGS in S. quitoense. A control group silencing the 1061

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 52: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 52 of 54

PDS genes was performed in parallel with the experimental groups. The onset of the albino phenotype of 1062

the control group was used as a visual marker to determine the harvest time and leaf selection in the 1063

experimental groups. The fourth true leaves were harvested and cut in half for gene expression analysis 1064

and acylsugar quantification, respectively. (F) ESI- mode, LC/MS extracted ion chromatograms of six major 1065

acylsugars of S. quitoense for the experimental group. The three LC/MS chromatograms show 1066

representative acylsugar profiles of the empty vector control plants and the VIGS plants targeting Sq-1067

AACS1 and Sq-AECH1. 1068

1069

1070

1071

1072

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 53: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 53 of 54

Fig. S7. Syntenic regions containing the acylsugar gene cluster. The species name and 1073

chromosome/scaffold identifier are indicated with S. lycopersinum in blue font. Rectangle: protein-coding 1074

gene (solid line) or pseudogene (dotted line) colored according to the type of genes. Line connecting two 1075

genes: putative orthologous genes. Numbers underneath chromosomes: chromosome coordinates in 1076

million bases (Mb). 1077

1078

1079

1080

1081

1082

1083

1084

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint

Page 54: Evolution of a plant gene cluster in Solanaceae and ... · Manuscript Template Page 1 of 54 1 Evolution of a plant gene cluster in Solanaceae and emergence of metabolic diversity

Manuscript Template Page 54 of 54

1085

Fig. S8. Phylogenetic distribution of acylsugar acyl chains with different lengths across the 1086

Solanaceae family. (A) The collated results of acylsugar acyl chain distribution across different 1087

Solanaceae species. Red rectangles indicate detectable acyl chains in the acylsugars produced in the 1088

tested species and white rectangles indicate no detectable signals. The data source where the results are 1089

derived is listed on the right. (B) Results of acylsugar acyl chain characterization of selected Solanaceae 1090

species. Mole percentage (Mol %) of acylsugar acyl chains with different lengths were obtained from 1091

GC/MS analysis of fatty acid ethyl esters. 1092

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 5, 2020. . https://doi.org/10.1101/2020.03.04.977231doi: bioRxiv preprint