12
ORIGINAL ARTICLE On the Origin and Evolution of Plant Brassinosteroid Receptor Kinases Hao Wang Hongliang Mao Received: 6 September 2013 / Accepted: 18 December 2013 / Published online: 27 December 2013 Ó Springer Science+Business Media New York 2013 Abstract Brassinosteroid (BR) signaling pathway is so far the best-understood receptor-kinase signaling pathway in plants. In Arabidopsis, the activation of this pathway requires binding of BRs to the receptor kinase BRASSI- NOSTEROID-INSENSITIVE I (AtBRI1). Although the function of AtBRI1 has been extensively studied, it is not known when the binding function emerged and how this important component of BR signaling pathway and related genes (the BRI1–BRL gene family) have evolved in plants. We define BRI1–BRL genes in sequenced plant genomes, construct profiles for critical protein domains, scan them against all accessible plant gene/EST resources, and reveal the evolution of domain configuration of this family. We also investigate its evolutionary pattern through phyloge- netic analysis. The complete BR receptor domain config- uration originates through two domain gain events in the ancestral receptor-like kinase: first juxtamembrane domain gained during the early diversification of land plants, and then island domain (ID) acquired in the common ancestor of angiosperms and gymnosperms after its divergence from spike moss. The 70 amino acid ID has characteristic sequences of BRI1–BRL family and this family keeps relative stable copy numbers during the history of angio- sperms and the majority of duplications and losses have occurred in terminal taxa in current taxon sampling. This study reveals important events shaping structural and functional characteristics of plant BR receptors. It answers the question of how and when BR receptors originates, which provide insights into the origin and evolution of the BR signaling pathway. Keywords Brassinosteroid Á BRI1 Á Protein domain Á Gene family Á Evolution Introduction Brassinosteroids (BRs) are a group of plant steroid hor- mones that play critical roles in a wide range of develop- mental and physiological processes such as cell elongation, vascular differentiation, root growth, light response, stres- ses resistance, and senescence (Clouse and Sasse 1998; Kim and Wang 2010; Vert et al. 2005). The last two dec- ades have observed great advances in assembling the BR signal transduction pathway (or network Wang et al. 2012) in Arabidopsis. A number of component genes involved in the BR transduction pathway have been defined and important signal transduction steps, from BR perception by the receptor kinase to the activation of the most upstream transcription factors of the BR-dependent transcriptional network, have been revealed (e.g. see reviews in Clouse 2011; Kim and Wang 2010). In early events that activate the Arabidopsis BR sig- naling network, the binding of BR requires the function of BR receptor kinase BRASSINOSTEROID-INSENSITIVE I (BRI1 or AtBRI1). Genetic and biochemical studies have established AtBRI1 as the major receptor of BRs (Ki- noshita et al. 2005; Li and Chory 1997) in Arabidopsis, and great efforts have been made to elucidate the functional Electronic supplementary material The online version of this article (doi:10.1007/s00239-013-9609-5) contains supplementary material, which is available to authorized users. H. Wang Á H. Mao T-Life Research Center, Department of Physics, Fudan University, Shanghai 200433, People’s Republic of China H. Wang (&) Department of Genetics, University of Georgia, 120 Green Street, Athens, GA 30602, USA e-mail: [email protected] 123 J Mol Evol (2014) 78:118–129 DOI 10.1007/s00239-013-9609-5

On the Origin and Evolution of Plant Brassinosteroid Receptor Kinases

Embed Size (px)

Citation preview

ORIGINAL ARTICLE

On the Origin and Evolution of Plant Brassinosteroid ReceptorKinases

Hao Wang • Hongliang Mao

Received: 6 September 2013 / Accepted: 18 December 2013 / Published online: 27 December 2013

� Springer Science+Business Media New York 2013

Abstract Brassinosteroid (BR) signaling pathway is so

far the best-understood receptor-kinase signaling pathway

in plants. In Arabidopsis, the activation of this pathway

requires binding of BRs to the receptor kinase BRASSI-

NOSTEROID-INSENSITIVE I (AtBRI1). Although the

function of AtBRI1 has been extensively studied, it is not

known when the binding function emerged and how this

important component of BR signaling pathway and related

genes (the BRI1–BRL gene family) have evolved in plants.

We define BRI1–BRL genes in sequenced plant genomes,

construct profiles for critical protein domains, scan them

against all accessible plant gene/EST resources, and reveal

the evolution of domain configuration of this family. We

also investigate its evolutionary pattern through phyloge-

netic analysis. The complete BR receptor domain config-

uration originates through two domain gain events in the

ancestral receptor-like kinase: first juxtamembrane domain

gained during the early diversification of land plants, and

then island domain (ID) acquired in the common ancestor

of angiosperms and gymnosperms after its divergence from

spike moss. The 70 amino acid ID has characteristic

sequences of BRI1–BRL family and this family keeps

relative stable copy numbers during the history of angio-

sperms and the majority of duplications and losses have

occurred in terminal taxa in current taxon sampling. This

study reveals important events shaping structural and

functional characteristics of plant BR receptors. It answers

the question of how and when BR receptors originates,

which provide insights into the origin and evolution of the

BR signaling pathway.

Keywords Brassinosteroid � BRI1 � Protein domain �Gene family � Evolution

Introduction

Brassinosteroids (BRs) are a group of plant steroid hor-

mones that play critical roles in a wide range of develop-

mental and physiological processes such as cell elongation,

vascular differentiation, root growth, light response, stres-

ses resistance, and senescence (Clouse and Sasse 1998;

Kim and Wang 2010; Vert et al. 2005). The last two dec-

ades have observed great advances in assembling the BR

signal transduction pathway (or network Wang et al. 2012)

in Arabidopsis. A number of component genes involved in

the BR transduction pathway have been defined and

important signal transduction steps, from BR perception by

the receptor kinase to the activation of the most upstream

transcription factors of the BR-dependent transcriptional

network, have been revealed (e.g. see reviews in Clouse

2011; Kim and Wang 2010).

In early events that activate the Arabidopsis BR sig-

naling network, the binding of BR requires the function of

BR receptor kinase BRASSINOSTEROID-INSENSITIVE

I (BRI1 or AtBRI1). Genetic and biochemical studies have

established AtBRI1 as the major receptor of BRs (Ki-

noshita et al. 2005; Li and Chory 1997) in Arabidopsis, and

great efforts have been made to elucidate the functional

Electronic supplementary material The online version of thisarticle (doi:10.1007/s00239-013-9609-5) contains supplementarymaterial, which is available to authorized users.

H. Wang � H. Mao

T-Life Research Center, Department of Physics, Fudan

University, Shanghai 200433, People’s Republic of China

H. Wang (&)

Department of Genetics, University of Georgia, 120 Green

Street, Athens, GA 30602, USA

e-mail: [email protected]

123

J Mol Evol (2014) 78:118–129

DOI 10.1007/s00239-013-9609-5

regions or protein domains in this gene (see review in Kim

and Wang 2010). The currently accepted domain configu-

ration of AtBRI1 is LRR{20}–ID–LRR21–LRR{3}–TM–

JM–KD–CT (Kim and Wang 2010; Vert et al. 2005), where

LRR, ID, LRR21, TM, JM, KD, and CT denote leucine-

rich repeats, island domain, the 21st LRR domain, trans-

membrane region, juxtamembrane region, kinase domain,

and C-terminal region, respectively. LRR21 and its

upstream adjacent ID play important roles together so this

LRR is marked specially. The numbers in braces are copy

numbers of tandem domain units. For instance, LRR{20}

means 20 tandem LRR domains. AtBRI1 in fact has 25

LRRs (Kinoshita et al. 2005), but the one located at

N-terminal is irregular (She et al. 2011). If taking the

irregular LRR as the first LRR, ID is located between

LRR-21 and -22. In this study, we count the first regular

LRR as the first LRR, following the numbering system

used by several previous works (Kim and Wang 2010; Li

and Chory 1997; Vert et al. 2005).

Several homologs of AtBRI1 have been identified: (1)

AtBRI1 orthologs have been defined in a number of crops

such as tomato (Curl3/tBRI1/SR160; Koka et al. 2000;

Montoya et al. 2002), rice (OsBRI1; Yamamuro et al.

2000), barley (HvBRI1; Chono et al. 2003), cotton

(GhBRI1; Sun et al. 2004), grape (VvBRI1; Symons et al.

2006), and pea (LKA/PsBRI1; Nomura et al. 2003), and

their function in BR perception and plant growth have been

confirmed by mutational researches (see Clouse 2011;

Morillo and Tax 2006 for brief reviews). (2) AtBRI1 is the

major, but not the only BR receptor in Arabidopsis. Three

paralogs of AtBRI1 have been cloned and two of them,

AtBRL1 and AtBRL3, can also bind to BR and mediate

cell-type-specific BR response and rescue the bri1 mutant

when expressed under the control of the BRI1 promoter

(Cano-Delgado et al. 2004). The investigation of the

counterparts of AtBRI1 and AtBRL genes in rice seems to

support the model that BRI1 performing as major BR

receptor and some of its homologs as partial functional

backup also work in rice (Nakamura et al. 2006).

The structure and function of AtBRI1 have been

extensively studied. In contrast, the evolution of AtBRI1

and related genes (called the BRI1–BRL gene family

hereafter) is largely unrevealed. AtBRI1 belongs to LRR

receptor-like kinases (LRR RLKs) which are characterized

by extracellular LRR arrays, a single-pass TM and a

cytoplasmic KD (Shiu and Bleecker 2001a). Though the

relationship and diversity of LRR and KD have attracted

many attentions and these studies have established that

BRI1–BRL genes form a clade in the KD phylogenetic tree

(Dolan et al. 2007; Kobe and Kajava 2001; Lehti-Shiu et al.

2009; Matsushima et al. 2009, 2010; Shiu and Bleecker

2001a, b; Shiu et al. 2004), none of these researches have

focused on the BRI1–BRL family per se. To date, the

relationship between BRI1 and BRL genes only has been

discussed in quite limited organisms (Cano-Delgado et al.

2004; Gish and Clark 2011; Nakamura et al. 2006). The

pioneering work based on the phylogeny of the 10 avail-

able BRI1–BRL genes (Cano-Delgado et al. 2004) dis-

covered that the early duplications splitting BRI2 from

BRI1 and BRL1–3 occurred prior to the divergence of

monocots and dicots. However, due to the lack of data, this

work failed to correctly resolve the time of the split of

BRL1 and BRL3 and could not discuss many other ques-

tions such as the birth and death patterns and lineage dis-

tribution of family members. In short, systematic

investigations of this family with an extensive taxon sam-

pling have not been reported yet and how and when this

family originated still remains an open question.

Recent genomics revolution has uncovered genomes

from many major lineages of plant tree of life and depos-

ited a large number of EST/transcriptome data for lineages

without fully sequenced genomes. With the aim of illu-

minating the origin and evolution of structure and function

of plant BRI1–BRL genes, we have performed comparative

analyses of the BRI1–BRL gene family in major clades of

the plant kingdom. Focusing on the 44 sequenced plant

genomes [32 land plants and 12 green algae (Supporting

Information Table S1)], we have identified the BRI1–BRL

gene repertoire in 29 genomes using bioinformatic method

and tried to answer the following questions: (1) how and

when the family originated, (2) relationship between family

members, and (3) pattern of gene duplication and loss. We

also have discussed a possible picture of the origin of the

single-exon structure of this family.

Materials and Methods

Sequence Data

Fully sequenced plant genomes and gene annotations were

downloaded from phytozome version 8 (http://phytozome.

net/). Data of chlorophytes that were deposited in JGI but

not included in phytozome were downloaded from JGI

genome portal (http://genome.jgi.doe.gov/). The genome

and annotations of Amborella trichopoda were downloaded

from Amborella Genome Database (http://www.amborella.

org/). ESTs, EST assemblies, transcriptome assemblies,

and cDNAs (mRNAs) of conifers as well as other nonan-

giosperm plants were collected from TreeGene (http://den

drome.ucdavis.edu/treegenes/) and plantGDB (http://www.

plantgdb.org/). A translated EST dataset was constructed

by predicting coding regions of the ESTs by ORFPredictor

(http://proteomics.ysu.edu/tools/OrfPredictor.html). Pro-

teins of plants that deposited in Uniprot (http://www.uni

prot.org/) were also downloaded. The combined protein set

J Mol Evol (2014) 78:118–129 119

123

was made of nonredundant sequences from genome

annotation, translated EST dataset, and Uniprot.

BRI1–BRL Gene Identification

The combined protein set was scanned against PFAM

v26.0 (Punta et al. 2012) to obtain all genes containing KD

using the Pfam_scan.pl script (ftp://ftp.sanger.ac.uk/pub/

databases/Pfam/Tools). These genes were then assigned to

subfamilies according to the similarity of their KDs with

known KDs of plant RLK/Pelle genes (Lehti-Shiu et al.

2009). The previous study (Shiu and Bleecker 2003) sug-

gested that AtBRI1 and AtBRL genes belonged to the

LRR-Xb-1 subfamily (called RLK/Pelle-LRR-Xb-1 genes

hereafter). All genes belonged to the RLK/Pelle-LRR-Xb-1

subfamily were extracted and the amino acids neighbor-

joining (NJ) tree was built using their KD sequences (see

below). The genes fell within the same well-supported

clade with known BRI1–BRL genes were extracted as

BRI1–BRL candidates (Supporting Information Fig. S1).

We manually cured the BRI1–BRL candidates from

fully sequenced organisms to improve gene models. Every

gene locus was re-annotated through AUGUSTUS (Stanke

et al. 2008) with EST evidence. Each predicted gene model

was compared to that released by genome sequencing

centers and the better model was chosen based on EST

support and sequence alignment quality. This manual

inspection excluded nine models for further study because

they were far shorter than other BRI1–BRL genes or lacked

other essential domains or had no EST support.

Protein Domain Profile Construction

Multiple sequence alignment (MSA) of protein sequences

of BRI1–BRL genes was constructed using MUSCLE

(Edgar 2004) and the alignment was manually inspected. In

the alignment, blocks representing ID, LRR21, JM, and KD

domains were extracted according to their locations in

AtBRI1 (Vert et al. 2005). Profile hidden Markov models

(HMMs) of domains were then constructed by using

HMMER v3.0 package (Finn et al. 2011).

Domain Configuration Identification

The identification of ID, JM, and KD was performed by

hmmsearch (E value = 1e-5) using profile constructed

above. Besides hmmsearch, PFAM profiles (v26.0) were

also scanned against genome sequences of the 12 chloro-

phytes to detect the LRR–KD configuration in these spe-

cies. The presence of TM in genes was predicted using

TMHMM webserver (http://www.cbs.dtu.dk/services/

TMHMM/).

Phylogenetic Analysis

KD regions were aligned using MUSCLE, and the NJ tree

(Saitou and Nei 1987) of all RLK/Pelle-LRR-Xb-1 genes

was built using MEGA 5 (Tamura et al. 2011) with the

following parameters: Poisson model; uniform rate; and

pairwise deletion of gaps/missing data. Here, we excluded

sequences shorter than 140 aa (50 % of average size of

experimentally verified BRI1–BRL genes) and so the size

of KD sequences used in this NJ tree construction was from

140 to 355 aa.

Amino acid maximum likelihood (ML) phylogenetic

tree of BRI1–BRL genes from sequenced angiosperms was

constructed by RAxML (Stamatakis 2006) using conserved

region stretching from ID to KD (ID–LRR{4}–TM–JM–

KD; sequence lengths 536–648 aa). Sequence alignments

were generated by MUSCLE and low quality regions were

excluded from alignment using trimAL (Capella-Gutierrez

et al. 2009) with the option ‘‘automated1’’. Protein model

selection was performed by Prottest 3.0 (Abascal et al.

2005). Parameters of ML tree construction are as follows:

JTT ? I ? G model; duplications of rapid bootstrap = 100.

In 227 Arabidopsis genes of which KDs belong to the

LRR RLK domain group, 21 highly diverged ones (could

not calculate valid evolutionary distance with others) were

excluded. The DNA MSA of the coding sequences of KD

domain (sequence lengths 417–900 nt) was constructed

with MUSCLE and trimmed with trimAL. ML tree was

built by RAxML with the following parameter settings:

model = GTR ? I ? G; duplications of rapid boot-

strap = 100. Model selection was performed by jModeltest

(Posada 2008).

Gene Duplication–Loss History

The species tree used in this study was constructed by

modifying Phytozome 8 plant tree of life: A. trichopoda

was added as sister taxon of monocots and dicots and

Panicum virgatum as sister species of Setaria italica. Gene

tree and species tree reconciliation was performed by No-

tung 2.6 (Chen et al. 2000). When reconstructing the gene

duplication–loss history, weakly supported edges (boot-

strap value \90 %) of the gene tree were allowed to be

rearranged to minimize duplication and loss events.

Selection Analysis

Coding DNA sequence alignments of paralogous pairs

were obtained under the guidance of protein alignments

using PAL2NAL (Suyama et al. 2006). Dn, Ds, and Dn/Ds

values were calculated by the yn00 program of the PAML

120 J Mol Evol (2014) 78:118–129

123

package (Yang 2007) according to the Nei–Gojobori

method (Nei and Gojobori 1986).

Results

BRI1–BRL Genes in Sequenced Genomes

Phylogenetically, the BRI1–BRL gene family, in this

study, was defined as the sub-clade of RLK/Pelle-LRR-Xb-

1 genes (Lehti-Shiu et al. 2009) that included all of the

previously cloned BRI1–BRL genes. We built a NJ tree for

all detected KD in the combined plant protein set made of

whole gene repertoire of the 44 sequenced genomes, all

plant proteins in the Uniprot database and all translated

ESTs of nonangiosperm plants (see ‘‘Materials and Methods’’

section for details). In the tree (Fig. S1), the BRI1–BRL

clade (bootstrap value = 95 %) included a total of 220

sequences, with 136 being full-length BRI1–BRL gene

models (Table S2). Here, full-length means the open

reading frame containing both a start and stop codon. 117

out of the 136 full-length genes were from 29 sequenced

species (Table S3). Our manual inspection predicted 4

novel models in the apple genome (Malus domestica) and

suggested that 35 gene models should be modified based on

expression and/or conservation evidence. The correspon-

dence between corrected models and automatic gene

annotation was shown in Table S3. The other 19 full-length

BRI1–BRL genes were from Uniprot. They belonged to

organisms without whole-genome sequences when this

study was done (Table S2). It is worth noting that all of the

BRI1–BRL genes are from angiosperms and gymnosperms.

Domain Configuration of BRI1–BRL Family

The MSA of the BRI1–BRL family showed that most

domains were quite conservative (Fig. S2). Low conser-

vation was found at the first several LRRs, CT, and internal

region of TM. Overall, LRR10–24 (covering ID) and KD

were the two most conserved parts in the entire alignment.

We compared the domain configuration of AtBRI1 with

other previously identified BRI1–BRL genes and found all

of them had the identical configuration with AtBRI1.

Therefore, we constructed profiles of the LRR, ID, JM, and

KD domains (see ‘‘Materials and Methods’’ section) and

used them to scan the above 220 BRI1–BRL genes. 152

sequences exhibited a domain configuration like ‘‘LRR

array’’–ID–‘‘LRR array’’–TM–JM–KD (Table S2),

including 136 full-length genes (117 in sequenced spe-

cies ? 19 in other species from Uniprot), 11 partial pro-

teins from conifer EST/transcriptome assemblies, and 5

partial proteins from Uniprot. The other 68 sequences had

KD but not ID: 23 from Uniprot, 6 from plantGDB, and 39

from TreeGene EST/transcriptome assemblies. Unlike

protein sequences, several EST/transcriptome assemblies

might come from the same gene and thus the number of

EST/transcriptome assemblies might not correctly reflect

number of genes. However, the occurrence of a certain

domain in these assemblies was sufficient to confirm their

occurrence in corresponding species.

In this scan, we identified TM domains through the

TMHMM webserver, but not through domain profile ana-

lysis because of low conservation at sequence level.

Sequences of the CT region were also highly divergent

(Fig. S2) and the instability of CT was reported previously

(Xu et al. 2009). Therefore, although it might plays an

inhibitory role in BRI1 function (Wang et al. 2005), CT

was not suitable for profile analysis and excluded from this

study. In short, all full-length genes had a domain config-

uration like ‘‘LRR array’’–ID–‘‘LRR array’’–TM–JM–KD,

so this was the valid definition of BRI1–BRL gene family

in term of domain configuration.

Presence and Absence of Domains in Major Lineages

of the Plant Kingdom

We refined the profile for the four domains (LRR, ID, JM,

and KD) based on the MSA of the 136 full-length BRI1–

BRL genes and scanned them against the combined plant

protein set. The results showed that (1) the LRR–KD

combination was detected in 22 genes from 6 chlorophytes

(Table S4). In contrast, we failed to detect the LRR–KD

configuration in the sequenced red algae (Cyanidioschyzon

merolae). In 13 cases, LRR occurred as long tandem arrays

(C5 LRR units). (2) TM domain was located in between

some (nine cases) but not all of the LRR–KD structure. (3)

JM domain was not detected in any chlorophyte, but was

found in all available major groups of land plants including

liverworts, mosses, lycophytes, and seed plants such as

cycads, gnetophytes, conifers (Table S5), and angiosperms.

We found that whenever full-length proteins are available,

their JMs were found co-occurring with LRR array, TM,

and KD, but not always with ID. (4) IDs were only

detected in gymnosperms (conifers and gnetophytes) and

angiosperms. We found that a total of 29 gymnosperm

EST/transcriptome assemblies contained IDs, with 11 of

them showing complete BRI1–BRL gene domain config-

uration (Table S6). In contrast, this domain configuration

analysis failed to detect BRI1–BRL members in plant

genomes that diverged before seed plants and this result

was consistent with the KD phylogenetic analysis descri-

bed above. In summary, according to current available

data, the first occurrence of LRR–TM–KD, LRR–TM–JM–

KD, and LRR–ID–LRR–TM–JM–KD were observed in

chloroplasts, liverworts, and the common ancestor of seed

plants, respectively (Fig. 1).

J Mol Evol (2014) 78:118–129 121

123

ID Was the Diagnostic Domain of BRI1–BRL Genes

In all full-length sequences within the combined plant

protein set, we found that whenever ID was present, the

entire BRI1–BRL domain configuration, i.e. ‘‘LRR array’’

–ID–‘‘LRR array’’–TM–JM–KD, occurred automatically.

Furthermore, we found that if a sequence contains both ID

and KD, the KD fell in the BRI1–BRL clade. These

observations indicate that domain configuration and phy-

logenetic relationship of KD give equivalent definitions of

BRI1–BRL family and ID is the domain only present in the

BRI1–BRL family. This family now can be described as

plant RLKs containing ID.

Characteristics of ID Domain

The investigation of the alignment profile of ID domain of

BRI1–BRL genes (Figs. 2a; S3) identified six residues

conserved in all genes: C23–G27–L29–E31–C49–Y57,

where letters were codes of residues and numbers were the

locations of residues in the alignment. Besides the CGL-

ECY hexad, another six highly conserved residues or

motifs (i.e., occurring in [95 % of the sequences) were

also revealed within ID domain (Fig. 2a). Moreover, the ID

domain sequences could be further categorized as three

groups according to their conservation pattern (Fig. 2b).

The three ID groups were congruent to the three major

clades of the BRI1–BRL family (Fig. 2b and see below).

Phylogenetics of Plant BRI1–BRL Genes

The ML phylogeny of BRI1–BRL genes from the 29

sequenced angiosperms (Fig. 3) supported with 100 %

bootstrap value that the BRI1–BRL family was composed

of three major clades: the basal split was between BRL2

and all others, and the second split was between BRI1 and

BRL1–3 genes. Here, based on the names of the Arabi-

dopsis and rice members, we called the three clades as

BRI1, BRL1–3 and BRL2, respectively. Reconciliation the

BRI1–BRL gene tree with sequenced angiosperm species

tree (Fig. S4) suggested that the three major clades were

derived from two gene duplication events before the

divergence of angiosperms (Fig. S5). These results were

consistent with the previous results from far less sequences

(Cano-Delgado et al. 2004; Gish and Clark 2011).

We investigated gene duplication and loss pattern of the

family. (1) the majority of duplications (85 %, 28 out the

33) and losses (77 %, 10 out of 13) happened after the

divergence of leaf nodes and their closest relative species

in current taxon sampling (Fig. S5). This indicated that no

drastic expansion or contraction of gene number occurred

in the ancestral nodes. (2) Copy number variation of BRI1–

BRL genes was found in all the three clades (Table 1) with

the highest number of genes in M. domestica (nine; three in

each clade) and lowest in Medicago truncatula (one from

the BRI1 clade). However, 62 % (18 out of 29) species had

three or four genes and, except for M. domestica, all

variations of gene copy number within species were within

mean ± 2 9 SD. In this sense, gene numbers showed no

strong organism bias. (3) The three major clades had

similar numbers of genes (41 in BRI1, 41 in BRL1–3, and

35 in BRL2). These observations support that gene num-

bers were quite stable during the evolution of this gene

family.

According to the phylogeny of BRL1–3 (Fig. 3b), the

Arabidopsis and rice BRL1 and BRL3 genes had inde-

pendent origins. AtBRL1 and AtBRL3 were derived from a

duplication in the common ancestor of Brassicaceae, which

left two descendants in each of the five sequenced

Brassicaceae organisms. Rice BRL1 and BRL3 originated

through a duplication in the common ancestor of grass after

divergence from dicots and both descendants were also

preserved in the most investigated grass species, except for

Brachypodium. In switch grass (P. virgatum), a second

round of duplications occurred independently in each of the

two copies and resulted four BRL1–3 genes.

Selection on Paralogs

The 28 terminal species duplications generated 22 groups

of in-paralogs (Koonin 2005): 17 pairs related by 1

duplication, 4 triplets by 2, and 1 quartets by 3 duplications

(Fig. S5). Investigating selection pressure using Nei–

Gojobori method exhibited Dn/Ds values \1 in all within-

group gene pairs (Table S7) and detected no positive

selection.

Fig. 1 Domain configuration evolution of BRI1–BRL family. The

relationship within seed plants remains unresolved because of

controversial results from different datasets and methods (see e.g.

Mathews 2009). The three stages of BRI1–BRL domain configuration

evolution are mapped to the tree. Arrows point to the internal nodes

which are lower bounds of the first presence of configurations. Right

table shows the presence/absence of domains. ‘‘?’’: presence and

‘‘-’’: absence

122 J Mol Evol (2014) 78:118–129

123

Discussion

Origin of the BRI1–BRL Family: A Three-Stage

Process of Domain Gain

The evolution of the domain configuration provides a

scenario of gene family evolution at the level of organi-

zation of structural and/or functional units. Our results

exhibit a perfect consistency between phylogeny and

domain configuration when defining the BRI1–BRL gene

family. If the origin of the family is marked by formation

of the ‘‘LRR array’’–ID–‘‘LRR array’’–TM–JM–KD

domain configuration, we have revealed a stepwise domain

acquirement process in its ‘‘pre-histoy’’.

LRRs present in numerous proteins from all major

branches of the tree of life and with diverse functions

(Kobe and Kajava 2001). To date, at least seven groups of

LRRs have been identified and at least most LRRs have

been found following the so-called the mutual exclusive

rule, i.e., LRRs from different groups are found to never

occur simultaneously in the same protein. Although phy-

logenetic relationships of the seven groups have not been

well resolved yet (Andrade et al. 2000; Kajava 1998), it is

clear that LRRs presented in BRI1–BRL genes belong to

the plant-specific group that are found in plants and protists

(Kobe and Kajava 2001).

The ancient origin of KDs of BRI1–BRL genes has been

uncovered for more than 10 years: this domain belongs to

plant RLKs domain family, which together with animal

Pelle and related cytoplasmic KDs form the RLK/Pelle

domain family. The RLK/Pelle domain family, animal

receptor serine/threonine kinases, animal receptor tyrosine

kinases, and Raf kinases form a clade within the serine/

threonine/tyrosine kinases (Shiu and Bleecker 2001b). The

domain configuration of RLKs is defined as ‘‘extracelluar

domain’’–KD, where extracellular domain represents a

wide range of domains including LRR (Shiu and Bleecker

2001a). It has been suggested that such domain combina-

tion is the structural basis of new signaling pathway evo-

lution (Lehti-Shiu et al. 2009).

Lehti-Shiu et al. (2009) suggested that the LRR–KD

configuration first occurred in streptophytes after its split

from chlorophytes and before the divergence of land plants

(embryophytes). However, their result of absence of LRR–

KD configuration in chlorophytes was based on limited

data: only two species: Chlamydomonas reinhardtii and

Ostreococcus tauri at that time. In this study, we have

revisited this question by identifying domain configuration

of the 12 sequenced chlorophytes genomes deposited in

DOE-JGI genome portal with two methods (see ‘‘Materials

and Methods’’ section and Table S4). We have added the

red algae C. merolae data (Matsuzaki et al. 2004) in this

investigation since red algae have been placed as the sister

group of green plants in the tree of life by many molecular

phylogenetic studies (Adl et al. 2005). Our results support

that the plant LRR–KD configuration originate in green

plants after the split between them and red algae but before

the split of streptophytes from chlorophytes.

JM domains are found in all land plant major lineages

but not in any of the 12 chlorophytes. In contrast, ID only

occurs in gymnosperms and angiosperms. This indicates

that in the evolutionary trajectory of BRI1–BRL genes, the

acquirement of JM is prior to the acquirement of ID. If the

absence of JM domain in the 12 chlorophytes correctly

reflects the absence of this domain in chlorophytes, the

origin of JM probably is estimated to happen after the

streptophytes diverged from chlorophytes but before the

divergence of liverworts from other land plants. It has been

suggested that Charales and Coleochaetales have a closer

relationship to land plants than other green algae (Karol

et al. 2001; Qiu et al. 2006). However, whole-genome data

on these lineages are not available, so we cannot decide if

the origin of JM is before or after the plant colonization of

land.

Both KD similarity and domain configuration analysis

suggest that the upper and lower bound of the origin of ID

are in euphyllophytes after its split from lycophytes but

before the divergence of angiosperms and extant gymno-

sperms. Ferns and horsetails form a major lineage diverged

after lycophytes but before seed plants. Unfortunately,

current accessible data is not enough to resolve the

occurrence of ID in this lineage.

The diversification of this three clades, and maybe the

differentiation of functions (see below), can also be dated

as before the differentiation of angiosperms and gymno-

sperms: in the RLK/Pelle-LRR-Xb-1 NJ tree (Fig. S1), all

of the three major clades of BRI1–BRL family have

members from conifers (bootstrap support at some nodes

are low). If gymnosperms form the sister clade of angio-

sperms, as argued by the recent molecular studies (see

review in Mathews 2009), the origin of the three clades

probably can be traced to the ancestor of seed plants. If not,

Fig. 2 ID domain of BRI1–BRL family. a Alignment after redun-

dancy elimination at the cutoff of 90 % of sequence identity. The

complete alignment is shown in Fig. S3. In each subgroup, conserved

residues of ID are highlighted with dots. Blue stripes mark the column

of conserved residues. Numbers on the top of alignments shows the

locations of residues in the alignment. b Schematic representation of

the ID domain for each group. ID domains are represented by

horizontal blue stripes. Vertical bars with amino acid symbols above

domains exhibit locations of conserved residues. Vertical bars at two

terminals of domains exhibit locations of the first and last residues.

The relationship between the three groups is shown on the left. This

topology is based on the NJ tree of ID sequences but only the basal

branching pattern of the three groups is shown. Bootstrap values

(1,000 replicates) of the three groups are shown in branches (Color

figure online)

c

J Mol Evol (2014) 78:118–129 123

123

the lower bound can be placed at the common ancestor of

angiosperms and conifers.

In short, the domain configuration evolution of BRI1–

BRL family can be resolved as a three stage process

(Fig. 1): first, the LRR–TM–KD configuration was ancient

to all green plants; then in streptophytes, either before or

after the divergence of Coleochaetales and Charales, JM

domain added in; and at last, ID appeared in the common

ancestor of angiosperms and gymnosperms after differen-

tiated from lycophytes.

124 J Mol Evol (2014) 78:118–129

123

Origin of BR Recognition via BRI1

Although island regions which interrupt LRR in proteins

containing LRR arrays have been widely observed, their

origin, evolution, and function are little known (Matsu-

shima et al. 2009). To date, only in very few genes, e.g.

AtBRI1 in Arabidopsis, DcPSKR in Zinnia elegans and

Toll in Drosophila, island regions have been proposed to

be functional (Gibbard et al. 2006; Kinoshita et al. 2005;

Shinohara et al. 2007). In all the three cases, the functional

island regions (or ID) have been inferred playing a role in

ligand interaction.

In AtBRI1, the essentiality of ID (along with its

downstream LRR) in BR binding has been confirmed

(Kinoshita et al. 2005). Recently resolved three-dimen-

sional structure of AtBRI1 has revealed the details of how

ID functions in BR perception (Hothorn et al. 2011; She

et al. 2011): ID folds back into the interior of the LRR

superhelix to form a surface pocket which the brassinolide,

the first isolated BR, can bind to. Since ID acquirement is

the last step during the domain configuration evolution of

the BRI1–BRL family, this domain gain event provides a

LRR RLK gene the ability to recognize BRs and the

possibility that this new gene is recruited as the receptor to

initiate the BR signaling cascade.

Subsequently, an interesting question is when and how

many times this occurred. Although exceptions have been

reported (Lynch and Wagner 2008; Nehrt et al. 2011), the

orthology conjecture, i.e. orthologues carry out equivalent

functions, whereas paralogues undergo functional diversi-

fication, seems applicable in general (Gabaldon and Koo-

nin 2013). In our case, the AtBRI1 orthologs in tomato

(Holton et al. 2007) have also been established as active

BR receptors. In rice, evidence also suggests that OsBRI1

is a BR receptor (Yamamuro et al. 2000; Zhao et al. 2002).

If the orthology conjecture is valid in our case and the

orthology of BRI1 genes in Arabidopsis, tomato and rice

reflects conservation of ancient function, our results sug-

gest that recruiting of BRI1 gene as a component of BR

signaling pathway has a single origin in plant evolution.

We failed to detect BRI1–BRL homologs with ID in

nonseed plants in this study. However, BRs have been

detected throughout the plant kingdom in every species that

has been examined, including nonseed plants such as fern

(Equisetum arvense), liverwort (Marchantia polymorpha),

and green algae (Chlorella vulgaris, Hydrodictyon

Fig. 3 Amino acids ML phylogeny of BRI1–BRL genes in

sequenced plant genomes. a Relationship of deep nodes and the

BRI1 clade. KD of AT5G07280 is used as outgroup. b, c Phylogeny

of the BRL1–3 and BRL2 clades, respectively. Bootstrap values of

100 replicates are shown in branches. Only values of 50 % or more

are shown. The format of names of leaf nodes is ‘‘three-letter species

code’’ ‘‘Gene ID’’/‘‘location of the region used in tree construction’’.

The correspondence between species code and the standard species

names can be found in Table S1. The detailed information about Gene

ID can be found in Table S3

J Mol Evol (2014) 78:118–129 125

123

reticulatum) (Bajguz and Tretyn 2003; Clouse 2011).

Therefore, if BR signaling exists in these ‘‘lower’’ plant

lineages, they should use a different BR receptor. Collec-

tively, we estimate that the origin of the Arabidopsis BR

signaling paradigm is no earlier than the divergence of

lycophytes.

The Consistency Between Phylogenetic and Functional

Differentiation

Our results show strong correlation between evolutionary

pattern and gene function: it is known (Cano-Delgado et al.

2004; Clay and Nelson 2002; Li and Chory 1997; Na-

kamura et al. 2006) that Arabidopsis BRI1–BRL genes can

be functionally categorized as two groups: (1) AtBRI1,

AtBRL1 and AtBRL3 perform as receptors of BR. (2)

AtBRL2 does not induce BR response but plays a role in

transduction of other extracellular spatial and temporal

signals into downstream cell differentiation responses in

provascular/procambial cells (Ceserani et al. 2009; Clay

and Nelson 2002). In the BR receptor group, AtBRI1 is the

major BR receptor which ubiquitously expresses in grow-

ing cells, while AtBRL1 and AtBRL3 perform as func-

tional redundancy partners of AtBRI1 and induce cell-type-

specific BR response in vascular tissues (Cano-Delgado

et al. 2004). Consistent with this functional diversity, the

ML phylogeny (Fig. 3) exhibits that the basal split of

BRI1–BRL genes is between BRL2 and all others, and the

second split is between BRI1 and BRL1–3 genes and the

two splits are caused by duplications that induce neo- and

sub-functionalization.

At nine sites in the ID, amino acids are conserved in BR

binding genes AtBRI1, AtBRL1–3, and OsBRI1, but not in

the confirmed non-BR-binding gene AtBRL2. Moreover,

seven out of the nine sites (i.e. 9K–11Y/F–61T–64T–68N–

69G–71S) are highly conserved (identity [90 %) in all

BRI1 and BRL1–3 gene models (Fig. 2). Therefore, it is

possible that one or several of these substitutions may have

played a role in the functional diversification of AtBRL2. It

is worth noting that substitution in ID domain may not be

the (or the only) source of the functional alternation of

AtBRL2—changes in other domain may also contribute to

or be fully responsible for that. Further analyses are needed

to reveal why AtBRL2 does not function as a BR receptor.

A Possible Picture of the Evolution of BRI1–BRL

Exon–Intron Structure

95 % (111 out of the 117) BRI1–BRL genes were single-

exon genes. Mapping exon–intron structures into the phy-

logeny of BRI1–BRL family and related genes could

resolve the evolution of the exon–intron structure. How-

ever, if constructing the phylogeny using all BRI1–BRL

related genes (i.e. all LRR RLK genes) from the 29

sequenced species, the relatively short size of KD domain

(\300 aa) and the huge number of genes (Lehti-Shiu et al.

2009) would make the tree highly unreliable. Here, we

used a sampling strategy by investigating LRR RLK genes

in the Arabidopsis genome. The reasons for choosing

Arabidopsis thaliana were as follows: (1) its LRR genes

were distributed in most of the subclades of the LRR RLK

domain family (Shiu et al. 2004) and (2) this genome had

so far the best gene annotation quality in plants. In this

study, we only investigated exon–intron structure within

the protein-coding gene ORFs but not untranslated regions

because the prediction of untranslated regions is less

reliable.

The relationship between DNA sequences of KDs of

Arabidopsis LRR RLK genes is shown in Fig. S6. Con-

sistent with the previous results (Shiu and Bleecker 2003;

Table 1 BRI1–BRL genes in sequenced angiosperm genomes

Organism BRI1 BRL1–3 BRL2 Sum

Aquilegia coerulea 1 0 1 2

Arabidopsis lyrata 1 2 1 4

Arabidopsis thaliana 1 2 1 4

Amborella trichopoda 1 0 2 3

Brachypodium distachyon 1 1 1 3

Brassica rapa 3 2 1 6

Citrus clementina 1 1 1 3

Carica papaya 1 1 1 3

Capsella rubella 1 2 1 4

Cucumis sativus 1 0 1 2

Citrus sinensis 1 1 1 3

Eucalyptus grandis 1 2 2 5

Glycine max 2 2 2 6

Linum usitatissimum 4 0 2 6

Malus domestica 3 3 3 9

Manihot esculenta 2 2 2 6

Mimulus guttatus 2 1 1 4

Medicago truncatula 1 0 0 1

Oryza sativa 1 2 1 4

Panicum virgatum 2 4 0 6

Prunus persica 1 1 1 3

Populus trichocarpa 2 2 2 6

Phaseolus vulgaris 1 1 1 3

Ricinus communis 1 1 1 3

Sorghum bicolor 1 1 1 3

Setaria italica 1 2 1 4

Thellungiella halophila 1 2 1 4

Vitis vinifera 1 1 1 3

Zea mays 1 2 1 4

Total 41 41 35 117

126 J Mol Evol (2014) 78:118–129

123

Shiu et al. 2004), this phylogeny suggested that BRI1–BRL

family and its sister gene EMS1 (EXCESS MICROSP-

OROCYTES1, Ath:AT5G07280, a putative LRR RLK

gene that controls somatic and reproductive cell fates in

Arabidopsis Zhao et al. 2002) formed a monophyletic

group (bootstrap value = 55 %) and the most intron-con-

taining genes were diverged earlier. If the grouping of

BRI1–BRL and EMS1 as a clade was valid and old LRR

RLK genes were intron containing, the origin of single-

exon structure of the BRI1–BRL family could be parsi-

moniously explained as intron loss event(s) in their com-

mon ancestor before the divergence of EMS1. However,

we note that both observations have only rather poor sta-

tistical support and further studies are needed to fully

resolve this question.

Mapping the domain configuration of Arabidopsis LRR

RLK genes to the phylogeny (Fig. S6) identified only five

genes containing JM domain: the four BRI1–BRL genes

and EMS1. With low statistical support, the pattern that all

of the JM containing genes formed a clade indicated that

the acquirement of JM might be posterior to the occurrence

of the single-exon structure of BRI1–BRL family.

Intron-Gain Events

The other six BRI1–BRL genes with two introns were

probably derived from intron gain. Flanking coding

sequences of these introns were highly conservative (two

examples are shown in Fig. S7) and all of these introns

were located at different sites in the genes, and therefore

the six introns were likely to be gained through indepen-

dent events.

Nature of Ancient Intron Loss

Ancient intron loss event(s) that shaping current exon–

intron structure of BRI1–BRL genes may have removed

single or multiple introns at a time. Both reverse trans-

criptase-mediated (RT-mediated) intron loss (Derr 1998;

Roy and Gilbert 2006) and retroposition (Brosius 1991;

McCarrey and Thomas 1987) can lead to simultaneous loss

of multiple introns. Retroposition usually generates an in-

tronless copy that is located at a different locus from the

original copy through the activity of retroelements like

LINEs or LTR retrotransposons. RT-mediated intron loss,

or gene conversion by a cDNA, however, removes introns

in the original gene, but does not change its physical

position in the genome. The two mechanisms can thus be

tested when compared species are evolutionarily close

enough so that synteny is detectable at target loci. In the

BRI1–BRL case, however, it is difficult to distinguish the

two mechanisms because of the rapid erosion of gene

synteny in plant genome evolution.

Acknowledgments This work was supported by the National Basic

Research Program of China (973 Project No. 2007CB814800 and

2013CB834100) and the Shanghai Leading Academic Discipline

Project (No. B111). This study was also supported in part by

resources and technical expertise from the Georgia Advanced Com-

puting Resource Center, a partnership between the University of

Georgia’s Office of the Vice President for Research and Office of the

Vice President for Information Technology.

Conflict of interest The authors declare that they have no conflict

of interest.

References

Abascal F, Zardoya R, Posada D (2005) ProtTest: selection of best-fit

models of protein evolution. Bioinformatics 21:2104

Adl SM, Simpson AG, Farmer MA, Andersen RA, Anderson OR,

Barta JR, Bowser SS, Brugerolle G, Fensome RA, Fredericq S,

James TY, Karpov S, Kugrens P, Krug J, Lane CE, Lewis LA,

Lodge J, Lynn DH, Mann DG, McCourt RM, Mendoza L,

Moestrup O, Mozley-Standridge SE, Nerad TA, Shearer CA,

Smirnov AV, Spiegel FW, Taylor MF (2005) The new higher

level classification of eukaryotes with emphasis on the taxonomy

of protists. J Eukaryot Microbiol 52:399

Andrade MA, Ponting CP, Gibson TJ, Bork P (2000) Homology-

based method for identification of protein repeats using statis-

tical significance estimates. J Mol Biol 298:521

Bajguz A, Tretyn A (2003) The chemical characteristic and distri-

bution of brassinosteroids in plants. Phytochemistry 62:1027

Brosius J (1991) Retroposons—seeds of evolution. Science 251:753

Cano-Delgado A, Yin Y, Yu C, Vafeados D, Mora-Garcia S, Cheng

JC, Nam KH, Li J, Chory J (2004) BRL1 and BRL3 are novel

brassinosteroid receptors that function in vascular differentiation

in Arabidopsis. Development 131:5341

Capella-Gutierrez S, Silla-Martinez JM, Gabaldon T (2009) trimAL: a

tool for automated alignment trimming in large-scale phyloge-

netic analyses. Bioinformatics 25:1972

Ceserani T, Trofka A, Gandotra N, Nelson T (2009) VH1/BRL2

receptor-like kinase interacts with vascular-specific adaptor

proteins VIT and VIK to influence leaf venation. Plant J 57:1000

Chen K, Durand D, Farach-Colton M (2000) NOTUNG: a program

for dating gene duplications and optimizing gene family trees.

J Comput Biol 7:429

Chono M, Honda I, Zeniya H, Yoneyama K, Saisho D, Takeda K,

Takatsuto S, Hoshino T, Watanabe Y (2003) A semidwarf

phenotype of barley uzu results from a nucleotide substitution in

the gene encoding a putative brassinosteroid receptor. Plant

Physiol 133:1209

Clay NK, Nelson T (2002) VH1, a provascular cell-specific receptor

kinase that influences leaf cell patterns in Arabidopsis. Plant Cell

14:2707

Clouse SD (2011) Brassinosteroid signal transduction: from receptor

kinase activation to transcriptional networks regulating plant

development. Plant Cell 23:1219

Clouse SD, Sasse JM (1998) BRASSINOSTEROIDS: essential

regulators of plant growth and development. Annu Rev Plant

Physiol Plant Mol Biol 49:427

Derr LK (1998) The involvement of cellular recombination and repair

genes in RNA-mediated recombination in Saccharomyces cere-

visiae. Genetics 148:937

Dolan J, Walshe K, Alsbury S, Hokamp K, O’Keeffe S, Okafuji T,

Miller SF, Tear G, Mitchell KJ (2007) The extracellular leucine-

rich repeat superfamily; a comparative survey and analysis of

J Mol Evol (2014) 78:118–129 127

123

evolutionary relationships and expression patterns. BMC

Genomics 8:320

Edgar RC (2004) MUSCLE: multiple sequence alignment with high

accuracy and high throughput. Nucleic Acids Res 32:1792

Finn RD, Clements J, Eddy SR (2011) HMMER web server:

interactive sequence similarity searching. Nucleic Acids Res

39:W29

Gabaldon T, Koonin EV (2013) Functional and evolutionary impli-

cations of gene orthology. Nat Rev Genet 14:360

Gibbard RJ, Morley PJ, Gay NJ (2006) Conserved features in the

extracellular domain of human toll-like receptor 8 are essential

for pH-dependent signaling. J Biol Chem 281:27503

Gish LA, Clark SE (2011) The RLK/Pelle family of kinases. Plant J

66:117

Holton N, Cano-Delgado A, Harrison K, Montoya T, Chory J, Bishop

GJ (2007) Tomato BRASSINOSTEROID INSENSITIVE1 is

required for systemin-induced root elongation in Solanum

pimpinellifolium but is not essential for wound signaling. Plant

Cell 19:1709

Hothorn M, Belkhadir Y, Dreux M, Dabi T, Noel JP, Wilson IA,

Chory J (2011) Structural basis of steroid hormone perception by

the receptor kinase BRI1. Nature 474:467

Kajava AV (1998) Structural diversity of leucine-rich repeat proteins.

J Mol Biol 277:519

Karol KG, McCourt RM, Cimino MT, Delwiche CF (2001) The

closest living relatives of land plants. Science 294:2351

Kim TW, Wang ZY (2010) Brassinosteroid signal transduction from

receptor kinases to transcription factors. Annu Rev Plant Biol

61:681

Kinoshita T, Cano-Delgado A, Seto H, Hiranuma S, Fujioka S,

Yoshida S, Chory J (2005) Binding of brassinosteroids to the

extracellular domain of plant receptor kinase BRI1. Nature

433:167

Kobe B, Kajava AV (2001) The leucine-rich repeat as a protein

recognition motif. Curr Opin Struct Biol 11:725

Koka CV, Cerny RE, Gardner RG, Noguchi T, Fujioka S, Takatsuto

S, Yoshida S, Clouse SD (2000) A putative role for the tomato

genes DUMPY and CURL-3 in brassinosteroid biosynthesis and

response. Plant Physiol 122:85

Koonin EV (2005) Orthologs, paralogs, and evolutionary genomics.

Annu Rev Genet 39:309

Lehti-Shiu MD, Zou C, Hanada K, Shiu SH (2009) Evolutionary

history and stress regulation of plant receptor-like kinase/Pelle

genes. Plant Physiol 150:12

Li J, Chory J (1997) A putative leucine-rich repeat receptor kinase

involved in brassinosteroid signal transduction. Cell 90:929

Lynch VJ, Wagner GP (2008) Resurrecting the role of transcription

factor change in developmental evolution. Evolution 62:2131

Mathews S (2009) Phylogenetic relationships among seed plants:

persistent questions and the limits of molecular data. Am J Bot

96:228

Matsushima N, Mikami T, Tanaka T, Miyashita H, Yamada K,

Kuroki Y (2009) Analyses of non-leucine-rich repeat (non-LRR)

regions intervening between LRRs in proteins. Biochim Biophys

Acta 1790:1217

Matsushima N, Miyashita H, Mikami T, Kuroki Y (2010) A nested

leucine rich repeat (LRR) domain: the precursor of LRRs is a ten

or eleven residue motif. BMC Microbiol 10:235

Matsuzaki M, Misumi O, Shin IT, Maruyama S, Takahara M,

Miyagishima SY, Mori T, Nishida K, Yagisawa F, Yoshida Y,

Nishimura Y, Nakao S, Kobayashi T, Momoyama Y, Higash-

iyama T, Minoda A, Sano M, Nomoto H, Oishi K, Hayashi H,

Ohta F, Nishizaka S, Haga S, Miura S, Morishita T, Kabeya Y,

Terasawa K, Suzuki Y, Ishii Y, Asakawa S, Takano H, Ohta N,

Kuroiwa H, Tanaka K, Shimizu N, Sugano S, Sato N, Nozaki H,

Ogasawara N, Kohara Y, Kuroiwa T (2004) Genome sequence

of the ultrasmall unicellular red alga Cyanidioschyzon merolae

10D. Nature 428:653

McCarrey JR, Thomas K (1987) Human testis-specific PGK gene

lacks introns and possesses characteristics of a processed gene.

Nature 326:501

Montoya T, Nomura T, Farrar K, Kaneta T, Yokota T, Bishop GJ

(2002) Cloning the tomato curl3 gene highlights the putative

dual role of the leucine-rich repeat receptor kinase tBRI1/SR160

in plant steroid hormone and peptide hormone signaling. Plant

Cell 14:3163

Morillo SA, Tax FE (2006) Functional analysis of receptor-like

kinases in monocots and dicots. Curr Opin Plant Biol 9:460

Nakamura A, Fujioka S, Sunohara H, Kamiya N, Hong Z, Inukai Y,

Miura K, Takatsuto S, Yoshida S, Ueguchi-Tanaka M, Hase-

gawa Y, Kitano H, Matsuoka M (2006) The role of OsBRI1 and

its homologous genes, OsBRL1 and OsBRL3, in rice. Plant

Physiol 140:580

Nehrt NL, Clark WT, Radivojac P, Hahn MW (2011) Testing the

ortholog conjecture with comparative functional genomic data

from mammals. PLoS Comput Biol 7:e1002073

Nei M, Gojobori T (1986) Simple methods for estimating the numbers

of synonymous and nonsynonymous nucleotide substitutions.

Mol Biol Evol 3:418

Nomura T, Bishop GJ, Kaneta T, Reid JB, Chory J, Yokota T (2003)

The LKA gene is a BRASSINOSTEROID INSENSITIVE 1

homolog of pea. Plant J 36:291

Posada D (2008) jModelTest: phylogenetic model averaging. Mol

Biol Evol 25:1253

Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell

C, Pang N, Forslund K, Ceric G, Clements J, Heger A, Holm

L, Sonnhammer EL, Eddy SR, Bateman A, Finn RD (2012)

The PFAM protein families database. Nucleic Acids Res

40:D290

Qiu YL, Li L, Wang B, Chen Z, Knoop V, Groth-Malonek M,

Dombrovska O, Lee J, Kent L, Rest J, Estabrook GF, Hendry

TA, Taylor DW, Testa CM, Ambros M, Crandall-Stotler B, Duff

RJ, Stech M, Frey W, Quandt D, Davis CC (2006) The deepest

divergences in land plants inferred from phylogenomic evidence.

Proc Natl Acad Sci USA 103:15511

Roy SW, Gilbert W (2006) The evolution of spliceosomal introns:

patterns, puzzles and progress. Nat Rev Genet 7:211

Saitou N, Nei M (1987) The neighbor-joining method: a new method

for reconstructing phylogenetic trees. Mol Biol Evol 4:406

She J, Han Z, Kim TW, Wang J, Cheng W, Chang J, Shi S, Yang M,

Wang ZY, Chai J (2011) Structural insight into brassinosteroid

perception by BRI1. Nature 474:472

Shinohara H, Ogawa M, Sakagami Y, Matsubayashi Y (2007)

Identification of ligand binding site of phytosulfokine receptor

by on-column photoaffinity labeling. J Biol Chem 282:124

Shiu SH, Bleecker AB (2001a) Plant receptor-like kinase gene family:

diversity, function, and signaling. Sci STKE 2001:re22

Shiu SH, Bleecker AB (2001b) Receptor-like kinases from Arabi-

dopsis form a monophyletic gene family related to animal

receptor kinases. Proc Natl Acad Sci USA 98:10763

Shiu SH, Bleecker AB (2003) Expansion of the receptor-like kinase/

Pelle gene family and receptor-like proteins in Arabidopsis.

Plant Physiol 132:530

Shiu SH, Karlowski WM, Pan R, Tzeng YH, Mayer KF, Li WH

(2004) Comparative analysis of the receptor-like kinase family

in Arabidopsis and rice. Plant Cell 16:1220

Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based

phylogenetic analyses with thousands of taxa and mixed models.

Bioinformatics 22:2688

Stanke M, Diekhans M, Baertsch R, Haussler D (2008) Using native

and syntenically mapped cDNA alignments to improve de novo

gene finding. Bioinformatics 24:637

128 J Mol Evol (2014) 78:118–129

123

Sun Y, Fokar M, Asami T, Yoshida S, Allen RD (2004) Character-

ization of the brassinosteroid insensitive 1 genes of cotton. Plant

Mol Biol 54:221–232

Suyama M, Torrents D, Bork P (2006) PAL2NAL: robust conversion

of protein sequence alignments into the corresponding codon

alignments. Nucleic Acids Res 34:W609

Symons GM, Davies C, Shavrukov Y, Dry IB, Reid JB, Thomas MR

(2006) Grapes on steroids. Brassinosteroids are involved in grape

berry ripening. Plant Physiol 140:150

Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S

(2011) MEGA5: molecular evolutionary genetics analysis using

maximum likelihood, evolutionary distance, and maximum

parsimony methods. Mol Biol Evol 28:2731

Vert G, Nemhauser JL, Geldner N, Hong F, Chory J (2005) Molecular

mechanisms of steroid hormone signaling in plants. Annu Rev

Cell Dev Biol 21:177

Wang X, Li X, Meisenhelder J, Hunter T, Yoshida S, Asami T, Chory

J (2005) Autoregulation and homodimerization are involved in

the activation of the plant steroid receptor BRI1. Dev Cell 8:855

Wang ZY, Bai MY, Oh E, Zhu JY (2012) Brassinosteroid signaling

network and regulation of photomorphogenesis. Annu Rev Genet

46:701

Xu G, Ma H, Nei M, Kong H (2009) Evolution of F-box genes in

plants: different modes of sequence divergence and their

relationships with functional diversification. Proc Natl Acad

Sci USA 106:835

Yamamuro C, Ihara Y, Wu X, Noguchi T, Fujioka S, Takatsuto S,

Ashikari M, Kitano H, Matsuoka M (2000) Loss of function of a

rice brassinosteroid insensitive1 homolog prevents internode

elongation and bending of the lamina joint. Plant Cell 12:1591

Yang Z (2007) PAML 4: phylogenetic analysis by maximum

likelihood. Mol Biol Evol 24:1586

Zhao DZ, Wang GF, Speal B, Ma H (2002) The excess microsporo-

cytes1 gene encodes a putative leucine-rich repeat receptor

protein kinase that controls somatic and reproductive cell fates in

the Arabidopsis anther. Genes Dev 16:2021

J Mol Evol (2014) 78:118–129 129

123