Upload
gabriel-clay
View
226
Download
1
Tags:
Embed Size (px)
Citation preview
Bienvenidos a TAIR!
Kate Dreher
curator
TAIR/PMN
TAIR = The Arabidopsis Information Resource
Why Arabidopsis?
What does TAIR do?
What can you do with TAIR?
Introduction to TAIR
Arabidopsis
Introduction to Arabidopsis Basic facts:
“small weed related to mustard” also known as “mouse ear cress” can grow to 20-25 cm tall annual (or occasionally biennial) plant member of the Brassicaceae
broccoli cauliflower radish cabbage
found around the northern hemisphere
Why do so many people study THIS plant?
Arabidopsis has good model organism traits
Fast life cycle (6 weeks) Thousands of plants fit in a small space Fairly easy to grow Thousands of seeds produced by each plant Self-fertile (in-breeding) Many different subspecies/ecotypes Serves as a good model for crop plants
But why Arabidopsis instead of other plants?
Arabidopsis offers some advantages “Good” genome
very small: 125 Mb diploid 5 haploid chromosomes fewer/smaller regions of repetitive DNA than many plants
Quite easily transformable with Agrobacterium NO tissue culture required
Inertia! A group of scientists lobbied for Arabidopsis The genome was sequenced (2000) MANY resources have been developed
Arabidopsis research can be successfully applied to “real plants” Over-expression of the hardy gene from Arabidopsis can improve water use efficiency in
rice (Karaba 2007)
cDNAs from castor bean were over-expressed in Arabidopsis and a high-throughput screen of fatty acid content in Arabidopsis seeds led to the identification of three cDNAs that increase the hydroxy fatty acid levels (Lu 2006)
Endosperm-specific over-expression of the Arabidopsis GTPCHI and ADCS biosynthetic genes can increase folate (vitamin B9) levels by up to 100-fold in rice (Storozhenko 2007)
Studies on a sodium transporter (HKT1) in Arabidopsis helped to identify a durum wheat homolog. It has been introgressed into bread wheat lines and appears to improve their yield on saline soils (Hwang 2006; Byrt 2007, et al)
Both basic and translational experiments using Arabidopsis continue . . .
Arabidopsis data explosion TONS of data are generated about Arabidopsis
Over 2400 “Arabidopsis” articles published each year are indexed in PubMed
Tens of thousands of mutants have been generated
Hundreds of microarray experiments have been performed
Proteomics and metabolomics studies are becoming popular
“1001” Arabidopsis genomes are being sequenced
Large-scale phenotypic studies are scheduled to start soon
TAIR tries to bring data together to benefit scientists and society That includes all of you . . .
What does TAIR do? Curators and computer tech team members work together under great directors
TAIR develops internal data sets and resources
TAIR links to external data sets and resources
TAIR provides free on-line access to everyone: www.arabidopsis.org
Funded by the National Science Foundation of the USA Started in 1999
Dr. Eva HualaDirector
Dr. Sue RheeCo-PI
Curators
Computer tech team members
Internal TAIR data sets
Structural curators try to correctly define gene sequences
Functional curators try to correctly describe gene function
Structural curation at TAIR Structural curators try to answer the question:
What are ALL of the genes in Arabidopsis? Use many types of data
ESTs full-length cDNAs peptides orthology
Determine gene coordinates and features Establish intron, exon, and UTR boundaries Add alternative splice variants Classify genes
protein coding miRNA psuedogene
Keep updating! (even though the genome was sequenced in 2000!) TAIR9 – released June 2009 282 new loci and 739 new gene models
Structural curation at TAIR
Apollo is a program to assist with structural curation
ESTsProtein similarity cDNAs
The seed-bearing structure in angiosperms, formed from the ovary after flowering
Functional curation at TAIR Functional curators try to answer the questions:
What does every gene/protein in Arabidopsis do? When are where does it act?
Functional curation requires controlled vocabularies Allow cross-species comparisons TAIR curators work to develop and agree upon common terms
acheneberry
capsulecaryopsis
circumcissilecapsulecypseladrupefolliclegrainkernel
legumeloculicidal capsule
lomentumnutpod
pomeporicidal capsule
schizocarpsepticidal capsuleseptifragal capsule
silique
FRUIT
Plant Ontology:Structure:PO:0009001
Catalysis of the reaction:auxin + UDP-D-glucose = indole-3-acetyl-beta-1-D-glucose + UDP
Functional curation at TAIR Functional curators try to correctly describe gene function
Functional curators try to help build controlled vocabularies Allow cross-species comparisons Develop and agree upon common terms
indole-3-acetate beta-glucosyltransferase activity
Gene Ontology:Molecular function:GO:0047215
IAA-Glu synthetase activityIAA-glucose synthase activityIAGlu synthase activityindol-3-ylacetylglucose synthase activityUDP-glucose:(indol-3-yl)acetate beta-D-glucosyltransferase activityUDP-glucose:indol-3-ylacetate glucosyl-transferase activityUDP-glucose:indol-3-ylacetate glucosyltransferase activityUDPG-indol-3-ylacetyl glucosyl transferase activityUDPglucose:indole-3-acetate beta-D-glucosyltransferase activityuridine diphosphoglucose-indoleacetate glucosyltransferase activity
Functional curation at TAIR Functional curators use controlled vocabularies to annotate genes
Molecular function Subcellular localization Biological process Expression pattern
Development stage Tissue / organ / cell type
Gene Enter common name, e.g. Nitrate Transporter 2.7, NRT2.7 Prefer to track using AGI (Arabidopsis Genome Initiative) Locus Codes
AT5G14570
Data Sources Published Literature Researchers
Arabidopsis thaliana Chromosome 5
Position along chromosome (between 14560 and 14580)Gene
Functional curation at TAIR Functional curators capture mutant phenotypes
alx8 mutant – mutation in gene At5g63980
External data sets MANY different external data sets are linked to specific genes
EST sequences (Arabidopsis and other species) Transcript expression data Peptide expression data Biochemical pathway data (. . . described in the PMN talk) Epigenetic features Ecotype-specific polymorphisms Publications Seed stocks DNA vectors Interaction partners Promoter elements Post-translational modifications Orthologs
New data types are frequently added
Providing Tools at TAIR Tech (computer) team members and curators
Provide links to external databases from every gene page
Providing Tools at TAIR Tech (computer) team members and curators
Load TAIR and external data sets into existing tools
BLAST
GBrowse
Synteny Viewer (very new)
NBrowse Interaction Viewer (coming soon . . .)
Genbank Green Plant
Providing Tools at TAIR Tech (computer) team members and curators
Develop new tools and modify existing tools
SeqViewer
Patmatch
. . . several others
Providing Tools at TAIR Tech (computer) team members and curators
Create advanced search pages
Other Resources at TAIR Ordering system for the Arabidopsis Biological Resource Center (ABRC)
DNA stocks
Seed stocks
Community member information
Arabidopsis lab protocols
Gene Symbol Registry
Information Portals
Are these data and tools useful?
Bytes per Month
0
10,000,000,000
20,000,000,000
30,000,000,000
40,000,000,000
50,000,000,000
60,000,000,000
70,000,000,000
80,000,000,000
1999 2000 2001 2002 2003 2004 2005 2006
Visits per Month
0
20,000
40,000
60,000
80,000
100,000
120,000
140,000
1999 2000 2001 2002 2003 2004 2005 2006
Unique Visitors per Month
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
45,000
1999 2000 2001 2002 2003 2004 2005 2006
TAIR I TAIR II TAIR I TAIR II
TAIR I TAIR II
Visits per MonthUnique Visitors per Month
Bytes per Month
Who uses TAIR? (June 4 – July 4, 2009)
Why might you use TAIR? Do you work with plants?
Do you want to take advantage of the tremendous amount of Arabidopsis data?
Do you want to know more about a gene? an enzyme? a protein domain? a DNA regulatory region? an abnormal phenotype? a chromosomal region? a set of orthologous proteins? a biological process? natural variation across populations?
Then please come see if TAIR can help you
Putting TAIR to work for you . . .
You are studying drought tolerance in potato plants
You do a subtractive hybridization study to identify cDNAs that are up-regulated in the roots of drought-stressed plants
You find that a number of the up-regulated cDNAs code for proteins with a new domain: Ser-x-Glu-x-Cys-x-Ala = (SxExCxA)
One of the family members, SECA1, appears to be present at particularly high levels
How can TAIR help?
Putting TAIR to work for you . . .
Are there any proteins with the SxExCxA domain in Arabidopsis?
What do they do in Arabidopsis?
Do they share additional domains?
What is the closest homolog to SECA1?
Are there any phenotypes when SECA1 is mutated?
Can I get a cDNA of this homolog to over-express in my species?
Are there putative SECA1 orthologs in other plant species?
Are there any SxExCxA proteins in Arabidopsis? Find all of the proteins that have the SxExCxA domain
What do the SxExCxA proteins do? Option 1: Get the individual GO annotations for each gene
What do SxExCxA proteins do? Option 2: Get an overview of the information for the set of genes
GO categorization
What do SxExCxA proteins do? Option 3: Get the description of each gene
What other domains do SxExCxA proteins share? Identify all of the other domains found in those proteins
What is the closest homolog to SECA1? Blast SECA1 against the TAIR9 protein data set
Are there any mutant phenotypes associated with At2g04240?
Use the Seed/Germplasm Search page
. . . or look in the Germplasm section of the Locus page
Can I get a cDNA for At2g04240 to overexpress in potato?
Use the DNA Clones Search page
Are there putative SECA1 orthologs in other plant species?
Look for putative orthologs and paralogs using GBrowse Phytozome (orthologs)
InParanoid (paralogs)
We are here to help: www.arabidopsis.org Please use our data
Please use our tools
Please use TAIR to help improve your research on IMPORTANT plants!
Please contact us if we can be of any help! Make an appointment to meet with me during my visit
(Puedo tratar de hablar en español)
www.arabidopsis.org
AcknowledgementsTAIR
Current Curators:
- Tanya Berardini (lead curator – functional annotation)
- David Swarbreck (lead curator – structural annotation)
- Peifen Zhang (Director and lead curator- metabolism)
- A. S. Karthikeyan (curator)
- Philippe Lamesch (curator)
- Donghui Li (curator)
- Rajkumar Sasidharan (curator)
Recent Past Contributors:
- Debbie Alexander (curator)
Eva Huala (Director) Sue Rhee (Co-PI)
Tech Team Members:- Bob Muller (Manager)- Larry Ploetz (Sys. Administrator)- Raymond Chetty- Anjo Chi- Vanessa Kirkup- Cynthia Lee- Tom Meyer- Shanker Singh- Chris Wilks
We are here to help: www.arabidopsis.org Please use our data
Please use our tools
Please use TAIR to help improve your research on IMPORTANT plants!
Please contact us if we can be of any help! Make an appointment to meet with me during my visit
(Puedo tratar de hablar en español)
www.arabidopsis.org
Why Arabidopsis? Plant research can benefit from focusing on a “model” plant
Other model organisms include:
Model organisms are easy to do experiments on Fast life cycle Don’t need much space Easy to take care of Lots of offspring (for genetics) Can be genetically transformed Good model for the really interesting species
humans CROP PLANTS
Communities develop to study model organisms
Many resources become available for model organisms Lab protocols Mutant maps Stock centers Genome sequences . . . and more!
roundworm
yeast
mouse
fruit fly
zebrafish
What should be the model plant?
Have I been able to get useful information at TAIR?
We hope so!
But, if you have any trouble finding the information you want . . .