13
TreeBASE and Phyloinformatic Roderic Page University of Glasgow

TreeBASE and Phyloinformatics Roderic Page University of Glasgow

Embed Size (px)

Citation preview

Page 1: TreeBASE and Phyloinformatics Roderic Page University of Glasgow

TreeBASE and Phyloinformatics

Roderic PageUniversity of Glasgow

Page 2: TreeBASE and Phyloinformatics Roderic Page University of Glasgow

At the core of a ToL effort must be a “phyloinformatic intrastructure”

Tools for:

• data and tree storage

• analysis (supertrees, supermatrices)

• collaboration

• meta analysis

Page 3: TreeBASE and Phyloinformatics Roderic Page University of Glasgow

It’s a scandal

• We cannot answer even the most basic question: “what is the phylogeny for group x?”

• GenBank is currently the best phylogenetic database(!)

• Can't even say how many species are in a given group

• Little idea of who is doing what

Page 4: TreeBASE and Phyloinformatics Roderic Page University of Glasgow
Page 5: TreeBASE and Phyloinformatics Roderic Page University of Glasgow

Tree of Lifetolweb.org

• Provides text and images

• Relies on extensive manual effort (e.g., writing text)

• Can’t do any computations with it

• Limited research value

Page 6: TreeBASE and Phyloinformatics Roderic Page University of Glasgow

TreeBASEwww.treebase.org

• Relational database

• Query by author, taxon, study number

• Compute supertrees

• Submit NEXUS data files

Page 7: TreeBASE and Phyloinformatics Roderic Page University of Glasgow

TreeBASE and mincut supertrees

• User selects two or more trees

• Clicks on button

and script on darwin.zoology.gla.ac.uk is run to create supertree

• Can view as PS, PDF, treefile, or in Java applet (ATV)

Page 8: TreeBASE and Phyloinformatics Roderic Page University of Glasgow

Dependencies amongst studies (Gatesy et al.)

Page 9: TreeBASE and Phyloinformatics Roderic Page University of Glasgow

What’s wrong with TreeBASE?

• No consistency of taxon names

• (e.g., Human, Homo sapiens, Homo sapiens X54666-1)

• No consistency of data names (e.g., gene names, morphological characters, etc.)

Page 10: TreeBASE and Phyloinformatics Roderic Page University of Glasgow

What needs to be done to TreeBASE?

• Consistency of taxon names

• Consistency of data names (e.g., gene names)

Page 11: TreeBASE and Phyloinformatics Roderic Page University of Glasgow

General issues• Develop tools for rapid construction of supertrees and

supermatrices

• Visualisation of trees (and other graphs)

• Queries to highlight areas of uncertainty

• Easy submission of rigorously annotated data

• Resolve centralisation versus distributed (one database or many?)

Page 12: TreeBASE and Phyloinformatics Roderic Page University of Glasgow

The single most important thing we could do is to create a

phyloloinformatic infrastructure

to support ToL studies

(IMHO)

Page 13: TreeBASE and Phyloinformatics Roderic Page University of Glasgow

Primary Database

Comparative DataPhylogenetic Trees

Higher TaxonName Database

Synthetic Viewof

Tree of Life

Synthetic Viewof

Tree of Life....additionalsyntheses

Secondary Databases

PII

SequenceDatabases

Species NameDatabases

Collections andVoucher Specimen

Databases

BiologicalDatabases

BiologicalDatabases

Phylogenetically drivenqueries

.....