17
Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013 BCHB524 - 2013 - Edwards

Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

Embed Size (px)

Citation preview

Page 1: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

Relational Databases: Object Relational

Mappers – SQLObject II

BCHB5242013

Lecture 23

11/20/2013 BCHB524 - 2013 - Edwards

Page 2: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 2

Relational Databases

Store information in a table Rows represent items Columns represent items' properties or

attributes

Name Continent RegionSurface

AreaPopulatio

nGNP

BrazilSouth

America South America 854740317011500

0 776739

Indonesia Asia Southeast Asia 190456921210700

0 84982

India AsiaSouthern and Central

Asia 328726310136620

00 447114

China Asia Eastern Asia 957290012775580

00 982268

Pakistan AsiaSouthern and Central

Asia 79609515648300

0 61289

United States

North America North America 9363520

278357000 8510700

Page 3: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 3

... as Objects

Objects have data members or attributes.

Store objects in a list oriterable.

Abstract awaydetails of underlyingRDBMS

c1 = Country()c1.name = 'Brazil'c1.continent = 'South America'c1.region = 'South America'c1.surfaceArea = 8547403c1.population = 170115000c1.gnp = 776739

# initialize c2, ..., c6

countryTable = [ c1, c2, c3, c4, c5, c6 ]

for cnty in countryTable:    if cnty.population > 100000000:        print cnty.name, cnty.population

Page 4: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 4

Taxonomy Database, from scratch

Specify the model Tables: Taxonomy and Name

Populate basic data-values in the Taxonomy table from “small_nodes.dmp”

Populate the Names table from “small_names.dmp” Insert basic data-values Insert relationship with Taxonomy table

Fix Taxonomy parent relationship Fix Taxonomy derived information Use in a program…

Page 5: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 5

Taxonomy Database: model.py

from sqlobject import *import os.path, sys

dbfile = 'small_taxa.db3'

def init(new=False):    # Magic formatting for database URI    conn_str = os.path.abspath(dbfile)    conn_str = 'sqlite:'+ conn_str    # Connect to database    sqlhub.processConnection = connectionForURI(conn_str)    if new:        # Create new tables (remove old ones if they exist)        Taxonomy.dropTable(ifExists=True)        Name.dropTable(ifExists=True)        Taxonomy.createTable()        Name.createTable()

Page 6: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 6

Taxonomy Database: model.py

# model.py continued…

class Taxonomy(SQLObject):    taxid = IntCol(alternateID=True)    scientific_name = StringCol()    rank = StringCol()    parent = ForeignKey("Taxonomy")

class Name(SQLObject):    taxonomy = ForeignKey("Taxonomy")    name = StringCol()    name_class = StringCol()

Page 7: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 7

Taxonomy Database structure

TaxonomyName

123456

12345

taxonomy: 4 parent: 2

Foreign Key: id number of some other row

taxonomy

parent

parent: 2taxonomy: 4taxonomy: 4

Page 8: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 8

Populate Taxonomy table:load_taxa.py

import sysfrom model import *

init(new=True)

# Read in the taxonomy nodes, populate taxid and rankh = open(sys.argv[1])for l in h:    l = l.strip('\t|\n')            sl = l.split('\t|\t')    taxid = int(sl[0])    rank = sl[2]    t = Taxonomy(taxid=taxid, rank=rank,                 scientific_name=None,                 parent=None)h.close()

Page 9: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 9

Populate Name table:load_names.py

import sysfrom model import *

init()

# Read in the names, populate name, class, and id of # taxonomy rowh = open(sys.argv[1])for l in h:    l = l.strip('\t|\n')            sl = l.split('\t|\t')    taxid = int(sl[0])    name_class = sl[3]    name = sl[1]    t = Taxonomy.byTaxid(taxid)    n = Name(name=name, name_class=name_class, taxonomy=t)h.close()

Page 10: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 10

Fix up the Taxonomy table:fix_taxa.py

import sysfrom model import *

init()

# Read in the taxonomy nodes, get self and parent taxonomy objects,# and fix the parent field appropriatelyh = open(sys.argv[1])for l in h:    l = l.strip('\t|\n')            sl = l.split('\t|\t')    taxid = int(sl[0])    parent_taxid = int(sl[1])    t = Taxonomy.byTaxid(taxid)    p = Taxonomy.byTaxid(parent_taxid)    t.parent = ph.close()

# Find all scientific names and fix their taxonomy objects' scientific# name files appropriatelyfor n in Name.select(Name.q.name_class == 'scientific name'):    n.taxonomy.scientific_name = n.name

Page 11: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 11

Back to the Taxonomy example

Each taxonomy entry can have multiple names Many names can point (ForeignKey) to a single

taxonomy entry name → taxonomy is easy... taxonomy → list of names requires a select

statement from model import *init()hs = Taxonomy.byTaxid(9606)for n in Name.select(Name.q.taxonomy==hs):     print n.name

Page 12: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 12

Taxonomy Database structure

TaxonomyName

123456

12345

taxonomy: 4 parent: 2

Foreign Key: id number of some other row

taxonomy

parent

parent: 2taxonomy: 4taxonomy: 4

Page 13: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 13

Taxonomy table relationships

This relationship (one-to-many) is called a multiple join.

Related joins(many-to-many)too...

class Taxonomy(SQLObject):    # other data members    names = MultipleJoin("Name")    children = MultipleJoin("Taxonomy",joinColumn='parent_id')

from model import *init()hs = Taxonomy.byTaxid(9606)for n in hs.names:     print n.namefor c in hs.children:     print c.scientific_name

Page 14: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 14

SQLObject Exceptions

What happens when the row isn't in the table?

from model import *

try:   hs = Taxonomy.get(7921)   hs = Taxonomy.byTaxid(9606)except SQLObjectNotFound:   # if row id 7921 / Tax id 9606 is not in table...

results = Taxonomy.selectBy(taxid=9606)if results.count() == 0:   # No rows satisfy the constraint!   try:   first_item = results[0]except IndexError:   # No first item in the results

Page 15: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 15

Example Programimport sysfrom model import *init()

try:    taxid = int(sys.argv[1])except IndexError:    print >>sys.stderr, "Need a taxonomy id argument"    sys.exit(1)except ValueError:    print >>sys.stderr, "Taxonomy id should be an intenger"    sys.exit(1)    #Get taxonomy rowtry:    t = Taxonomy.byTaxid(taxid)except SQLObjectNotFound:    print >>sys.stderr, "Taxonomy id",taxid,"does not exist"    sys.exit(1)

for n in t.names:    print "Organism",t.scientific_name,"has name",n.namefor c in t.children:    print "Organism",t.scientific_name,"has child",c.scientific_name,c.taxidprint "Organism",t.scientific_name,"has parent",t.parent.scientific_name,t.parent.taxid

Page 16: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 16

Example Program# Continued...

# Iterate up through the taxonomy tree from t, to find its genusr = tg = Nonewhile r != r.parent:    if r.rank == 'genus':        g = r        break    r = r.parent

if g == None:    print "Organism",t.scientific_name,"has no genus"else:    print "Organism",t.scientific_name,"has genus",g.scientific_name

Page 17: Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23 11/20/2013BCHB524 - 2013 - Edwards

11/20/2013 BCHB524 - 2013 - Edwards 17

Exercises

Write a python program using SQLObject to find the taxonomic lineage of a user-supplied organism name. Make sure you use the small_taxa.db3 file from

the course data-folder