View
808
Download
10
Category
Preview:
Citation preview
Relational Database Access with PythonWhy you should be using SQLAlchemy
Mark ReesCTO
Century Software (M) Sdn. Bhd.
Is This Your Current Relational Database Access Style?
# Django ORM>>> from ip2country.models import Ip2Country
>>> Ip2Country.objects.all()[<Ip2Country: Ip2Country object>, <Ip2Country: Ip2Country object>, '...(remaining elements truncated)...']
>>> myp = Ip2Country.objects.filter(assigned__year=2015)\... .filter(countrycode2=’MY')
>>> myp[0].ipfrom736425984.0
Is This Your Current Relational Database Access Style?
# SQLAlchemy ORM>>> from sqlalchemy import create_engine, extract>>> from sqlalchemy.orm import sessionmaker>>> from models import Ip2Country
>>> engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2country')>>> Session = sessionmaker(bind=engine)>>> session = Session()
>>> all_data = session.query(Ip2Country).all()
>>> myp = session.query(Ip2Country).\... filter(extract('year',Ip2Country.assigned) == 2015).\... filter(Ip2Country.countrycode2 == ’MY')
print(myp[0].ipfrom)736425984.0
SQL Relational Database AccessSELECT * FROM ip2country;
“id”,"ipfrom";"ipto";"registry";"assigned";"countrycode2";"countrycode3";"countryname"1,1729522688;1729523711;"apnic";"2011-08-05";"CN";"CHN";"China"2,1729523712;1729524735;"apnic";"2011-08-05";"CN";"CHN";"China”. . .
SELECT * FROM ip2countryWHERE date_part('year', assigned) = 2015AND countrycode2 = ’MY';
“id”,"ipfrom";"ipto";"registry";"assigned";"countrycode2";"countrycode3";"countryname"5217;736425984;736427007;"apnic";"2015-01-13 00:00:00";"MY";"MYS";"Malaysia”5218;736427008;736428031;"apnic";"2015-01-13 00:00:00";"MY";"MYS";"Malaysia”. . .
SELECT ipfrom FROM ip2countryWHERE date_part('year', assigned) = 2015AND countrycode2 = ’MY';
"ipfrom"736425984736427008. . .
Python + SQL == Python DB-API 2.0
• The Python standard for a consistent interface to relational databases is the Python DB-API (PEP 249)
• The majority of Python database interfaces adhere to this standard
Python DB-API UML Diagram
Python DB-API Connection ObjectAccess the database via the connection object• Use connect constructor to create a
connection with databaseconn = psycopg2.connect(parameters…)
• Create cursor via the connectioncur = conn.cursor()
• Transaction management (implicit begin)conn.commit()conn.rollback()
• Close connection (will rollback current transaction)
conn.close()• Check module capabilities by globals
psycopg2.apilevel psycopg2.threadsafety psycopg2.paramstyle
Python DB-API Cursor ObjectA cursor object is used to represent a database cursor, which is used to manage the context of fetch operations.• Cursors created from the same connection
are not isolatedcur = conn.cursor()cur2 = conn.cursor()
• Cursor methodscur.execute(operation, parameters) cur.executemany(op,seq_of_parameters)cur.fetchone()cur.fetchmany([size=cursor.arraysize])cur.fetchall()cur.close()
Python DB-API Cursor Object• Optional cursor methods
cur.scroll(value[,mode='relative']) cur.next()cur.callproc(procname[,parameters])cur.__iter__()
• Results of an operationcur.descriptioncur.rowcountcur.lastrowid
• DB adaptor specific “proprietary” cursor methods
Python DB-API Parameter StylesAllows you to keep SQL separate from parameters
Improves performance & security
Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.From http://initd.org/psycopg/docs/usage.html#query-parameters
Python DB-API Parameter StylesGlobal paramstyle gives supported style for the adaptor
qmark Question mark styleWHERE countrycode2 = ?
numeric Numeric positional styleWHERE countrycode2 = :1
named Named styleWHERE countrycode2 = :code
format ANSI C printf format styleWHERE countrycode2 = %s
pyformat Python format style WHERE countrycode2 = %(name)s
Python + SQL: INSERTimport csv, datetime, psycopg2conn = psycopg2.connect("dbname=ip2countrydb user=ip2country_rw password=secret")cur = conn.cursor()with open("IpToCountry.csv", "rt") as f: reader = csv.reader(f) try: for row in reader: if row[0][0] != "#": row[3] = datetime.datetime.utcfromtimestamp(float(row[3])) cur.execute("""INSERT INTO ip2country( ipfrom, ipto, registry, assigned, countrycode2, countrycode3, countryname) VALUES (%s, %s, %s, %s, %s, %s, %s)""", row) except (Exception) as error: print(error) conn.rollback() else: conn.commit() finally: cur.close() conn.close()
Python + SQL: SELECT# Find ipv4 address ranges assigned to Malaysiaimport psycopg2, socket, struct
def num_to_dotted_quad(n): """convert long int to dotted quad string http://code.activestate.com/recipes/66517/""" return socket.inet_ntoa(struct.pack('!L', n))
conn = psycopg2.connect("dbname=ip2countrydb user=ip2country_rw password=secret")
cur = conn.cursor()
cur.execute("""SELECT * FROM ip2country WHERE countrycode2 = 'MY' ORDER BY ipfrom""")
for row in cur: print("%s - %s" % (num_to_dotted_quad(int(row[0])), num_to_dotted_quad(int(row[1]))))
SQLite
• sqlite3• CPython 2.5 & 3• DB-API 2.0• Part of CPython distribution since 2.5
PostgreSQL
• psycopg• CPython 2 & 3• DB-API 2.0, level 2 thread safe• Appears to be most popular• http://initd.org/psycopg/
• py-postgresql• CPython 3• DB-API 2.0• Written in Python with optional C
optimizations• pg_python - console• http://python.projects.postgresql.org/
PostgreSQL
• PyGreSQL• CPython 2.5+• Classic & DB-API 2.0 interfaces• http://www.pygresql.org/
• pyPgSQL• CPython 2• Classic & DB-API 2.0 interfaces• http://pypgsql.sourceforge.net/• Last release 2006
PostgreSQL• pypq• CPython 2.7 & pypy 1.7+• Uses ctypes• DB-API 2.0 interface• psycopg2-like extension API• https://bitbucket.org/descent/pypq
• psycopg2cffi• CPython 2.6+ & pypy 2.0+• Uses cffi• DB-API 2.0 interface• psycopg2 compat layer • https://github.com/chtd/psycopg2cffi
MySQL• MySQL-python• CPython 2.3+• DB-API 2.0 interface• http://sourceforge.net/projects/mysql-
python/• PyMySQL• CPython 2.4+ & 3• Pure Python DB-API 2.0 interface• http://www.pymysql.org/
• MySQL-Connector• CPython 2.4+ & 3• Pure Python DB-API 2.0 interface• https://launchpad.net/myconnpy
Other “Enterprise” Databases
• cx_Oracle• CPython 2 & 3• DB-API 2.0 interface• http://cx-oracle.sourceforge.net/
• informixda• CPython 2• DB-API 2.0 interface• http://informixdb.sourceforge.net/• Last release 2007
• Ibm-db• CPython 2• DB-API 2.0 for DB2 & Informix• http://code.google.com/p/ibm-db/
ODBC• mxODBC• CPython 2.3+• DB-API 2.0 interfaces• http://www.egenix.com/products/pytho
n/mxODBC/doc
• Commercial product
• PyODBC• CPython 2 & 3• DB-API 2.0 interfaces with extensions• https://github.com/mkleehammer/
pyodbc• ODBC interfaces not limited to Windows
thanks to iODBC and unixODBC
Jython + SQL
• zxJDBC• DB-API 2.0 Written in Java using JDBC
API so can utilize JDBC drivers• Support for connection pools and JNDI
lookup• Included with standard Jython
installation http://www.jython.org/• jyjdbc• DB-API 2.0 compliant• Written in Python/Jython so can utilize
JDBC drivers• Decimal data type support• https://bitbucket.org/clach04/jyjdbc/
IronPython + SQL
• adodbapi• IronPython 2+• Also works with CPython 2.3+ with
pywin32• http://adodbapi.sourceforge.net/
Gerald, the half a schema
import geralds1 = gerald.PostgresSchema(’public', 'postgres://ip2country_rw:secret@localhost/ip2country')s2 = gerald.PostgresSchema(’public', 'postgres://ip2country_rw:secret@localhost/ip2countryv4')
print s1.schema['ip2country'].compare(s2.schema['ip2country'])DIFF: Definition of assigned is differentDIFF: Column countryname not in ip2countryDIFF: Definition of registry is differentDIFF: Column countrycode3 not in ip2countryDIFF: Definition of countrycode2 is different
• Database schema toolkit• via DB-API currently supports• PostgreSQL• MySQL• Oracle
• http://halfcooked.com/code/gerald/
SQLPython
$ sqlpython --postgresql ip2country ip2country_rwPassword: 0:ip2country_rw@ip2country> select * from ip2country where countrycode2='SG';...1728830464.0 1728830719.0 apnic 2011-11-02 SG SGP Singapore 551 rows selected.0:ip2country_rw@ip2country> select * from ip2country where countrycode2='SG'\j[...{"ipfrom": 1728830464.0, "ipto": 1728830719.0, "registry": "apnic”,"assigned": "2011-11-02", "countrycode2": "SG", "countrycode3": "SGP", "countryname": "Singapore"}]
• A command-line interface to relational databases• via DB-API currently supports• PostgreSQL• MySQL• Oracle
• http://packages.python.org/sqlpython/
SQLPython, batteries included0:ip2country_rw@ip2country> select * from ip2country where countrycode2 =’MY’;...1728830464.0 1728830719.0 apnic 2011-11-02 MY MYS Malaysia 551 rows selected.0:ip2country_rw@ip2country> pyPython 2.6.6 (r266:84292, May 20 2011, 16:42:25) [GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2
py <command>: Executes a Python command. py: Enters interactive Python mode. End with `Ctrl-D` (Unix) / `Ctrl-Z` (Windows), `quit()`, 'exit()`. Past SELECT results are exposed as list `r`; most recent resultset is `r[-1]`. SQL bind, substitution variables are exposed as `binds`, `substs`. Run python code from external files with ``run("filename.py")`` >>> r[-1][-1](1728830464.0, 1728830719.0, 'apnic', datetime.date(2011, 11, 2), ’MY', ’MYS', ’Malaysia')>>> import socket, struct>>> def num_to_dotted_quad(n):... return socket.inet_ntoa(struct.pack('!L',n))...>>> num_to_dotted_quad(int(r[-1][-1].ipfrom))'103.11.220.0'
SpringPython – Database Templates# Find ipv4 address ranges assigned to Malaysia# using SpringPython DatabaseTemplate & DictionaryRowMapper
from springpython.database.core import *from springpython.database.factory import * conn_factory = PgdbConnectionFactory( user="ip2country_rw", password="secret", host="localhost", database="ip2countrydb")dt = DatabaseTemplate(conn_factory)
results = dt.query( "SELECT * FROM ip2country WHERE countrycode2=%s", (”MY",), DictionaryRowMapper())
for row in results: print("%s - %s" % (num_to_dotted_quad(int(row['ipfrom'])), num_to_dotted_quad(int(row['ipto']))))
SQLAlchemyhttp://www.sqlalchemy.org/
First release in 2005Now at version 1.0.8What is it• Provides helpers, tools & components to
assist with database access• Provides a consisdent and full featured
façade over the Python DBAPI• Provides an optional object relational
mapper(ORM)• Foundation for many Python third party
libraries & tools• It doesn’t hide the database, you need
understand SQL
SQLAlchemy Overview
SQLAlchemy Core – The Enginefrom sqlalchemy import create_engine
engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2countrydb')
engine.execute(""" create table registry ( id serial primary key, name text ) """)
engine.execute(""" insert into registry(name) values('apnic') """)engine.execute(""" insert into registry(name) values('aprn') ""”)engine.execute(""" insert into registry(name) values('lacnic') """)
SQLAlchemy Core – SQL Expression Languagefrom sqlalchemy import create_engine, Table, Column, Integer, String, MetaData
engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2countrydb', echo=True)
metadata = MetaData()
registry = Table('registry', metadata, Column('id', Integer,
autoincrement=True,
primary_key=True), Column('name', String(10)))
metadata.create_all(engine) # create table if it doesn't exist
# auto construct insert statement with binding parametersins = registry.insert().values(name='dummy’)
conn = engine.connect() # get database connection# insert multiple rows with explicit commitconn.execute(ins, [{'name': 'apnic'}, {'name': 'aprn'}, {'name': 'lacnic'}])
SQLAlchemy Core – SQL Expression Languagefrom sqlalchemy import create_engine, Table, Column, Integer, String, MetaDatafrom sqlalchemy.sql import select
engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2countrydb', echo=True)
metadata = MetaData()
registry = Table('registry', metadata, Column('id', Integer, autoincrement=True, primary_key=True, Column('name', String(10)))
# auto create select statements = select([registry])
conn = engine.connect()
result = conn.execute(s)
for row in result: print(row)
SQLAlchemy Core – SQL Expression Languagefrom sqlalchemy import create_engine, Table, Column, Integer, String, MetaDatafrom sqlalchemy.sql import select
engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2countrydb', echo=True)
metadata = MetaData()
registry = Table('registry', metadata, Column('id', Integer, autoincrement=True, primary_key=True, Column('name', String(10)))
# auto create select statements = select([registry])
conn = engine.connect()
result = conn.execute(s)
for row in result: print(row)
SQLAlchemy ORMfrom sqlalchemy.ext.declarative import declarative_basefrom sqlalchemy import create_engine, Table, Column, Integer, String
Base = declarative_base()
class Registry(Base): __tablename__ = 'registry' id = Column(Integer, autoincrement=True, primary_key=True) name = Column(String(10))
def __repr__(self): return "<Registry(%r, %r)>" % ( self.id, self.name )
engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2countrydb', echo=True)Base.metadata.create_all(engine)
from sqlalchemy.orm import Sessionsession = Session(bind=engine)
apnic = session.query(Registry).filter_by(name='apnic').first()print(apnic)
SQLAlchemy ORM. . .Base = declarative_base()
class Registry(Base): __tablename__ = 'registry' id = Column(Integer, autoincrement=True, primary_key=True) name = Column(String(10))
def __repr__(self): return "<Registry(%r, %r)>" % ( self.id, self.name )
engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2countrydb', echo=True)Base.metadata.create_all(engine)
from sqlalchemy.orm import Sessionsession = Session(bind=engine)
mynic = Registry(name='mynic')
session.add(mynic)
session.commit()
DB-API 2.0 PEP http://www.python.org/dev/peps/pep-0249/
Travis Spencer’s DB-API UML Diagram http://travisspencer.com/
Andrew Kuchling's introduction to the DB-API http://www.amk.ca/python/writing/DB-API.html
Attributions
Andy Todd’s OSDC paper http://halfcooked.com/presentations/osdc2006/python_databases.html
Source of csv data used in examples from WebNet77 licensed under GPLv3 http://software77.net/geo-ip/
Attributions
Mark Reesmark at censof dot com
+Mark Rees@hexdump42
hex-dump.blogspot.com
Contact Details
Recommended