26
BMMB 597D - Practical Data Analysis for Life Scientists Week 15 - Lecture 29 István Albert BMB, Bioinformatics

Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

BMMB 597D - Practical Data Analysis for Life Scientists

Week 15 - Lecture 29

István Albert

BMB, Bioinformatics

Page 2: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Databases and Object Persistence

• Maintaining data across several runs

• Storing data in a relational database

Page 3: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Object persistence with shelve()

Page 4: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Read back the object in a different progam

Page 5: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

You can save and recall just about any python object

Page 6: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Relational databases

• Data is organized in tables (rows) and described by type (integer, float, string)

• Standardized query language called SQL• Some people say SQL stands for Structured

Query Language:

True with some caveats:

– SQL is not actually structured– SQL is not just for query– SQL is not a programming language

Page 7: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Most common model: client - server

• There is a server (for example UCSC MySQL server) that can be connected to via the MySQL client

• You can send a query to the servers

• The MySQL client needs to be installed on your server

• You need to know how the data is modeled at the UCSC servers.

Page 8: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

MySQL query example

*************************** 1. row ***************************bin: 707name: NM_005378chrom: chr2strand: +txStart: 15998133txEnd: 16004580

cdsStart: 15999637cdsEnd: 16003670

exonCount: 3exonStarts: 15998133,15999520,16003065,exonEnds: 15998316,16000427,16004580,

id: 0name2: MYCN

cdsStartStat: cmplcdsEndStat: cmplexonFrames: -1,0,1,

mysql -h genome-mysql.cse.ucsc.edu -A -u genome -D hg18 -e 'select * from refGene where name="NM_005378"'

Page 9: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Sqlite relational database in Python

We run this only once to initialize the database table that stores our data

Page 10: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Populate the database

Page 11: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Query the database

Page 12: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Populate your database with the gff file

For simplicity we call the feature type name

Page 13: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Query your database

Page 14: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Exercise: explore various queries

You may want to google: SQL tutorial

Page 15: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

BMMB 597D - Practical Data Analysis for Life Scientists

Week 15 - Lecture 30

István Albert

BMB, Bioinformatics

Page 16: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

The Zen of Python

import this

Write and run: import this

Page 17: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

About the course

• I hope it was interesting

• I hope it was useful

• Future plans expand on the subjects, do more difficult problems in a second lecture series

• This course may or may not be offered in the future.

• Depends on you the potential audience, advisors and administration.

• If you liked it mention this to your advisor/commitee members etc.

Page 18: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Computation == Thought

Advice

If you know what an object ISthen you will know what it DOES

Print it. Check its type. Check its content.

Page 19: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Simplicity – the essential ingredient

• It always easier to create a processes than to comprehend it!

• Keep it simple! Don’t repeat yourself.

• Don’t be afraid to toss the program away. If you can’t debug it, toss it away and start fresh with a slightly different perspective

Page 20: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Visit Biostar – over 1000 questions!http://biostar.stackexchange.com/

Ask your questions there! We’ll try to build it into an extensive knowledge base!

Page 21: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Bioinformatics with Python

• BioPython – has parsers to a large number of bioinformatics formats

• PyFasta – is able to indexes very large files, you can quickly access any part of a genome

• BX-python – very good interval handling data structures

• Pygr – graph representation for biological data

• Pycogent - evolutionary algorithms

Search for the name to find them

Page 22: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Optimizing programs

• All programs have bottlenecks

• You only need to optimize if there is a problem (or you foresee one)

• Don’t optimize prematurely

• Make it work make it right make it fast

Page 23: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

How to collect profiling information from a program

Page 24: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Display the statistics

Page 25: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

From XKCD: http://xkcd.com/353/

Page 26: Week 15 - Lecture 29 · Week 15 - Lecture 30 István Albert BMB, Bioinformatics. The Zen of Python import this Write and run: import this. About the course • I hope it was interesting

Thanks!