Upload
lasircc
View
184
Download
3
Tags:
Embed Size (px)
DESCRIPTION
PhD Lesson
Citation preview
Construction of Biomedical Database Applications Case Study:
LaboratoryAssistant
Suite
Starting May 2011, LAS stems from the joined efforts of IRCC and the Politecnico of Torino
Players
IRCC contribution
•Strategy
•Working- and Data-flow analysis
•User interface definition
•On-site implementation
POLITO contribution
•Database & Data warehouse
•Analytical tools & software features
•IT
Context – Personalized Medicine in Oncology
Figure 11.12 The Biology of Cancer (© Garland Science 2007)
ASSUMPTION I:
Cancer is a genetic disease caused by the progressive accumulation of gene mutations
Context – Personalized Medicine in Oncology
ASSUMPTION II:
If mutations are causative, in general terms their quality is likely to influence the behavior (biology)
of the system, in particular they are predicted to determining responses to perturbations (e.g.
drugs)
Context – Personalized Medicine in Oncology
ASSUMPTION III:
Mutations (or their surrogates) can be exploited to stratify patients for therapy
Context – Personalized Medicine in Oncology
EVIDENCE I:
Precision cancer medicine works: Selective inhibition of ‘driver’ mutations can result in dramatic
clinical benefit
Context – Personalized Medicine in Oncology
EVIDENCE II:
a. Precision cancer medicine stands on exceptions
b. ‘Drivers’ not always are ‘targets’ Exceptions become rules only if confirmed on a population
basis:
• Only 10% of NSCLCs harbour EGFR mutations, and only 40% of EGFR-
mutant tumours respond to EGFR inhibitors: • overall prevalence of responders: 4%
• Only 4% of NSCLCs harbour ALK translocations, and only 50% of ALK-translocated tumours respond to ALK inhibitors:
• overall prevalence of responders: 2%
• Response to BRAF or MEK inhibition in BRAF mutant melanoma: 60%
• Response to BRAF or MEK inhibition in BRAF mutant CRC: 2%
Context – Personalized Medicine in Oncology
CONSIDERATION I:
Reliable preclinical models are needed to prioritize hypothesis validation in patients (clinical trials)
due to ethical, economical and social constrains.
• Understanding inter-individual tumour heterogeneity needs a reference background:• Focus on one specific tumour type
• Pinpointing exceptions needs big numbers:• Collect many cases
• Identifying exceptions (and the contextual mutational milieu) needs integrated approaches with reliable outcomes:• Multi-dimensional genomic exploration of high-quality tumour material
Context – Personalized Medicine in Oncology
CONSIDERATION II:
Direct transplantation of surgical specimens in immunocompromized mice can generate a high
fidelity preclinical platform for anticipation of clinical results
• Reliable simulation of phase II trials for investigational drugs
• Identification of new predictive biomarkers for approved drugs
• Multi parametric evaluation of genetic determinants for patients
stratification
• Comparative evaluation of alternative treatment protocols
Context – Experimental Model
CRC banking started
Oct 2008 Oct 2010 May 2011
Apr 2012
Evaluation of commercial LIMS
solution
Jan 2013
LAS project started
LAS started working
N° of collected specimen
s22 148 235 480 614
LAS manages (starting April 2012):
• 622 surgical samples collection
• 7158 mice
• 18537 measures of tumour growth
• 1656 mice treated with 44 different protocols&schedules
• 51131 archived aliquots of biological material
Context – Facts & Numbers
Data Flow
Operation
Tissue Aliquots
Derived Aliquots Storage
BIOBANKING
EXPERIMENTS
Molecular Experiments
Implants
Mouse
Explants
XENOPATIENTS
Treatments
Measurements
ImagesNext Generation Sequecing
Requirements
Data Entry• Real-time• Time saving• User friendly• Error proof
Data Analysis• Integrative• Reproducible• Intuitive• No programming skills required
From theory to practice
Waterfall model
•Feasibility study
•Requirements analysis
•Requirements definition
Requirements
•Define software system functions
•Establish an overall system architecture
•Unified Modeling Language (UML)
Design
•Code generation
•Definition of logically separable part of the software (units)
•Unit testing done by the developer
Implementation
•Integration and testing of the complete system
•Testing units against the requirements as specified
•System delivered to the client
Verification
•Identification of problems
•Errors fixed
•Performance improvements
Maintenance
Agile model
• Customer satisfaction by rapid delivery of useful software
• Welcome changing requirements, even late in development
• Working software is delivered frequently• Working software is the principal measure of
progress• Sustainable development• Close cooperation• Face-to-face conversation is the best form of
communication (co-location)• Continuous attention to technical excellence and
good design• Simplicity - the art of maximizing the amount of
work not done - is essential• Self-organizing teams• Regular adaptation to changing circumstances
Database design
• Conceptual design. The purpose is to represent the informal requirements of an application in terms of a conceptual schema that refers to a conceptual data model
• Logical design. Translation of the conceptual schema, defined in the preceding phase, into the logical schema of the database that refers to a logical data model
• Physical design. The logical schema is completed with the details of the physical implementation (file organization and indexes) on a given DBMS. The product is called the physical schema and refers to a physical data model
The Entity Relationship model
• Conceptual data model• Provides a series of constructs
capable of describing the data requirements
• Easy to understand• Independent of the criteria for the
management and organization of data on a database system
• For every construct, there is a corresponding graphical representation.
• Allows to define an E-R schema diagrammatically
ER constructs
• Entity• represents classes of objects (facts, things, people, for example) that have properties in
common and an autonomous existence
• Attribute• describes the elementary properties of entities or relationships
• Relationship• represents logical links between two or more entities
• Cardinalities• specified for each entity participating in a relationship• describe the maximum and minimum number of relationship occurrences in which an
entity occurrence can participate • for the minimum cardinality, zero or one• for the maximum cardinality, one or many (N)
ER constructs
• Identifiers• specified for each entity• describe the concepts (attributes and/or entities) of the schema allowing the
unambiguous identification of the entity occurrences• internal identifier (key)
• formed by one or more attributes of the entity itself• external identifier (foreign key)
• when the attributes of an entity are not sufficient to identify its occurrences unambiguously
• other entities need to be involved in the identification• the entity to identify participates with cardinality equal to (1,1) into the relationship
• Generalization• represents logical links between entities (i.e., 1 parent and one or more children)• the parent entity is more general in the sense that it comprises child entities as a
particular case
Logical design
Goals• Construction of a relational schema• Representing correctly and efficiently all of the information described by an ER schema
Design steps• Restructuring of the Entity-Relationship schema• Optimization of the schema• Translation into the logical model
Entity1 (ID1, attr_a, attr_b, …)
Entity2 (ID2, attr_x, attr_y, …)
Relationship1 (ID1, ID2, attr_r, …)
Data flow example
Database design example
Mouse (barcode, status, gender, deathDate, birthDate)
Database design example
Mouse (barcode, status, gender, deathDate, birthDate, mouseStrainName)
MouseStrain (mouseStrainName, description, linkToDoc)
Database design example
Explant (date, operator, mouseBarcode, scope)
Aliquot (barcode, tissueType, tumorType, explantDate*, explantOperator*, mouseBarcode*)
Implant (date, operator, aliquotBarcode, badQuality, site, mouseBarcode)
Database design example
Database design example
Mouse (barcode, status, gender, deathDate, birthDate, mouseStrainName)
MouseStrain (mouseStrainName, description, linkToDoc)
Explant (date, operator, mouseBarcode, scope)
Aliquot (barcode, tissueType, tumorType, explantDate*, explantOperator*, mouseBarcode*)
Implant (date, operator, aliquotBarcode, badQuality, site, mouseBarcode)
MeasurementSerie (date,time, type)
MouseIsMeasured (date,time,mouseBarcode, value)
Querying the database
• SQL (Structured Query Language)• designed for managing data held in a relational database management systems • example:
SELECT barcode, mouseStrainName
FROM Mouse M, Explant E
WHERE M.barcode = E.mouseBarcode
AND status = ‘Implanted’;
• ORM (Object-relational mapping)• programming technique for converting data between incompatible type systems in
object-oriented programming languages• creates a "virtual object database“ used from within the programming language• maps database table rows to objects• allows to establish relations between those objects
Model View Controller
• High-level Python Web framework that encourages rapid development and clean, pragmatic design
• Makes it easier to build better Web apps more quickly and with less code• The Web framework for perfectionists with deadlines
Features
• MVC architecture • Object- Relation Mapper• Templating Language • Automatic Language • Elegant urls • Unicode support• Cache framework
• Testing framework• Solid security emphasis• Send emails easily• Nice support for forms• Great docs• Friendly community
Build a django project
$ django-admin.py startproject xenopatients
• command-line utility to interact with the Django project
• the actual Python package of the project• used to import anything inside it
• indicates that this directory is a Python package
• settings/configuration for the project
• URL declarations
• an entry-point for WSGI-compatible webservers to serve your project
Run server
$ ./manage.py runserverValidating models...0 errors foundMarch 07, 2013 - 15:50:53Django version 1.5, using settings ‘xenopatients.settings'Development server is running at http://127.0.0.1:8000/Quit the server with CONTROL-C.
Create application
$ python manage.py startapp xenos
• application belonging to the Django project
• indicates that this directory is a Python package
• defines python classes mapped on database tables
• simple routines to check the operation of the code
• defines a “type” of Web page to serve a specific function with a specific template
• each view is represented by a simple Python function
Define the modelEdit the file /xenos/models.py
class Mice(models.Model):
barcode = models.BigIntegerField(primary_key=True, editable=False)
birth_date = models.DateField(db_column= 'birthdate', blank=True)
death_date = models.DateField(db_column= 'deathdate', blank=True)
gender = models.CharField(max_length=1)
status = models.CharField(max_length=20)
id_mouse_strain = models.ForeignKey(‘Mouse_strain’, blank=True, db_column='id_mouse_strain')
def __unicode__(self):
return self.barcode
class Mouse_strain(models.Model):
id_strain = models.BigIntegerField(primary_key=True, editable=False)
mouse_strain_name = models.CharField(max_length=45, unique=True)
description = models.TextField()
linkToDoc = models.CharField(max_length=80)
def __unicode__(self):
return self.mouse_strain_name
Define the urls and views
Edit the file /xenos/urls.py
urlpatterns = patterns('',
(r'^$', views.index),
(r'^miceloading/$', views.miceLoading),
(r'^miceStatus/$', views.changeStatus),
…
Edit the file /xenos/views.py
@login_required
def index(request):
if request.method == 'GET':
name = request.user.username
return render_to_response('index.html', {'name':name}, RequestContext(request)) …
Activate the admin siteEdit the file /xenopatients/urls.py
from django.conf.urls.defaults import *
# Uncomment the next two lines to enable the admin:
from django.contrib import admin
admin.autodiscover()
urlpatterns = patterns('',
# Uncomment the next line to enable the admin:
(r'^admin/', include(admin.site.urls)),
Activate the admin site
References
• Software Engineering• I. Sommerville (2010) “Software Engineering (9th Edition)”• I. Sommerville (2007) “Ingegneria del software”• R. Miles, K. Hamilton (2006) “Learning UML 2.0”• M. Fowler (2010) “UML distilled. Guida rapida al linguaggio di modellazione standard”
• Database• C. Coronel, S. Morris, P. Rob (2012) “Database Systems: Design, Implementation, and
Management” • P. Atzeni, S. Ceri, S. Paraboschi, R. Torlone (2009) “Basi di dati – Modelli e linguaggi di
interrogazione”• Python & Django
• A. Martelli (2006) “Python in a Nutshell, Second Edition”• M. Lutz (2009) “Learning Python: Powerful Object-Oriented Programming”• M. Dawson (2010) “Python Programming for the Absolute Beginner, 3rd Edition”• Django website https://www.djangoproject.com/ • A. Holovaty, J. Kaplan-Moss (2009) “The Definitive Guide to Django: Web Development
Done Right” • M. Beri (2009) “Sviluppare applicazioni web con Django”
References (context)
• Personalized medicine in oncology• Hait WN, Cancer Discov 1, 383 (2011).• MacConaill LE et al., Cancer Discov 1, 297 ( 2011).• Haber Da, Gray NS, Baselga J, Cell 145, 19 (2011).
• Unmet needs and preclinical models• de Bono JS, Ashworth A, Nature 467, 543 (2010).• Tentler JJ et al., Nat Rev Clin Oncol 9, 338 (2012).
• Our work• Baralis E et al., J Med Systems ( 2012).• Migliardi G et al., Clin Cancer Res 18, 2515 ( 2012).• Bertotti A et al., Cancer Discov 1, 508 (2011).• Galimi F et al., Clin Cancer Res 17, 3146 ( 2011).