Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
COURSE OVERVIEW
LESSON 2: BASICS OF PYTHON
- Variables- Operators- Loop & Conditional- Functions??
VARIABLESNumbers = Integer, float
String = ”string”
List = []
Tuple = ()
Dictionary = {A:1}
Boolean = True, False
variable
Function drink()
Let´s write your variable!
VARIABLESNumbers = Integer, float
String = ”string”
List = []
Tuple = ()
Dictionary = {A:1}
Boolean = True, False
VARIABLESNumbers = Integer, float
String = ”string” - List of characters
List = []
Tuple = () - read only list
Dictionary = {A:1}
Boolean = True, False
- Function type()
print list # Prints complete list
print list[0] # Prints first element of the list
print list[1:3] # Prints elements starting from 2nd till 3rd
print list[2:] # Prints elements starting from 3rd element
VARIABLES
print string # Prints complete string
print string[0] # Prints first element of the string
print string[1:3] # Prints elements starting from 2nd till 3rd
print string[2:] # Prints elements starting from 3rd element
print tuple # Prints complete tuple
print tuple[0] # Prints first element of the tuple
…
OPERATORS
Arithmetic Operators : +, - , *, /, %, **
Comparison Operators : = =, !=, <, >, <=, >=
Assignment Operators : =, +=, -=, *=, /=, %=, **=
LOOPS & CONDITIONAL
Repetition : while, for //// break, continue, pass
for LOOP_VARIABLE in SEQUENCE: STATEMENTS
Separation : if, if … else, if … elif … else
if BOOLEAN EXPRESSION: STATEMENTS_1
elif BOOLEAN EXPRESSION: STATEMENTS_2
else:ALTERNATIVE STATEMENTS
LOOPS & CONDITIONAL
Repetition : while, for //// break, continue, pass
for LOOP_VARIABLE in SEQUENCE: STATEMENTSif LOOP_VARIABLE in SEQUENCE:
STATEMENTSelse:
STATEMENTS
Iterations
FUNCTIONSdef functionname( parameters ):
"function_docstring" statements return [expression]
A function is a block of organized, reusable code that is used to perform a single, related action. Functions provide better modularity for your application and a high degree of code reusing.
https://www.tutorialspoint.com/python/python_functions.htm
WHERE TO GET HELPWorld Wide Web• www.google.com
Developer Community• www.stackoverflow.com
Python Website (www.python.org)• Python 3 Documentation• https://docs.python.org/3/
• Python 3 Tutorials• https://docs.python.org/3/tutorial/i
ndex.html
Variables
Types of variables
In [1]:
In [2]:
In [3]:
Working with lists
Out[1]:
list
insert variable: 10
Out[2]:
int
1 , 2 , 3
# defining variable and displaying variable typevariable = [1,3,4,5,]type (variable)
# The 'input' function will prompt user to enter a value.# The value will be printed upon pressing enter. variable = int(input("insert variable: "))type(variable) # Sometimes how variables are defined is important
# multiple assignments on the same line are possible a, b, c = 1,2,3print(a,b,c,sep= " , ")
In [33]:
In [34]:
In [35]:
In [36]:
In [37]:
In [32]:
Out[33]:
str
1,2,3,4,5,6
1,2 3,4,5,6
[1, 2, 3] [5, 6]
2 6
# Defining a string as a list of individual charactersvariable = "1,2,3,4,5,6"type (variable)
print (variable)
# Displaying a subset of a list by specifying indices.# An index refers to the location of an element in a list.# Indices start at 0 in Python, i.e. printing mylist[0] will display the first element of the variable mylist.print (variable[0:3],variable[4:])
# Assigning list of numbers to variablevariable2 = [1,2,3,4,5,6]
# Printing subset of elements in list of numbersprint (variable2[0:3],variable2[4:])
# Retrieving individual list elements from their indicesstart = variable[1] # 2nd element of list stored in 'variable' (again Python starts counting at 0)end = variable[5] # 6th element of list stored in 'variable'print (start,end)
In [38]:
Operators
In [40]:
In [41]:
In [26]:
Exercise: Try to sum one str or list with one int and see what happens
What would happen if we multiplied them?
Loops and Conditionals
[[1, 2, 3], 2]
2.0
fourfive
[1, 2, 3, 4, 5, 6]
mylist = [] # initializing empty listmylist.append([1,2,3]) # appending element to list, here the element is another list of numbersmylist.append(2) # appending element to list print(mylist)
# Addition operatorvariable = 10print(variable + 5)
# Now let's try with str variable = "four"print(variable + "five")
# A string is like a list of individual characters.# We said that str are like list, right? So what happens if we sum them using the '+' operator?variable1 = [1,2,3]variable2 = [4,5,6]print(variable1 + variable2)
In [1]:
list( ) and range( ) function
In [28]:
In [ ]:
In [46]:
[0, 5, 10]
['1', '2', '3', '4']
[10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95]
# for loop example to construct a list dynamicallyvariablelist= []for letter in [0,1,2]: #print(letter) variablelist.append(letter*5)print(variablelist)
# printing a string as a list of characterslista =list("1234")print(lista)
# using the 'range' function to automatically generate a list# here we define a list of integer values that start from 10 and go up to 100 by increments of 5 listb = list(range(10,100,5))print (listb)
In [30]:
In [12]:
In [48]:
While
In [49]:
Conditional
0123456789
[50, 55, 60, 65, 70, 75, 80, 85, 90, 95]
True
01234
# using a for loop to print the individual values in a list of 10 consecutive integersfor value in range(10): print (value)
# Converting a range of values to a list, then printing itprint (list(range(50,100,5)))
# Comparing variables using logical operator '<' to check if one variable is greater than another onevariable =1variable2 = 5print(variable < variable2 )
# Using a while loop to count up to 5, printing the count value at every iteration count = 0while count< 5: print(count) count += 1 # This is the same as count = count + 1
In [33]:
In [ ]:
Break, Continue, and Pass
In [14]:
Below is extra material not covered in class, try it atyour own risk ;)
Let's start coding!!
What is the temperature? 30Get some exercise outside.
Number is 1Number is 2Number is 3Number is 4Out of loop
# Small program to determine what to wear based on user-specified temperaturetemperature = int(input('What is the temperature? '))if temperature > 70: print('Wear shorts pants.')elif temperature < 70: print('Wear long pants.')else: print('You are wearing no pants!!.')
# For loop to repeat code block 10 times# Loop halted when counter variable reaches 5, forcing code to exit the loop and terminatenumber = 0for number in range(10): number = number + 1 if number == 5: break print('Number is ' + str(number)) print('Out of loop')
In [3]:
In [4]:
In [5]:
Opening the files
In [8]:
In [9]:
Out[4]:
Bio.Seq.Seq
Out[5]:
Seq('TCATGTGACCA')
Out[9]:
dict_keys(['sample_well', 'dye', 'polymer', 'machine_model', 'run_start', 'run_finish', 'abif_raw'])
#There is one library to python that allow to work with .ab1 files. This library is Biopython and can be installed by the command: conda install biopythonfrom Bio import SeqIOfrom Bio import pairwise2from Bio.Seq import Seq
##This library allow to create a new type of variable called Bio.Seq.Seq. This variable is like one string but with extra functions like give you the complement of a given sequence.my_seq = Seq("AGTACACTGGT")type(my_seq)
my_seq.complement()
#We can open files .ab1 whit this application using the comand Seq.read and including the direction of the file in our computer.file1 = SeqIO.read("/Users/Rafa/Desktop/file1.ab1", 'abi')
#This is all the stuff that is inside of the file, it is like a diccionary of diccionaries.file1.annotations.keys()
In [10]:
In [11]:
In [12]:
In [13]:
Out[10]:
dict_keys(['AEPt1', 'AEPt2', 'APFN2', 'APXV1', 'APrN1', 'APrV1', 'APrX1', 'ARTN1', 'ASPF1', 'ASPt1', 'ASPt2', 'AUDT1', 'B1Pt1', 'B1Pt2', 'BCTS1', 'BufT1', 'CTID1', 'CTNM1', 'CTOw1', 'CTTL1', 'CpEP1', 'DATA1', 'DATA2', 'DATA3', 'DATA4', 'DATA5', 'DATA6', 'DATA7', 'DATA8', 'DATA9', 'DATA10', 'DATA11', 'DATA12', 'DCHT1', 'DSam1', 'DySN1', 'Dye#1', 'DyeN1', 'DyeN2', 'DyeN3', 'DyeN4', 'DyeW1', 'DyeW2', 'DyeW3', 'DyeW4', 'EPVt1', 'EVNT1', 'EVNT2', 'EVNT3', 'EVNT4', 'FTab1', 'FVoc1', 'FWO_1', 'Feat1', 'GTyp1', 'HCFG1', 'HCFG2', 'HCFG3', 'HCFG4', 'InSc1', 'InVt1', 'LANE1', 'LIMS1', 'LNTD1', 'LsrP1', 'MCHN1', 'MODF1', 'MODL1', 'NAVG1', 'NLNE1', 'NOIS1', 'PBAS1', 'PBAS2', 'PCON1', 'PCON2', 'PDMF1', 'PDMF2', 'PLOC1', 'PLOC2', 'PSZE1', 'PTYP1', 'PXLB1', 'RGNm1', 'RMXV1', 'RMdN1', 'RMdV1', 'RMdX1', 'RPrN1', 'RPrV1', 'RUND1', 'RUND2', 'RUND3', 'RUND4', 'RUNT1', 'RUNT2', 'RUNT3', 'RUNT4', 'Rate1', 'RunN1', 'S/N%1', 'SCAN1', 'SMED1', 'SMLt1', 'SMPL1', 'SPAC1', 'SPAC2', 'SPAC3', 'SVER1', 'SVER2', 'SVER3', 'Scal1', 'Scan1', 'TUBE1', 'Tmpr1', 'User1', 'phAR1', 'phCH1', 'phDY1', 'phQL1', 'phTR1', 'phTR2'])
2017-04-21 21:54:21
#There are even more...file1.annotations['abif_raw'].keys()
# We can find the data when our file was created. print(file1.annotations['run_start'])
# The sequence in the file is on: file1.annotations['abif_raw']['PBAS1']# but we can acces to the raw data of the spectophotometry inside: file1.annotations['abif_raw']["DATA9"],file1.annotations['abif_raw']["DATA10"],file1.annotations['abif_raw']["DATA11"] and file1.annotations['abif_raw']["DATA12"].# each DATA list are the values for each nucleotid, to decode them the information is on: file1.annotations['abif_raw']['FWO_1']
#There is not only raw data (Sample), but also probability data. The latter tells where the peaks are (e.g. Probability.peak_index(100)) and how good they are. This will be very helpful. That data is saved here: PLOC1 = file1.annotations['abif_raw']['PLOC1']
In [14]:
In [15]:
GATC
# On FWO_1 are decoded the differents DATA lists, DATA9, DATA10, DATA11 and DATA12 print(file1.annotations['abif_raw']['FWO_1']) # So DATA9 is G# DATA10 is A# ...
# Give them easy namesG = file1.annotations['abif_raw']["DATA9"]A = file1.annotations['abif_raw']["DATA10"]T = file1.annotations['abif_raw']["DATA11"]C = file1.annotations['abif_raw']["DATA12"] #make a list with the 4 nucleotids togetherdatamerged = []for e in range(len(G)): datamerged.append([G[e], A[e], T[e], C[e]]) # Keep only the values of the list that are in the peakspeaks=[]for pos, val in enumerate(datamerged): if pos in PLOC1: peaks.append(val) total = []for value in peaks: if value[0] > value[1] and value[0] > value[2] and value[0] > value[3]: total.append("G") if value[1] > value[0] and value[1] > value[2] and value[1] > value[3]: total.append("A") if value[2] > value[0] and value[2] > value[1] and value[2] > value[3]: total.append("T") if value[3] > value[0] and value[3] > value[1] and value[3] > value[2]: total.append("C") secuence = ""for value in total: secuence += value
In [16]:
In [17]:
In [18]:
In [19]:
GGGGGAGCTGCATTGTTGTCAAGGCCATAGAGCCTCCCTAATTCTTTACAGTGATATCACACTCACTGATAAAAACCTCATTATCTTCTCCAGCATAGGTAAGGAAGGATATAAATCCATCTAGATGCCCAACCCACCCACTCACCTTGAGCTTGGCTAGCTCTTAGTTGGTGCACCACTTTCAGTGACAAATCTCACTTCCTGCCCTCACTGCTCTAAACCTTGTCCACTCTGTGTACTTCTGACCATTGATGTTGGCTCTGCTCTCTCCACACAGGTCACAACATTAACTCAGGAGGTTTCCCAGCTAGGGAGAGATATGAGAAGCATCATGCAACTTCTGGAAAACATCTTGTCACCTCAGCAGCCATCCCAGTTTTGTTCTCTACATCCCACCCCAATGTGTCCTTCCAGAGAAAGTTTACAGACTAGGGTGAGTTGGAGTGCTCACCAGCCTTGCCTACATTTGCAGGCAGGTGGAGCACATCTTTACCATGGTAATGTCGCCTCTGGATCCTGGAGTAGCGGTGGGAAGTTGTTATCCGCTCACAATTCCACACAACTTGCTAGCCGGAAGCAAAAAGCTAAAGCCCGTGGAGCCT
Out[18]:
[('GGGGGAGCTGCATTGTTGTCAAGGCCATAGAGCCTCCCTAATTCTTTACAGTGATATCACACTCACTGATAAAAACCTCATTATCTTCTCCAGCATAGGTAAGGAAGGATATAAATCCATCTAGATGCCCAACCCACCCACTCACCTTGAGCTTGGCTAGCTCTTAGTTGGTGCACCACTTTCAGTGACAAATCTCACTTCCTGCCCTCACTGCTCTAAACCTTGTCCACTCTGTGTACTTCTGACCATTGATGTTGGCTCTGCTCTCTCCACACAGGTCACAACATTAACTCAGGAGGTTTCCCAGCTAGGGAGAGATATGAGAAGCATCATGCAACTTCTGGAAAACATCTTGTCACCTCAGCAGCCATCCCAGTTTTGTTCTCTACATCCCACCCCAATGTGTCCTTCCAGAGAAAGTTTACAGACTAGGGTGAGTTGGAGTGCTCACCAGCCTTGCCTACATTTGCAGGCAGGTGGAGCACATCTTTACCATGGTAATGTCGCCTCTGGATCCTGGAGTAGCGGTGGGAAGTTGTTATCCGCTCACAATTCCACACAACTTGCTAGCCGGAAGCAAAAAGCTAAAGCCCGTGGAGCCT', 'GCCGGAGCTGCATTGATGTCAAGGCCATAGAGCCTCCCTAATTCTTTACAGTGATATCACACTCACTGATAAAAACCTCATTATCTTCTCCAGCATAGGTAAGGAAGGATATAAATCCATCTAGATGCCCAACCCACCCACTCACCTTGAGCTTGGCTAGCTCTTAGTTGGTGCACCACTTTCAGTGACAAATCTCACTTCCTGCCCTCACTGCTCTAAACCTTGTCCACTCTGTGTACTTCTGACCATTGATGTTGGCTCTGCTCTCTCCACACAGGTCACAACATTAACTCAGGAGGTTTCCCAGCTAGGGAGAGATATGAGAAGCATCATGCAACTTCTGGAAAACATCTTGTCACCTCAGCAGCCATCCCAGTTTTGTTCTCTACATCCCACCCCAATGTGTCCTTCCAGAGAAAGTTTACAGACTAGGGTGAGTTGGAGTGCTCACCAGCCTTGCCTACATTTGCAGGCAGGTGGAGCACATCTTTACCATGGTAATGTCGCCTCTGGATCCTGGAGTAGCGGTGGGAGGTTGTTATCCGCTCACAATTCCACTCAACATGCTAGCCGGAAGCAAAAAGCTAAAGCCTGTGGAGCCA', 1181.0, 3, 601)]
print(secuence)
sec = file1.annotations['abif_raw']['PBAS1']
# The function pairwise2 allow to compare sequences. pairwise2.align.localms(secuence, sec, 2, -1, -5, -5)
import matplotlib.pyplot as pltfrom collections import defaultdict
In [20]:
In [21]:
In [ ]:
plt.plot(G, color = 'blue')plt.plot(A, color = 'red')plt.plot(T, color = 'green')plt.plot(C, color = 'yellow')plt.xlim(3000,3500)plt.ylim(0,1500)plt.show()
# More information about this library: #http://biopython.org/DIST/docs/tutorial/Tutorial.pdf