Upload
mansi-arora
View
278
Download
0
Embed Size (px)
Citation preview
7/25/2019 Numpy for Python
1/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 1/27
Numpy highlights
ndarray: fast and space-efficient multidimensional array with vectorized arithmetic and
sophisticated broadcasting
standard vectorized mathreading / writing arrays to disk
memory-mapped file access
linear algebra, rng, fourier transform
Integration of C, C++, FORTRAN
Creating an ndarray
The arrayfunction
The array function is the workhorse function for creating numpy ndarrays on the fly from other
Python sequence like objects: primarily tuples and lists.
In [2]:
importnumpyasnpimportrandom
np.array(range(3))
Out[2]:
array([0, 1, 2])
In [2]:
np.array((1, 2, 3))
Out[2]:
array([1, 2, 3])
In [3]:
np.array([1, 2, 3])
Out[3]:
array([1, 2, 3])
In [4]:
np.array(list('hello'))
Out[4]:
array(['h', 'e', 'l', 'l', 'o'],dtype='
7/25/2019 Numpy for Python
2/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 2/27
Nested lists are treated as multi-dimensional arrays.
In [5]:
random.seed(3.141)dataList = [[random.uniform(0, 9) forx inrange(3)] fory inrange(4)]
dataList
Out[5]:
[[4.844700289907117, 3.285473931529339, 2.1797684393413155],[5.824634396536993, 0.8824946651389621, 2.76732952458187],[1.8329068547314877, 4.527186437261438, 3.5724501538885134],[2.144914100647332, 4.951733405544532, 4.325440230285053]]
In [6]:
data = np.array(dataList)data
Out[6]:
array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, 4.52718644, 3.57245015], [ 2.1449141 , 4.95173341, 4.32544023]])
In [7]:
Important information about an ndarray
print(data.ndim) # Number of dimensionsprint(data.shape)# Shape of the ndarrayprint(data.dtype)# Data type contained in the array
2(4, 3)float64
Other functions to create arrays
arange
This is equivalent to the range function, except returns a one-dimensional array instead of a range
object:
In [8]:
np.arange(10)
Out[8]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
7/25/2019 Numpy for Python
3/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 3/27
ones, ones_like, zeros, zeros_like
To create arrays filled with ones or zeroes with a given shape or with a shape similar to a given
object:
In [9]:
np.ones(3)
Out[9]:
array([ 1., 1., 1.])
In [10]:
np.ones((3, 3))
Out[10]:
array([[ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.]])
In [11]:
np.zeros((4))
Out[11]:
array([ 0., 0., 0., 0.])
In [12]:
np.zeros((4, 4, 4))
Out[12]:
array([[[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]],
[[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]],
[[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]],
[[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]]])
7/25/2019 Numpy for Python
4/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 4/27
In [13]:
np.ones_like(data)
Out[13]:
array([[ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.]])
In [14]:
np.zeros_like(data)
Out[14]:
array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.],
[ 0., 0., 0.]])
empty, empty_like
Just like ones and zeros but initializes an empty array with garbage values not zero.
In [15]:
np.empty((2, 3))
Out[15]:
array([[ 6.91635841e-310, 6.91636044e-310, 6.91635874e-310], [ 6.91635549e-310, 6.91635128e-310, 6.91635185e-310]])
In [16]:
np.empty_like(data)
Out[16]:
array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]])
The ndarray function is a lower level function to create numpy arrays with even more power.
However, we do not explore that here.
Choosing a data type
Most array creation functions take a dtype argument which can be used to explicitly specify the data
type with which the array should be created.
7/25/2019 Numpy for Python
5/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 5/27
In [17]:
np.array(dataList, dtype=np.int)# Everything is truncated to integers
Out[17]:
array([[4, 3, 2], [5, 0, 2], [1, 4, 3], [2, 4, 4]])
In [18]:
np.array(dataList, dtype=np.float)
Out[18]:
array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, 4.52718644, 3.57245015],
[ 2.1449141 , 4.95173341, 4.32544023]])
In [19]:
np.array(dataList, dtype=np.bool)# All numbers > 0 are True
Out[19]:
array([[ True, True, True], [ True, True, True], [ True, True, True],
[ True, True, True]], dtype=bool)
In [20]:
np.array(dataList, dtype=np.unicode_)# Unicode strings
Out[20]:
array([['4.844700289907117', '3.285473931529339', '2.1797684393413155'], ['5.824634396536993', '0.8824946651389621', '2.76732952458187'],
['1.8329068547314877', '4.527186437261438', '3.5724501538885134'], ['2.144914100647332', '4.951733405544532', '4.325440230285053']],
dtype='
7/25/2019 Numpy for Python
6/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 6/27
In [21]:
np.array(dataList, dtype=np.object)# Arrays containing arbitrary objects.
Out[21]:
array([[4.844700289907117, 3.285473931529339, 2.1797684393413155], [5.824634396536993, 0.8824946651389621, 2.76732952458187], [1.8329068547314877, 4.527186437261438, 3.5724501538885134], [2.144914100647332, 4.951733405544532, 4.325440230285053]], dtype=object)
In [3]:
np.array([3, 3.141, 'Pi'], dtype=np.object)
Out[3]:
array([3, 3.141, 'Pi'], dtype=object)
Numpy provides a richer typeset than this. One may choose integers and floats of various different
sizes based on requirements. The details are available in the book. Please refer.
Casting to a data type
A numpy ndarray carries anastype method which may be used to cast the elements of the array to
a new type. Note that this operation always creates a copy of the original array even if the elements
are being cast to the same data type. Let's look at a few examples.
In [22]:
data
Out[22]:
array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952],
[ 1.83290685, 4.52718644, 3.57245015], [ 2.1449141 , 4.95173341, 4.32544023]])
In [23]:
data.astype(np.int8)
Out[23]:
array([[4, 3, 2], [5, 0, 2],
[1, 4, 3], [2, 4, 4]], dtype=int8)
7/25/2019 Numpy for Python
7/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 7/27
In [24]:
data.astype(np.uint32)
Out[24]:
array([[4, 3, 2], [5, 0, 2], [1, 4, 3], [2, 4, 4]], dtype=uint32)
In [25]:
data.astype(np.float128)
Out[25]:
array([[ 4.8447003, 3.2854739, 2.1797684], [ 5.8246344, 0.88249467, 2.7673295], [ 1.8329069, 4.5271864, 3.5724502],
[ 2.1449141, 4.9517334, 4.3254402]], dtype=float128)
In [26]:
data.astype(np.complex64)
Out[26]:
array([[ 4.84470034+0.j, 3.28547382+0.j, 2.17976832+0.j], [ 5.82463455+0.j, 0.88249469+0.j, 2.76732945+0.j], [ 1.83290684+0.j, 4.52718639+0.j, 3.57245016+0.j],
[ 2.14491415+0.j, 4.95173359+0.j, 4.32544041+0.j]], dtype=complex64)
In [27]:
data.astype(np.string_)
Out[27]:
array([[b'4.844700289907117', b'3.285473931529339', b'2.1797684393413155'], [b'5.824634396536993', b'0.8824946651389621', b'2.76732
952458187'], [b'1.8329068547314877', b'4.527186437261438', b'3.5724501538885134'], [b'2.144914100647332', b'4.951733405544532', b'4.325440230285053']],
dtype='|S32')
Vectorized math
Numpy's ndarrays allow concise mathematical expressions without the need for iteration using forloops: quite like R.
7/25/2019 Numpy for Python
8/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 8/27
In [28]:
data + data
Out[28]:
array([[ 9.68940058, 6.57094786, 4.35953688], [ 11.64926879, 1.76498933, 5.53465905], [ 3.66581371, 9.05437287, 7.14490031], [ 4.2898282 , 9.90346681, 8.65088046]])
In [29]:
data * 3
Out[29]:
array([[ 14.53410087, 9.85642179, 6.53930532], [ 17.47390319, 2.647484 , 8.30198857], [ 5.49872056, 13.58155931, 10.71735046],
[ 6.4347423 , 14.85520022, 12.97632069]])
In [30]:
1 / data
Out[30]:
array([[ 0.20641112, 0.30437009, 0.45876433], [ 0.1716846 , 1.13315133, 0.36135921], [ 0.54558146, 0.22088774, 0.27991993],
[ 0.46621914, 0.20194948, 0.23119034]])
In [31]:
2 ** data.astype(np.int)
Out[31]:
array([[16, 8, 4], [32, 1, 4], [ 2, 16, 8], [ 4, 16, 16]])
In [7]:
data * [1, 2, 3]# Elementwise multiplication of columns
Out[7]:
array([[ 4.84470029, 6.57094786, 6.53930532], [ 5.8246344 , 1.76498933, 8.30198857], [ 1.83290685, 9.05437287, 10.71735046], [ 2.1449141 , 9.90346681, 12.97632069]])
7/25/2019 Numpy for Python
9/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 9/27
In [8]:
data * [1, 2, 3, 4]
---------------------------------------------------------------------------ValueError Traceback (most recent call last)
in ()----> 1 data * [1, 2, 3, 4]
ValueError: operands could not be broadcast together with shapes (4,3) (4,)
In [12]:
x = np.array([[1], [2], [3], [4]])print(x, "\n\n", x.shape)
[[1][2][3][4]]
(4, 1)
In [9]:
data * [[1], [2], [3], [4]]# Elementwise multiplication of rows
Out[9]:array([[ 4.84470029, 3.28547393, 2.17976844], [ 11.64926879, 1.76498933, 5.53465905], [ 5.49872056, 13.58155931, 10.71735046], [ 8.5796564 , 19.80693362, 17.30176092]])
In [34]:
diag = np.diag([1, 2, 3])diag
Out[34]:
array([[1, 0, 0], [0, 2, 0], [0, 0, 3]])
7/25/2019 Numpy for Python
10/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 10/27
In [35]:
data * diag# Will not do matrix multiplication
---------------------------------------------------------------------------ValueError Traceback (most recent call last)
in ()----> 1 data * diag # Will not do matrix multiplication
ValueError: operands could not be broadcast together with shapes (4,3) (3,3)
In [36]:
np.dot(data, diag)
Out[36]:
array([[ 4.84470029, 6.57094786, 6.53930532], [ 5.8246344 , 1.76498933, 8.30198857], [ 1.83290685, 9.05437287, 10.71735046], [ 2.1449141 , 9.90346681, 12.97632069]])
In [37]:
print(np.sqrt(data), "\n\n", np.log(data), "\n\n", np.exp(data))
[[ 2.20106799 1.81258763 1.47640389]
[ 2.41342793 0.93941187 1.66352924][ 1.3538489 2.1277186 1.89009263][ 1.46455253 2.22524907 2.07976927]]
[[ 1.57788538 1.18951091 0.77921865][ 1.76209623 -0.12500254 1.01788278][ 0.60590315 1.51010065 1.27325168][ 0.76309951 1.5997377 1.46451392]]
[[ 127.06519357 26.72164554 8.84425804][ 338.53734003 2.4169216 15.91607372]
[ 6.25203402 92.49794585 35.60372097][ 8.54130751 141.4198896 75.59878641]]
In numpy parlance, these functions that apply the same operation on each of the elements of an array
are called universal functionsor ufuncs. On the other hand, there are functions that return
aggregations or other types of operations on ndarrays. Refer to McKinney (2012) for details.
7/25/2019 Numpy for Python
11/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 11/27
In [38]:
print(np.sum(data), "\n\n", np.min(data), "\n\n", np.max(data), "\n\n", np.mean(data))
41.1390324294
0.882494665139
5.82463439654
3.42825270245
Indexing ndarrays
Indexing by position
ndarrays are collections of data of a given shape, size, and type. Indexing is the way to access
elements, sub-collections of elements in a given ndarray.
Numpy provides a rich set of indexing semantics that allow concise expression of various indexing
operations.
One-dimensional arrays
The simplest indexing operation into any one dimension of a numpy ndarray mimics the semantics
of Python's indexing operator :.
Let's look at this using a one-dimensional array.
In [39]:
oneD = np.arange(10)oneD
Out[39]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
7/25/2019 Numpy for Python
12/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 12/27
In [40]:
Standard Python indexing: similar to lists and tuples
print(oneD[0]) # 0-based indexingprint(oneD[2:4]) # Indexing a range: excludes right limitprint(oneD[:4]) # Indexing from the beginning implictlyprint(oneD[4:]) # Indexing to the end implicitly
print(oneD[1:5:2])# Indexing with jumpsprint(oneD[::-1]) # Reverting an arrayprint(oneD[:-2]) # Negative indices to index from the end
0[2 3][0 1 2 3][4 5 6 7 8 9][1 3][9 8 7 6 5 4 3 2 1 0][0 1 2 3 4 5 6 7]
Indexing can be combined with the = operator to assign values in an ndarray.
In [41]:
oneD[3] = 13oneD
Out[41]:
array([ 0, 1, 2, 13, 4, 5, 6, 7, 8, 9])
In [42]:
oneD[1:3] = [11, 12]oneD
Out[42]:
array([ 0, 11, 12, 13, 4, 5, 6, 7, 8, 9])
In [43]:oneD[4:7] = [1, 2]# The shape of the replacement must match
---------------------------------------------------------------------------ValueError Traceback (most recent call last) in ()----> 1 oneD[4:7] = [1, 2] # The shape of the replacement mustmatch
ValueError: cannot copy sequence with size 2 to array axis with dimension 3
7/25/2019 Numpy for Python
13/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 13/27
In [44]:
oneD[7:9] = 20# Scalars are replicated to fill spaceoneD
Out[44]:
array([ 0, 11, 12, 13, 4, 5, 6, 20, 20, 9])
Aside: Indexing by position creates views
Numpy intends to be conservative with memory usage and is designed such that indexing / slicing into
an ndarray does not copy elements unless explicitly asked to. Therefore, slices are viewsinto the
original array the elements of a slice are the same as that of the original.
In [45]:
oneDSlice = oneD[2:5]
oneDSlice
Out[45]:
array([12, 13, 4])
Since the slice is just a view into the original ndarray, changes to the slice are also reflected into the
original. Therefore code that works on slices must be careful with introducing any unwanted changes
into the original ndarray.
In [46]:
oneDSlice[2] = 14print(oneDSlice)oneD
[12 13 14]
Out[46]:
array([ 0, 11, 12, 13, 14, 5, 6, 20, 20, 9])
To create an explicit copy of the view / slice, one may use the copy method available with an array.
For example:
In [47]:
oneDCopy = oneD[2:5].copy()oneDCopy
Out[47]:
array([12, 13, 14])
7/25/2019 Numpy for Python
14/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 14/27
In [48]:
oneDCopy[2] = 24print(oneDCopy)oneD
[12 13 24]
Out[48]:array([ 0, 11, 12, 13, 14, 5, 6, 20, 20, 9])
Indexing multi-dimensional arrays
Multi-dimensional arrays differ from one-dimensional arrays because where elements of a one-
dimensional array are themselves scalars, the elements of a multi-dimensional array are arrays
themselves.
For example, a two-dimensional array (or a matrix) can be considered as an array of row-arrays.
In [49]:
data
Out[49]:
array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, 4.52718644, 3.57245015], [ 2.1449141 , 4.95173341, 4.32544023]])
In [50]:
row1 = data[0]row1
Out[50]:
array([ 4.84470029, 3.28547393, 2.17976844])
However, a two-dimensional array can also be considered as an array of column-arrays. How do we
access that first column for example?
Numpy provides n indices into any arbitrary n-dimensional array. Any specific element in the
ndarray can be accessed by specifying the position of the element along the n dimensions.
In [51]:
data[1, 1]# Access the second diagonal element in 'data'
Out[51]:
0.88249466513896213
7/25/2019 Numpy for Python
15/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 15/27
In [52]:
data[:, 1]# Access the second column where `:` stands for all rows
Out[52]:
array([ 3.28547393, 0.88249467, 4.52718644, 4.95173341])
In [53]:
data[2, :]# Access the second row
Out[53]:
array([ 1.83290685, 4.52718644, 3.57245015])
In [54]:
data[:2, :]# First and second row, all columns
Out[54]:
array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952]])
In [55]:
data[:2, ::-1]# First and second row, all columns reversed
Out[55]:
array([[ 2.17976844, 3.28547393, 4.84470029],
[ 2.76732952, 0.88249467, 5.8246344 ]])
Fancy Indexing
All the position based indexing that we have seen till now uses slice objects to create views out of
the ndarrays. Moreover, these slice objects create views out of the ndarray instead of copying
elements in the object.
The slice based indexing notation however does not allow one to take values at an arbitrary set of
positions out of the array. For example, consider an array of ten elements with the problem of takingthe first, the fourth, and the ninth element out of the array.
These are the situations where 'fancy indexing' is used instead. An important thing to note is that
fancy indexing does not create views into the existing array instead they create a new ndarray, into
which, the desired elements are copied. Therefore, fancy indexing should be avoided if one can use
slices to index with.
7/25/2019 Numpy for Python
16/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 16/27
In [56]:
data
Out[56]:
array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, 4.52718644, 3.57245015], [ 2.1449141 , 4.95173341, 4.32544023]])
In [57]:
data[[2, 0, 1]]
Out[57]:
array([[ 1.83290685, 4.52718644, 3.57245015], [ 4.84470029, 3.28547393, 2.17976844],
[ 5.8246344 , 0.88249467, 2.76732952]])
In [58]:
data[:, [-1, 2, 0]]
Out[58]:
array([[ 2.17976844, 2.17976844, 4.84470029], [ 2.76732952, 2.76732952, 5.8246344 ], [ 3.57245015, 3.57245015, 1.83290685], [ 4.32544023, 4.32544023, 2.1449141 ]])
In [59]:
data[[2, 0, 1], [-1, 2, 0]]
Out[59]:
array([ 3.57245015, 2.17976844, 5.8246344 ])
In [60]:
data[[2, 0, 1], [-1, 2]]
---------------------------------------------------------------------------IndexError Traceback (most recent call last) in ()----> 1 data[[2, 0, 1], [-1, 2]]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,)
7/25/2019 Numpy for Python
17/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 17/27
In [61]:
data[np.ix_([2, 0, 1], [-1, 2])]
Out[61]:
array([[ 3.57245015, 3.57245015], [ 2.17976844, 2.17976844], [ 2.76732952, 2.76732952]])
Boolean indexing
The indexing semantics that we have seen till now are useful when the position of the subset to be
extracted is known. However, there are often situations where one wants to select a subset of a
collection not based on some predicate.
For example, in the following data ndarray comprised of numbers between zero to nine, one may
want to select elements that are less than 4. In such situations boolean indexing comes useful.
In [62]:
print(data)# Let's recap what is in data.
[[ 4.84470029 3.28547393 2.17976844][ 5.8246344 0.88249467 2.76732952][ 1.83290685 4.52718644 3.57245015][ 2.1449141 4.95173341 4.32544023]]
In [63]:
data[np.array([True, False, True, False])]# Subset rows
Out[63]:
array([[ 4.84470029, 3.28547393, 2.17976844], [ 1.83290685, 4.52718644, 3.57245015]])
In [64]:
data[:, np.array([True, False, True])]# Subset columns
Out[64]:
array([[ 4.84470029, 2.17976844], [ 5.8246344 , 2.76732952], [ 1.83290685, 3.57245015], [ 2.1449141 , 4.32544023]])
In [65]:
data[-np.array([True, False, True, False])]# Invert bool array
Out[65]:
array([[ 5.8246344 , 0.88249467, 2.76732952], [ 2.1449141 , 4.95173341, 4.32544023]])
7/25/2019 Numpy for Python
18/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 18/27
In [66]:
data[~np.array([True, False, True, False])]# Invert bool array
Out[66]:
array([[ 5.8246344 , 0.88249467, 2.76732952], [ 2.1449141 , 4.95173341, 4.32544023]])
Unlike for numerical indices, if the shape of the boolean indexing array does not match the shape of
the array being indexed, then the values that are left out of the indexing array are considered to be
False. Here is an example. Best to avoid such indexing.
In [21]:
x = data[np.array([True, False, True]), np.array([True, False, True])]print(x)print(x.ndim)
print(x.shape)
[ 4.84470029 3.57245015]1(2,)
Aside: More on booleans
Conditional operations on other arrays generate boolean arrays as well. For example:
In [68]:
data < 4# Returns an array of booleans
Out[68]:
array([[False, True, True], [False, True, True], [ True, False, True], [ True, False, False]], dtype=bool)
These multidimensional arrays can be used to index other arrays but with surprising, yet, correctbehavior. If conditional arrays are desired for conditional assignments, the numpy.where function is
handy.
In [69]:
data[data < 4]# Mangles the shape of the array
Out[69]:
array([ 3.28547393, 2.17976844, 0.88249467, 2.76732952,
1.83290685, 3.57245015, 2.1449141 ])
7/25/2019 Numpy for Python
19/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 19/27
In [70]:
np.where(data < 4, data, -data)# Use `where` for conditional assignments
Out[70]:
array([[-4.84470029, 3.28547393, 2.17976844], [-5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, -4.52718644, 3.57245015], [ 2.1449141 , -4.95173341, -4.32544023]])
Logical combinations on booleans
Numpy provides the standard logical operations: and(&), or (|), and not (~, -) that we have already
seen in action. Besides, the symbolic operators, these logical operations are also provided as
functions in the numpy library.
In [71]:np.array([True, False, True, False]) | np.array([True] * 4)
Out[71]:
array([ True, True, True, True], dtype=bool)
In [72]:
np.array([True, False, True, False]) & np.array([True] * 4)
Out[72]:array([ True, False, True, False], dtype=bool)
In [73]:
np.logical_or(np.array([True, False] * 2), np.array([True] * 4))
Out[73]:
array([ True, True, True, True], dtype=bool)
In [74]:
np.logical_and(np.array([True, False] * 2), np.array([True] * 4))
Out[74]:
array([ True, False, True, False], dtype=bool)
In [75]:
np.logical_not(np.array([True, False]))
Out[75]:
array([False, True], dtype=bool)
7/25/2019 Numpy for Python
20/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 20/27
Logical aggregations
Any array of logicals can be compressed to a single logical value based in two different ways: whether
all values in the array are true or any values in the array are true. Python provides .any() and
.all() methods on logical arrays to do these aggregations. For example:
In [76]:
np.array([True, False, True, True]).all()
Out[76]:
False
In [77]:
np.array([True, False, True, True]).any()
Out[77]:
True
In [78]:
np.array([True, True, True, True]).all()
Out[78]:
True
In [79]:
np.array([False, False, False, False]).all()
Out[79]:
False
Exercise: Create a function which:
1. none: When given a logical array returns True if all elements in the array are False.
2. notall: When given a logical array returns True if any elements in the array areFalse.
Transposing arrays
There are two ways to transpose a numpy array. Each array has a .T attribute which returns a view
which is the transpose of the original array. On the other hand, one may use the numpy.transpose
to return a shallow transposed copy of the array.
However, the thing of particular note is that each of these methods provide a view and one muse use
the .copy() method to achieve a true copy. Transposing arrays is only one of the many placeswhere the programmer needs to exercise special caution to ensure that there is no action at a
distance which can be the reason for many subtle bugs.
7/25/2019 Numpy for Python
21/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 21/27
In [80]:
data.T
Out[80]:
array([[ 4.84470029, 5.8246344 , 1.83290685, 2.1449141 ], [ 3.28547393, 0.88249467, 4.52718644, 4.95173341], [ 2.17976844, 2.76732952, 3.57245015, 4.32544023]])
In [81]:
np.transpose(data)
Out[81]:
array([[ 4.84470029, 5.8246344 , 1.83290685, 2.1449141 ], [ 3.28547393, 0.88249467, 4.52718644, 4.95173341], [ 2.17976844, 2.76732952, 3.57245015, 4.32544023]])
In [82]:
NB: Remember that the transpose is only a shallow copy:
dataT = data.Tprint(dataT)
[[ 4.84470029 5.8246344 1.83290685 2.1449141 ][ 3.28547393 0.88249467 4.52718644 4.95173341][ 2.17976844 2.76732952 3.57245015 4.32544023]]
In [83]:
dataT[1, 1] = 1print(data)
[[ 4.84470029 3.28547393 2.17976844][ 5.8246344 1. 2.76732952][ 1.83290685 4.52718644 3.57245015][ 2.1449141 4.95173341 4.32544023]]
Therefore, with numpy it is always better to use .copy() explicity when copies are desired (or take acareful read of the documentation). Let's look at an example:
7/25/2019 Numpy for Python
22/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 22/27
In [84]:
dataT2 = data.T.copy()dataT2[1, 1] = 2print(dataT2, "\n\n", data)
[[ 4.84470029 5.8246344 1.83290685 2.1449141 ][ 3.28547393 2. 4.52718644 4.95173341]
[ 2.17976844 2.76732952 3.57245015 4.32544023]]
[[ 4.84470029 3.28547393 2.17976844][ 5.8246344 1. 2.76732952][ 1.83290685 4.52718644 3.57245015][ 2.1449141 4.95173341 4.32544023]]
Reading from and writing to text files
Numpy provides functions to read delimited text based datasets using two simple functions:
numpy.loadtxt andnumpy.savetxt. Let's look at a simple example.
In [85]:
!head -10 ../../../data/pythagorean-triples.txt
3,4,55,12,1315,8,17
7,24,2621,20,2935,12,379,40,4145,28,5211,60,6133,56,65
7/25/2019 Numpy for Python
23/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 23/27
In [22]:
Load the pythagorean triples into a two dimensional array
pyTrips = np.loadtxt("../../../data/pythagorean-triples.txt", dtype=np.uint, delimiter=",")print(pyTrips[:10], "\n\n", pyTrips.shape)
[[ 3 4 5][ 5 12 13][15 8 17][ 7 24 26][21 20 29][35 12 37][ 9 40 41][45 28 52][11 60 61][33 56 65]]
(101, 3)
7/25/2019 Numpy for Python
24/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 24/27
In [23]:
pyTrips
7/25/2019 Numpy for Python
25/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 25/27
Out[23]:
array([[ 3, 4, 5], [ 5, 12, 13], [ 15, 8, 17], [ 7, 24, 26], [ 21, 20, 29], [ 35, 12, 37],
[ 9, 40, 41], [ 45, 28, 52], [ 11, 60, 61], [ 33, 56, 65], [ 63, 16, 65], [ 55, 48, 73], [ 13, 84, 85], [ 77, 36, 86], [ 39, 80, 89], [ 65, 72, 97], [ 99, 20, 101],
[ 91, 60, 109], [ 15, 112, 111], [117, 44, 125], [105, 88, 137], [ 17, 144, 145], [143, 24, 145], [ 51, 140, 147], [ 85, 132, 157], [119, 120, 169], [165, 52, 173], [ 19, 180, 181],
[ 57, 176, 185], [153, 104, 185], [ 95, 168, 193], [195, 28, 199], [133, 156, 205], [187, 84, 205], [ 21, 220, 221], [171, 140, 221], [221, 60, 229], [105, 208, 235], [209, 120, 241],
[255, 32, 257], [ 23, 264, 265], [247, 96, 265], [ 69, 260, 269], [115, 252, 271], [231, 160, 281], [161, 240, 289], [285, 68, 293], [207, 224, 305], [273, 136, 305], [ 25, 312, 313],
[ 75, 308, 317], [253, 204, 325], [323, 36, 325], [175, 288, 337], [299, 180, 349],
7/25/2019 Numpy for Python
26/27
1/11/2016 01-numpy
file:///home/fractaluser/Downloads/01-numpy.html 26/27
[225, 272, 353], [ 27, 364, 368], [357, 76, 365], [275, 252, 373], [135, 352, 377], [345, 152, 379], [189, 340, 389], [325, 228, 397],
[399, 40, 401], [391, 120, 409], [ 29, 420, 421], [ 87, 416, 425], [297, 304, 425], [145, 408, 433], [203, 396, 445], [437, 84, 445], [351, 280, 449], [425, 168, 457], [261, 380, 465],
[ 31, 480, 481], [319, 360, 481], [ 93, 476, 485], [483, 44, 485], [155, 468, 493], [475, 132, 493], [217, 456, 505], [377, 336, 505], [459, 220, 590], [279, 440, 521], [435, 308, 533],
[525, 92, 533], [341, 420, 541], [ 33, 544, 545], [513, 184, 545], [165, 532, 557], [403, 396, 565], [493, 276, 556], [231, 520, 569], [575, 48, 577], [465, 368, 593], [551, 240, 601],
[ 35, 612, 613], [105, 608, 617], [527, 336, 627], [429, 460, 629], [621, 100, 629]], dtype=uint64)
In [25]:
?np.where
7/25/2019 Numpy for Python
27/27
1/11/2016 01-numpy
Exercise
Using the array on Pythagorean triples loaded above, write numpy code to do the following:
1. Create three arrays, viz. baseSq, altitudeSq, and hypotenuseSq respectively by
squaring the first, second, and third columns of the pyTrips array.
2. Use numpy.where and the three arrays created above to find which of these triples are notreally Pythagorean.
3. Count the number of non-Pythagorean triples.