.lower() method to convert all upper case letters to lower casetranslate() that can be used to scrub certain characters from a string, but it is a little complicated (see https://machinelearningmastery.com/clean-text-machine-learning-python/)s the characters that appear in the string delete_chars.string.punctuation:import string
print(string.punctuation)
## !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
def clean_string(s,delete_chars=string.punctuation):
for i in delete_chars:
s = s.replace(i,"")
return(s)
x = "ab,Cde!?Q@#$I"
print(clean_string(x))
## abCdeQI
markov_create function (outline)def markov_create(file_name, sentence_length = 20):
## open the file and store its contents in a string
text_file = open(file_name, 'r')
text = text_file.read()
## clean the text and then split it into words
clean_text = clean_string(text)
word_list = clean_text.split()
## create the markov dictionary
text_dict = markov_dict(word_list)
## Produce a sentence (a list of strings) of length
## sentence_length using the dictionary
sentence = markov_sentence(text_dict, sentence_length)
## print out the sentence as a string using
## the .join() method.
return " ".join(sentence)
To complete this exercise, we need to produce the following functions:
clean_string(s,delete_chars = string.punctuation) strips the text of punctuation and converts upper case words into lower case.markov_dict(word_list) creates a dictionary from a list of wordsmarkov_sentence(text_dict, sentence_length) randomly produces a sentence using the dictionary.random modulerandom module can be used to generate pseudo-random numbers or to pseudo-randomly select items.randrange() picks a random integer from a prescribed range can be generatedchoice(seq) randomly chooses an element from a sequence, such as a list or tupleshuffle shuffles (permutes) the items in a list; sample() samples elements from a list, tuple, or setrandom.seed() sets the starting value for a (pseudo-)random number sequence [important]random examplesimport random
random.seed(101) ## any integer you want
random.randrange(2, 102, 2) # random even integers
## 76
random.choice([1, 2, 3, 4, 5]) # random choice from list
## random.choices([1, 2, 3, 4, 5], 9) # multiple choices (Python >=3.6)
## 2
random.sample([1, 2, 3, 4, 5], 3) # rand. sample of 3 items
## [5, 3, 2]
random.random() # uniform random float between 0 and 1
## 0.048520987208713895
random.uniform(3, 7) # uniform random between 3 and 7
## 5.014081424907534
random.seed(101)
for i in range(3):
print(random.randrange(10))
## 9
## 3
## 8
random.seed(101)
for i in range(3):
print(random.randrange(10))
## 9
## 3
## 8
numpy is the fundamental package for scientific computing with Python. It contains among other things:
numpy should already be installed with Anaconda or on syzygy. If not, you Good documentation can be found here and here.
array() is numpy’s main data structure.list, but must be homogeneous (e.g. floating point (float64) or integer (int64) or str)float64 is a 64-bit floating point number)import numpy as np ## use "as np" so we can abbreviate
x = [1, 2, 3]
a = np.array([1, 4, 5, 8], dtype=float)
print(a)
## [1. 4. 5. 8.]
print(type(a))
## <class 'numpy.ndarray'>
print(a.shape)
## (4,)
shape of an array is a tuple that lists its dimensionsnp.array([1,2]) produces a 1-dimensional (1-D) array of length 2 whose entries have type intnp.array([1,2], float) produces a 1-dimensional (1-D) array of length 2 whose entries have type float64.a1 = np.array([1,2])
print(a1.dtype)
## int64
print(a1.shape)
## (2,)
print(len(a1))
## 2
a2 = np.array([1,2],float)
print(a2.dtype)
## float64
range function.numpy has a function called np.arange (like range) that creates arraysnp.zeros() and np.ones() create arrays of all zeros or all onesx = [1, 'a', 3]
a = np.array(x) ## what happens?
b = np.array(range(10), float)
c = np.arange(5, dtype=float)
d = np.arange(2,4, 0.5, dtype=float)
np.ones(10)
## array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
np.zeros(4)
## array([0., 0., 0., 0.])
a[1]=0).copy() method to make a new, independent copy (works for lists etc. too!)a1 = np.array([1.0, 2, 3, 4, 5, 6])
a1[1]
## 2.0
a1[:-3]
## array([1., 2., 3.])
b1 = a1
c1 = a1.copy()
b1[1] = 23
a1[1]
## 23.0
c1[1]
## 2.0
np.array() functiona[i,j] rather than a[i][j]nested = [[1, 2, 3], [4, 5, 6]]
a = np.array(nested, float)
nested[0][2]
## 3
a[0,2]
## 3.0
a
## array([[1., 2., 3.],
## [4., 5., 6.]])
a.shape
## (2, 3)
: indicates that everything along a dimension will be used.a = np.array([[1, 2, 3], [4, 5, 6]], float)
a[1, :] ## row index 1
## array([4., 5., 6.])
a[:, 2] ## column index 2
## array([3., 6.])
a[-1:, -2:] ## slicing rows and columns
## array([[5., 6.]])
An array can be reshaped using the reshape(t) method, where we specify a tuple t that gives the new dimensions of the array.
a = np.array(range(10), float)
a = a.reshape((5,2))
print(a)
## [[0. 1.]
## [2. 3.]
## [4. 5.]
## [6. 7.]
## [8. 9.]]
.flatten() converts an array with a given shape to a 1-D array:
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(a)
## [[1 2 3]
## [4 5 6]
## [7 8 9]]
print(a.flatten())
## [1 2 3 4 5 6 7 8 9]
np.zeros(shape) and np.ones(shape) work for multidimensional arrays if we provide a tuple of length > 1np.ones_like(), np.zeros_like(), or the .fill() method to create arrays of just zeros or ones (or some other value) and are the same shape as an existing arrayb = np.ones_like(a)
b.fill(33)
np.identity() or np.eye() to create an identity matrix (all zeros except for ones down the diagonal)np.eye() also lets you fill in off-diagonal elementsprint(np.identity(4, dtype=float)),
## [[1. 0. 0. 0.]
## [0. 1. 0. 0.]
## [0. 0. 1. 0.]
## [0. 0. 0. 1.]]
## (None,)
print(np.eye(4, k = -1, dtype=int))
## [[0 0 0 0]
## [1 0 0 0]
## [0 1 0 0]
## [0 0 1 0]]
+ operation concatenates two objects to create a longer onenp.concatenate() to stick two suitably shaped arrays together: to concatenate two arrays of suitable shapes, thea = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
b = np.array([[10, 11,12], [13, 14, 15], [16, 17, 18]])
print(np.concatenate((a,b)))
## [[ 1 2 3]
## [ 4 5 6]
## [ 7 8 9]
## [10 11 12]
## [13 14 15]
## [16 17 18]]
print(a+b)
## [[11 13 15]
## [17 19 21]
## [23 25 27]]
print(a*b)
## [[ 10 22 36]
## [ 52 70 90]
## [112 136 162]]
print(a**b)
## [[ 1 2048 531441]
## [ 67108864 6103515625 470184984576]
## [ 33232930569601 2251799813685248 150094635296999121]]
a + 1-, *, **, /, . . .print(a + 1)
## [[ 2 3 4]
## [ 5 6 7]
## [ 8 9 10]]
print(a/2)
## [[0.5 1. 1.5]
## [2. 2.5 3. ]
## [3.5 4. 4.5]]
print(a ** 3)
## [[ 1 8 27]
## [ 64 125 216]
## [343 512 729]]
numpy comes with a large library of common functions (sin, cos, log, exp, . . .): these work element-wisea.sum() and a.prod() will produce the sum and the product of the items in a:print(np.sin(a))
## [[ 0.84147098 0.90929743 0.14112001]
## [-0.7568025 -0.95892427 -0.2794155 ]
## [ 0.6569866 0.98935825 0.41211849]]
print(a.sum())
## 45
print(a.prod())
## 362880
print(a.mean())
## 5.0