50
Vocabulary Size of Moby Dick March 5, 2015 CSCI 0931 Intro. to Comp. for the HumaniDes and Social Sciences 1

Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Embed Size (px)

Citation preview

Page 1: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Vocabulary  Size  of  Moby  Dick  

March  5,  2015  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   1  

Page 2: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

The  Big  Picture  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   2  

Overall  Goal  Build  a  Concordance  of  a  text  

•  Loca%ons  of  words  •  Frequency  of  words  

Today:  Summary  Sta2s2cs  •  Get  the  vocabulary  size  of  Moby  Dick  (AQempt  1)  •  Write  test  cases  to  make  sure  our  program  works  

•  Think  of  a  faster  way  to  compute  the  vocabulary  size  

Page 3: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Last  Class  AcDvity  &  Homework  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   3  

•  Let’s  review  common  office  hours  quesDons  

•  Understanding  how  ‘for  loops’  work  – Using  a  variable  to  accumulate  some  value  

•  (e.g.,  a  count,  a  sum,  a  list)  over  the  course  of  running  the  loop  

•  Understanding  how  ‘if  statements’  work  •  Do  I  need  the  ‘else’  part?  

Page 4: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

From  Last  Class  

•  Finding  average  word  length,  longest  word  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   4  

Page 5: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Compute  the  Average  Word  Length  of  Moby  Dick  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   5  

def avgWordLengthInMobyDick(): '''Gets the average word length in MobyDick.txt''' myList = readMobyDick() s = 0 # tally of lengths of all words encountered so far for word in myList: s = s + len(word) avg = s/float(len(myList)) return avg

Page 6: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   6  

[“cat”,  “puppy”,  “dog”,  “kiQy”]  

s:  

def avgWordLengthInMobyDick(): '''Gets the average word length in MobyDick.txt''' myList = readMobyDick() s = 0 # tally of lengths of all words encountered so far for word in myList: s = s + len(word) avg = s/float(len(myList)) return avg

word:  

0  

cat  

Page 7: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   7  

[“cat”,  “puppy”,  “dog”,  “kiQy”]  

s:  

def avgWordLengthInMobyDick(): '''Gets the average word length in MobyDick.txt''' myList = readMobyDick() s = 0 # tally of lengths of all words encountered so far for word in myList: s = s + len(word) avg = s/float(len(myList)) return avg

word:  

3  

cat  

Page 8: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   8  

[“cat”,  “puppy”,  “dog”,  “kiQy”]  

s:  

def avgWordLengthInMobyDick(): '''Gets the average word length in MobyDick.txt''' myList = readMobyDick() s = 0 # tally of lengths of all words encountered so far for word in myList: s = s + len(word) avg = s/float(len(myList)) return avg

word:  

3  

puppy  

Page 9: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   9  

[“cat”,  “puppy”,  “dog”,  “kiQy”]  

s:  

def avgWordLengthInMobyDick(): '''Gets the average word length in MobyDick.txt''' myList = readMobyDick() s = 0 # tally of lengths of all words encountered so far for word in myList: s = s + len(word) avg = s/float(len(myList)) return avg

word:  

8  

puppy  

Page 10: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   10  

[“cat”,  “puppy”,  “dog”,  “kiQy”]  

s:  

def avgWordLengthInMobyDick(): '''Gets the average word length in MobyDick.txt''' myList = readMobyDick() s = 0 # tally of lengths of all words encountered so far for word in myList: s = s + len(word) avg = s/float(len(myList)) return avg

word:  

8  

dog  

Page 11: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   11  

[“cat”,  “puppy”,  “dog”,  “kiQy”]  

s:  

def avgWordLengthInMobyDick(): '''Gets the average word length in MobyDick.txt''' myList = readMobyDick() s = 0 # tally of lengths of all words encountered so far for word in myList: s = s + len(word) avg = s/float(len(myList)) return avg

word:  

11  

dog  

Page 12: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   12  

[“cat”,  “puppy”,  “dog”,  “kiQy”]  

s:  

def avgWordLengthInMobyDick(): '''Gets the average word length in MobyDick.txt''' myList = readMobyDick() s = 0 # tally of lengths of all words encountered so far for word in myList: s = s + len(word) avg = s/float(len(myList)) return avg

word:  

11  

kiQy  

Page 13: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   13  

[“cat”,  “puppy”,  “dog”,  “kiQy”]  

s:  

def avgWordLengthInMobyDick(): '''Gets the average word length in MobyDick.txt''' myList = readMobyDick() s = 0 # tally of lengths of all words encountered so far for word in myList: s = s + len(word) avg = s/float(len(myList)) return avg

word:  

16  

kiQy  

Page 14: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   14  

[“cat”,  “puppy”,  “dog”,  “kiQy”]  

s:  

def avgWordLengthInMobyDick(): '''Gets the average word length in MobyDick.txt''' myList = readMobyDick() s = 0 # tally of lengths of all words encountered so far for word in myList: s = s + len(word) avg = s/float(len(myList)) return avg

word:  

16  

kiQy  

avg:   4.0  return  this  

Page 15: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Now  the  longest  word…  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   15  

Page 16: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Get  the  Longest  Word  in  Moby  Dick  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   16  

def getLongestWordInMobyDick(): '''Returns the longest word in MobyDick.txt''' return longestword

Page 17: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Get  the  Longest  Word  in  Moby  Dick  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   17  

def getLongestWordInMobyDick(): '''Returns the longest word in MobyDick.txt''' myList = readMobyDick() longestWord = "" for word in myList: if len(word) > len(longestWord):

longestWord = word return longestWord

Page 18: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Get  the  Longest  Word  in  Moby  Dick  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   18  

def getLongestWordInMobyDick(): '''Returns the longest word in MobyDick.txt''' myList = readMobyDick() longestWord = "" for word in myList: if len(word) > len(longestWord):

longestWord = word return longestWord

Is  our  program  correct?  

Page 19: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   19  

def getLongestWordInMobyDick(): '''Returns the longest word in MobyDick.txt''' myList = readMobyDick() longestword = "" for word in myList: if len(word) > len(longestword):

longestword = word return longestword

[“cat”,  “elephant”,  “zebra”,  “flying  squirrel”]  

longestWord:  

word:   cat  

Page 20: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   20  

def getLongestWordInMobyDick(): '''Returns the longest word in MobyDick.txt''' myList = readMobyDick() longestword = "" for word in myList: if len(word) > len(longestword):

longestword = word return longestword

[“cat”,  “elephant”,  “zebra”,  “flying  squirrel”]  

longestWord:   cat  

word:   cat  

Page 21: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   21  

def getLongestWordInMobyDick(): '''Returns the longest word in MobyDick.txt''' myList = readMobyDick() longestword = "" for word in myList: if len(word) > len(longestword):

longestword = word return longestword

[“cat”,  “elephant”,  “zebra”,  “flying  squirrel”]  

longestWord:   cat  

word:   elephant  

Page 22: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   22  

def getLongestWordInMobyDick(): '''Returns the longest word in MobyDick.txt''' myList = readMobyDick() longestword = "" for word in myList: if len(word) > len(longestword):

longestword = word return longestword

[“cat”,  “elephant”,  “zebra”,  “flying  squirrel”]  

longestWord:   elephant  

word:   elephant  

Page 23: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   23  

def getLongestWordInMobyDick(): '''Returns the longest word in MobyDick.txt''' myList = readMobyDick() longestword = "" for word in myList: if len(word) > len(longestword):

longestword = word return longestword

[“cat”,  “elephant”,  “zebra”,  “flying  squirrel”]  

longestWord:   elephant  

word:   zebra  

Page 24: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   24  

def getLongestWordInMobyDick(): '''Returns the longest word in MobyDick.txt''' myList = readMobyDick() longestword = "" for word in myList: if len(word) > len(longestword):

longestword = word return longestword

[“cat”,  “elephant”,  “zebra”,  “flying  squirrel”]  

longestWord:   elephant  

word:   flying  squirrel  

Page 25: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   25  

def getLongestWordInMobyDick(): '''Returns the longest word in MobyDick.txt''' myList = readMobyDick() longestword = "" for word in myList: if len(word) > len(longestword):

longestword = word return longestword

[“cat”,  “elephant”,  “zebra”,  “flying  squirrel”]  

longestWord:   flying  squirrel  

word:   flying  squirrel  

Page 26: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

How  it  works  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   26  

def getLongestWordInMobyDick(): '''Returns the longest word in MobyDick.txt''' myList = readMobyDick() longestword = "" for word in myList: if len(word) > len(longestword):

longestword = word return longestword

[“cat”,  “elephant”,  “zebra”,  “flying  squirrel”]  

longestWord:   flying  squirrel  

word:   flying  squirrel  

return  this  

Page 27: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Why  use  funcDons?  

•  Break  up  tasks  into  smaller  tasks  – Test  smaller  tasks;  then  assemble!  

•  FuncDons  allow  generalizaDon!  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   27  

Page 28: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Compute  the  Average  Word  Length  of  a  word  list  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   28  

def avgWordLength (wordList): ''‘Average word length in a nonempty list of words''' s = 0 # tally of lengths of all words encountered so far for word in wordList: s = s + len(word) avg = s/float(len(wordList)) #assumes wordList nonempty! return avg

Page 29: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Designing  funcDons  

•  What  consDtutes  a  “smaller  task”?    •  Bad  choice:  “find  average  word  length  in  first  third  of  Moby  Dick”  

•  Good  choice:  “read  in  list  of  all  words  of  Moby  Dick”;  “compute  average  word  length  in  list”  

•  For  now...we’ll  guide  you  on  this.    

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   29  

Page 30: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

ACT2-­‐4  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   30  

•  Do  Task  1  –  PracDce  spomng  errors  in  funcDons  

Page 31: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Debugging  Programs  

Func2on   Example  

addTwo(x,y) addTwo(2,0)  

subtractTwo(x,y) subtractTwo(2,0)  

multiplyTwo(x,y) z = multiply(2,0)  

divideTwo(x,y) divideTwo(2,0)  

addList(myList) myList([2,0])  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   31  

Page 32: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Debugging  Programs  

Func2on   Example  

addTwo(x,y) addTwo(2,0) – output  is  wrong  

subtractTwo(x,y) subtractTwo(2,0) – any  input  causes  an  error  

multiplyTwo(x,y) z = multiply(2,0) – What  is  z  aner  this  assignment?  

divideTwo(x,y) divideTwo(2,0) – What  happens  when  I  run  this?  Use  an  “if”  to  catch  the  error  and  print  a  message  to  the  screen.  

addList(myList) myList([2,0]) – any  input  causes  an  error  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   32  

Page 33: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Boolean  Expressions  on  Numbers  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   33  

Expression Evaluation Final

(100 > 101) and (-1 != -1)

(1 <= 2) or ((1 == 1) and (1 != 2)

x = 1 not(x > 2) or (x <= x+1)

y = 100 not(not(y == 100))

Page 34: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Boolean  Expressions  on  Numbers  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   34  

Expression Evaluation Final

(100 > 101) and (-1 != -1) False and False False

(1 <= 2) or ((1 == 1) and (1 != 2)

x = 1 not(x > 2) or (x <= x+1)

y = 100 not(not(y == 100))

Page 35: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Boolean  Expressions  on  Numbers  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   35  

Expression Evaluation Final

(100 > 101) and (-1 != -1) False and False False

(1 <= 2) or ((1 == 1) and (1 != 2)

True or (True and True)

True

x = 1 not(x > 2) or (x <= x+1)

y = 100 not(not(y == 100))

Page 36: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Boolean  Expressions  on  Numbers  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   36  

Expression Evaluation Final

(100 > 101) and (-1 != -1) False and False False

(1 <= 2) or ((1 == 1) and (1 != 2)

True or (True and True)

True

x = 1 not(x > 2) or (x <= x+1)

True or True True

y = 100 not(not(y == 100))

Page 37: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Boolean  Expressions  on  Numbers  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   37  

Expression Evaluation Final

(100 > 101) and (-1 != -1) False and False False

(1 <= 2) or ((1 == 1) and (1 != 2)

True or (True and True)

True

x = 1 not(x > 2) or (x <= x+1)

True or True True

y = 100 not(not(y == 100))

not(False) True

Page 38: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Boolean  Expressions  on  Strings  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   38  

Boolean  Operators  on  Strings  Operator   Example   Result  Equality   'a' == 'b' False

Inequality   'a' != 'b' True

Page 39: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

Boolean  Expressions  on  Strings  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   39  

>>> 'apple' == 'apple' True >>> 'apple' == 'Apple' False >>> 'apple' == 'apple!' False

Boolean  Operators  on  Strings  Operator   Example   Result  Equality   'a' == 'b' False

Inequality   'a' != 'b' True

Page 40: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

The  Big  Picture  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   40  

Overall  Goal  Build  a  Concordance  of  a  text  

•  Loca%ons  of  words  •  Frequency  of  words  

Today:  Summary  Sta2s2cs  •  Get  the  vocabulary  size  of  Moby  Dick  (AQempt  1)  •  Write  test  cases  to  make  sure  our  program  works  

•  Think  of  a  faster  way  to  compute  the  vocabulary  size  

Save  ACT2-­‐4.py  and  MobyDick.txt  to  the  same  directory  

Page 41: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

WriDng  a  vocabSize  FuncDon  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   41  

def vocabSize(): myList = readMobyDickShort() uniqueList = noReplicates(myList) return len(uniquelist)

Page 42: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

WriDng  a  vocabSize  FuncDon  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   42  

def noReplicates(wordList): '''takes a list as argument, returns a list free of replicate items. slow implementation.'''

def isElementOf(myElement,myList): '''takes a string and a list and returns True if the string is in the list and False otherwise.'''

def vocabSize(): myList = readMobyDickShort() uniqueList = noReplicates(myList) return len(uniquelist)

Page 43: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

WriDng  a  vocabSize  FuncDon  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   43  

def noReplicates(wordList): '''takes a list as argument, returns a list free of replicate items. slow implementation.'''

def isElementOf(myElement,myList): '''takes a string and a list and returns True if the string is in the list and False otherwise.'''

def vocabSize(): myList = readMobyDickShort() uniqueList = noReplicates(myList) return len(uniquelist)

def testNoReplicates(): def testIsElementOf():

WriDng  test  cases  is  important  to  make  sure  your  program  works!  

Page 44: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

The  Big  Picture  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   44  

Overall  Goal  Build  a  Concordance  of  a  text  

•  Loca%ons  of  words  •  Frequency  of  words  

Today:  Summary  Sta2s2cs  •  Get  the  vocabulary  size  of  Moby  Dick  (AQempt  1)  •  Write  test  cases  to  make  sure  our  program  works  

•  Think  of  a  faster  way  to  compute  the  vocabulary  size  

Page 45: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

What  does  slow implementation mean?      

•  Replace  readMobyDickShort()  with  readMobyDickAll()

•  Now,  run  vocabSize() – Hint:  Ctrl-­‐C  (or  Command-­‐C)  will  abort  the  call.  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   45  

Page 46: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

What  does  slow implementation mean?      

•  Replace  readMobyDickShort()  with  readMobyDickAll()

•  Now,  run  vocabSize() – Hint:  Ctrl-­‐C  (or  Command-­‐C)  will  abort  the  call.  

•  Faster  way  to  write  noReplicates() – What  if  we  can  sort  the  list?  [‘a’,’a’,’a’,’at’,’and’,’and’,…,’zebra’]

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   46  

Page 47: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

SorDng  Lists  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   47  

Preloaded  Func2ons  

Name   Inputs   Outputs   CHANGES  

sort List   Original  List!  

Page 48: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

SorDng  Lists  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   48  

Preloaded  Func2ons  

Name   Inputs   Outputs   CHANGES  

sort List   Original  List!  

>>> myList = [0,4,1,5,-1,6] >>> myList.sort() >>> myList [-1, 0, 1, 4, 5, 6]

Page 49: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

SorDng  Lists  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   49  

Preloaded  Func2ons  

Name   Inputs   Outputs   CHANGES  

sort List   Original  List!  

>>> myList = [0,4,1,5,-1,6] >>> myList.sort() >>> myList [-1, 0, 1, 4, 5, 6] >>> myList = ['b','d','c','a','z','i'] >>> myList.sort() >>> myList ['a', 'b', 'c', 'd', 'i', 'z']

Page 50: Vocabulary*Size*of*Moby*Dick* - Brown Universitycs.brown.edu/courses/csci0931/2015-spring/2-text_analysis/LEC2-5.pdf · Vocabulary*Size*of*Moby*Dick* March*5,*2015* CSCI0931*=*Intro.*to*Comp.*for*the*HumaniDes*and*Social*Sciences*

The  Big  Picture  

CSCI  0931  -­‐  Intro.  to  Comp.  for  the  HumaniDes  and  Social  Sciences   50  

Overall  Goal  Build  a  Concordance  of  a  text  

•  Loca%ons  of  words  •  Frequency  of  words  

Today:  Summary  Sta2s2cs  •  Get  the  vocabulary  size  of  Moby  Dick  (AQempt  1)  •  Write  test  cases  to  make  sure  our  program  works  

•  Think  of  a  faster  way  to  compute  the  vocabulary  size