18
Python: Strings - Methods As discussed earlier, strings are sequences of characters delimited by single or double quotes String methods are associated with the String class, and as such are associated with each individual string object These methods are accessed using the syntax These do not affect the string itself Built-in methods: 1. Construction/exploding operations (a) join Syntax: Semantics: Creates a string made from string-arg parameters sepa- rated by string Example: ”:”.join([’hours’, ’minutes’]) ’hours:minutes’ (b) split Syntax: Semantics: Creates a list of strings created by breaking string apart * Essentially the opposite of join * If no arguments are provided, the splits occur on any white space * If the string argument is provided, the splits occur on that string * If the int argument is provided, that number of splits occur on the provided string 1

Python: Strings - Methods

  • Upload
    others

  • View
    34

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Python: Strings - Methods

Python: Strings - Methods

• As discussed earlier, strings are sequences of characters delimited by single ordouble quotes

• String methods are associated with the String class, and as such are associatedwith each individual string object

• These methods are accessed using the syntax

• These do not affect the string itself

• Built-in methods:

1. Construction/exploding operations

(a) join

– Syntax:

– Semantics: Creates a string made from string-arg parameters sepa-rated by string

– Example: ”:”.join([’hours’, ’minutes’]) ⇒ ’hours:minutes’

(b) split

– Syntax:

– Semantics: Creates a list of strings created by breaking string apart

∗ Essentially the opposite of join

∗ If no arguments are provided, the splits occur on any white space

∗ If the string argument is provided, the splits occur on that string

∗ If the int argument is provided, that number of splits occur on theprovided string

1

Page 2: Python: Strings - Methods

Python: Strings - Methods (2)

– Examples:”Every good boy does fine”.split() ⇒ [’Every’, ’good’, ’boy’, ’does’,’fine’]

”15:21:00”.split(”:”) ⇒ [’15’, ’21’, ’00’]

”15:21:00”.split(”:”, 1) ⇒ [’15’, ’21:00’]

2. Formatting operations

(a) strip

– Syntax:

– Semantics: Creates a string made by removing characters from thebegining and/or end of string

∗ By default (no argument), white space is removed from the ends ofthe string

∗ If strip string is included, any characters included in strip string areremoved

– Examples:” hello ”.strip() ⇒ ’hello’

”abc-keep-axa”.strip(”abcde-”) ⇒ ’keep-ax’

(b) lstrip

– Syntax: See strip

– Semantics: Same as strip but only affects beginning of string

(c) rstrip

– Syntax: See strip

– Semantics: Same as strip but only affects end of string

2

Page 3: Python: Strings - Methods

Python: Strings - Methods (3)

(d) maketrans and translate

– These are used in tandem for translating/encoding characters

– First, a table is generated using maketrans:

∗ Syntax:

∗ Example: t = ”abcde”.maketrans(”bd”, ”:;”)

∗ The result must be saved to be used by translate

– Then, the table is used to convert the source chars to code chars bytranslate:

∗ Syntax:

∗ Example: ”bbbddd”.translate(t) ⇒ ’:::;;;’

∗ Note that the string used in making the table does not need tobe the string that the table is applied to, nor do the characters insource chars need to be contained in the string used when makingthe table

(e) lower

– Syntax:

– Semantics: Converts string to lower case

(f) upper

– Syntax:

– Semantics: Converts string to upper case

(g) capitalize

– Syntax:

– Semantics: Capitalizes the first character of string

3

Page 4: Python: Strings - Methods

Python: Strings - Methods (4)

(h) title

– Syntax:

– Semantics: Capitalizes the first character of each word of string

(i) swapcase

– Syntax:

– Semantics: Changes the case of each letter in string

(j) ljust

– Syntax:

– Semantics: Left justifies string in a field of field width characterspadded with spaces

– Example: ”abc”.ljust(8) ⇒ ’abc ’

(k) rjust

– Syntax:

– Semantics: Right justifies string in a field of field width characterspadded with spaces

– Example: ”abc”.rjust(8) ⇒ ’ abc’

(l) center

– Syntax:

– Semantics: Centers string in a field of field width characters paddedwith spaces

– Example: ”abc”.center(11) ⇒ ’ abc ’

4

Page 5: Python: Strings - Methods

Python: Strings - Methods (5)

(m) zfill

– Syntax:

– Semantics: Adds zeroes to the left of string in a field of field width

– Example: ”1234”.zfill(8) ⇒ ’00001234’

3. Query operations

(a) find

– Syntax:

– Semantics: Returns index of the first occurrence of key in string work-ing left to right

∗ If start is included, searching starts at position start

∗ If end is included, searching ends immediately before position end

∗ If key is not found, -1 is returned

– Examples:”singsing”.find(”in”) ⇒ 1

”singsing”.find(”in”, 2) ⇒ 5

”singsing”.find(”in”, 2, 6) ⇒ -1

(b) rfind

– Syntax: See find

– Semantics: Same as find, but returns index of the last occurrence ofkey in string working right to left

(c) index

– Syntax:

– Semantics: Same as find, but returns an exception instead of -1 whenkey not found

5

Page 6: Python: Strings - Methods

Python: Strings - Methods (6)

(d) rindex

– Syntax: See index

– Semantics: index version of rfind

(e) count

– Syntax:

– Semantics: Returns number of non-overlapping occurrences of key instring

– Examples:”singsing”.count(”in”) ⇒ 2

”ininini”.count(”ini”) ⇒ 2

(f) startswith

– Syntax:

– Semantics: Returns True/False depending on whether any of thelisted keys are a prefix of string

– Examples:”singsing”.startswith(”in”) ⇒ False

”singsing”.startswith(”sin”) ⇒ True

”singsing”.startswith((”ins”, ”isn”, ”nis”, ”nsi”, ”sin”, ”sni”))⇒ True

(g) endswith

– Syntax: See startswith

– Semantics: Same as startswith but for suffix

(h) isdigit

– Syntax:

– Semantics: Returns True/False depending on whether all of the char-acters of string are digits

6

Page 7: Python: Strings - Methods

Python: Strings - Methods (7)

(i) isalpha

– Syntax:

– Semantics: Returns True/False depending on whether all of the char-acters of string are alphabetic

(j) isalnum

– Syntax:

– Semantics: Returns True/False depending on whether all of the char-acters of string are alphanumeric

(k) islower

– Syntax:

– Semantics: Returns True/False depending on whether all of the char-acters of string are lower case

(l) isupper

– Syntax:

– Semantics: Returns True/False depending on whether all of the char-acters of string are upper case

7

Page 8: Python: Strings - Methods

Python: Strings - Methods (8)

4. Conversion functions

(a) int

– Syntax:

– Semantics: Returns the integer equivalent of string interpreted asbase ten, or an exception

– If base is included, string is interpreted in that base

(b) float

– Syntax:

– Semantics: Returns the real equivalent of string, or an exception

(c) repr

– Syntax:

– Semantics: Returns a printable version of object

– This can be used with any Python object

– The result is something that can be used to reproduce the originalobject

(d) str

– Syntax:

– Semantics: Returns a printable version of object

– This differs from repr in that the result is intended for people to read

8

Page 9: Python: Strings - Methods

Python: Strings - Formatting

• Formatting refers to controlling the way output is displayed

– You know how to combine strings and duplicate them (+, ∗)– You know how to control how values are printed (end =, sep =)

• But these are really not sufficient if you want to display output that looks likethis:

Profits

First Second Third Fourth

Branch Quarter Quarter Quarter Quarter

---------------------------------------------------------------------------

North Branch $1,213,440.98 $856,112.80 $967,765.42 $1,317,444.00

South Branch $99,345.75 $233,665.12 $2,008,933.78 $10,000,134.62

---------------------------------------------------------------------------

• Python3 provides two formatting techniques

– These notes discuss the format method

– This is a string method

• Using the format method

– Syntax:

∗ The format template is a string with embedded formatting specifications

∗ The parameter list is a comma-separated list of values to be substitutedfor the embedded formatting specs

– Semantics: The format template is returned with parameters substitutedfor the formatting specsEverything else in the formatting string stays exactly as it appears

9

Page 10: Python: Strings - Methods

Python: Strings - Formatting (2)

– Formatting specs

∗ These are indicated by a pair of curly braces

∗ There are three flavors

1. Empty braces

· Parameters are substituted in order for the braces

· For example:’The {} in the {}’.format(’cat’, ’hat’) returns ’The cat in the hat’

where the first parameter (cat) is substituted for the first spec andthe second parameter (hat) is substituted for the second spec

2. Braces with integers

· Parameters are implicitly numbered (starting with zero)

· If integers appear in the braces, it refers to the numbered parameterThis is refered to as positional referencing

· This allows you to substitute parameters in any order and multipletimes

· For example’{2} porridge {1}, {2} porridge {0}’.format(’cold’, ’hot’, ’pease’)returns’pease porridge hot, pease porridge cold’

3. Braces with keywords

· In this approach, braces contain an identifier (keyword), and pa-rameters have the form < keyword >=< value >

· The value associated with a keyword is substituted for that keywordin the template

· ’{z} porridge {y}, {z} porridge {x}’.format(x = ’cold’, y = ’hot’, z= ’pease’) returns’pease porridge hot, pease porridge cold’

4. The two methods can both appear in the same format method:positional parameters must appear first, followed by keyword param-eters

· ’{z} porridge {y}, {z} porridge {0}’.format(’cold’, y = ’hot’, z =’pease’) returns’pease porridge hot, pease porridge cold’

10

Page 11: Python: Strings - Methods

Python: Strings - Formatting (3)

– In addition to a parameter, a spec may contain a set of codes that performvarious formatting tasks

∗ These are preceded by a colon

– Conversion codes

∗ As discussed so far, Python will treat a parameter as a string, using thevalue verbatim

∗ Conversion codes tell Python to display the parameter as a specific datatype(NOTE: This is not a type coercion. The value must be able to berepresented as the indicated type.)

∗ Codes (incomplete):

Code Meaning

d Signed integer decimali Signed integer decimalu Unsigned decimale floating point exponential formatE floating point exponential formatf floating point decimal format (for float and complex)F floating point decimal format (for float and complex)c single characterr string (any object converted using repr)s string (any object converted using str)% Does not perform conversion; displays a per cent sign

Additional codes include o, x, X, g, G

∗ For example:’x = {:f}’.format(x) returns ’x = 12.300000’ (when variable x hasvalue 12.3)

∗ The conversion code must come last

11

Page 12: Python: Strings - Methods

Python: Strings - Formatting (4)

– Field width and precision

∗ Field width is an integer that specifies the number of columns to use fora value

· It is applicable for any data type

· If the value requires more columns than specified in the field width,Python will ignore the field width

· Note that strings are left-justified; numerics are right-justified

· For example:’x ={:5}’.format(10) returns ’x = 10’

’x ={:5}’.format(856392) returns ’x =856392’

’x ={:5}’.format(’abc’) returns ’x =abc ’

∗ Precision is an integer that specifies the number of digits to the right ofthe decimal point

· It is preceded by a decimal point

· It is applicable for float and complex

· Note that the value displayed is rounded, but this has no effect onthe actual value

· For example:’x ={:8.2f}’.format(10.9) returns ’x = 10.90’

’x ={:8.2f}’.format(8.56392) returns ’x = 8.56’

– Justification codes

∗ < left justifies

∗ > right justifies

∗ ˆcenters

∗ These must appear before the field width

∗ For example: ’x ={:<8}’.format(10.9) returns ’x =10.9 ’

’x ={:ˆ8}’.format(’ab’) returns ’x = ab ’

12

Page 13: Python: Strings - Methods

Python: Strings - Formatting (5)

– Fill characters

∗ Any character that immediately follows the colon is a fill character

∗ It will be used instead of blank spaces when padding a field

∗ For example

· Zero after the colon pads with zeroes’x ={:08}’.format(10.9) returns ’x =000010.9’

· An asterisk after the colon pads with asterisks’{:*ˆ9}’.format(3) returns ’****3****’

· Note that the character is not quoted

– Numeric formatting codes

∗ A comma after the field width inserts commas as thousands separators

· For example: ’x ={:20,}’.format(10000000000) returns’x = 10,000,000,000’

∗ A + before the field width forces a preceding sign

· For example:’x ={:+5}’.format(10) returns ’x = +10’

13

Page 14: Python: Strings - Methods

Python: Strings - Formatting (6)

– Formatting specs as parameters

∗ The examples above use fixed specifications for specs like width, preci-sion, fill characters, etc.

∗ But what if you wanted these to be variable; e.g., what if you wanted theperson running your program to decide how wide a field width shouldbe?

∗ Formatting specs can be represented as variables in exactly the sameway as data: with curly braces

∗ For example

· A variable field widthx = 8

’*{:ˆ{}}*’.format(10, x) returns ’* 10 *’

· A variable field width and fill characterx = 8

’*{:{}ˆ{}}*’.format(10, ’a’, x) returns ’*aaa10aaa*’

∗ Like any other spec variables, variables can be explicitly numbered orkeyed

∗ If positional paramaters are used (as in the above examples), outerbraces are handled first, then inner braces within, left-to-right

14

Page 15: Python: Strings - Methods

Python: Characters - Intro

• Python doesn’t distinguish between strings and characters as data types

– A character is just a string containing one character

– Remember - a string is an iterable type, so it represents one or more memorypointers to values

• Characters are represented internally as integers

– ASCII (American Standard Code for Information Interchange)

∗ Standard encoding

∗ ASCII codes can be found online (see resources page)

– Unicode

∗ What ASCII evolved into

∗ Uses larger integers so can represent more characters (i.e., internationalcharacter sets)

15

Page 16: Python: Strings - Methods

Python: Characters - Representation

• As indicated above, characters are simply single-character strings

– As such, printing characters are represented using quotes

• Non-printing characters cannot be represented with quotesAlternatives must be used

1. Escape sequences

– An escape sequence is a sequence of characters indicated by a specialmarker character

– For characters, this is the backslash (\)– Commonly used escape sequences:

Character escape sequence

single quote \’double quote \”backslash \\alert (bell) \abackspace \bformfeed \fnewline \nreturn \rtab \tvertical tab \v

2. Octal

– Octal is base 8 representation

– Uses digits 0 - 7

– Octal characters are represented by a back slash followed by three octaldigits (enclosed in quotes)

– If not a printable character, the octal code is returned on evaluation

16

Page 17: Python: Strings - Methods

Python: Characters - Representation (2)

3. Hexadecimal

– Hexadecimal is base 16 representation

– Uses digits 0 - 9 plus A - F (for 10 - 15)

– Hexadecimal characters are represented by a back slash ’x’ followed bytwo hexadecimal digits

∗ The character sequence is terminated by any non-hex digit

4. Unicode

– Unicode characters are represented by a back slash ’u’ followed by fourhexadecimal digits

17

Page 18: Python: Strings - Methods

Python: Characters - Functions

• Since characters are strings, all of the functions related to strings apply tocharacters

– However, there are a few functions exclusive to characters

• Character functions

1. ord

– Syntax:

– Semantics: Returns the ASCII value of char

– (ord stands for ordinal, meaning the position of the character in theordered sequence of ASCII characters;i.e., the number of its position)

2. chr

– Syntax:

– Semantics: Returns the character whose ASCII value is int

18