Chp-4: Strings#

Section Title: Strings

Chapter Objectives

By the end of this chapter, the student should be able to:

  • Define strings and use them to represent textual data in Python.

  • Declare strings using single quotes, double quotes, or triple quotes

  • Perform string manipulation operations including concatenation, slicing, and indexing.

  • Use built-in string methods in Python.

  • Use f-strings for formatting strings.

  • Use escape sequences to control whitespace characters in printed output.

  • Examine encoding schemes such as ASCII and Unicode.

  • Perform string operations such as concatenation and repetition.

  • Access individual characters or substrings in a string using indexing and slicing.

  • Recognize the immutability of string.

  • Solve real-world problems, such as parsing strings.

Create Strings#

Strings are ordered sequences of characters. There are several ways to create strings:

  • Single quotes: 'text'

  • Double quotes: "text"

  • Triple single quotes: '''text'''

  • Triple double quotes: """text"""

A space is also a character that can be in a string. Triple single and double quotes are used to create strings spanning multiple lines.

# Single quotes 
name = 'Mary'
# Double quotes 
name = "Mary"
# Triple single quotes
name = '''Mary'''
# Triple double quotes
name = """Mary"""

Double single/double quotes cannot be used to create strings.

# ERROR: Double single quotes.
name = ''Mary''
# ERROR: Double double quotes. 
name = ""Mary""

If a single quote is a character in the string, using single quotes to create the string is an error. This is because the character single quote will end the creation of the string.

  • It behaves like the second single quote, instead of being a character in the string.

  • You can use double quotes to create the string to overcome the confusion. Alternatively, you can use escaping characters.

# ERROR
'Mary's book.'
  • In the code above, the second single quote ends the creation of the string.

  • The s book.' part causes the error because it has only one single quote.

  • If you use double quotes, then there will not be any problem.

# no ERROR
sentence = "Mary's book."
# ERROR
"Mary said "I am here"."
  • In the code above, the second double quote ends the creation of the string.

  • The I am here part causes the error because it is not enclosed by single or double quotes.

  • If you use single quotes, then there will not be any problem.

# no ERROR
sentence = 'Mary said "I am here".'

Single/Double quotes cannot be used with strings spanning more than one line.

# ERROR: Multiple lines.
name = 'John
     Steinbeck'
print(x)
# ERROR: Multiple lines.
name = "John
     Steinbeck"

Triple single/double quotes are used to create strings with multiple lines.

# no ERROR
name = '''John
     Steinbeck'''
# no ERROR
name = """John
     Steinbeck"""

Quick Check!

Use a single string to print the following text:
He's a very
hardworking student.

Solution

text = '''He's a very   
hardworking student.'''

print(text)

Escaping characters#

An escape character is a backslash \ followed by a character. It is used to write special characters in a string.

  • \' : Single quote (apostroph)

  • \" : Double quote

  • \n : New line

  • \t : Tabulation

  • \b : Backspace

  • \\ : Backslash

  • \r : Carriage return

# \' is the character '
print('Mary\'s book.')
Mary's book.
# \" is the character "
print("Mary said \"I am here\".")
Mary said "I am here".
# \n is a new line
print('John\nSteinbeck')
John
Steinbeck
  • In the following code, ‘\t’ inserts 4 spaces because after ‘John’, there are 4 more spaces until the next tab stop, which occurs at the 8th character.

# \t is a tab 
print('John\tSteinbeck')
print('John'+' '*4+'Steinbeck')
John	Steinbeck
John    Steinbeck
  • In the following code, ‘\t’ inserts 7 spaces because ‘Steinbeck’ has 9 characters and there are 7 more spaces until the next tab stop, which occurs at the 16th character.

print('Steinbeck\tJohn')
print('Steinbeck'+' '*7+'John')
Steinbeck	John
Steinbeck       John
  • \b (backspace) moves one character back, deleting the preceding character.

print('John\b Steinbeck')
  • Output: Joh Steinbeck

  • \b\b (two backspaces) moves two characters back, deleting the preceding two characters.

print('John\b\b Steinbeck')
  • Output: Jo Steinbeck

# \\ is the \ (backslash) character
print('John\\ Steinbeck')
John\ Steinbeck

Quick Check!

Find the output of the following code:

print('abc\ndef\b\bgh')

Solution

abc
dgh

Raw Strings#

Each character in a raw string has no special meaning; they are just characters.

  • It is created using r in front of the string: r'text'

  • In the example below, \t,\n,\',\" have no special meanings; they are just characters \, t, n,',"

# \t means two characters, \ and t. 
# \t does not mean a tab in a raw string

text = r'Good \t bye.'
print(text)
Good \t bye.
text = r'Hello.\t my name\n is Tom. I am\' from\" England.'
print(text)
Hello.\t my name\n is Tom. I am\' from\" England.

f-strings#

It is a great way to combine constants and variable values.

  • It is in the form of: f'text {variable} text'

  • Variables are enclosed in curly brackets {variable} called placeholders.

  • An f-string generates a new string.

  • You can also perform rounding or algebraic operations within curly brackets.

  • It is much easier to use f-strings than comma-separated values in a print() function.

name = 'Tom'
country = 'Spain'
age = 25
weight = 173.6294
  • You can compare the next two cells below to see the advantage of using f-strings.

# using an f-string
print(f'My name is {name}.')
My name is Tom.
# using comma separated values
print('My name is ', name, '.', sep='')
My name is Tom.
# using an f-string
text = f'My name is {name}.'   # text is a string
print(text)
My name is Tom.
# multiple placholders
print(f'My name is {name} and I am from {country}. I am {age} years old.')
My name is Tom and I am from Spain. I am 25 years old.
  • Algebraic operations can be performed inside curly brackets

 print(f'I will be {age+1} years old next year.')
I will be 26 years old next year.
  • Rounding can be performed inside curly brackets

# round() function

print(f'My weight is {weight}.')
print(f'My rounded weight is {round(weight,2)}.')
My weight is 173.6294.
My rounded weight is 173.63.
# rounding by a different way

print(f'My weight is {weight}.')
print(f'My rounded weight is {weight:.2f}.') # 2 means second decimal place (hundredth), f means float
My weight is 173.6294.
My rounded weight is 173.63.

Quick Check!

Find the output of the following code:

percentage, order, country = 0.25, 5, 'France'
print(f'{percentage}, order, {country}')

Solution

0.25, order, France

Unicode Characters#

These are symbols, accented letters, non-Latin characters, and emojis—kind of different characters.

  • You can find the list of Unicode characters on the official website of Unicodes.

  • Each Unicode character has a code that is unique to it.

    • You can search for a specific Unicode code from the Unicode character charts.

    • If the code has four characters, use

      • \uXXXX where XXXX is the code.

    • If a code has four characters or more, pad it with 0 from the left to make the length of the code eight and use:

      • \Uxxxxxxxx where xxxxxxxx is the 0-padded code.

Example: If the Unicode is 1F639, three zeros are added to the left.

print('\U0001F639')
😹

Example: If the Unicode is 1F602, three zeros are added to the left.

print('\U0001F602')
😂

Example: If the Unicode is 2764, no padding is necessary.

print('\u2764')

Example: Padding can also be applied if the Unicode is 2764.

print('\U00002764')

Operations on strings#

Concatenation#

The + operator is used to concatenate two strings.

  • String + String: combines two strings.

x = 'John'
y = 'Steinbeck'
# concatenation of x and y

author = x+y
print(author)
JohnSteinbeck
  • A single space ' ' can be inserted between x and y.

name = x + ' ' + y
print(name)
John Steinbeck

Repetition#

The * operator is used to repeat a string a certain number of times.

  • String * Integer or Integer * String makes copies of the string Integer many times.

  • Floats cannot be used for repetitions.

# four copies of x
x = 'John'
fourjohns = x*4
print(fourjohns)
JohnJohnJohnJohn
  • In the following code, there is an error due to the attempt to use the float value 4.3 in a repetition operation.

# ERROR: float * str
4.3*'Hi'

Example: Forming a triangle pattern with the ‘$’ character using repetitions.

print(' '*6+'$')
print(' '*5+'$'*2)
print(' '*4+'$'*3)
print(' '*3+'$'*4)
print(' '*2+'$'*5)
print(' '*1+'$'*6)
print(' '*0+'$'*7)
      $
     $$
    $$$
   $$$$
  $$$$$
 $$$$$$
$$$$$$$

Quick Check!

Find the output of the following code.

print('$')
print('$'*2)
print('$'*3)
print('$'*4)
print('$'*5)
print('$'*6)
print('$'*7)

Solution

$
$$
$$$
$$$$
$$$$$
$$$$$$
$$$$$$$

Length Function#

The built-in len() function returns the number of characters in a string.

  • The space ‘ ‘ is a character.

  • Escaping characters such as ‘\n’ and ‘\t’ are counted as single characters.

Example: The string ‘hello’ contains 5 characters.

print(len('hello'))
5

Example: The string ‘hel lo’ contains 6 characters, including the space.

print(len('hel lo'))
6

Example: The string ‘hel\nlo’ contains 6 characters, including the newline character.

print(len('hel\nlo'))
6

Quick Check!

Find the output of the following code.

print(len('one day.'))

Solution

8

String Indexing#

Indexing is used to access individual characters or sets of characters.

  • Indexing starts with zero.

  • The index of the first character is 0.

  • The index of the second character is 1, and so on.

  • The index is written in square brackets: string[index].

  • Negative numbers can also be used for indexing.

  • The index of the last character is -1.

  • The index of the second character from the end is -2, and so on.

The list of positive and negative indexes for each character in the word "CALIFORNIA":Index of C is 0 or -10,Index of the first A is 1 or -9,Index of L is 2 or -8,Index of the first I is 3 or -7,Index of F is 4 or -6,Index of O is 5 or -5,Index of R is 6 or -4,Index of N is 7 or -3,Index of the second I is 8 or -2,Index of the second A is 9 or -1.The length CALIFORNIA is 10.

  • index of C is 0 or -10

  • index of first A is 1 or -9

  • index of second A is 9 or -1

Warning: There is a character with index -10 (the second ‘A’), but there is no character with index positive 10 because indexing starts with 0, and it does not reach 10, which is the length of the string.

  • For any string, there is no character with an index equal to the length of the string.

state = 'CALIFORNIA'
# character at index 0 
print(state[0])
C
# character at index 6 
print(state[6])
R
# character at index -1 
print(state[-1])
A
# character at index -3 
print(state[-3])
N
  • There is an error in the following code because there is no character with index 10.

# ERROR: out of range
state[10]
  • The length of state is 10, and there is an error in the following code. This applies to all strings.

# ERROR: out of range
state[len(state)]
  • There is no error in the following code because len(state) - 1 = 9 is the index of the last character.

print(state[len(state)-1])
A

Quick Check!

Find the output of the following code:

state = 'arizona'
print(state[3])
print(state[-5])

Solution

z
i

String Slices#

You can access more than one character of a string by using index numbers.

  • It is in the form of string[start: end] with inclusive start and exclusive end.

  • Use : (colon) inside square brackets between the start and end indexes.

  • It consists of characters starting with index start up to the character with index end-1.

  • The character with index end is not included.

  • It returns a substring.

  • Examples:

    • string[2:5] returns characters with indexes 2, 3, 4 (5 is not included).

    • string[-4:-1] returns characters with indexes -4, -3, -2 (-1 is not included).

The list of positive and negative indexes for each character in the word "CALIFORNIA":Index of C is 0 or -10,Index of the first A is 1 or -9,Index of L is 2 or -8,Index of the first I is 3 or -7,Index of F is 4 or -6,Index of O is 5 or -5,Index of R is 6 or -4,Index of N is 7 or -3,Index of the second I is 8 or -2,Index of the second A is 9 or -1.

state = 'CALIFORNIA'
print(state[2:5])  # index=2,3,4
LIF
state = 'CALIFORNIA'
print(state[-4:-1])  # index=-4,-3,-2
RNI

Default start and end values:

  • string[:end]: the default value of start is 0, which means it starts from the very beginning.

    • For example, string[:5] returns characters with indexes 0, 1, 2, 3, 4 (5 is not included).

  • string[start:]: the default value of end is the length, which means it goes all the way to the end.

    • For example, string[2:] returns characters with indexes 2, 3, 4, 5, 6, 7, 8, 9 (all characters starting from index 2).

  • string[:]: starting from the very beginning and going all the way to the end, representing the whole string.

state = 'CALIFORNIA'
print(state[2:])  # index = 2,3,4,5,6,7,8,9
LIFORNIA
state = 'CALIFORNIA'
print(state[:5])  # index = 0,1,2,3,4
CALIF
state = 'CALIFORNIA'
print(state[:])  # index = all of them = 0,1,2,...,9
CALIFORNIA

Quick Check!

Find the output of the following code:

state = 'arizona'
print(state[3:5])
print(state[2:])
print(state[:3])

Solution

zo
izona
ari

Step parameter

Slicing can also be done by taking steps in the form of: string[start: end: step].

  • string[start: end: step] means starting with the character at index = start up to the character at index = length - 1, as before, but not necessarily including all characters between them.

  • The first index is start, the second index is start + step, and the third index is start + 2 * step.

  • It continues in this way, but the largest index can be at most length - 1.

  • step can also be considered as an increment, but it can also be a negative number.

  • The default value of step is 1.

state = 'CALIFORNIA'
print(state[2:7:2])  # index = 2,2+2=4, 4+2=6
LFR
state = 'CALIFORNIA'
print(state[1:8:3])  # index = 1,1+3=4, 4+3=7
AFN
state = 'CALIFORNIA'
print(state[7:2:-2])  # index = 7,7+(-2)=5,5+(-2)=3 
NOI
state = 'CALIFORNIA'
print(state[-8:-2:3])  # index = -8, -8+3=-5
LO

Warning:

  • For negative step default value of start is 9 (-1)

  • For negative step default value of end is 0 (-10)

state = 'CALIFORNIA'
print(state[::-1])  # index = 9,8,...,0  
AINROFILAC
state = 'CALIFORNIA'
print(state[-3::-1])  # index = -3,-4,...,-10
NROFILAC
state = 'CALIFORNIA'
print(state[:-4:-1])  # index = 9,8,7 or -1,-2,-3
AIN

Quick Check!

Find the output of the following code:

state = 'arizona'
print(state[1::2])

Solution

rzn

String module#

It contains functions to process strings, as well as some constants.

  • Use help(string) for more explanations.

# constants and functions

import string
print(dir(string))
['Formatter', 'Template', '_ChainMap', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_re', '_sentinel_dict', '_string', 'ascii_letters', 'ascii_lowercase', 'ascii_uppercase', 'capwords', 'digits', 'hexdigits', 'octdigits', 'printable', 'punctuation', 'whitespace']
# lowercase letters
print(string.ascii_lowercase)
abcdefghijklmnopqrstuvwxyz
# lowercase and uppercase letters
print(string.ascii_letters)
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
# digits
print(string.digits)
0123456789
# punctuations
print(string.punctuation)
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~

Immutable#

Strings are immutable, which means they cannot be modified.

  • For example, if you try to change the first character of ‘CALIFORNIA’, you will get an error message.

# ERROR:  try to change the first character of state
state = 'CALIFORNIA'
state[0] = 'R'  

You can use the state variable to produce a new string without changing the original one. In the following code:

  • The value of the variable new_state is the concatenation of the string ‘R’ and a slice of state starting from the character at index 1 and continuing to the end.

state = 'CALIFORNIA'
new_state = 'R' + state[1:]   

print(new_state)
RALIFORNIA
  • In the following code, a new value is assigned to the variable state. The string ‘CALIFORNIA’ is not modified.

state = 'CALIFORNIA'
state = 'R' + state[1:]   

print(state)
RALIFORNIA

in and not in#

These operators are used to check if a character or slice is present in a string.

  • They return a boolean value: True or False.

  • Python is case-sensitive.

print( 'a' in 'FLORIDA' )  # 'a' is not in FLORIDA
False
print( 'a' not in 'FLORIDA' )  # 'a' is not in FLORIDA
True
print( 'A' in 'FLORIDA' )  # 'A' is in FLORIDA
True
print( 'A' not in 'FLORIDA' )  # 'A' is not in FLORIDA
False
print( 'ORI'  in 'FLORIDA' )  # 'ORI' is in FLORIDA
True
print( 'ORI' not in 'FLORIDA' )  # 'ORI' is not in FLORIDA
False

Quick Check!

Write a program to verify if the character 'r' exists in the word 'Radio'.

Solution

print('r' in 'Radio')

String Methods#

String methods do not modify the original string because strings are immutable.

  • String methods return a new value.

  • If you run dir(str), you will see that there are many methods because there can be so many things that can be done with strings.

  • We will cover some of them here, but you can check help(str) for more details.

print(dir(str))
['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'removeprefix', 'removesuffix', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

capitalize()#

  • Produces a duplicate of the string where only the initial character is in uppercase, while all other characters are converted to lowercase.

text = 'tOm aNd jerRy.'
print(text.capitalize())   # 't' is capitalized, a new string is produced
print(text)                # no change on text (immutable)
Tom and jerry.
tOm aNd jerRy.

upper()#

  • Produces a duplicate of the string where all characters are converted to uppercase.

text = 'tOm aNd jerRy.'
print(text.upper())        # all characters are in uppercase
print(text)                # no change on text (immutable)
TOM AND JERRY.
tOm aNd jerRy.

lower()#

  • Produces a duplicate of the string where all characters are converted to lowercase.

text = 'tOm aNd jerRy.'
print(text.lower())        # all characters are in lowercase
print(text)                # no change on text (immutable)
tom and jerry.
tOm aNd jerRy.

title()#

  • Produces a duplicate of the string where all words are capitalized.

text = 'tOm aNd jerRy.'
print(text.title())        # all words are capitalized
print(text)                # no change on text (immutable)
Tom And Jerry.
tOm aNd jerRy.

Quick Check!

Display the capitalized, uppercase, and lowercase forms of the string 'aBcDef'

Solution

print('aBcDef'.capitalize())
print('aBcDef'.upper())
print('aBcDef'.lower())

find()#

  • It provides the earliest occurrence of a given substring within a string.

  • It returns the lowest index.

  • If the substring is not present, it returns -1.

  • Additionally, you have the option to begin the search from a specific character to find the index of the given substring.

    • find('a', N): find index of first ‘a’ starting from index=N

    • default value of N is 0

state = 'CALIFORNIA'
print(state.find('L')) # index of 'L' is 2
2
  • In the following code, the value -1 indicates that ‘W’ does not exist. It does not represent an index.

state = 'CALIFORNIA'
print(state.find('W')) 
-1
state = 'CALIFORNIA'
print(state.find('A')) # index of first 'A'
1
  • In the following code, the find() method returns the index of the first occurrence of ‘A’, starting from the character at index 3.

state = 'CALIFORNIA'
print(state.find('A', 3)) 
9
# 'FOR' begins from the character at index 4.

state = 'CALIFORNIA'
print(state.find('FOR')) 
4
# 'WE' does not exist in CALIFORNIA

state = 'CALIFORNIA'
print(state.find('WE')) 
-1

Quick Check!

Write a program that returns the index of the second occurrence of ‘A’ in the word ‘CALIFORNIA’.

Solution

'CALIFORNIA'.find('A', 2)

rfind()#

  • It returns the maximum index in a string where the substring is located.

state = 'CALIFORNIA'
print(state.find('A'))   # index of first 'A'
print(state.rfind('A'))  # index of last 'A'
1
9

strip(), rstrip(), lstrip()#

  • strip(): Removes white spaces from the beginning and end of a string.

  • rstrip(): Removes white spaces from the end of a string.

  • lstrip(): Removes white spaces from the beginning of a string.

country = '  GERMANY   '
print('---' + country + '---')          
print('---' + country.strip() + '---')     # white spaces on the left and right are removed
print('---' + country.rstrip()+ '---')     # white spaces on the right are removed
print('---' + country.lstrip()+ '---')     # white spaces on the left  are removed
---  GERMANY   ---
---GERMANY---
---  GERMANY---
---GERMANY   ---

startswith()#

  • It returns True if the string starts with the specified prefix; otherwise, it returns False.

# 'CALIFORNIA' does not startswith 'H'

state = 'CALIFORNIA'
print(state.startswith('H')) 
False
# 'CALIFORNIA' does startswith 'C'

state = 'CALIFORNIA'
print(state.startswith('C')) 
True

endswith()#

  • It returns True if the string end with the specified prefix; otherwise, it returns False.

# 'CALIFORNIA' does not endswith 'H'

state = 'CALIFORNIA'
print(state.endswith('H')) 
False
# 'CALIFORNIA' does endswith 'A'

state = 'CALIFORNIA'
print(state.endswith('A')) 
True

Quick Check!

Write a program to verify

  • if the first letter of the word 'table' is 't'

  • if the last letter of the word 'table' is 'k'

Solution

print('table'.startswith('t'))
print('table'.endswith('k'))

count()#

  • Returns the number of non-overlapping occurrences of a substring within a string

# number of 'A's in 'CALIFORNIA'

state = 'CALIFORNIA'
print(state.count('A'))   
2
# number of 'I's in CALIFORNIA'

state = 'CALIFORNIA'
print(state.count('I'))   
2
 # number of 'C's in 'CALIFORNIA'

state = 'CALIFORNIA'
print(state.count('C'))  
1
# number of 'C's in 'CALIFORNIA'

state = 'CALIFORNIA'
print(state.count('W'))   
0

Quick Check!

Find the number of occurences of 'a' in 'abbabba'

Solution

print('abbabba'.count('a'))

isdigit()#

  • Returns True if the string consists of digits, False otherwise.

# not all characters are digits

print('hello'.isdigit())         
False
# all characters are digits

print('123456'.isdigit())       
True
# not all characters are digits

print('h1234'.isdigit())       
False

isalpha()#

  • Returns True if the string consists of alphabetic characters, False otherwise.

# all characters are alphabetic

print('hello'.isalpha())         
True
# not all characters are alphabetic

print('123456'.isalpha())        
False
# not all characters are alphabetic

print('h1234'.isalpha())         
False

replace()#

  • Returns a duplicate with all occurrences of the old substring replaced by the new one.

  • It is in the form of replace(old, new)

print(state)
print(state.replace('A', 'W')) # replace 'A' by 'W'
CALIFORNIA
CWLIFORNIW
print(state)
print(state.replace('T', 'W')) # no 'T' to replace by 'W'
CALIFORNIA
CALIFORNIA
print(state)
print(state.replace('LI', '***')) # replace 'LI' by '***'
CALIFORNIA
CA***FORNIA

swapcase()#

  • Transform uppercase characters to lowercase and lowercase characters to uppercase.

name = 'aRThUr'

print(name)
print(name.swapcase()) # 'a' becomes 'A', 'R' becomes 'r', and so on
aRThUr
ArtHuR

join()#

  • Concatenate a list of strings.

  • Insert the string, whose method is called, between each given string.

  • Return the result as a new string.

  • Example: '--'.join(['ab', 'pq', 'rs']) returns 'ab--pq--rs'

print('--'.join(['ab', 'pq', 'rs']))
ab--pq--rs

STRING METHODS

String

Method

Output

‘hoW are yOu’

capitalize()

‘How are you’

‘hoW are yOu’

upper()

‘HOW ARE YOU’

‘hoW are yOu’

lower()

‘how are you’

‘hoW are yOu’

title()

‘How Are You’

‘eraser’

find(‘s’)

3

‘eraser’

find(‘e’)

0

‘eraser’

find(‘e’,2)

4

‘eraser’

find(‘w’)

-1

‘eraser’

rfind(‘s’)

3

‘eraser’

rfind(‘e’)

4

‘  table      ‘

strip()

‘table’

‘  table      ‘

rstrip()

‘  table’

‘  table      ‘

lstrip()

‘tablenbsp;    ‘

‘desk’

startswith(‘d’)

True

‘desk’

startswith(‘w’)

False

‘desk’

endswith(‘k’)

True

‘desk’

endswith(‘x’)

False

‘cross-section’

count(‘s’)

3

‘cross-section’

count(‘w’)

0

‘1234’

isdigit()

True

‘1234s’

isdigit()

False

‘chair’

isalpha()

True

‘chair1’

isalpha()

False

‘python’

replace(‘p’, ‘r’)

‘rython’

‘python’

replace(‘w’, ‘r’)

‘python’

‘cAr’

swapcase()

‘CaR’

‘???’

join([‘a’,’b’,’c’])

‘a???b???c’

Parsing Strings#

By using string methods, you can analyze a string and extract meaningful information about the string.

  • You can also perform specific operations based on the structure and content of the string.

  • Example:

    • From the given message below, extract the company name and capitalize it.

    • The company name is between the characters @ and .

    • You can use the find() method to find the indexes of these two characters.

    • There are multiple . characters, so we need to find the first one that comes after @.

message = 'Hello. My name is Tom. I live in California. My email address is tom@tesla.com. I will be in NY next week.'
index_at = message.find('@')                    # index of @
index_period = message.find('.', index_at)      # index of first . after @ 
  • To extract the company name, we need to use slicing.

  • Slicing must start from index_at + 1 because if you start from index_at, the slice will include @.

  • Slicing must end at index_period because the end index is not included.

print(message[index_at+1:index_period].capitalize())
Tesla