Skip to content

3. The Basics of Python

Image

3.1. Script [bases_01]: Basic Operations

The script [bases_01] introduces the basic features of Python.


# ----------------------------------
def display(string):
    # displays string
    print("string=%s" % string)


# ----------------------------------
def display_type(variable):
    # displays the variable type
    print("type[%s]=%s" % (variable, type(variable)))


# ----------------------------------
def f1(param):
    # adds 10 to param
    return param + 10


# ----------------------------------
def f2():
    # returns a tuple of 3 values
    return "one", 0, 100


# -------------------------------- main program ------------------------------------
# This is a comment
# variable used without being declared
name = "dupont"

# a screen display
print("name=%s" % name)

# a list with elements of different types
list = ["one", "two", 3, 4]

# its number of elements
n = len(list)

# a loop
for i in range(n):
    print("list[%d]=%s" % (i, list[i]))

# initializing 2 variables with a tuple
(string1, string2) = ("string1", "string2")

# concatenating the two strings
string3 = string1 + string2

# display result
print("[%s,%s,%s]" % (string1, string2, string3))

# using the function
display(string1)

# the type of a variable can be determined
display_type(n)
display_type(string1)
display_type(list)

# the type of a variable can change during execution
n = "has changed"
display_type(n)

# a function can return a result
res1 = f1(4)
print("res1=%s" % res1)

# a function can return a list of values
(res1, res2, res3) = f2()
print("(res1,res2,res3)=[%s,%s,%s]" % (res1, res2, res3))

# we could have stored these values in a variable
list = f2()
for i in range(len(list)):
    print("list[%s]=%s" % (i, list[i]))

# some tests
for i in range(len(list)):
    # only displays strings
    if type(list[i]) == "str":
        print("list[%s]=%s" % (i, list[i]))

# more tests
for i in range(len(list)):
    # only displays integers >10
    if type(list[i]) == "int" and list[i] > 10:
        print("list[%s]=%s" % (i, list[i]))

# a while loop
list = (8, 5, 0, -2, 3, 4)
i = 0
sum = 0
while i < len(list) and list[i] > 0:
    print("list[%s]=%s" % (i, list[i]))
    sum += list[i]   # sum = sum + list[i]
    i += 1  # i=i+1
print("sum=%s" % sum)
# end of program

Comments

  • line 2: the keyword def defines a function;
  • line 2: the function receives the parameter [string]. The parameter type is not specified. Python uses pass-by-value exclusively. This differs depending on the data:
    • for a simple type (number, boolean, etc.), this value is the value encapsulated by the data (4, True, etc.);
    • for a complex type (list, class, etc.), this value is the address of the data;
  • lines 3–4: the body of the function. It is indented one tab to the right. It is this indentation, combined with the : character of the def statement, that defines the function’s body. This applies to all statements with a body: if, else, while, for, try, except;
  • line 4: the syntax used here is [print('text1%F1text2%F2…' % data1, data2)]:
    • the [%Fi] (here %s) are display formats:
    • %s (string): for a string;
    • %d (decimal): for signed decimal integers;
    • %f (float): decimal format for real numbers;
    • %e (exponential): exponential format for a real number;
  • [data1, data2…] are the expressions whose values you want to display:
    • [data1] will be displayed using the F1 format;
    • [data2] will be displayed using the F2 format;
  • line 10: Python manages variable types internally. You can determine a variable’s type using the type(variable) function, which returns a variable of type 'type'. The expression '%s' % (type(variable)) is a string representing the variable’s type;
  • line 25: the main program. This usually (but not necessarily) comes after the definition of all the script’s functions. Its content is not indented;
  • line 28: In Python, variables are not declared. Python is case-sensitive. The variable `Nom` is different from the variable `nom`. A string can be enclosed in double quotes " or single quotes '. You can therefore write `'dupont'` or `"dupont"`;
  • line 34: there is a difference between a tuple (1,2,3) (note the parentheses) and a list [1,2,3] (note the square brackets). A tuple is immutable, whereas a list is mutable. In both cases, element number i is denoted as [i];
  • Line 40: range(n) is the tuple (0, 1, 2, …, n-1);
  • line 41: the %d format is used for signed integers;
  • line 74: len(var) is the number of elements in the var collection (tuple, list, dictionary, etc.);
  • line 84: the [for in …] structure allows you to iterate over an iterable structure. Lists and tuples are iterable elements;
  • line 86: the other Boolean operators are or and not;
  • line 93: sums the numbers greater than 0 in the list;

The screen output is as follows:


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_01.py
name=dupont
list[0] = one
list[1] = two
list[2] = 3
list[3] = 4
[string1, string2, string1string2]
string = string1
type[4] = <class 'int'>
type[string1] = <class 'str'>
type[['one', 'two', 3, 4]] = <class 'list'>
type[has changed] = <class 'str'>
res1 = 14
(res1, res2, res3) = [one, 0, 100]
list[0] = one
list[1] = 0
list[2] = 100
list[0] = 8
list[1] = 5
sum = 13

Process finished with exit code 0

3.2. Script [bases_02]: Formatted Strings

Python 3 introduced a new way to format strings:


# formatted strings
# the formats are those of the C language
# integer
int1 = 10
print(f"[int1={int1}]")
print(f"[int1={int1:4d}]")
print(f"[int1={int1:04d}]")
# float
float1 = 8.2
print(f"[float1={float1}]")
print(f"[float1={float1:8.2f}]")
print(f"[float1={float1:.3e}]")
# string
str1="abcd"
print(f"[str1={str1}]")
print(f"[str1={str1:8s}]")
str2="jean de florette"
print(f"[{str2:20.10s}]")
# Formatted strings can be assigned to variables
str3 = f"[{str2:20.10s}]"
print(str3)

The syntax for the formatted string is as follows:

f'…{expr1:format1} …. {expr2:format2} ….'

where:

  • [expr]: an expression;
  • [formati]: the format of the expression [expri]. These formats are those of the C language:
    • %d: for integers;
    • %f: decimal notation for real numbers;
    • %e: exponential notation for real numbers;
    • %s: for character strings. This is the format used when no format is specified for [expr];
    • %nd, %nf, %ns: displays [expri] over n characters: the string is either truncated or padded with spaces;
  • line 7: [04d], 4-character integer padded on the left with zeros;
  • line 11: [8.2f], 8-character decimal floating-point number with 2 digits after the decimal point;
  • line 12: [.3e], a floating-point number in exponential form with 3 decimal places for the mantissa;
  • line 18: [20.10s], the first 10 characters of a string padded with spaces to make 20 characters;

The results of the execution are as follows:

C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_02.py
[int1=10]
[int1=  10]
[int1=0010]
[float1=8.2]
[float1=    8.20]
[float1=8.200e+00]
[str1=abcd]
[str1=abcd    ]
[jean de fl          ]
[jean de fl          ]

Process finished with exit code 0

3.3. Script [bases_03]: Type conversions

Here we focus on type conversions involving data of type str (string), int (integer), float (floating-point), and bool (boolean).


# type conversions
# int --> str, float, bool
x = 4
print(x, type(x))
x = str(4)
print(x, type(x))
x = float(4)
print(x, type(x))
x = bool(4)
print(x, type(x))

# bool --> int, float, str
x = True
print(x, type(x))
x = int(True)
print(x, type(x))
x = float(True)
print(x, type(x))
x = str(True)
print(x, type(x))

# str --> int, float, bool
x = "4"
print(x, type(x))
x = int("4")
print(x, type(x))
x = float("4")
print(x, type(x))
x = bool("4")
print(x, type(x))

# float --> str, int, bool
x = 4.32
print(x, type(x))
x = str(4.32)
print(x, type(x))
x = int(4.32)
print(x, type(x))
x = bool(4.32)
print(x, type(x))

# Handling type conversion errors
try:
    x = int("abc")
    print(x, type(x))
except ValueError as error:
    print(error)

# various cases
x = bool("abc")
print(x, type(x))
x = bool("")
print(x, type(x))
x = bool(0)
print(x, type(x))
x = None
print(x, type(x))
x = bool(None)
print(x, type(x))
x = bool(0.0)
print(x, type(x))

# All data are class instances and, as such, have methods
# string
str1 = "abc"
print(str1.capitalize())
# integer
int1 = 4
print(int1.bit_length())
# boolean
bool1 = True
print(bool1.conjugate())
# floating-point number
float1 = 8.2
print(float1.is_integer())

Many type conversions are possible. Some may fail, such as those in lines 46–47, which attempt to convert the string 'abc' into an integer. We handled the error using a try/except block. A general form of this block

is as follows:


try:
    actions
except Exception as ex:
    actions
finally:
    actions

If any of the actions in the try block throw an exception (signal an error), control immediately jumps to the except clause. If the actions in the try block do not throw an exception, the except clause is ignored. The Exception and ex attributes of the except statement are optional. When present, Exception specifies the type of exception intercepted by the except statement, and ex contains the exception that occurred. There can be multiple except statements if you want to handle different types of exceptions within the same try block.

The finally statement is optional. If present, the actions in the finally block are always executed, regardless of whether an exception occurred or not.

We’ll come back to exceptions a little later.

Lines 49–61 show various attempts to convert data of type str, int, float, and NoneType to boolean. This is always possible. The rules are as follows:

  • bool(int i) is False if i is 0, True in all other cases;
  • bool(float f) is False if f is 0.0, True in all other cases;
  • bool(str string) is False if string has 0 characters, True in all other cases;
  • bool(None) is False. None is a special value that means the variable exists but has no value.

The screen output is as follows:

C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_03.py
4 <class 'int'>
4 <class 'str'>
4.0 <class 'float'>
True <class 'bool'>
True <class 'bool'>
1 <class 'int'>
1.0 <class 'float'>
True <class 'str'>
4 <class 'str'>
4 <class 'int'>
4.0 <class 'float'>
True <class 'bool'>
4.32 <class 'float'>
4.32 <class 'str'>
4 <class 'int'>
True <class 'bool'>
invalid literal for int() with base 10: 'abc'
True <class 'bool'>
False <class 'bool'>
False <class 'bool'>
None <class 'NoneType'>
False <class 'bool'>
False <class 'bool'>
Abc
3
1
False

Process finished with exit code 0

Note that all data are objects, i.e., class instances. This means they can have methods. This is shown in lines 63–75 of the code. We are not trying here to explain what the methods do, but simply to show that they exist.

3.4. Script [bases_04]: variable scope

The script [bases_04] shows that Python does not have the concept of block-scoped variables:


# variable scope
i = 4
if True:
    i += 1
    j = 7
print(f"i={i}, j={j}")

Results

1
2
3
4
C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_04.py
i=5, j=7

Process finished with exit code 0

Comments

The results show two things:

  • line 4: the variable [i] in the [if] block is the same as the variable i used on line 2;
  • line 6: the variable [j] is the one initialized in the [if] block;

In some languages, where variables are declared, a variable defined within a block (such as the one in lines 3–5) is not known outside of it. In Python, this is not the case.

3.5. Script [bases_05]: Lists - 1

The script [bases_05] is as follows:


# 1-dimensional lists
# initialization
list1 = [0, 1, 2, 3, 4, 5]

# iteration - 1
print(f"list1 has {len(list1)} elements")
for i in range(len(list1)):
    print(f"list1[{i}]={list1[i]}")

list1[1] = 10
# Iteration - 2
print(f"list1 has {len(list1)} elements")
for element in list1:
    print(element)

# adding two elements
list1[len(list1):] = [10, 11]
# the %s format displays the list on a single line
print("%s" % list1)

# removing the last two elements
list1[len(list1) - 2:] = []
# The default format displays the list on a single line
print(f"{list1}")

# Add a list to the beginning of the list
list1[:0] = [-10, -11, -12]
print(f"{list1}")

# Insert two elements in the middle of the list
list1[3:3] = [100, 101]
print(f"{list1}")

# Remove two elements from the middle of the list
list1[3:4] = []
print(f"{list1}")

Notes:

  • the notation array[i:j] refers to elements i through j-1 of the array;
  • the notation [i:] refers to elements i and subsequent elements of the array;
  • the notation [:i] refers to elements 0 through i-1 of the array;
  • line 19: print (%s) % (list1) displays the string: "[ list1[0], list1[2]…, list1[n-1]]";
  • line 24: the notation print ('f{list1}') does the same thing;

Results


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_05.py
list1 has 6 elements
list1[0] = 0
list1[1] = 1
list1[2] = 2
list1[3] = 3
list1[4] = 4
list1[5] = 5
list1 has 6 elements
0
10
2
3
4
5
[0, 10, 2, 3, 4, 5, 10, 11]
[0, 10, 2, 3, 4, 5]
[-10, -11, -12, 0, 10, 2, 3, 4, 5]
[-10, -11, -12, 100, 101, 0, 10, 2, 3, 4, 5]
[-10, -11, -12, 101, 0, 10, 2, 3, 4, 5]

Process finished with exit code 0

3.6. Script [bases_06]: lists - 2

The previous code can be written differently (bases_06) using certain list methods:


# 1-dimensional lists

# initialization
list1 = [0, 1, 2, 3, 4, 5]

# iteration - 1
print(f"list1 has {len(list1)} elements")
for i in range(len(list1)):
    print(f"list1[{i}]={list1[i]}")

# modifying an element
list1[1] = 10

# iteration - 2
print(f"list1 has {len(list1)} elements")
for element in list1:
    print(element)

# adding two elements
list1.extend([10, 11])
print(f"{list1}")

# Remove the last two elements
del list1[len(list1) - 2:]
print(f"{list1}")

# Add a tuple to the beginning of the list
for i in (-12, -11, -10):
    list1.insert(0, i)
print(f"{list1}")

# Insertion in the middle of the list
for i in (101, 100):
    list1.insert(3, i)
print(f"{list1}")

# removal from the middle of the list
del list1[3:4]
print(f"{list1}")

The results are the same as in the previous version.

3.7. script [bases_07]: the dictionary

The script [bases_07] shows how to define and use a dictionary, sometimes called an associative array.


# a function that checks if the key husband exists in the spouses dictionary
def exists(spouses, husband):
    if husband in spouses:
        print(f"The key [{husband}] exists associated with the value [{spouses[husband]}]")
    else:
        print(f"The key [{husband}] does not exist")


# ----------------------------- Main
# a dictionary
spouses = {"Pierre": "Gisèle", "Paul": "Virginie", "Jacques": "Lucette", "Jean": ""}

# iteration - 1
print(f"Number of elements in the dictionary: {len(couples)}")
for (key, value) in couples.items():
    print(f"pairs[{key}]={value}")

# list of dictionary keys
print("list of keys-------------")
keys = conjoints.keys()
print(f"{keys}")

# list of dictionary values
print("list of values------------")
values = conjunctions.values()
print(f"{values}")

# Search for a key
exists(spouses, "Jacques")
exists(spouses, "Lucette")
exists(spouses, "Jean")

# Delete a key-value pair
del (couples["Jean"])
print(f"Number of elements in the dictionary: {len(spouses)}")
print(f"{spouses}")

# The keys and values of a dictionary are not lists
print(f"Key type: {type(keys)}")
print(f"Type of values: {type(values)}")

# they can be converted to lists
keys = list(keys)
print(f"keys: {type(lkeys)}, {lkeys}")
vvalues = list(values)
print(f"values: {type(lvaleurs)}, {lvaleurs}")

Notes:

  • line 11: the hard-coded definition of a dictionary;
  • line 15: conjoints.items() returns the list of (key, value) pairs from the conjoints dictionary;
  • line 20: conjoints.keys() returns the keys of the conjoints dictionary;
  • line 25: conjoints.values() returns the values of the conjoints dictionary;
  • line 3: husband in spouses returns True if the key husband exists in the spouses dictionary, False otherwise;
  • line 36: A dictionary can be displayed on a single line.

Results


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_07.py
Number of elements in the dictionary: 4
spouses[Pierre]=Gisèle
spouses[Paul] = Virginie
spouses[Jacques] = Lucette
spouses[Jean]=
list of keys-------------
dict_keys(['Pierre', 'Paul', 'Jacques', 'Jean'])
list of values------------
dict_values(['Gisèle', 'Virginie', 'Lucette', ''])
The key [Jacques] exists and is associated with the value [Lucette]
The key [Lucette] does not exist
The key [Jean] exists and is associated with the value []
Number of elements in the dictionary: 3
{'Pierre': 'Gisèle', 'Paul': 'Virginie', 'Jacques': 'Lucette'}
Key type: <class 'dict_keys'>
Value type: <class 'dict_values'>
keys: <class 'list'>, ['Pierre', 'Paul', 'Jacques']
values: <class 'list'>, ['Gisèle', 'Virginie', 'Lucette']

Process finished with exit code 0

Notes:

  • Note that in lines 16–17 of the results, the keys and values of a dictionary do not form a list but a 'dict_keys' type;
  • lines 18–19: a simple type cast converts them to a [list] type;

3.8. script [bases_08]: tuples

A tuple is similar to a list but is immutable:


# tuples
# initialization
tuple1 = (0, 1, 2, 3, 4, 5)

# iteration - 1
print(f"tuple1 has {len(tuple1)} elements")
for i in range(len(tuple1)):
    print(f"tuple1[{i}]={tuple1[i]}")

# iteration - 2
print(f"tuple1 has {len(tuple1)} elements")
for element in tuple1:
    print(element)

# a tuple can be displayed on a single line
print(f"tuple1={tuple1}")

# A tuple cannot be modified
tuple1[0] = 10

Results


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_08.py
tuple1 has 6 elements
tuple1[0]=0
tuple1[1] = 1
tuple1[2] = 2
tuple1[3] = 3
tuple1[4] = 4
tuple1[5] = 5
tuple1 has 6 elements
0
1
2
3
4
5
tuple1 = (0, 1, 2, 3, 4, 5)
Traceback (most recent call last):
  File "C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_08.py", line 19, in <module>
    tuple1[0] = 10
TypeError: 'tuple' object does not support item assignment

Process finished with exit code 1

Notes:

  • Lines 17–20 of the output: show that a tuple cannot be modified.

3.9. Script [bases_09]: Multidimensional lists and dictionaries

The script [bases_09] demonstrates how to define and use a multidimensional list or dictionary:


# multidimensional lists
# initialization
multi = [[0, 1, 2], [10, 11, 12, 13], [20, 21, 22, 23, 24]]

# iteration
for i1 in range(len(multi)):
    for i2 in range(len(multi[i1])):
        print(f"multi[{i1}][{i2}]={multi[i1][i2]}")

# display on one line
print(f"multi={multi}")

# Multidimensional dictionaries
# initialization
multi = {"zero": [0, 1], "one": [10, 11, 12, 13], "two": [20, 21, 22, 23, 24]}

# iteration
for (key, value) in multi.items():
    for i2 in range(len(value)):
        print(f"multi[{key}][{i2}]={multi[key][i2]}")

# display on one line
print(f"multi={multi}")

Comments

  • line 7: multi[i1] is a list;
  • line 18: value is a list;

Results


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_09.py
multi[0][0]=0
multi[0][1]=1
multi[0][2]=2
multi[1][0]=10
multi[1][1]=11
multi[1][2]=12
multi[1][3]=13
multi[2][0]=20
multi[2][1]=21
multi[2][2]=22
multi[2][3]=23
multi[2][4]=24
multi = [[0, 1, 2], [10, 11, 12, 13], [20, 21, 22, 23, 24]]
multi[zero][0]=0
multi[zero][1]=1
multi[one][0]=10
multi[one][1]=11
multi[one][2]=12
multi[one][3]=13
multi[two][0]=20
multi[two][1]=21
multi[two][2]=22
multi[two][3]=23
multi[two][4] = 24
multi={'zero': [0, 1], 'one': [10, 11, 12, 13], 'two': [20, 21, 22, 23, 24]}

Process finished with exit code 0

The [bases_10] script demonstrates how to extract elements from a string separated by a common delimiter into a list.


# string to list
string = '1:2:3:4'
list = string.split(':')
print(type(list))

# display list
print(f"list has {len(list)} elements")
print(f"list={list}")

# list to string
string2 = ":".join(list)
print(f"string2={string2}")

# let's add an empty field
string += ":"
print(f"string={string}")
list = string.split(":")

# display list
print(f"list has {len(list)} elements")
print(f"list={list}")

# let's add an empty field again
string += ":"
print(f"string={string}")
list = string.split(":")

# display list
print(f"list has {len(list)} elements")
print(f"list={list}")

Notes:

  • line 3: the method string.split(separator) splits the string string into elements separated by separator and returns them as a list. Thus, the expression '1:2:3:4'.split(":") returns the list ('1','2','3','4');
  • line 11: 'separator'.join(list) returns the string 'list[0]+separator+list[1]+separator+…'.

Results


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_10.py
<class 'list'>
list has 4 elements
list = ['1', '2', '3', '4']
string2 = 1:2:3:4
string = 1:2:3:4:
list with 5 elements
list = ['1', '2', '3', '4', '']
string = 1:2:3:4::
list with 6 elements
list=['1', '2', '3', '4', '', '']

Process finished with exit code 0

3.11. Script [bases_11]: Regular Expressions

The [bases_11] script demonstrates how to use regular expressions:


# import the regular expressions module
import re


# --------------------------------------------------------------------------
def compare(pattern, string):
    # compares the string [string] to the pattern [pattern]
    # display results
    print(f"\nResults({string},{pattern})")
    match = re.match(pattern, string)
    if match:
        print(match.groups())
    else:
        print(f"The string [{string}] does not match the pattern [{pattern}]")


# Regular expressions in Python
# extract the different fields from a string
# the pattern: a sequence of digits surrounded by any characters
# we only want to extract the sequence of digits
pattern = r"^.*?(\d+).*?$"

# We match the string against the pattern
compare(pattern, "xyz1234abcd")
compare(pattern, "12 34")
compare(pattern, "abcd")

# the pattern: a sequence of digits surrounded by any characters
# We want the sequence of digits as well as the fields that follow and precede it
pattern = r"^(.*?)(\d+)(.*?)$"

# we match the string against the pattern
compare(pattern, "xyz1234abcd")
compare(pattern, "12 34")
compare(pattern, "abcd")

# the pattern - a date in dd/mm/yy format
pattern = r"^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$"
compare(pattern, "10/05/97")
compare(pattern, " 04/04/01 ")
compare(pattern, "5/1/01")

# the pattern - a decimal number
pattern = r"^\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$"
compare(pattern, "187.8")
compare(pattern, "-0.6")
compare(pattern, "4")
compare(pattern, ".6")
compare(pattern, "4.")
compare(model, " + 4")
# end

Notes:

  • Note the [re] module imported on line 2. It contains the functions for handling regular expressions;
  • line 10: comparing a string to a regular expression (pattern) returns True if the string matches the pattern, False otherwise;
  • line 12: match.groups() is a tuple whose elements are the parts of the string that match the elements of the regular expression enclosed in parentheses. In the pattern:
    • ^.*?(\d+).*?, match.groups() will be a one-element tuple because there is one set of parentheses;
    • ^(.*?)(\d+)(.*?)$, match.groups() will be a 3-element tuple because there are three parentheses;
  • line 21: a literal regular expression is written as r"xxx". It is the r symbol that turns the string into a regular expression;

Regular expressions allow us to validate the format of a string. For example, we can verify that a string representing a date is in the dd/mm/yy format. To do this, we use a pattern and compare the string to that pattern. In this example, d, m, and y must be digits. The pattern for a valid date format is therefore "\d\d/\d\d/\d\d", where the symbol \d represents a digit. The symbols that can be used in a pattern are as follows:

Character
Description
\
Designates the following character as a special character or literal. For example, "n" corresponds to the character "n", while "\n" corresponds to a newline character. The sequence "\\" corresponds to "\", while "\(" corresponds to "(".
^
Matches the start of the string.
$
Matches the end of the string.
*
Matches the preceding character zero or more times. Thus, "zo*" matches "z" or "zoo".
+
Matches the preceding character, one or more times. Thus, "zo+" matches "zoo", but not "z".
?
Matches the preceding character zero or one time. For example, "a?ve?" matches "ve" in "lever".
.
Matches any single character, except the newline character.
(pattern)
Searches for the pattern and stores the match. The matching substring can be retrieved from the match.groups() collection. To find matches with characters inside parentheses ( ), use "\(" or "\)".
x|y
Matches either x or y. For example, "z|foot" matches "z" or "foot". "(z|f)oo" matches "zoo" or "foo".
{n}
n is a non-negative integer. Matches exactly n occurrences of the character. For example, "o{2}" does not match "o" in "Bob," but matches the first two "o"s in "fooooot".
{n,}
n is a non-negative integer. Matches at least n occurrences of the character. For example, "o{2,}" does not match "o" in "Bob," but matches all "o"s in "fooooot." "o{1,}" is equivalent to "o+" and "o{0,}" is equivalent to "o*".
{n,m}
m and n are non-negative integers. Matches at least n and at most m occurrences of the character. For example, "o{1,3}" matches the first three "o"s in "foooooot" and "o{0,1}" is equivalent to "o?".
[xyz]
Character set. Matches any of the specified characters. For example, "[abc]" matches "a" in "plat".
[^xyz]
Negative character set. Matches any character not listed. For example, "[^abc]" matches "p" in "plat".
[a-z]
Character range. Matches any character in the specified range. For example, "[a-z]" matches any lowercase alphabetical character between "a" and "z".
[^m-z]
Negative character range. Matches any character not in the specified range. For example, "[^m-z]" matches any character not between "m" and "z".
\b
Matches a word boundary, that is, the position between a word and a space. For example, "er\b" matches "er" in "lever," but not "er" in "verb."
\B
Matches a boundary that does not represent a word. "en*t\B" matches "ent" in "bien entendu".
\d
Matches a character representing a digit. Equivalent to [0-9].
\D
Matches a character that is not a digit. Equivalent to [^0-9].
\f
Matches a line break character.
\n
Equivalent to a newline character.
\r
Equivalent to a carriage return character.
\s
Matches any whitespace, including space, tab, page break, etc. Equivalent to "[ \f\n\r\t\v]".
\S
Matches any non-whitespace character. Equivalent to "[^ \f\n\r\t\v]".
\t
Matches a tab character.
\v
Matches a vertical tab character.
\w
Matches any character representing a word and including an underscore. Equivalent to "[A-Za-z0-9_]".
\W
Matches any character that does not represent a word. Equivalent to "[^A-Za-z0-9_]".
\num
Matches num, where num is a positive integer. Refers to stored matches. For example, "(.)\1" matches two consecutive identical characters.
\n
Matches n, where n is an octal escape value. Octal escape values must consist of 1, 2, or 3 digits. For example, "\11" and "\011" both match a tab character. "\0011" is equivalent to "\001" & "1". Octal escape values must not exceed 256. If they do, only the first two digits are taken into account in the expression. Allows ASCII codes to be used in regular expressions.
\xn
Corresponds to n, where n is a hexadecimal escape value. Hexadecimal escape values must consist of exactly two digits. For example, "\x41" corresponds to "A". "\x041" is equivalent to "\x04" & "1". Allows the use of ASCII codes in regular expressions.

An element in a pattern may appear once or multiple times. Let’s look at some examples involving the \d symbol, which represents a single digit:

pattern
meaning
\d
a digit
\d?
0 or 1 digit
\d*
0 or more digits
\d+
1 or more digits
\d{2}
2 digits
\d{3,}
at least 3 digits
\d{5,7}
between 5 and 7 digits

Now let’s imagine a model capable of describing the expected format for a string:

target string
pattern
a date in dd/mm/yy format
\d{2}/\d{2}/\d{2}
a time in hh:mm:ss format
\d{2}:\d{2}:\d{2}
an unsigned integer
\d+
a sequence of spaces, which may be empty
\s*
an unsigned integer that may be preceded or followed by spaces
\s*\d+\s*
an integer that may be signed and preceded or followed by spaces
\s*[+|-]?\s*\d+\s*
an unsigned real number that may be preceded or followed by spaces
\s*\d+(.\d*)?\s*
a real number that may be signed and preceded or followed by spaces
\s*[+-]?\s*\d+(.\d*)?\s*
a string containing the word "just"
\bjuste\b

You can specify where to search for the pattern in the string:

pattern
meaning
^pattern
the pattern starts the string
pattern$
the pattern ends the string
^pattern$
the pattern starts and ends the string
pattern
the pattern is searched for anywhere in the string, starting from the beginning.
Model sought
pattern
a string ending with an exclamation point
!$
a string ending with a period
\.$
a string beginning with the sequence //
^//
a string consisting of a single word, optionally preceded or followed by spaces
^\s*\w+\s*$
a string consisting of two words, optionally followed or preceded by spaces
^\s*\w+\s*\w+\s*$
a string containing the word secret
\bsecret\b

Sub-patterns of a pattern can be "extracted." Thus, not only can we verify that a string matches a particular pattern, but we can also extract from that string the elements corresponding to the sub-patterns of the pattern that have been enclosed in parentheses. For example, if we parse a string containing a date in the format dd/mm/yy and want to extract the dd, mm, and yy components of that date, we would use the pattern (\d\d)/(\d\d)/(\d\d).

Script results


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_11.py

Results(xyz1234abcd,^.*?(\d+).*?$)
('1234',)

Results(12 34,^.*?(\d+).*?$)
('12',)

Results(abcd,^.*?(\d+).*?$)
The string [abcd] does not match the pattern [^.*?(\d+).*?$]

Results(xyz1234abcd,^(.*?)(\d+)(.*?)$)
('xyz', '1234', 'abcd')

Results(12 34,^(.*?)(\d+)(.*?)$)
('', '12', ' 34')

Results(abcd,^(.*?)(\d+)(.*?)$)
The string [abcd] does not match the pattern [^(.*?)(\d+)(.*?)$]

Results(10/05/97,^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$)
('10', '05', '97')

Results( 04/04/01 ,^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$)
('04', '04', '01')

Results(5/1/01,^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$)
The string [5/1/01] does not match the pattern [^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$]

Results(187.8,^\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$)
('', '187.8')

Results(-0.6,^\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$)
('-', '0.6')

Results(4,^\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$)
('', '4')

Results(.6, '\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$)
('', '.6')

Results(4.,^\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$)
('', '4.')

Results( + 4, '\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$)
('+', '4')

Process finished with exit code 0