Skip to content

3. The Basics of Python

Image

3.1. Script [bases_01]: Basic Operations

The script [bases_01] introduces the basic features of Python.

# ----------------------------------
def affiche(chaine):
    #  chain poster
    print("chaine=%s" % chaine)


# ----------------------------------
def affiche_type(variable):
    #  displays variable type
    print("type[%s]=%s" % (variable, type(variable)))


# ----------------------------------
def f1(param):
    #  adds 10 to param
    return param + 10


# ----------------------------------
def f2():
    #  returns a tuple of 3 values
    return "un", 0, 100


#  -------------------------------- main program ------------------------------------
#  this is a comment
#  variable used without being declared
nom = "dupont"

#  a screen display
print("nom=%s" % nom)

#  a list with elements of different types
liste = ["un", "deux", 3, 4]

#  its number of elements
n = len(liste)

#  a loop
for i in range(n):
    print("liste[%d]=%s" % (i, liste[i]))

#  initialize 2 variables with a tuple
(chaine1, chaine2) = ("chaine1", "chaine2")

#  concatenation of the 2 strings
chaine3 = chaine1 + chaine2

#  result display
print("[%s,%s,%s]" % (chaine1, chaine2, chaine3))

#  use function
affiche(chaine1)

#  the type of a variable can be known
affiche_type(n)
affiche_type(chaine1)
affiche_type(liste)

#  the type of a variable can change at runtime
n = "a changé"
affiche_type(n)

#  a function can return a result
res1 = f1(4)
print("res1=%s" % res1)

#  a function can return a list of values
(res1, res2, res3) = f2()
print("(res1,res2,res3)=[%s,%s,%s]" % (res1, res2, res3))

#  we could have retrieved these values in a
liste = f2()
for i in range(len(liste)):
    print("liste[%s]=%s" % (i, liste[i]))

#  testing
for i in range(len(liste)):
    #  displays only channels
    if type(liste[i]) == "str":
        print("liste[%s]=%s" % (i, liste[i]))

#  other tests
for i in range(len(liste)):
    #  displays only integers >10
    if type(liste[i]) == "int" and liste[i] > 10:
        print("liste[%s]=%s" % (i, liste[i]))

#  a while loop
liste = (8, 5, 0, -2, 3, 4)
i = 0
somme = 0
while i < len(liste) and liste[i] > 0:
    print("liste[%s]=%s" % (i, liste[i]))
    somme += liste[i]  #  sum=sum+list[i]
    i += 1  #  i=i+1
print("somme=%s" % somme)
#  end of program

Comments

  • line 2: the keyword def defines a function;
  • line 2: the function receives the parameter [string]. The parameter type is not specified. Python uses pass-by-value exclusively. This differs depending on the data:
    • for a simple type (number, boolean, etc.), this value is the value encapsulated by the data (4, True, etc.);
    • for a complex type (list, class, etc.), this value is the address of the data;
  • lines 3–4: the body of the function. It is indented one tab to the right. It is this indentation, combined with the : character of the def statement, that defines the function’s body. This applies to all statements with a body: if, else, while, for, try, except;
  • line 4: the syntax used here is [print('text1%F1text2%F2…' % data1, data2)]:
    • the [%Fi] (here %s) are display formats:
    • %s (string): for a string;
    • %d (decimal): for signed decimal integers;
    • %f (float): decimal format for real numbers;
    • %e (exponential): exponential format for a real number;
  • [data1, data2…] are the expressions whose values you want to display:
    • [data1] will be displayed using the F1 format;
    • [data2] will be displayed using the F2 format;
  • line 10: Python manages variable types internally. You can determine a variable’s type using the type(variable) function, which returns a variable of type 'type'. The expression '%s' % (type(variable)) is a string representing the variable’s type;
  • line 25: the main program. This usually (but not necessarily) comes after the definition of all the script’s functions. Its content is not indented;
  • line 28: In Python, variables are not declared. Python is case-sensitive. The variable `Nom` is different from the variable `nom`. A string can be enclosed in double quotes " or single quotes '. You can therefore write `'dupont'` or `"dupont"`;
  • line 34: there is a difference between a tuple (1,2,3) (note the parentheses) and a list [1,2,3] (note the square brackets). A tuple is immutable, whereas a list is mutable. In both cases, element number i is denoted as [i];
  • Line 40: range(n) is the tuple (0, 1, 2, …, n-1);
  • line 41: the %d format is used for signed integers;
  • line 74: len(var) is the number of elements in the var collection (tuple, list, dictionary, etc.);
  • line 84: the [for in …] structure allows you to iterate over an iterable structure. Lists and tuples are iterable elements;
  • line 86: the other Boolean operators are or and not;
  • line 93: sums the numbers greater than 0 in the list;

The screen output is as follows:


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_01.py
nom=dupont
liste[0]=un
liste[1]=deux
liste[2]=3
liste[3]=4
[chaine1,chaine2,chaine1chaine2]
chaine=chaine1
type[4]=<class 'int'>
type[chaine1]=<class 'str'>
type[['un', 'deux', 3, 4]]=<class 'list'>
type[a changé]=<class 'str'>
res1=14
(res1,res2,res3)=[un,0,100]
liste[0]=un
liste[1]=0
liste[2]=100
liste[0]=8
liste[1]=5
somme=13
 
Process finished with exit code 0

3.2. Script [bases_02]: Formatted Strings

Python 3 introduced a new way to format strings:

#  formatting strings
#  formats are those of the C language
#  integer
int1 = 10
print(f"[int1={int1}]")
print(f"[int1={int1:4d}]")
print(f"[int1={int1:04d}]")
#  float
float1=8.2
print(f"[float1={float1}]")
print(f"[float1={float1:8.2f}]")
print(f"[float1={float1:.3e}]")
#  string
str1="abcd"
print(f"[str1={str1}]")
print(f"[str1={str1:8s}]")
str2="jean de florette"
print(f"[{str2:20.10s}]")
#  formatted strings can be assigned to variables
str3=f"[{str2:20.10s}]"
print(str3)

The syntax for the formatted string is as follows:

f'…{expr1:format1} …. {expr2:format2} ….'

where:

  • [expr]: an expression;
  • [formati]: the format of the expression [expri]. These formats are those of the C language:
    • %d: for integers;
    • %f: decimal notation for real numbers;
    • %e: exponential notation for real numbers;
    • %s: for character strings. This is the format used when no format is specified for [expr];
    • %nd, %nf, %ns: displays [expri] over n characters: the string is either truncated or padded with spaces;
  • line 7: [04d], 4-character integer padded on the left with zeros;
  • line 11: [8.2f], 8-character decimal floating-point number with 2 digits after the decimal point;
  • line 12: [.3e], a floating-point number in exponential form with 3 decimal places for the mantissa;
  • line 18: [20.10s], the first 10 characters of a string padded with spaces to make 20 characters;

The results of the execution are as follows:


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_02.py
[int1=10]
[int1=  10]
[int1=0010]
[float1=8.2]
[float1=    8.20]
[float1=8.200e+00]
[str1=abcd]
[str1=abcd    ]
[jean de fl          ]
[jean de fl          ]
 
Process finished with exit code 0

3.3. Script [bases_03]: Type conversions

Here we focus on type conversions involving data of type str (string), int (integer), float (floating-point), and bool (boolean).

#  type changes
#  int --> str, float, bool
x = 4
print(x, type(x))
x = str(4)
print(x, type(x))
x = float(4)
print(x, type(x))
x = bool(4)
print(x, type(x))

#  bool --> int, float, str
x = True
print(x, type(x))
x = int(True)
print(x, type(x))
x = float(True)
print(x, type(x))
x = str(True)
print(x, type(x))

#  str --> int, float, bool
x = "4"
print(x, type(x))
x = int("4")
print(x, type(x))
x = float("4")
print(x, type(x))
x = bool("4")
print(x, type(x))

#  float --> str, int, bool
x = 4.32
print(x, type(x))
x = str(4.32)
print(x, type(x))
x = int(4.32)
print(x, type(x))
x = bool(4.32)
print(x, type(x))

#  type change error handling
try:
    x = int("abc")
    print(x, type(x))
except ValueError as erreur:
    print(erreur)

#  various cases
x = bool("abc")
print(x, type(x))
x = bool("")
print(x, type(x))
x = bool(0)
print(x, type(x))
x = None
print(x, type(x))
x = bool(None)
print(x, type(x))
x = bool(0.0)
print(x, type(x))

#  all data are class instances and as such have their own methods
#  character string
str1 = "abc"
print(str1.capitalize())
#  whole number
int1 = 4
print(int1.bit_length())
#  boolean
bool1=True
print(bool1.conjugate())
#  real number
float1=8.2
print (float1.is_integer())

Many type conversions are possible. Some may fail, such as those in lines 46–47, which attempt to convert the string 'abc' into an integer. We handled the error using a try/except block. A general form of this block

is as follows:


try:
    actions
except Exception as ex:
    actions
finally:
    actions

If any of the actions in the try block throw an exception (signal an error), control immediately jumps to the except clause. If the actions in the try block do not throw an exception, the except clause is ignored. The Exception and ex attributes of the except statement are optional. When present, Exception specifies the type of exception intercepted by the except statement, and ex contains the exception that occurred. There can be multiple except statements if you want to handle different types of exceptions within the same try block.

The finally statement is optional. If present, the actions in the finally block are always executed, regardless of whether an exception occurred or not.

We’ll come back to exceptions a little later.

Lines 49–61 show various attempts to convert data of type str, int, float, and NoneType to boolean. This is always possible. The rules are as follows:

  • bool(int i) is False if i is 0, True in all other cases;
  • bool(float f) is False if f is 0.0, True in all other cases;
  • bool(str string) is False if string has 0 characters, True in all other cases;
  • bool(None) is False. None is a special value that means the variable exists but has no value.

The screen output is as follows:

C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_03.py
4 <class 'int'>
4 <class 'str'>
4.0 <class 'float'>
True <class 'bool'>
True <class 'bool'>
1 <class 'int'>
1.0 <class 'float'>
True <class 'str'>
4 <class 'str'>
4 <class 'int'>
4.0 <class 'float'>
True <class 'bool'>
4.32 <class 'float'>
4.32 <class 'str'>
4 <class 'int'>
True <class 'bool'>
invalid literal for int() with base 10: 'abc'
True <class 'bool'>
False <class 'bool'>
False <class 'bool'>
None <class 'NoneType'>
False <class 'bool'>
False <class 'bool'>
Abc
3
1
False

Process finished with exit code 0

Note that all data are objects, i.e., class instances. This means they can have methods. This is shown in lines 63–75 of the code. We are not trying here to explain what the methods do, but simply to show that they exist.

3.4. Script [bases_04]: variable scope

The script [bases_04] shows that Python does not have the concept of block-scoped variables:

1
2
3
4
5
6
#  variable scope
i = 4
if True:
    i += 1
    j = 7
print(f"i={i}, j={j}")

Results

1
2
3
4
C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_04.py
i=5, j=7

Process finished with exit code 0

Comments

The results show two things:

  • line 4: the variable [i] in the [if] block is the same as the variable i used on line 2;
  • line 6: the variable [j] is the one initialized in the [if] block;

In some languages, where variables are declared, a variable defined within a block (such as the one in lines 3–5) is not known outside of it. In Python, this is not the case.

3.5. Script [bases_05]: Lists - 1

The script [bases_05] is as follows:

#  1-dimensional lists
#  initialization
list1 = [0, 1, 2, 3, 4, 5]

#  routes - 1
print(f"list1 a {len(list1)} éléments")
for i in range(len(list1)):
    print(f"list1[{i}]={list1[i]}")

list1[1] = 10
#  routes - 2
print(f"list1 a {len(list1)} éléments")
for element in list1:
    print(element)

#  addition of two elements
list1[len(list1):] = [10, 11]
#  the %s format displays the list on one line
print("%s" % list1)

#  deletion of last two items
list1[len(list1) - 2:] = []
#  the default format displays the list on one line
print(f"{list1}")

#  add to the beginning of a list
list1[:0] = [-10, -11, -12]
print(f"{list1}")

#  mid-list insertion of two elements
list1[3:3] = [100, 101]
print(f"{list1}")

#  deletion of two items in the middle of a list
list1[3:4] = []
print(f"{list1}")

Notes:

  • the notation array[i:j] refers to elements i through j-1 of the array;
  • the notation [i:] refers to elements i and subsequent elements of the array;
  • the notation [:i] refers to elements 0 through i-1 of the array;
  • line 19: print (%s) % (list1) displays the string: "[ list1[0], list1[2]…, list1[n-1]]";
  • line 24: the notation print ('f{list1}') does the same thing;

Results


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_05.py
list1 a 6 éléments
list1[0]=0
list1[1]=1
list1[2]=2
list1[3]=3
list1[4]=4
list1[5]=5
list1 a 6 éléments
0
10
2
3
4
5
[0, 10, 2, 3, 4, 5, 10, 11]
[0, 10, 2, 3, 4, 5]
[-10, -11, -12, 0, 10, 2, 3, 4, 5]
[-10, -11, -12, 100, 101, 0, 10, 2, 3, 4, 5]
[-10, -11, -12, 101, 0, 10, 2, 3, 4, 5]
 
Process finished with exit code 0

3.6. Script [bases_06]: lists - 2

The previous code can be written differently (bases_06) using certain list methods:

#  1-dimensional lists

#  initialization
list1 = [0, 1, 2, 3, 4, 5]

#  routes - 1
print(f"list1 a {len(list1)} éléments")
for i in range(len(list1)):
    print(f"list1[{i}]={list1[i]}")

#  element modification
list1[1] = 10

#  routes - 2
print(f"list1 a {len(list1)} éléments")
for element in list1:
    print(element)

#  addition of two elements
list1.extend([10, 11])
print(f"{list1}")

#  deletion of last two items
del list1[len(list1) - 2:]
print(f"{list1}")

#  add a tuple at the beginning of the list
for i in (-12, -11, -10):
    list1.insert(0, i)
print(f"{list1}")

#  mid-list insertion
for i in (101, 100):
    list1.insert(3, i)
print(f"{list1}")

#  mid-list deletion
del list1[3:4]
print(f"{list1}")

The results are the same as in the previous version.

3.7. script [bases_07]: the dictionary

The script [bases_07] shows how to define and use a dictionary, sometimes called an associative array.

#  a function that checks whether the husband key exists in the joint dictionary
def existe(conjoints, mari):
    if mari in conjoints:
        print(f"La clé [{mari}] existe associée à la valeur [{conjoints[mari]}]")
    else:
        print(f"La clé [{mari}] n'existe pas")


#  ----------------------------- Main
#  a dictionary
conjoints = {"Pierre": "Gisèle", "Paul": "Virginie", "Jacques": "Lucette", "Jean": ""}

#  routes - 1
print(f"Nombre d'éléments du dictionnaire : {len(conjoints)}")
for (clé, valeur) in conjoints.items():
    print(f"conjoints[{clé}]={valeur}")

#  list of dictionary keys
print("liste des clés-------------")
clés = conjoints.keys()
print(f"{clés}")

#  list of dictionary values
print("liste des valeurs------------")
valeurs = conjoints.values()
print(f"{valeurs}")

#  key search
existe(conjoints, "Jacques")
existe(conjoints, "Lucette")
existe(conjoints, "Jean")

#  deleting a key-value
del (conjoints["Jean"])
print(f"Nombre d'éléments du dictionnaire : {len(conjoints)}")
print(f"{conjoints}")

#  dictionary keys and values are not lists
print(f"type des clés : {type(clés)}")
print(f"type des valeurs : {type(valeurs)}")

#  we can transform them into lists
lclés = list(clés)
print(f"clés : {type(lclés)}, {lclés}")
lvaleurs = list(valeurs)
print(f"valeurs : {type(lvaleurs)}, {lvaleurs}")

Notes:

  • line 11: the hard-coded definition of a dictionary;
  • line 15: conjoints.items() returns the list of (key, value) pairs from the conjoints dictionary;
  • line 20: conjoints.keys() returns the keys of the conjoints dictionary;
  • line 25: conjoints.values() returns the values of the conjoints dictionary;
  • line 3: husband in spouses returns True if the key husband exists in the spouses dictionary, False otherwise;
  • line 36: A dictionary can be displayed on a single line.

Results


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_07.py
Nombre d'éléments du dictionnaire : 4
conjoints[Pierre]=Gisèle
conjoints[Paul]=Virginie
conjoints[Jacques]=Lucette
conjoints[Jean]=
liste des clés-------------
dict_keys(['Pierre', 'Paul', 'Jacques', 'Jean'])
liste des valeurs------------
dict_values(['Gisèle', 'Virginie', 'Lucette', ''])
La clé [Jacques] existe associée à la valeur [Lucette]
La clé [Lucette] n'existe pas
La clé [Jean] existe associée à la valeur []
Nombre d'éléments du dictionnaire : 3
{'Pierre': 'Gisèle', 'Paul': 'Virginie', 'Jacques': 'Lucette'}
type des clés : <class 'dict_keys'>
type des valeurs : <class 'dict_values'>
clés : <class 'list'>, ['Pierre', 'Paul', 'Jacques']
valeurs : <class 'list'>, ['Gisèle', 'Virginie', 'Lucette']
 
Process finished with exit code 0

Notes:

  • Note that in lines 16–17 of the results, the keys and values of a dictionary do not form a list but a 'dict_keys' type;
  • lines 18–19: a simple type cast converts them to a [list] type;

3.8. script [bases_08]: tuples

A tuple is similar to a list but is immutable:

#  tuples
#  initialization
tuple1 = (0, 1, 2, 3, 4, 5)

#  routes - 1
print(f"tuple1 a {len(tuple1)} elements")
for i in range(len(tuple1)):
    print(f"tuple1[{i}]={tuple1[i]}")

#  routes - 2
print(f"tuple1 a {len(tuple1)} elements")
for element in tuple1:
    print(element)

#  a tuble can be displayed in one line
print(f"tuple1={tuple1}")

#  a tuple cannot be modified
tuple1[0] = 10

Results


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_08.py
tuple1 a 6 elements
tuple1[0]=0
tuple1[1]=1
tuple1[2]=2
tuple1[3]=3
tuple1[4]=4
tuple1[5]=5
tuple1 a 6 elements
0
1
2
3
4
5
tuple1=(0, 1, 2, 3, 4, 5)
Traceback (most recent call last):
  File "C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_08.py", line 19, in <module>
    tuple1[0] = 10
TypeError: 'tuple' object does not support item assignment
 
Process finished with exit code 1

Notes:

  • Lines 17–20 of the output: show that a tuple cannot be modified.

3.9. Script [bases_09]: Multidimensional lists and dictionaries

The script [bases_09] demonstrates how to define and use a multidimensional list or dictionary:

#  multidimensional lists
#  initialization
multi = [[0, 1, 2], [10, 11, 12, 13], [20, 21, 22, 23, 24]]

#  route
for i1 in range(len(multi)):
    for i2 in range(len(multi[i1])):
        print(f"multi[{i1}][{i2}]={multi[i1][i2]}")

#  one-line display
print(f"multi={multi}")

#  multidimensional dictionaries
#  initialization
multi = {"zéro": [0, 1], "un": [10, 11, 12, 13], "deux": [20, 21, 22, 23, 24]}

#  route
for (clé, valeur) in multi.items():
    for i2 in range(len(valeur)):
        print(f"multi[{clé}][{i2}]={multi[clé][i2]}")

#  one-line display
print(f"multi={multi}")

Comments

  • line 7: multi[i1] is a list;
  • line 18: value is a list;

Results


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_09.py
multi[0][0]=0
multi[0][1]=1
multi[0][2]=2
multi[1][0]=10
multi[1][1]=11
multi[1][2]=12
multi[1][3]=13
multi[2][0]=20
multi[2][1]=21
multi[2][2]=22
multi[2][3]=23
multi[2][4]=24
multi=[[0, 1, 2], [10, 11, 12, 13], [20, 21, 22, 23, 24]]
multi[zéro][0]=0
multi[zéro][1]=1
multi[un][0]=10
multi[un][1]=11
multi[un][2]=12
multi[un][3]=13
multi[deux][0]=20
multi[deux][1]=21
multi[deux][2]=22
multi[deux][3]=23
multi[deux][4]=24
multi={'zéro': [0, 1], 'un': [10, 11, 12, 13], 'deux': [20, 21, 22, 23, 24]}
 
Process finished with exit code 0

The [bases_10] script demonstrates how to extract elements from a string separated by a common delimiter into a list.

#  string to list
chaine = '1:2:3:4'
liste = chaine.split(':')
print(type(liste))

#  list display
print(f"liste a {len(liste)} éléments")
print(f"liste={liste}")

#  list to string
chaine2 = ":".join(liste)
print(f"chaine2={chaine2}")

#  add an empty field
chaine += ":"
print(f"chaine={chaine}")
liste = chaine.split(":")

#  list display
print(f"liste a {len(liste)} éléments")
print(f"liste={liste}")

#  let's add another empty field
chaine += ":"
print(f"chaine={chaine}")
liste = chaine.split(":")

#  list display
print(f"liste a {len(liste)} éléments")
print(f"liste={liste}")

Notes:

  • line 3: the method string.split(separator) splits the string string into elements separated by separator and returns them as a list. Thus, the expression '1:2:3:4'.split(":") returns the list ('1','2','3','4');
  • line 11: 'separator'.join(list) returns the string 'list[0]+separator+list[1]+separator+…'.

Results


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_10.py
<class 'list'>
liste a 4 éléments
liste=['1', '2', '3', '4']
chaine2=1:2:3:4
chaine=1:2:3:4:
liste a 5 éléments
liste=['1', '2', '3', '4', '']
chaine=1:2:3:4::
liste a 6 éléments
liste=['1', '2', '3', '4', '', '']
 
Process finished with exit code 0

3.11. Script [bases_11]: Regular Expressions

The [bases_11] script demonstrates how to use regular expressions:

#  import the regular expressions module
import re


# --------------------------------------------------------------------------
def compare(modèle, chaine):
    #  compares the string [string] with the model [model]
    #  displaying results
    print(f"\nRésultats({chaine},{modèle})")
    match = re.match(modèle, chaine)
    if match:
        print(match.groups())
    else:
        print(f"La chaîne [{chaine}] ne correspond pas au modèle [{modèle}]")


#  regular expressions in python
#  retrieve the various fields of a string
#  the model: a sequence of numbers surrounded by any characters
#  you only want to retrieve the sequence of digits
modèle = r"^.*?(\d+).*?$"

#  the chain is compared with the
compare(modèle, "xyz1234abcd")
compare(modèle, "12 34")
compare(modèle, "abcd")

#  the model: a sequence of numbers surrounded by any characters
#  we want the sequence of numbers and the fields that follow and precede them
modèle = r"^(.*?)(\d+)(.*?)$"

#  the chain is compared with the
compare(modèle, "xyz1234abcd")
compare(modèle, "12 34")
compare(modèle, "abcd")

#  the template - a date in dd/mm/yy format
modèle = r"^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$"
compare(modèle, "10/05/97")
compare(modèle, " 04/04/01 ")
compare(modèle, "5/1/01")

#  the model - a decimal number
modèle = r"^\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$"
compare(modèle, "187.8")
compare(modèle, "-0.6")
compare(modèle, "4")
compare(modèle, ".6")
compare(modèle, "4.")
compare(modèle, " + 4")
#  end

Notes:

  • Note the [re] module imported on line 2. It contains the functions for handling regular expressions;
  • line 10: comparing a string to a regular expression (pattern) returns True if the string matches the pattern, False otherwise;
  • line 12: match.groups() is a tuple whose elements are the parts of the string that match the elements of the regular expression enclosed in parentheses. In the pattern:
    • ^.*?(\d+).*?, match.groups() will be a one-element tuple because there is one set of parentheses;
    • ^(.*?)(\d+)(.*?)$, match.groups() will be a 3-element tuple because there are three parentheses;
  • line 21: a literal regular expression is written as r"xxx". It is the r symbol that turns the string into a regular expression;

Regular expressions allow us to validate the format of a string. For example, we can verify that a string representing a date is in the dd/mm/yy format. To do this, we use a pattern and compare the string to that pattern. In this example, d, m, and y must be digits. The pattern for a valid date format is therefore "\d\d/\d\d/\d\d", where the symbol \d represents a digit. The symbols that can be used in a pattern are as follows:

Character
Description
\
Designates the following character as a special character or literal. For example, "n" corresponds to the character "n", while "\n" corresponds to a newline character. The sequence "\\" corresponds to "\", while "\(" corresponds to "(".
^
Matches the start of the string.
$
Matches the end of the string.
*
Matches the preceding character zero or more times. Thus, "zo*" matches "z" or "zoo".
+
Matches the preceding character, one or more times. Thus, "zo+" matches "zoo", but not "z".
?
Matches the preceding character zero or one time. For example, "a?ve?" matches "ve" in "lever".
.
Matches any single character, except the newline character.
(pattern)
Searches for the pattern and stores the match. The matching substring can be retrieved from the match.groups() collection. To find matches with characters inside parentheses ( ), use "\(" or "\)".
x|y
Matches either x or y. For example, "z|foot" matches "z" or "foot". "(z|f)oo" matches "zoo" or "foo".
{n}
n is a non-negative integer. Matches exactly n occurrences of the character. For example, "o{2}" does not match "o" in "Bob," but matches the first two "o"s in "fooooot".
{n,}
n is a non-negative integer. Matches at least n occurrences of the character. For example, "o{2,}" does not match "o" in "Bob," but matches all "o"s in "fooooot." "o{1,}" is equivalent to "o+" and "o{0,}" is equivalent to "o*".
{n,m}
m and n are non-negative integers. Matches at least n and at most m occurrences of the character. For example, "o{1,3}" matches the first three "o"s in "foooooot" and "o{0,1}" is equivalent to "o?".
[xyz]
Character set. Matches any of the specified characters. For example, "[abc]" matches "a" in "plat".
[^xyz]
Negative character set. Matches any character not listed. For example, "[^abc]" matches "p" in "plat".
[a-z]
Character range. Matches any character in the specified range. For example, "[a-z]" matches any lowercase alphabetical character between "a" and "z".
[^m-z]
Negative character range. Matches any character not in the specified range. For example, "[^m-z]" matches any character not between "m" and "z".
\b
Matches a word boundary, that is, the position between a word and a space. For example, "er\b" matches "er" in "lever," but not "er" in "verb."
\B
Matches a boundary that does not represent a word. "en*t\B" matches "ent" in "bien entendu".
\d
Matches a character representing a digit. Equivalent to [0-9].
\D
Matches a character that is not a digit. Equivalent to [^0-9].
\f
Matches a line break character.
\n
Equivalent to a newline character.
\r
Equivalent to a carriage return character.
\s
Matches any whitespace, including space, tab, page break, etc. Equivalent to "[ \f\n\r\t\v]".
\S
Matches any non-whitespace character. Equivalent to "[^ \f\n\r\t\v]".
\t
Matches a tab character.
\v
Matches a vertical tab character.
\w
Matches any character representing a word and including an underscore. Equivalent to "[A-Za-z0-9_]".
\W
Matches any character that does not represent a word. Equivalent to "[^A-Za-z0-9_]".
\num
Matches num, where num is a positive integer. Refers to stored matches. For example, "(.)\1" matches two consecutive identical characters.
\n
Matches n, where n is an octal escape value. Octal escape values must consist of 1, 2, or 3 digits. For example, "\11" and "\011" both match a tab character. "\0011" is equivalent to "\001" & "1". Octal escape values must not exceed 256. If they do, only the first two digits are taken into account in the expression. Allows ASCII codes to be used in regular expressions.
\xn
Corresponds to n, where n is a hexadecimal escape value. Hexadecimal escape values must consist of exactly two digits. For example, "\x41" corresponds to "A". "\x041" is equivalent to "\x04" & "1". Allows the use of ASCII codes in regular expressions.

An element in a pattern may appear once or multiple times. Let’s look at some examples involving the \d symbol, which represents a single digit:

pattern
meaning
\d
a digit
\d?
0 or 1 digit
\d*
0 or more digits
\d+
1 or more digits
\d{2}
2 digits
\d{3,}
at least 3 digits
\d{5,7}
between 5 and 7 digits

Now let’s imagine a model capable of describing the expected format for a string:

target string
pattern
a date in dd/mm/yy format
\d{2}/\d{2}/\d{2}
a time in hh:mm:ss format
\d{2}:\d{2}:\d{2}
an unsigned integer
\d+
a sequence of spaces, which may be empty
\s*
an unsigned integer that may be preceded or followed by spaces
\s*\d+\s*
an integer that may be signed and preceded or followed by spaces
\s*[+|-]?\s*\d+\s*
an unsigned real number that may be preceded or followed by spaces
\s*\d+(.\d*)?\s*
a real number that may be signed and preceded or followed by spaces
\s*[+-]?\s*\d+(.\d*)?\s*
a string containing the word "just"
\bjuste\b

You can specify where to search for the pattern in the string:

pattern
meaning
^pattern
the pattern starts the string
pattern$
the pattern ends the string
^pattern$
the pattern starts and ends the string
pattern
the pattern is searched for anywhere in the string, starting from the beginning.
Model sought
pattern
a string ending with an exclamation point
!$
a string ending with a period
\.$
a string beginning with the sequence //
^//
a string consisting of a single word, optionally preceded or followed by spaces
^\s*\w+\s*$
a string consisting of two words, optionally followed or preceded by spaces
^\s*\w+\s*\w+\s*$
a string containing the word secret
\bsecret\b

Sub-patterns of a pattern can be "extracted." Thus, not only can we verify that a string matches a particular pattern, but we can also extract from that string the elements corresponding to the sub-patterns of the pattern that have been enclosed in parentheses. For example, if we parse a string containing a date in the format dd/mm/yy and want to extract the dd, mm, and yy components of that date, we would use the pattern (\d\d)/(\d\d)/(\d\d).

Script results


C:\Data\st-2020\dev\python\cours-2020\python3-flask-2020\venv\Scripts\python.exe C:/Data/st-2020/dev/python/cours-2020/python3-flask-2020/bases/bases_11.py
 
Résultats(xyz1234abcd,^.*?(\d+).*?$)
('1234',)
 
Résultats(12 34,^.*?(\d+).*?$)
('12',)
 
Résultats(abcd,^.*?(\d+).*?$)
La chaîne [abcd] ne correspond pas au modèle [^.*?(\d+).*?$]

Résultats(xyz1234abcd,^(.*?)(\d+)(.*?)$)
('xyz', '1234', 'abcd')
 
Résultats(12 34,^(.*?)(\d+)(.*?)$)
('', '12', ' 34')
 
Résultats(abcd,^(.*?)(\d+)(.*?)$)
La chaîne [abcd] ne correspond pas au modèle [^(.*?)(\d+)(.*?)$]
 
Résultats(10/05/97,^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$)
('10', '05', '97')
 
Résultats( 04/04/01 ,^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$)
('04', '04', '01')
 
Résultats(5/1/01,^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$)
La chaîne [5/1/01] ne correspond pas au modèle [^\s*(\d\d)\/(\d\d)\/(\d\d)\s*$]
 
Résultats(187.8,^\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$)
('', '187.8')
 
Résultats(-0.6,^\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$)
('-', '0.6')
 
Résultats(4,^\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$)
('', '4')
 
Résultats(.6,^\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$)
('', '.6')
 
Résultats(4.,^\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$)
('', '4.')
 
Résultats( + 4,^\s*([+-]?)\s*(\d+\.\d*|\.\d+|\d+)\s*$)
('+', '4')
 
Process finished with exit code 0