4. Solving the three problems with Google Gemini
We will provide screenshots of the three Gemini sessions that were used to solve the three problems posed. We will go into considerable detail. Once this is done, we will not repeat the process for the other AIs tested. They operate in a similar manner. We will only provide the most notable details.
4.1. Introduction
We refer back to the first Gemini screenshot provided earlier:
![]() |
- In [1], the Gemini URL;
- In [2], the version of Gemini used;
- In [3-5], the three problems posed to Gemini;
Gemini is a Google product available at the URL [https://gemini.google.com/]. To view a history of your question-and-answer sessions as shown above, you must create an account. Furthermore, like all the other AIs tested, Gemini limits the number of questions you can ask and the number of files you can upload. When this limit is reached, the session ends, and you’re offered the option to continue it later. Since it’s quite frustrating to stop in the middle of a session, I signed up for a subscription. Fortunately, the first month of the Gemini subscription is free. I did the same with the other AIs that had these limits, namely ChatGPT, MistralAI, and ClaudeAI. I signed up for a one-month subscription, but in those cases, the first month was paid. I didn’t encounter any limits with Grok. DeepSeek doesn’t announce any limits but sometimes responds with [Server busy] and interrupts the session. That’s essentially setting limits without saying so.
From here on, I’ll refer to question-and-answer sessions simply as “sessions.” AIs most often use the English term “chat” or “conversation.”
Gemini’s interface for asking a question is as follows:
![]() | ![]() |
- In [1], your question;
- In [2], the icon that launches the AI to calculate the response;
- In [3-4], you can attach files;
4.2. Problem 1
The session for Problem 1 is as follows:
![]() |
- In [1], the question;
- In [2], the beginning of Gemini’s answer;
The rest of the answer is as follows:
![]() |
![]() |
![]() |
![]() |
The answer is correct. The other five AIs will also give the correct answer in a similar form.
4.3. Problem 2
4.3.1. Introduction
Here we recall the initial problem from the [python3-flask-2020] course. This is a text given to students in a tutorial.
![]() |
The table above allows us to calculate the tax in the simplified case of a taxpayer who has only their salary to declare. As indicated in note (1), the tax calculated in this way is the tax before three mechanisms:
- The capping of the family quotient, which applies to high incomes;
- The tax credit and tax reduction that apply to low-income earners;
Thus, the tax calculation involves the following steps [http://impotsurlerevenu.org/comprendre-le-calcul-de-l-impot/1217-calcul-de-l-impot-2019.php]:
![]() |
We propose to write a program to calculate a taxpayer’s tax liability for 2019 in the simplified case of a taxpayer who has only their salary to report.
4.3.1.1. Calculation of Gross Tax
Gross tax can be calculated as follows:
First, calculate the taxpayer’s number of shares:
- Each parent contributes 1 share;
- The first two children each contribute 1/2 share;
- Subsequent children each contribute one share:
The number of shares is therefore:
- nbParts=1+nbChildren*0.5+(nbChildren-2)*0.5 if the employee is unmarried;
- nbParts=2+nbChildren*0.5+(nbChildren-2)*0.5 if they are married;
- where nbChildren is the number of children;
- We calculate the taxable income R = 0.9 * S, where S is the annual salary;
- The family quotient QF is calculated as QF = R / nbParts;
- We calculate the gross tax I based on the following data (2019):
9964 | 0 | 0 |
27,519 | 0.14 | 1,394.96 |
73,779 | 0.3 | 5,798 |
156,244 | 0.4 | 13,913.69 |
0 | 0.45 | 20163.45 |
Each row has 3 fields: field1, field2, field3. To calculate tax I, we find the first row where QF <= field1 and take the values from that row. For example, for a married employee with two children and an annual salary S of 50,000 euros:
Taxable income: R=0.9*S=45,000
Number of shares: nbParts=2+2*0.5=3
Family quotient: QF=45,000/3=15,000
The first row where QF <= field1 is as follows:
Tax I is then equal to 0.14*R – 1394.96*numberOfShares=[0.14*45000-1394.96*3]=2115. The tax is rounded down to the nearest euro.
If the condition QF <= field1 holds on the first line, then the tax is zero.
If QF is such that the condition QF <= field1 is never satisfied, then the coefficients from the last line are used. Here:
which gives the gross tax I = 0.45*R – 20163.45*nbParts.
4.3.1.2. Family Quotient Cap
![]() |
To determine whether the family quotient (QF) cap applies, we recalculate the gross tax without the children. Again, for the married employee with two children and an annual salary S of 50,000 euros:
Taxable income: R = 0.9 * S = 45,000
Number of shares: nbParts=2 (children are no longer counted)
Family quotient: QF = 45,000 / 2 = 22,500
The first line where QF <= field1 is as follows:
Tax I is then equal to 0.14*R – 1394.96*number of shares = [0.14*45,000 – 1394.96*2] = 3,510.
Maximum child-related benefit: 1551 * 2 = 3102 euros
Minimum tax: 3,510 – 3,102 = 408 euros
The gross tax with 2 shares, already calculated in the previous paragraph (2,115 euros), is greater than the minimum tax (408 euros), so the family cap does not apply here.
In general, the gross tax is greater than (tax1, tax2) where:
- [tax1]: is the gross tax calculated including children;
- [tax2]: is the gross tax calculated without children and reduced by the maximum credit (here 1,551 euros per half-share) related to children;
4.3.1.3. Calculation of the tax reduction
![]() |
Still for the married employee with two children and an annual salary S of 50,000 euros:
The gross tax (2,115 euros) from the previous step is less than 2,627 euros for a couple (1,595 euros for a single person): the tax reduction therefore applies. It is calculated as follows:
tax credit = threshold (couple = 1,970 / single = 1,196) – 0.75 * gross tax
discount = 1,970 – 0.75 * 2,115 = 383.75, rounded to 384 euros.
New gross tax = 2,115 – 384 = 1,731 euros
Two rules must be observed when calculating the discount (some AI tools have stumbled on this issue):
- The discount cannot be negative;
- The discount cannot exceed the tax already calculated;
4.3.1.4. Calculation of the tax reduction
![]() |
Below a certain threshold, a 20% reduction is applied to the gross tax resulting from the previous calculations. In 2019, the thresholds are as follows:
- Single: 21,037 euros;
- couple: 42,074 euros; (the figure 37,968 used in the example above appears to be incorrect);
This threshold is increased by the value: 3,797 * (number of half-shares contributed by the children).
Again, for the married employee with two children and an annual salary S of 50,000 euros:
- His taxable income (45,000 euros) is below the threshold (42,074 + 2 × 3,797) = 49,668 euros;
- He is therefore entitled to a 20% reduction in his tax: 1,731 * 0.2 = 346.2 euros, rounded to 347 euros;
- The taxpayer’s gross tax becomes: 1,731 – 347 = 1,384 euros;
4.3.1.5. Calculation of net tax
Our calculation ends here: the net tax due will be 1,384 euros. In reality, the taxpayer may be eligible for other deductions, particularly for donations to public or general interest organizations.
4.3.1.6. High-Income Cases
Our previous example applies to the majority of employees. However, the tax calculation differs for high-income earners.
4.3.1.6.1. Cap on the 10% reduction on annual income
In most cases, taxable income is calculated using the formula: R = 0.9 × S, where S is the annual salary. This is known as the 10% reduction. This reduction is capped. In 2019:
- It cannot exceed 12,502 euros;
- It cannot be less than €437;
Let’s consider the case of an unmarried employee with no children and an annual salary of 200,000 euros:
- The 10% reduction is 200,000 euros > 12,502 euros. It is therefore capped at 12,502 euros;
4.3.1.6.2. Family Quotient Cap
Let’s consider a case where the family cap described in the section |Family Quotient Cap| applies. Let’s take the case of a couple with three children and an annual income of 100,000 euros. Let’s go through the calculation steps again:
- The 10% deduction is 100,000 euros < 12,502 euros. The taxable income R is therefore 100,000 - 10,000 = 90,000 euros;
- The couple has nbParts = 2 + 0.5 × 2 + 1 = 4 shares;
- His family quotient is therefore QF = R / nbParts = 90,000 / 4 = 22,500 euros;
- His gross tax I1 with children is I1 = 0.14 × 90,000 – 1,394.96 × 4 = 7,020 euros;
- His gross tax I2 without children:
- QF = 90,000 / 2 = 45,000 euros;
- I2 = 0.3 × 90,000 – 5,798 × 2 = 15,404 euros;
- The family quotient cap rule states that the benefit derived from children cannot exceed (1,551 × 4 half-shares) = 6,204 euros. However, here, it is I2 – I1 = 15,404 – 7,020 = 8,384 euros, which is greater than 6,204 euros;
- The gross tax is therefore recalculated as I3 = I2 - 6,204 = 15,404 - 6,204 = 9,200 euros;
- Since I3 > I1, tax I3 will be retained;
This couple will receive neither a tax credit nor a reduction, and their final tax will be 9,200 euros.
4.3.1.7. Official figures
Tax calculation is complex. Throughout this document, tests will be conducted using the following examples. The results are from the tax administration’s simulator |https://www3.impots.gouv.fr/simulateur/calcul_impot/2019/simplifie/index.htm|:
Taxpayer | Official results | Results from the document’s algorithm |
Couple with 2 children and an annual income of 55,555 euros | Tax = 2,815 euros Tax rate = 14% | Tax = 2,814 euros Tax rate = 14% |
Couple with 2 children and an annual income of 50,000 euros | Tax = 1,385 euros Tax credit = €384 Reduction = 346 euros Tax rate = 14% | Tax = €1,384 Discount = 384 euros Credit=347 euros Tax rate = 14% |
Couple with 3 children and an annual income of 50,000 euros | Tax = 0 euros Tax credit = 720 euros Reduction = 0 euros Tax rate = 14% | Tax = 0 euros Discount=720 euros Deduction=0 euros Tax rate = 14% |
Single with 2 children and an annual income of 100,000 euros | Tax = 19,884 euros Tax credit = 0 euros Deduction = 0 euros Tax rate = 41% | Tax = €19,884 Surcharge = 4,480 euros Discount=0 euros Reduction = 0 euros Tax rate = 41% |
Single with 3 children and an annual income of 100,000 euros | Tax = €16,782 Tax credit=0 euros Deduction = 0 euros Tax rate = 41% | Tax = €16,782 Surcharge = 7,176 euros Discount=0 euros Reduction = 0 euros Tax rate = 41% |
Couple with 3 children and an annual income of 100,000 euros | Tax = €9,200 Tax credit=0 euros Deduction = 0 euros Tax rate = 30% | Tax = 9,200 euros Surcharge = 2,180 euros Discount = 0 euros Reduction = 0 euros Tax rate = 30% |
Couple with 5 children and an annual income of 100,000 euros | Tax = €4,230 Tax credit=0 euros Deduction = 0 euros Tax rate = 14% | Tax = €4,230 Discount = 0 euros Deduction=0 euros Tax rate = 14% |
Single, no children, and annual income of 100,000 euros | Tax = 22,986 euros Tax credit=0 euros Deduction = 0 euros Tax rate = 41% | Tax = €22,986 Surcharge = 0 euros Discount=0 euros Deduction = 0 euros Tax rate = 41% |
Couple with 2 children and an annual income of 30,000 euros | Tax = 0 euros Tax credit=0 euros Deduction = 0 euros Tax rate = 0% | Tax = 0 euros Discount=0 euros Reduction=0 euro Tax rate = 0% |
Single with no children and an annual income of 200,000 euros | Tax = 64,211 euros Tax credit=0 euros Deduction = 0 euros Tax rate = 45% | Tax = €64,210 Surcharge = 7,498 euros Discount=0 euros Reduction = 0 euros Tax rate = 45% |
Couple with 3 children and an annual income of 200,000 euros | Tax = €42,843 Tax credit=0 euros Deduction = 0 euros Tax rate = 41% | Tax = €42,842 Surcharge = 17,283 euros Discount=0 euros Reduction = 0 euros Tax rate = 41% |
In the example above, the “surcharge” refers to the additional amount paid by high-income earners due to two factors:
- The cap on the 10% deduction from annual income;
- The cap on the family allowance;
This indicator could not be verified because the tax authority’s simulator does not provide it.
We can see that the document’s algorithm calculates the correct tax amount every time, though with a margin of error of 1 euro. This margin of error stems from rounding. All monetary amounts are rounded up to the nearest euro in some cases and down to the nearest euro in others. Since I was not familiar with the official rules, the monetary amounts in the document’s algorithm were rounded:
- Up to the next euro for discounts and reductions;
- Down to the nearest euro for surcharges and the final tax;
We will ask the AI to perform this tax calculation.
4.3.2. Gemini Session Configuration
The question posed to Gemini is accompanied by two files:
![]() |
- In [1], the calculation just detailed has been put into a PDF that is provided to Gemini. Gemini will find there the exact rules for the simplified calculation of the 2019 tax on 2018 income;
- In [2], our instructions;
- In [3], the command to launch the AI;
Our instructions in the text file [instructionsAvecPDF.txt] are as follows:
These instructions are the result of numerous questions asked of Gemini. It quickly becomes clear that the AI needs to be very tightly guided if we want to get what we want. It was because of all this trial and error that the Gemini session was ultimately terminated for exceeding limits. Let’s examine the rest of these instructions:
- Line 1: We specify that the conversation should be in French. This instruction is for DeepSeek, which tended to speak English;
- Line 3: what we want;
- Line 5: We tell the AI to use the PDF we provided;
- Lines 7–14: a number of useful tips, especially for Problem 3 without the PDF. Several AIs got lost in the tax calculation;
- Lines 15–44: the 11 unit tests we want included in the generated script. Once the script is generated, we’ll run it in PyCharm and see if all 11 tests pass;
- Lines 46–53: Without these instructions, the AIs would generate unit tests seeking exact results that would fail;
- Lines 55–56: I tell the AI not to go online. The simplest solution is to use the PDF;
- Lines 58–59: The AI did not follow this instruction. I had to explicitly write it in a prompt when I noticed that a test had failed;
- Lines 61–65: I specify what type of Python script I want;
- Lines 67–69: I would have preferred a link to retrieve the generated script because displaying the code on screen takes time. It turned out that most AIs cannot do this. The links provided did not work;
- Lines 71–72: I would have liked to know the time the AI took to answer the question. Only Gemini was able to provide this information. The other AIs either did not respond to this instruction or provided arbitrary numbers, indicating they did not understand it;
4.3.3. Gemini’s response
Gemini’s first response is as follows:
![]() |
- In [1-4], Gemini provides links to the part of the PDF or text file containing the instructions it is using at a given moment;
The rest is as follows:
![]() |
- In [1], Gemini states that it successfully executed all 11 unit tests. Most AIs made this claim for both Problem 2 and Problem 3, and often when the generated script was loaded, it did not work. This claim should therefore be taken with a grain of salt. For Gemini, however, this will prove to be true;
- In [2], a link that turns out not to work;
- In [3], only Gemini provided a realistic execution time;
So the link [2] does not work. We tell Gemini:
![]() |
Gemini’s response:
![]() |
- In [1], the Python script generated by Gemini;
We load this script into PyCharm and run it:
![]() |
- In [1], [gemini1] is the script generated by Gemini;
When the script is run, the following compilation errors appear:
- Line 30, the compilation error. [cite_start] is a marker used to generate a specific type of text;
We put the logs above into a file [logs gemini1.txt] and give it to Gemini:
![]() |
Gemini’s response is then as follows:
![]() |
When loaded into PyCharm, running the new script generates exactly the same error. We tell Gemini this by providing the execution logs again:
![]() |
Gemini’s response is as follows:
![]() |
This time it works. All 11 unit tests pass. We tell Gemini:
![]() |
To which it replies:
![]() |
The script generated by Gemini followed the instructions given in the text file [instructionsWithPDF.txt]:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 | |
I haven’t verified this code. Since the 11 unit tests passed, I consider it “probably correct.” I haven’t done anything more for my own code than verifying these 11 tests.
4.4. Problem 3
Problem 3 is identical to Problem 2, except that we no longer provide the AI with the PDF containing the calculation rules to follow.
The initial question to Gemini is as follows:
![]() |
The instructions file in [1] is almost the same as for Problem 2, with the following differences:
1 - Express yourself in French.
2 - Can you generate a Python script to calculate the tax paid by families in 2019 on their 2018 income?
3 - You will use sources you find on the internet. In your answer, please list these sources.
4 - You must pay attention to the following points:
…
- In [3], the student is asked to find the rules for calculating 2019 tax on 2018 income online. This is a more difficult exercise than the previous one;
Below, I am providing only parts of Gemini’s first answer:
![]() |
![]() |
The estimated time is plausible. We wait a long time for Gemini’s response.
As before, Gemini provided a download link for the generated script, but the link does not work. We tell him:
![]() |
Gemini’s response:
![]() |
We load the script into PyCharm under the name [gemini2]:
![]() |
We run it and… it doesn’t work. The execution logs are as follows:
"C:\Program Files\Python313\python.exe" "C:/Program Files/JetBrains/PyCharm 2025.2.0.1/plugins/python-ce/helpers/pycharm/_jb_unittest_runner.py" --path "C:\Data\st-2025\dev\python\code\python-flask-2025-cours\outils ia\gemini\gemini2.py"
Testing started at 5:23 PM ...
Launching unittests with arguments python -m unittest C:\Data\st-2025\dev\python\code\python-flask-2025-cours\outils ia\gemini\gemini2.py in C:\Data\st-2025\dev\python\code\python-flask-2025-cours
Failure
Traceback (most recent call last):
File "C:\Data\st-2025\dev\python\code\python-flask-2025-cours\outils ia\gemini\gemini2.py", line 278, in test_cas_2
self.assertAlmostEqual(import, 1385, delta=1)
~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 1691 != 1385 within 1 delta (306 difference)
Error
Traceback (most recent call last):
File "C:\Data\st-2025\dev\python\code\python-flask-2025-cours\outils ia\gemini\gemini2.py", line 291, in test_cas_3
tax, _, _ = calculate_final_tax(2, 3, 50000)
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
File "C:\Data\st-2025\dev\python\code\python-flask-2025-cours\outils ia\gemini\gemini2.py", line 187, in calculate_final_tax
discount, tax_after_discount = calculate_discount(tax_after_capping, adults)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Data\st-2025\dev\python\code\python-flask-2025-cours\outils ia\gemini\gemini2.py", line 134, in calculate_discount
discount = discount_threshold - (tax_before_discount * DISCOUNT_COEFFICIENT)
^^^^^^^^^^^^^^^^^
NameError: name 'COFFICIENT_DECOTE' is not defined. Did you mean: 'COEFFICIENT_DECOTE'?
Error
Traceback (most recent call last):
File "C:\Data\st-2025\dev\python\code\python-flask-2025-cours\outils ia\gemini\gemini2.py", line 316, in test_cas_9
self._verifier_cas(2, 2, 30000, (0, 0, 0))
~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
File "C:\Data\st-2025\dev\python\code\python-flask-2025-cours\outils ia\gemini\gemini2.py", line 216, in _verifier_cas
calculated_tax, calculated_discount, calculated_reduction = calculate_final_tax(adults, children, income)
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Data\st-2025\dev\python\code\python-flask-2025-cours\outils ia\gemini\gemini2.py", line 187, in calculate_final_tax
discount, tax_after_discount = calculate_discount(tax_after_capping, adults)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Data\st-2025\dev\python\code\python-flask-2025-cours\outils ia\gemini\gemini2.py", line 134, in calculate_discount
discount = discount_threshold - (tax_before_discount * DISCOUNT_COEFFICIENT)
^^^^^^^^^^^^^^^^^
NameError: name 'COFFICIENT_DECOTE' is not defined. Did you mean: 'COEFFICIENT_DECOTE'?
Ran 11 tests in 0.038s
FAILED (failures=1, errors=2)
Process finished with exit code 1
- Line 11, a test failed;
- Lines 25 and 42: the same compilation error;
We put these logs into a text file that we give to Gemini:
![]() |
Gemini’s response:
![]() |
![]() |
We load the script into PyCharm and run it. More errors. We tell Gemini, attaching the execution logs again:
![]() |
Gemini’s response:
![]() |
![]() |
![]() |
We load this new script into PyCharm and run it. This time, all 11 unit tests pass:
The code generated by Gemini is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 | |
Again, I haven’t inspected this code. I simply noted that it passed all 11 tests successfully.
But one might be curious to know its reasoning, particularly for specific cases in tax calculation. Let’s ask it:
![]() |
This is a high-income case with both a possible 10% deduction cap and a possible family quotient cap.
Gemini’s response is as follows:
![]() |
![]() |
![]() |
![]() |
![]() |
These last two screenshots are interesting. Gemini uses a calculation method different from the one explained in the PDF. This calculation method can indeed be found online. The two methods are equivalent.
![]() |
![]() |
![]() |
The explanation is remarkably clear. It could be given as is to students to explain the tax calculation method.
Now let’s take another example, this time with low income. In this case, there may be a tax credit and a reduction:
![]() |
Gemini’s response is as follows:
![]() |
![]() |
![]() |
![]() |
![]() |
Here, we see that Gemini applies a rule that is not in the PDF. He probably found it online, but is the source reliable?
![]() |
Here, Gemini continues to apply an unknown rule (the special rule mentioned above).
![]() |
![]() |
![]() |
So Gemini’s results match those of the official tax simulator. But it used a rule not found in the PDF. Where is the error? We ask Gemini, attaching the PDF:
![]() |
Gemini’s response:
![]() |
![]() |
![]() |
![]() |
I think Gemini is right and that my PDF is incorrect. To verify this, I ask him to run a test:
- Where his reasoning would yield the same results as the official tax simulator;
- Where the reasoning in the PDF would give results different from those of the simulator;
![]() |
Gemini’s response is as follows:
![]() |
Here Gemini is wrong. I ran the simulator on this example and found the following:
![]() |
However, we’ll see that Gemini’s reasoning does indeed yield the results above. Let’s continue:
![]() |
![]() |
![]() |
![]() |
Very well. Noted. Let’s continue:
![]() |
![]() |
![]() |
![]() |
So Gemini found (tax, discount, reduction) = (431, 325, 1296), whereas the simulator I used gives (431, 324, 1297). So Gemini found the correct results to within 1 euro, but it doesn’t know it. We tell it:
![]() |
Gemini responds:
![]() |
![]() |
Now, we wonder if Gemini could generate a corrected PDF:
![]() |
Gemini’s response:
![]() |
So Gemini didn’t give me a link to a PDF, but it generated text so I could create the PDF myself. Although it’s cumbersome to include screenshots of the PDF here, I’m doing so to show readers the generative aspect of AI:
![]() |













To be honest, I haven't checked whether everything in this PDF is true. In any case, it's a perfect document for a tutorial, generated in just a few seconds.
However, we can have Gemini itself verify that its PDF is correct. We start a new conversation:
![]() |
- in [1], we included the PDF generated by Gemini [The Problem According to Gemini.pdf];
- in [2], [instructionsWithPDF2.txt] is identical to the instructions in [instructionsWithPDF.txt], except that we’ve added a twelfth unit test—the very one that showed the initial PDF was incorrect:
Curiously, it took several back-and-forth iterations before Gemini generated the correct script:
Question 2
![]() |
Question 3
![]() |
As has been done several times now, when the script generated and loaded into PyCharm fails, we provide Gemini with the text file containing the execution logs. Gemini understands them very well.
Question 4
![]() |
Question 5
![]() |
Question 6 and conclusion
![]() | ![]() |
We are now confident in the validity of the PDF generated by Gemini. The calculation rules provided therein are correct.
We will now do the same for the other five AIs, but we will keep our explanations very brief, except for ChatGPT, the current leader in AI. What interests us is whether or not the AI solves the three problems we present to it. In fact, the interfaces of all these AIs are very similar, and I proceeded with them in the same way as with Gemini. Readers are encouraged to replay the Gemini conversations with the AI of their choice.




















































































