5. XML and PHP
In this chapter, we will provide an introduction to using XML (eXtensible Markup Language) documents with PHP. We will do so in the context of the tax application discussed in the previous chapter.
5.1. XML Files and XSL Style Sheets
Consider the following XML file, which could represent the results of simulations:
<?xml version="1.0" encoding="windows-1252"?>
<simulations>
<simulation married="yes" children="2" salary="200000" tax="22504"/>
<simulation married="no" children="2" salary="200000" tax="33388"/>
</simulations>
When viewed with IE 6, the following result is obtained:

IE6 recognizes that it is dealing with an XML file (thanks to the file’s .xml extension) and formats it in its own way. With Netscape, you get a blank page. However, if you look at the source code (View/Source), you can see the original XML file:

Why doesn’t Netscape display anything? Because it needs a stylesheet to tell it how to transform the XML file into an HTML file that it can then display. It turns out that IE 6 has a default stylesheet when the XML file does not provide one, which was the case here.
There is a language called XSL (eXtensible StyleSheet Language) that allows you to describe the transformations needed to convert an XML file into any text file. XSL supports numerous instructions and closely resembles programming languages. We won’t go into detail here, as that would take dozens of pages. We’ll simply describe two examples of XSL stylesheets. The first is the one that will transform the XML file simulations.xml into HTML code. We modify the latter so that it specifies the stylesheet that browsers can use to transform it into an HTML document they can display:
<?xml version="1.0" encoding="windows-1252"?>
<?xml-stylesheet type="text/xsl" href="simulations.xsl"?>
<simulations>
<simulation married="yes" children="2" salary="200000" tax="22504"/>
<simulation married="no" children="2" salary="200000" tax="33388"/>
</simulations>
The XML command
designates the simulations.xsl file as an xml-stylesheet of type text/xsl, i.e., a text file containing XSL code. This stylesheet will be used by browsers to transform the XML text into an HTML document. Here is the result obtained with Netscape 7 when loading the XML file simulations.xml:

When we view the document's source code (View/Source), we see the original XML document rather than the displayed HTML document:

Netscape used the simulations.xsl stylesheet to transform the XML document above into a displayable HTML document. Now it’s time to look at the contents of this stylesheet:
<?xml version="1.0" encoding="windows-1252"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="yes"/>
<xsl:template match="/">
<html>
<head>
<title>Tax Calculation Simulations</title>
</head>
<body>
<center>
<h3>Tax Calculation Simulations</h3>
<hr/>
<table border="1">
<th>married</th><th>children</th><th>salary</th><th>tax</th>
<xsl:apply-templates select="/simulations/simulation"/>
</table>
</center>
</body>
</html>
</xsl:template>
<xsl:template match="simulation">
<tr>
<td><xsl:value-of select="@marie"/></td>
<td><xsl:value-of select="@children"/></td>
<td><xsl:value-of select="@salary"/></td>
<td><xsl:value-of select="@tax"/></td>
</tr>
</xsl:template>
</xsl:stylesheet>
- An XSL stylesheet is an XML file and therefore follows XML rules. Among other things, it must be "well-formed," meaning that every opening tag must be closed.
- The file begins with two XML directives that can be included in any XSL stylesheet under Windows:
<?xml version="1.0" encoding="windows-1252"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
The encoding="windows-1252" attribute allows accented characters to be used in the stylesheet.
- The <xsl:output method="html" indent="yes"/> tag tells the XSL interpreter that we want to produce "indented" HTML.
- The <xsl:template match="element"> tag is used to define the element in the XML document to which the instructions found between <xsl:template ...> and </xsl:template> will be applied.
In the example above, the "/" element refers to the root of the document. This means that as soon as the start of the XML document is encountered, the XSL commands located between the two tags will be executed.
- Anything that is not an XSL tag is included as-is in the output stream. The XSL tags themselves are executed. Some of them produce a result in the output stream. Let’s examine the following example:
<xsl:template match="/">
<html>
<head>
<title>Tax Calculation Simulations</title>
</head>
<body>
<center>
<h3>Tax Calculation Simulations</h3>
<hr/>
<table border="1">
<th>married</th><th>children</th><th>salary</th><th>tax</th>
<xsl:apply-templates select="/simulations/simulation"/>
</table>
</center>
</body>
</html>
</xsl:template>
Note that the XML document being analyzed is as follows:
<?xml version="1.0" encoding="windows-1252"?>
<simulations>
<simulation married="yes" children="2" salary="200000" tax="22504"/>
<simulation married="no" children="2" salary="200000" tax="33388"/>
</simulations>
From the start of the parsed XML document (match="/"), the XSL processor will output the text
<html>
<head>
<title>Tax calculation simulations</title>
</head>
<body>
<center>
<h3>Tax Calculation Simulations</h3>
<hr>
<table border="1">
<th>married</th><th>children</th><th>salary</th><th>tax</th>
Note that in the original text we had <hr/> and not <hr>. In the original text, we could not write <hr> because, while it is a valid HTML tag, it is an invalid XML tag. However, we are dealing here with XML text that must be "well-formed," meaning that every tag must be closed. We therefore write <hr/>, and because we wrote <xsl:output text="html ...> the interpreter will transform the text <hr/> into <hr>. Following this text will be the text produced by the XSL command:
We will see later what this text is. Finally, the interpreter will add the text:
The command <xsl:apply-templates select="/simulations/simulation"/> instructs the interpreter to apply the "template" to the /simulations/simulation element. It will be executed every time the XSL interpreter encounters a <simulation>..</simulations> or <simulation/> tag inside a <simulations>..</simulations> tag in the parsed XML text. Upon encountering such a tag, the interpreter will execute the instructions of the following template:
<xsl:template match="simulation">
<tr>
<td><xsl:value-of select="@marie"/></td>
<td><xsl:value-of select="@children"/></td>
<td><xsl:value-of select="@salaire"/></td>
<td><xsl:value-of select="@tax"/></td>
</tr>
</xsl:template>
Consider the following XML lines:
The line <simulation ..> corresponds to the template for the XSL instruction <xsl:apply-templates select="/simulations/simulation">. The XSL interpreter will therefore attempt to apply the instructions that match this template to it. It will find the template <xsl:template match="simulation"> and execute it. Recall that anything that is not an XSL command is passed through unchanged by the XSL interpreter, while XSL commands are replaced by the result of their execution. The XSL instruction <xsl:value-of select="@champ"/> is thus replaced by the value of the "champ" attribute of the parsed node (here, a <simulation> node). Parsing the previous XML line will produce the following output:
XSL | output |
<tr><td> | <tr><td> |
<xsl:value-of select="@marie"/> | yes |
</td><td> | </td><td> |
<xsl:value-of select="@children"/> | 2 |
</td><td> | </td><td> |
<xsl:value-of select="@salary"/> | 200000 |
</td><td> | </td><td> |
<xsl:value-of select="@tax"/> | 22504 |
</td></tr> | </td></tr> |
In total, the XML line
will be converted into the following HTML line:
All these explanations are a bit rudimentary, but it should now be clear to the reader that the following XML text:
<?xml version="1.0" encoding="windows-1252"?>
<?xml-stylesheet type="text/xsl" href="simulations.xsl"?>
<simulations>
<simulation married="yes" children="2" salary="200000" tax="22504"/>
<simulation married="no" children="2" salary="200000" tax="33388"/>
</simulations>
accompanied by the following XSL stylesheet simulations.xsl:
<?xml version="1.0" encoding="windows-1252"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="yes"/>
<xsl:template match="/">
<html>
<head>
<title>Tax Calculation Simulations</title>
</head>
<body>
<center>
<h3>Tax Calculation Simulations</h3>
<hr/>
<table border="1">
<th>married</th><th>children</th><th>salary</th><th>tax</th>
<xsl:apply-templates select="/simulations/simulation"/>
</table>
</center>
</body>
</html>
</xsl:template>
<xsl:template match="simulation">
<tr>
<td><xsl:value-of select="@marie"/></td>
<td><xsl:value-of select="@children"/></td>
<td><xsl:value-of select="@salary"/></td>
<td><xsl:value-of select="@tax"/></td>
</tr>
</xsl:template>
</xsl:stylesheet>
produces the following HTML text:
<html>
<head>
<title>Tax Calculation Simulations</title>
</head>
<body>
<center>
<h3>Tax Calculation Simulations</h3>
<hr>
<table border="1">
<th>married</th><th>children</th><th>salary</th><th>tax</th>
<tr>
<td>yes</td><td>2</td><td>200,000</td><td>22,504</td>
</tr>
<tr>
<td>no</td><td>2</td><td>200,000</td><td>3,338</td>
</tr>
</table>
</center>
</body>
</html>
The following XML file simulations.xml
<?xml version="1.0" encoding="windows-1252"?>
<?xml-stylesheet type="text/xsl" href="simulations.xsl"?>
<simulations>
<simulation married="yes" children="2" salary="200000" tax="22504"/>
<simulation married="no" children="2" salary="200000" tax="33388"/>
</simulations>
When viewed in a modern browser (here Netscape 7), it is displayed as follows:

5.2. Tax Application: Version 5
5.2.1. The XML files and XSL style sheets of the tax application
Let’s return to our starting point, which was the tax web application, and recall that we want to modify it so that the response sent to clients is in XML format rather than HTML. This HTML response will be accompanied by an XSL stylesheet so that browsers can display it. In the previous paragraph, we presented:
- the simulations.xml file, which is a prototype of an XML response containing tax calculation simulations
- the simulations.xsl file, which will be the XSL stylesheet accompanying this XML response
We must also account for the case of a response containing errors. The prototype for the XML response in this case will be the following errors.xml file:
<?xml version="1.0" encoding="windows-1252"?>
<?xml-stylesheet type="text/xsl" href="errors.xsl"?>
<errors>
<error>error 1</error>
<error>error 2</error>
</errors>
The errors.xsl stylesheet used to display this XML document in a browser will be as follows:
<?xml version="1.0" encoding="windows-1252"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="yes"/>
<xsl:template match="/">
<html>
<head>
<title>Tax Calculation Simulations</title>
</head>
<body>
<center>
<h3>Tax Calculation Simulations</h3>
</center>
<hr/>
The following errors occurred:
<ul>
<xsl:apply-templates select="/errors/error"/>
</ul>
</body>
</html>
</xsl:template>
<xsl:template match="error">
<li><xsl:value-of select="."/></li>
</xsl:template>
</xsl:stylesheet>
This stylesheet introduces an XSL command not yet encountered: <xsl:value-of select="."/>. This command outputs the value of the parsed node, in this case a <error>text</error> node. The value of this node is the text between the opening and closing tags, in this case "text".
The errors.xml code is transformed by the errors.xsl stylesheet into the following HTML document:
<html>
<head>
<title>Tax Calculation Simulations</title>
</head>
<body>
<center>
<h3>Tax Calculation Simulations</h3>
</center>
<hr>
The following errors occurred:
<ul>
<li>Error 1</li>
<li>error 2</li>
</ul>
</body>
</html>
The errors.xml file, along with its style sheet, is displayed by a browser as follows:

5.2.2. The xmlsimulations application
We create an xmlsimulations.html file and place it in the impots application directory. The displayed page is as follows:

This HTML document is a static document. Its code is as follows:
<html>
<head>
<title>impots</title>
<script language="JavaScript" type="text/javascript">
function clear(){
// Clear the form
with(document.frmImpots){
optMarie[0].checked=false;
optMarie[1].checked=true;
txtChildren.value = "";
txtSalary.value="";
txtTaxes.value="";
}//with
}//clear
function calculate(){
// Check parameters before sending them to the server
with(document.frmImpots){
//number of children
fields = /^\s*(\d+)\s*$/ .exec(txtEnfants.value);
if(champs==null){
// the template is not validated
alert("The number of children was not provided or is incorrect");
nbEnfants.focus();
return;
}//if
//salary
fields = /^\s*(\d+)\s*$/ .exec(txtSalary.value);
if(champs==null){
// the pattern does not match
alert("Salary was not provided or is incorrect");
salary.focus();
return;
}//if
// OK - submit
submit();
}//with
}//calculate
</script>
</head>
<body background="/poly/impots/7/images/standard.jpg">
<center>
Tax Calculation
<hr>
<form name="frmImpots" action="/poly/impots/7/xmlsimulations.php" method="POST">
<table>
<tr>
<td>Are you married?</td>
<td>
<input type="radio" name="optMarie" value="yes">yes
<input type="radio" name="optMarie" value="no" checked>no
</td>
</tr>
<tr>
<td>Number of children</td>
<td><input type="text" size="3" name="txtEnfants" value=""></td>
</tr>
<tr>
<td>Annual salary</td>
<td><input type="text" size="10" name="txtSalary" value=""></td>
</tr>
<tr></tr>
<tr>
<td><input type="button" value="Calculate" onclick="calculate()"></td>
<td><input type="button" value="Clear" onclick="clear()"></td>
</tr>
</table>
</form>
</center>
</body>
</html>
Note that the form data is posted to the URL /poly/impots/7/xmlsimulations.php. The code for the xmlsimulations.php application is very similar to that of the impots.php application. The reader is encouraged to review the latter. Here is a reminder of the input code:
<?php
// processes the tax form
// libraries
include "ImpotsDSN.php";
// start session
session_start();
// application configuration
ini_set("register_globals", "off");
ini_set("display_errors", "off");
$taxForm = "impots_form.php";
$taxErrors = "tax_errors.php";
$taxDB = array(dsn => "mysql-dbimpots", user => "admimpots", pwd => "mdpimpots",
table=>taxes,limits=>limits,coeffR=>coeffR,coeffN=>coeffN);
// retrieve session parameters
$session = $_SESSION["session"];
// Is the session valid?
if(!isset($session) || !isset($session[objImpots]) || !isset($session[simulations])){
// start a new session
$session = array(objImpots => new ImpotsDSN($bdImpots), simulations => array());
// any errors?
if(count($session[objImpots]->errors)!=0){
$query = array(errors => $session[objImpots]->errors);
// display error page
include $taxErrors;
// end
$session = array();
endSession($session);
}//if
}//if
// retrieve the parameters of the current transaction
$request[married] = $_POST["optMarried"];
$request[children] = $_POST["txtChildren"];
$request[salary] = $_POST["txtSalary"];
// Do we have all the parameters?
if(!isset($request[married]) || !isset($request[children]) || !isset($request[salary])){
// display empty form
$request = array(chkoui => "", chknon => "checked", children => "", salary => "", taxes => "",
errors=>array(),simulations=>array());
include $taxForm;
// end
endSession($session);
}//if
// parameter validation
$request = validate($request);
// any errors?
if(count($request[errors])!=0){
// display form
include "$taxForm";
// end
endSession($session);
}//if
// calculate tax due
$query[taxes] = $session[taxObject]->calculate(array(married => $query[married],
children=>$request[children],salary=>$request[salary]));
// another simulation
$session[simulations][] = array($request[married], $request[children], $request[salary], $request[taxes]);
$request[simulations] = $session[simulations];
// Display form
include "$taxForm";
// end
endSession($session);
...
The HTML pages were displayed using the "include" lines. Here, we want to generate XML instead of HTML. Simply create two new files, impots_erreurs.php and impots_simulations.php, to generate XML instead of HTML. The rest of the application remains unchanged. The code then becomes as follows:
<?php
// process the tax form
// libraries
include "ImpotsDSN.php";
// start session
session_start();
// application configuration
ini_set("register_globals", "off");
ini_set("display_errors", "off");
$taxForm="xmlsimulations.html";
$taxErrors = "impots_erreurs.php";
$taxSimulations = "impots_simulations.php";
$taxDB = array(dsn => "mysql-dbimpots", user => "admimpots", pwd => "mdpimpots",
table=>taxes,limits=>limits,coeffR=>coeffR,coeffN=>coeffN);
// retrieve session parameters
$session = $_SESSION["session"];
// Is the session valid?
if(!isset($session) || !isset($session[objImpots]) || !isset($session[simulations])){
// start a new session
$session = array(objImpots => new ImpotsDSN($bdImpots), simulations => array());
// any errors?
if(count($session[objImpots]->errors)!=0){
$query = array(errors => $session[objImpots]->errors);
// display error page in XML format
header("Content-type: text/xml");
include $taxErrors;
// end
$session = array();
endSession($session);
}//if
}//if
// retrieve the parameters of the current transaction
$request[married] = $_POST["optMarried"];
$request[children] = $_POST["txtChildren"];
$request[salary] = $_POST["txtSalary"];
// Do we have all the parameters?
if(!isset($request[married]) || !isset($request[children]) || !isset($request[salary])){
// display empty form
include $taxForm;
// end
terminateSession($session);
}//if
// parameter validation
$request = check($request);
// any errors?
if(count($request[errors])!=0){
// display errors in XML format
header("Content-type: text/xml");
include "$taxErrors";
// end
endSession($session);
}//if
// calculate the tax due
$query[taxes] = $session[taxObject]->calculate(array(married => $query[married],
children=>$request[children],salary=>$request[salary]));
// another simulation
$session[simulations][] = array($request[married], $request[children], $request[salary], $request[taxes]);
$request[simulations] = $session[simulations];
// display simulations in XML format
header("Content-type: text/xml");
include "$taxSimulations";
// end
terminateSession($session);
We previously presented and examined the two types of XML responses to be provided, as well as the style sheets that must accompany them. The code for the impots_simulations.php application is as follows:
<?php
// generates the XML code for the application's tax simulation page
// some constants
$xslSimulations="simulations.xsl";
// XML headers
echo "<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>\n";
echo "<?xml-stylesheet type=\"text/xsl\" href=\"$xslSimulations\" ?>\n";
// the simulations
echo "<simulations>\n";
for ($i=0; $i<count($request[simulations]); $i++){
// simulation $i
echo "<simulation marie=\"".$query[simulations][$i][0]."\" ".
"children=\"".$query[simulations][$i][1]."\" ".
"salary=\"".$query[simulations][$i][2]."\" ".
"tax=\"".$query[simulations][$i][3]."\" />\n";
}//for
echo "</simulations>\n";
?>
This code uses the $query dictionary to generate XML code similar to the following:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<?xml-stylesheet type="text/xsl" href="simulations.xsl" ?>
<simulations>
<simulation married="no" children="3" income="200000" tax="22504" />
<simulation married="yes" children="3" salary="200000" tax="16400" />
<simulation spouse="yes" children="2" income="200000" tax="22504" />
</simulations>
The simulations.xsl stylesheet will transform this XML code into HTML code.
The code for the impots_erreurs.php application is as follows:
<?php
// generates the XML code for the application's error page
// some constants
$xslErrors="errors.xsl";
// XML headers
echo "<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>\n";
echo "<?xml-stylesheet type=\"text/xsl\" href=\"$xslErrors\" ?>\n";
// the errors
echo "<errors>\n";
for ($i=0;$i<count($request[errors]);$i++){
// error $i
echo "<error>".$request[errors][$i]."</error>";
}//for
echo "</errors>\n";
?>
This code uses the $request dictionary to generate XML code similar to the following:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<?xml-stylesheet type="text/xsl" href="errors.xsl" ?>
<errors>
<error>Unable to open the DSN database [mysql-dbimpots] (S1000)</error>
</errors>
The errors.xsl stylesheet will transform this XML code into HTML code.
Let's look at a first example:

The MySQL DBMS is not running. We then receive the following response:

If we look at the source code received by the browser, we see the following:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<?xml-stylesheet type="text/xsl" href="errors.xsl" ?>
<errors>
<error>Unable to open the DSN database [mysql-dbimpots] (S1000)</error>
</errors>
Now, we launch the MySQL DBMS and run a series of simulations. We get the following response:

The code received by the browser is as follows:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<?xml-stylesheet type="text/xsl" href="simulations.xsl" ?>
<simulations>
<simulation wife="yes" children="2" salary="200000" tax="22504" />
<simulation spouse="no" children="2" income="200000" tax="33388" />
<simulation married="no" children="3" income="200000" tax="22504" />
<simulation married="yes" children="3" income="200000" tax="16400" />
</simulations>
Note that our new application is easier to maintain than the previous one. Some of the work has been transferred to the XSL stylesheets. The advantage of this new division of tasks is that once the XML format of the responses has been established, the development of the stylesheets is independent of that of the application.
5.3. Parsing an XML Document in PHP
The next version of our tax application will be a client program for the previous xmlsimulations application. Our client will therefore receive XML code that it must parse to extract the information it needs. We will now take a break from our various versions and learn how to parse an XML document in PHP. We will do this using the following example:
dos>e:\php43\php.exe xmlParser.php
syntax: xmlParser.php XMLfile
The xmlParser.php application accepts one parameter: the URI (Uniform Resource Identifier) of the XML document to be parsed. In our example, this URI will simply be the name of an XML file located in the xmlParser.php application directory. Let’s consider two examples of execution. In the first example, the XML file being parsed is the following errors.xml file:
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="errors.xsl"?>
<errors>
<error>error 1</error>
<error>error 2</error>
</errors>
The analysis yields the following results:
dos>e:\php43\php.exe xmlParser.php errors.xml
ERRORS
ERROR
[error 1]
/ERROR
ERROR
[error 2]
/ERROR
/ERRORS
We haven't yet described what the xmlParser.php application does, but here we can see that it displays the structure of the parsed XML document. The second example parses the following XML file, simulations.xml:
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="simulations.xsl"?>
<simulations>
<simulation married="yes" children="2" salary="200000" tax="22504"/>
<simulation married="no" children="2" salary="200000" tax="33388"/>
</simulations>
The analysis yields the following results:
dos>e:\php43\php.exe xmlParser.php simulations.xml
SIMULATIONS
SIMULATION,(MARRIED,yes) (CHILDREN,2) (SALARY,200000) (TAX,22504)
/SIMULATION
SIMULATION, (MARRIED, no) (CHILDREN, 2) (INCOME, 200,000) (TAX, 33,388)
/SIMULATION
/SIMULATIONS
The xmlParser.php application contains everything we need in our tax application since it was able to retrieve both the errors and the simulations that the web server might send. Let's examine its code:
<?php
// syntax $0 XMLFile
// displays the structure and content of the XML file
// call verification
if(count($argv)!=2){
// error message
fwrite(STDERR, "syntax: $argv[0] XMLfile\n");
// end
exit(1);
}//if
// initializations
$file = $argv[1]; // the XML file
$depth=0; // indentation level = depth in the tree
// the program
// create an XML parsing object
$xml_parser = xml_parser_create();
// specify which functions to execute at the start and end of tags
xml_set_element_handler($xml_parser, "startElement", "endElement");
// specify which function to execute when data is encountered
xml_set_character_data_handler($xml_parser, "displayData");
// Open the XML file for reading
if (! ($fp = @fopen($file, "r"))) {
fwrite(STDERR, "Unable to open the XML file $file\n");
exit(2);
}//if
// process the XML file in 4096-byte blocks
while($data=fread($fp,4096)){
// parse the read data
if (! xml_parse($xml_parser,$data,feof($fp))){
// An error occurred
fprintf(STDERR, "XML error: %s on line %d\n",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser));
// end
exit(3);
}//if
}//while
// the file has been parsed
// free the resources used by the XML parser
xml_parser_free($xml_parser);
// end
exit(0);
// -----------------------------------------------------------
// function called when an opening tag is encountered
function startElement($parser, $name, $attributes) {
global $depth;
// a sequence of spaces (indentation)
for($i=0;$i<$depth;$i++){
print " ";
}//for
// attributes
$details="";
while(list($attrib,$value) = each($attributes)){
$details.="($attrib,$value) ";
}
// Display the tag name and any attributes
if($details)
print "$name,$details\n";
else print "$name\n";
// one more level in the tree
$depth++;
}//startElement
// -----------------------------------------------------------
// the function called when an end tag is encountered
function endElement($parser, $name) {
// end tag
// indentation level
global $depth;
$depth--;
// a sequence of spaces (indentation)
for($i=0;$i<$depth;$i++){
echo " ";
}//for
// the tag name
echo "/$name\n";
}//endElement
// -----------------------------------------------------------
// the data display function
function displayData($parser, $data) {
// indentation level
global $depth;
// the data is displayed
$data = trim($data);
if($data!=""){
// a sequence of spaces (indentation)
for($i=0;$i<$depth;$i++){
echo " ";
}//for
echo "[$data]\n";
}//if
}//displayData
?>
Let's examine the code related to XML. To parse an XML document, our application needs an XML parser.
When the parser analyzes the XML document, it will trigger events such as: I have encountered the start of the document, the start of a tag, a tag attribute, the content of a tag, the end of a tag, the end of the document, ... It passes these events to methods that we must specify:
<?php
...
// specify which functions to execute at the start and end of a tag
xml_set_element_handler($xml_parser, "startElement", "endElement");
// specify which function to execute when data is encountered
xml_set_character_data_handler($xml_parser, "displayData");
event emitted by the parser | processing method |
function startElement($parser, $name, $attributes) $parser: the document parser $name: name of the parsed element. If the encountered element is <simulations>, name will be "simulations". $attributes: list of the tag's attributes in the form (ATTRIBUTE,value), where ATTRIBUTE is the tag name in uppercase. | |
function displayData($parser, $data) $parser: the document parser $data: the data of the tag | |
function endElement($parser, $name) The parameters are the same as those of the startElement method. |
The startElement function retrieves the element's attributes via the $attributes parameter. This parameter is a dictionary of the tag's attributes. So, if we have the following tag:
the $attributes dictionary will be as follows: array(marie=>yes,children=>2,salary=>200000,tax=>22504)
Once the parser and the preceding methods have been defined, a document is parsed using the xml_parse function:
<?php
...
// Process the XML file in 4096-byte blocks
while($data=fread($fp,4096)){
// parse the read data
if (! xml_parse($xml_parser,$data,feof($fp))){
...
}//if
}//while
function xml_parse($parser, $doc, $end) The parser $parser parses the document $doc. $doc can be a fragment of a larger document. The $end parameter indicates whether this is the last part (true) or not (false). When parsing the $doc document, the functions defined by xml_set_element_handler are called at the start and end of each tag. The function defined by xml_set_character_data_handler is called each time the content of a tag is retrieved. |
During the parsing of the XML document, errors may occur, particularly if the XML document is "malformed," for example, when closing tags are missing. In this case, the xml_parse function returns a value of "false":
<?php
...
if (! xml_parse($xml_parser,$data,feof($fp))){
// an error occurred
fprintf(STDERR, "XML error: %s on line %d\n",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser));
// end
exit(3);
}//if
function xml_get_error_code($parser) returns the code of the last error that occurred -- function xml_error_string($code) returns the error message associated with the code passed as a parameter -- function xml_get_current_line($parser) returns the line number of the XML document currently being parsed |
Once the document has been parsed, the resources allocated to the parser are released:
function xml_free($parser) |
With this explanation, the previous program and its execution examples are self-explanatory.
5.4. Tax application: version 6
We now have all the elements to write a client program for our tax service that delivers XML. We’ll use version 4 of our application for the client and keep version 5 for the server. In this client-server application:
- the tax calculation simulation service is handled by the xmlsimulations.php application. The server’s response is therefore in XML format, as we saw in version 5.
- The client is no longer a browser but a standalone PHP client. Its console interface is that of version 4.
The reader is invited to review the code of the cltImpots.php application, which was the client program for version 4. It received a $document from the server. This was then an HTML document. It is now an XML document. The HTML document $document was parsed by the following function:
<?php
...
function getInfo($document){
// $document: HTML document
// we look for either the list of errors
// or the simulation table
// preparing the result
$taxes[errors] = array();
$taxes[simulations] = array();
........
return $taxes;
}//getInfo
The function received the HTML document $document, parsed it, and returned a dictionary $taxes with two attributes:
- errors: an array of errors
- simulations: an array of simulations, each simulation being itself an array with four elements (spouse, children, salary, tax).
The cltImpots.php application is now renamed cltXmlSimulations.php. Only the part that processes the document received from the server needs to be changed to account for the fact that it is now an XML document. The getInfos function then becomes the following:
<?php
...
// --------------------------------------------------------------
function getInfos($document){
// $document: XML document
// we look for either the list of errors
// or the simulation table
global $taxes, $tags;
// prepare the result
$taxes[errors] = array();
$taxes[simulations] = array();
// current tags
$tags = array();
// create an XML text parser object
$xml_parser = xml_parser_create();
// specify which functions to execute at the start and end of tags
xml_set_element_handler($xml_parser, "startElement", "endElement");
// specify which function to execute when data is encountered
xml_set_character_data_handler($xml_parser, "getData");
// parse $document
xml_parse($xml_parser, $document);
// Free the resources used by the XML parser
xml_parser_free($xml_parser);
// end
return $taxes;
}//getInfo
// -----------------------------------------------------------
// function called when an opening tag is encountered
function startElement($parser, $name, $attributes) {
global $taxes, $tag, $tags, $content;
// record the tag name and its content
$tag = strtolower($name);
$content = "";
// add it to the tag stack
array_push($tags, $tag);
// Is this a simulation?
if($tag == "simulation") {
// record the simulation attributes
$taxes[simulations][] = array($attributes[MARIE], $attributes[CHILDREN], $attributes[SALARY], $attributes[TAX]);
}//if
}//startElement
// -----------------------------------------------------------
// the function called when an end tag is encountered
function endElement($parser, $name) {
// retrieve the current tag
global $tags, $tags, $content;
$tag = array_pop($tags);
// Is this an error tag?
if($tag == "error") {
// another error
$taxes[errors][] = trim($content);
}//if
}//endElement
// -----------------------------------------------------------
// the function for processing the content of a tag
function getData($parser, $data) {
// global variables
global $tag, $content;
// Is this an error tag?
if($tag == "error") {
// append to the content of the current tag
$content.=$data;
}//if
}//getData
Comments:
- The getInfos($document) function begins by creating a parser, then configures it, and finally starts analyzing the $document:
<?php
...
// we create an XML parsing object
$xml_parser = xml_parser_create();
// specify which functions to execute at the start and end of tags
xml_set_element_handler($xml_parser, "startElement", "endElement");
// specify which function to execute when data is encountered
xml_set_character_data_handler($xml_parser, "getData");
// parse $document
xml_parse($xml_parser, $document);
- After parsing, release the resources allocated to the parser and return the $impots dictionary.
<?php
...
// Free the resources occupied by the XML parser
xml_parser_free($xml_parser);
// end
return $taxes;
- The startElement($parser, $name, $attributes) function is called at the start of each tag. It
- adds the $name tag to a tag array $tags. This array is managed as a stack: upon encountering the end-of-tag symbol, the last tag pushed onto $tags will be popped. The current tag is also stored in $tag. The $attributes dictionary contains the attributes of the encountered tag, with these attributes in uppercase.
- stores the attributes in the global dictionary $impots[simulations], if it is a simulation tag.
- the getData($parser,$data) function when the $data content of a tag has been retrieved. Here, a precaution has been taken. In certain XML document processing APIs, notably Java, it is noted that this function may be called repeatedly, i.e., the $data content of a tag is not necessarily available all at once. Here, the PHP documentation does not mention this restriction. As a precaution, we store the $data value obtained in a global variable $content. Only upon encountering the end-of-tag symbol will we consider that we have obtained the entire content of the tag. The only tag affected by this processing is the <error> tag.
- The endElement($parser, $name) function is called at the end of each tag. It is used here to change the name of the current tag by removing the last tag from the tag stack and to add the content of the closing <error> tag to the $impots[errors] array.
Here are a few examples of execution, first with a DBMS that has not been started:
dos>e:\php43\php.exe cltXmlSimulations.php http://localhost/poly/impots/8/xmlsimulations.php yes 2 200000
Session token=[e8c29ea12f79e4771960068d161229fd]
The following errors occurred:
Unable to open DSN database [mysql-dbimpots] (S1000)
Then, with the DBMS running:
dos>e:\php43\php.exe cltXmlSimulations.php http://localhost/poly/impots/8/xmlsimulations.php yes 3 200000
Session token=[69a54d79db10b70ed0a2d55d5026ac8b]
Simulations:
[yes,3,200000,16400]
dos >e:\php43\php.exe cltXmlSimulations.php http://localhost/poly/impots/8/xmlsimulations.php yes 2 200000 69a54d79db10b70ed0a2d55d5026ac8b
Session token=[69a54d79db10b70ed0a2d55d5026ac8b]
Simulations:
[yes,3,200000,16400]
[yes,2,200000,22504]
dos >e:\php43\php.exe cltXmlSimulations.php http://localhost/poly/impots/8/xmlsimulations.php no 2 200000 69a54d79db10b70ed0a2d55d5026ac8b
Session token=[69a54d79db10b70ed0a2d55d5026ac8b]
Simulations:
[yes,3,200000,16400]
[yes,2,200000,22504]
[no,2,200000,33388]
5.5. Conclusion
Thanks to its XML response, the tax application has become easier to manage for both its developer and the developers of client applications.
- The design of the server application can now be entrusted to two types of people: the PHP developer of the servlet and the graphic designer who will manage the appearance of the browser response in browsers. The latter simply needs to know the structure of the server’s XML response to build the stylesheets that will accompany it. Note that these are contained in separate XSL files independent of the PHP application. The front-end designer can therefore work independently of the developer.
- Client application designers, too, simply need to know the structure of the server’s XML response. Any changes the front-end designer might make to the style sheets have no impact on this XML response, which always remains the same. This is a huge advantage.
- How can the developer update their PHP application without breaking everything? First of all, as long as the XML response remains unchanged, they can organize their application however they like. They can also update the XML response as long as they retain the <error> and <simulation> elements expected by their clients. They can thus add new tags to this response. The front-end developer will account for them in their style sheets, and browsers will be able to display the updated version of the response. Programmatic clients, however, will continue to function using the old model, with the new tags simply being ignored. For this to work, the tags being looked for must be clearly identified in the XML parsing of the server’s response. This is what was done in our XML client for the tax application, where the procedures specifically stated that we were processing the <error> and <simulation> tags. As a result, the other tags are ignored.