Finally a simple xml => array class.
Functioning like SimpleXML library.
<?php
class xml {
private $parser;
private $pointer;
public $dom;
public function __construct($data) {
$this->pointer =& $this->dom;
$this->parser = xml_parser_create();
xml_set_object($this->parser, $this);
xml_parser_set_option($this->parser, XML_OPTION_CASE_FOLDING, false);
xml_set_element_handler($this->parser, "tag_open", "tag_close");
xml_set_character_data_handler($this->parser, "cdata");
xml_parse($this->parser, $data);
}
private function tag_open($parser, $tag, $attributes) {
if (isset($this->pointer[$tag]['@attributes'])) {
$content = $this->pointer[$tag];
$this->pointer[$tag] = array(0 => $content);
$idx = 1;
} else if (isset($this->pointer[$tag]))
$idx = count($this->pointer[$tag]);
if (isset($idx)) {
$this->pointer[$tag][$idx] = Array(
'@idx' => $idx,
'@parent' => &$this->pointer);
$this->pointer =& $this->pointer[$tag][$idx];
} else {
$this->pointer[$tag] = Array(
'@parent' => &$this->pointer);
$this->pointer =& $this->pointer[$tag];
}
if (!empty($attributes))
$this->pointer['@attributes'] = $attributes;
}
private function cdata($parser, $cdata) {
$this->pointer['@data'] = $cdata;
}
private function tag_close($parser, $tag) {
$current = & $this->pointer;
if (isset($this->pointer['@idx']))
unset($current['@idx']);
$this->pointer = & $this->pointer['@parent'];
unset($current['@parent']);
if (isset($current['@data']) && count($current) == 1)
$current = $current['@data'];
else if (empty($current['@data'])||$current['@data']==0)
unset($current['@data']);
}
}
?>
maybe I'll do some explanations on habr
XML Parser Functions
Table of Contents
- utf8_decode — Convierte una cadena codificada UTF-8 a ISO-8859-1
- utf8_encode — codifica una cadena ISO-8859-1 a UTF-8
- xml_error_string — obtiene la cadena de error del analizador XML
- xml_get_current_byte_index — obtiene el índice del byte actual para un analizador XML
- xml_get_current_column_number — Obtiene el número de columna actual para un analizador XML.
- xml_get_current_line_number — obtiene el número de línea actual de un analizador XML
- xml_get_error_code — obtiene el código de error del analizador XML
- xml_parse_into_struct — Parse XML data into an array structure
- xml_parse — comienza a analizar un documento XML
- xml_parser_create_ns — Create an XML parser with namespace support
- xml_parser_create — crea un analizador de XML
- xml_parser_free — Libera un analizador XML
- xml_parser_get_option — obtiene las opciones de un analizador XML
- xml_parser_set_option — establece las opciones de un analizador XML
- xml_set_character_data_handler — Establece gestores de datos de caracteres
- xml_set_default_handler — set up default handler
- xml_set_element_handler — establece gestores de los elementos principio y fin
- xml_set_end_namespace_decl_handler — Set up end namespace declaration handler
- xml_set_external_entity_ref_handler — Establecer gestor de referencias de entidades externas
- xml_set_notation_decl_handler — Establece gestores de declaraciones de notación
- xml_set_object — Usa un analizador XML dentro de un objecto
- xml_set_processing_instruction_handler — Establece el gestor de instrucciones de procesado (PI)
- xml_set_start_namespace_decl_handler — Set up start namespace declaration handler
- xml_set_unparsed_entity_decl_handler — Establece un gestor de declaraciones de entidades no analizadas
XML Parser Functions
wolfon.AT-DoG.inbox.ru
29-Jul-2008 12:09
29-Jul-2008 12:09
Anonymous
18-May-2008 09:18
18-May-2008 09:18
This is peace of the code. It edit xml file.
<?
$songs = Array();
function start_element($parser, $name, $attrs){
global $songs;
if($name == "song"){
array_push($songs, $attrs);
}
}
function end_element ($parser, $name){}
$playlist_string = file_get_contents("test.xml");
$parser = xml_parser_create();
xml_set_element_handler($parser, "start_element", "end_element");
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0);
xml_parse($parser, $playlist_string) or die("Error parsing XML document.");
print "<br />";
if($_POST['action'] == "ins"){
array_push($songs, Array(
"title" => $_POST['title'],
"artist" => $_POST['artist'],
"path" => $_POST['path']));
$songs_final = $songs;
}else if($_POST['action'] == "del"){
$songs_final = Array();
foreach($songs as $song){
if($song['title'] != $_POST['title']){
array_push($songs_final, $song);
}
}
}
$write_string = "<songs>";
foreach($songs_final as $song){
$write_string .= "<song>";
$write_string .= "<title>".$song['title']."</title>";
$write_string .= "<artist>".$song['artist']."</artist>";
$write_string .= "<path>".$song['path']."</path>";
$write_string .= "</song>";
}
$write_string .= "</songs>";
$fp = fopen("test.xml", "w+");
fwrite($fp, $write_string) or die("Error writing to file");
fclose($fp);
print "<em>Song inserted or deleted successfully :)</em><br />";
print "<a href=\"index.php\" title=\"return\">Return</a>";
?>
galen dot senogles at gmail dot com
29-Apr-2008 10:48
29-Apr-2008 10:48
An update to the function below. Fixes a bug where the data of the first tag, would occasionally get appended to the beginning of the tag data of the second tag.
<?php
foreach($dom['child_nodes'][0]['child_nodes'] as $key => $value) {
$tagname = $value['tag_name'];
if(isset($value['child_nodes'][0])) {
$numarrays = count($value['child_nodes']);
if($numarrays > 1) {
$contents = "";
foreach($value['child_nodes'] as $key => $value2) {
$contents .= $value2;
}
}else {
$contents = $value['child_nodes'][0];
}
}else {
$contents = 'isempty';
}
$artmp = array($tagname => $contents);
array_push_associative($xmlarray,$artmp);
unset($artmp);
}
?>
galen dot senogles at gmail dot com
25-Apr-2008 11:28
25-Apr-2008 11:28
If anyone else is having issues figuring out how to utilize the xml class that people have created and modified, don't worry as you are not alone. It took me a bit to come up with a solution that I liked, but I feel this does the job quite nice.
I read through the entire structure of the xml file and create an associative array based on the tag names.
I didn't worry about tag attributes as I didn't need to use them; so remember that if you use this method, you are only getting the tag name and the data inside the tag...that is all, no attributes!!
I am not going to include the xml class as it has been copy pasted multiple times already on this thread. Just scroll down for the xml class.
First let me show just an example of the EXTREMELY simple xml structure I was working with. Again, you will need to make modifications depending on the structure of the xml file you are working with! (I know I could use simplexml but I have php4 and not 5).
<?xml version="1.0"?>
<menuitems>
<menutype>1</menutype>
<product>Just some product info</product>
<shipping>some stuff</shipping>
</menuitems>
The custom associative array push function taken from:
http://us.php.net/manual/en/function.array-push.php#58705
The xml class file is located here:
http://us.php.net/manual/en/ref.xml.php#81910
<?php
// Obtain the exact path to the xml file
$xmlfile = "mydata.xml";
$fp = fopen($xmlfile,"r"); // open the xml file
$xml = fread($fp, filesize($xmlfile)); // read in the size of the file into the variable xml
fclose($fp); // close the stream
$xml_parser = new xml(); // create a new xml class instance
$xml_parser->parse($xml); // parse the variable xml which contains our xml data
$dom = $xml_parser->dom; // make a variable that holds the entire dom
/*
This part extracts the xml nodes from the dom and places them into an associative array.
The associative array key is the name of the tag; the value is the tag contents.
We simply create an array on the fly using the name and contents, and hit that array
with our original array using the array_push_associative function. We then check if
isset to prevent errors from being displayed. If the tag contents are empty,
I put the string isempty inside so I can easily check to see later if there is contents or not.
*/
$xmlarray = array(); // the array we are going to store the information within the tag
$contents = "";
foreach($dom['child_nodes'][0]['child_nodes'] as $key => $value) {
$tagname = $value['tag_name'];
if(isset($value['child_nodes'][0])) {
$numarrays = count($value['child_nodes']);
if($numarrays > 1) {
foreach($value['child_nodes'] as $key => $value2) {
$contents .= $value2;
}
}else {
$contents = $value['child_nodes'][0];
}
}else {
$contents = 'isempty';
}
$artmp = array($tagname => $contents);
array_push_associative($xmlarray,$artmp);
unset($artmp);
}
unset($xml); // free up resources
unset($xml_parser); // free up resources
unset($dom); // free up resources
?>
You may be wondering why there is a nested count and foreach loop inside the main foreach loop. The reason that exists is that the xml class that I am using in this example, the one that is four posts down from this one, has the wonderful behavior in that when something hits the length of 1024 characters, it creates a new element in the array and puts the next 1024 characters into that next element etc. This caused me massive confusion as to why some of my data was getting cut off.
So say I wanted to display the data inside the product tag, all I would need to do is this:
<?php
echo $xmlarray['product']
?>
I sincerely hope this helps people figure out how to utilize the xml class quicker than I did!
If anyone has suggestions, modifications, or whatever, please post it here!
Thanks
galen dot senogles at gmail dot com
19-Apr-2008 09:10
19-Apr-2008 09:10
I used shawn's code that is an ongoing fix/update of a very nice php 4 & 5 compatible class.
It works great, only it kept giving me errors when the array isn't set, (I have errors set to show all).
<?php
// Here is the old function that gave errors:
function makeChildNode() {
if (!is_array($this->pointer['child_nodes'])){
$this->pointer['child_nodes'] = array();
}
return count($this->pointer['child_nodes']);
}
// Here is the new function that does not spit errors:
function makeChildNode() {
if (!isset($this->pointer['child_nodes'])){
$this->pointer['child_nodes'] = array();
}
return count($this->pointer['child_nodes']);
}
?>
shawn dot rapp at gmail dot com
03-Apr-2008 09:24
03-Apr-2008 09:24
Well I posted my script with an example fread($fp, 4096) meaning that it will only read 4k. It was just for a quick example. If you used that to input data from a really long XML file to the parser that would be the problem.
you could replace the 4096 with filesize("file.xml") or try replacing that example test code part with:
$xml = implode('',file("http://localhost/test.xml"));
$xml_parser = new XML_Class();
$xml_parser->parse($xml);
print_r($xml_parser->dom);
I've tried to recreate your problems by posting entire howto of installing LDAP into character data space of a node and can't get it to fail. Please email with more info if the above isn't the problem.
But on that routine you posted from that website. The problem with that one is it seems to be padding with unnecessary arrays. It will overwrite different nodes with the same name if they are within the same parent. And the number one biggest issue for me is that it drops attributes. That is totally bogus. It's a lot cleaner to store most values in attributes than making a zillion nodes and storing the data for something small like a integer or a float as character data.
Example:
<coords x="1.53234" y="56.287" z="4.32" />
VS
<coords><x>1.53234</x><y>56.287</y><z>4.32</z></coords>
To me the top is very readable where the later makes my eyes bleed.
Any ways what is good about the links code is the error checking. Isolating all the code in the parse method instead of constructor so the object is recyclable. And than releases the xml parser.
I'm definitely going to be putting that stuff into my class after I post this note.
But let me know if its still truncating.
shawn dot rapp at gmail dot com
19-Mar-2008 03:52
19-Mar-2008 03:52
The reason why you would want to make your own simplistic DOM parser is because a lack of compatible between PHP 4's domxml and PHP 5's dom.
So it is for portability without having to wrapper the two different DOMs.
If you need a simple light weight XML parser that is portable this is the best way. If you are writing applications for a particular server and more concerned with functionality and speed go with a compiled in DOM.
Here is the fix to Emmetts code...
<?PHP
$fp = fopen("test.xml","r");
$xml = fread($fp, 4096);
fclose($fp);
$xml_parser = new xml();
$xml_parser->parse($xml);
$dom = $xml_parser->dom;
print_r($dom);
class xml {
var $parser;
var $pointer;
var $dom;
function xml() {
$this->pointer =& $this->dom;
$this->parser = xml_parser_create();
xml_set_object($this->parser, $this);
xml_parser_set_option($this->parser, XML_OPTION_CASE_FOLDING, false);
xml_set_element_handler($this->parser, "tag_open", "tag_close");
xml_set_character_data_handler($this->parser, "cdata");
}
function parse($data) {
xml_parse($this->parser, $data);
}
function makeChildNode() {
if (!is_array($this->pointer['child_nodes'])){
$this->pointer['child_nodes'] = array();
}
return count($this->pointer['child_nodes']);
}
function tag_open($parser, $tag, $attributes) {
$idx = $this->makeChildNode();
$this->pointer['child_nodes'][$idx] = Array(
'_idx' => $idx,
'_parent' => &$this->pointer,
'tag_name' => $tag,
'attributes' => $attributes,
);
$this->pointer =& $this->pointer['child_nodes'][$idx];
}
function cdata($parser, $cdata) {
//drop text nodes that are just white space formatting characters
if (trim($cdata) != "") {
$idx = $this->makeChildNode();
$this->pointer['child_nodes'][$idx] = $cdata;
}
}
function tag_close($parser, $tag) {
$idx =& $this->pointer['_idx'];
$this->pointer =& $this->pointer['_parent'];
unset($this->pointer['child_nodes'][$idx]['_idx']);
unset($this->pointer['child_nodes'][$idx]['_parent']);
}
}
?>
jesdisciple at gmail dot com
07-Mar-2008 11:00
07-Mar-2008 11:00
@[emmett dot thesane at yahoo dot com]: That code didn't work for me, but it seems that using the DOM functions (http://php.net/manual/en/ref.dom.php) would be more efficient.
emmett dot thesane at yahoo dot com
11-Dec-2007 01:19
11-Dec-2007 01:19
There's a couple of vital flaws in aquariusrick's example:
1. Multiple tags of the same name will overwrite one another.
2. Text nodes within an element are all strung together, with no information saved regarding their order with respect to non-text nodes.
It provided a good starting point, however, for a DOM-builder that *does* allow those things. This should be a more familiar structure for people used to DOM-walking in the browser; children of each node are stored in "childNodes". Text nodes are simply a child node that is only a string, instead of an array.
$xml_parser = new xml();
$xml_parser->parse($xml);
$dom = $xml_parser->dom;
print_r($dom);
class xml {
var $parser;
var $pointer;
var $dom;
function xml() {
$this->pointer =& $this->dom;
$this->parser = xml_parser_create();
xml_set_object($this->parser, $this);
xml_parser_set_option($this->parser, XML_OPTION_CASE_FOLDING, false);
xml_set_element_handler($this->parser, "tag_open", "tag_close");
xml_set_character_data_handler($this->parser, "cdata");
}
function parse($data) {
xml_parse($this->parser, $data);
}
function makeChildNode() {
if (!isset($this->pointer['childNodes'])){
$this->pointer['childNodes'] = array();
}
return count($this->pointer['childNodes']);
}
function tag_open($parser, $tag, $attributes) {
$idx = $this->makeChildNode();
$this->pointer['childNodes'][$idx] = Array(
'_idx' => $idx,
'tagName' => $tag,
'parentNode' => &$this->pointer,
'attributes' => $attributes,
);
$this->pointer =& $this->pointer['childNodes'][$idx];
}
function cdata($parser, $cdata) {
$idx = $this->makeChildNode();
$this->pointer['childNodes'][$idx] = $cdata;
//text node -- has no other attributes than the content
}
function tag_close($parser, $tag) {
$idx =& $this->pointer['_idx'];
$this->pointer =& $this->pointer['_parent'];
unset($this->pointer['childNodes'][$idx]['_idx']);
}
}
aquariusrick
06-Dec-2007 05:43
06-Dec-2007 05:43
Here's another attempt at a very simple script that parses XML into a structure:
<?php
#Usage:
//$xml_parser = new xml();
//$xml_parser->parse($xml);
//$dom = $xml_parser->dom;
class xml {
var $parser;
var $pointer;
var $dom;
function xml() {
$this->pointer =& $this->dom;
$this->parser = xml_parser_create();
xml_set_object($this->parser, $this);
xml_parser_set_option($this->parser, XML_OPTION_CASE_FOLDING, false);
xml_set_element_handler($this->parser, "tag_open", "tag_close");
xml_set_character_data_handler($this->parser, "cdata");
}
function parse($data) {
xml_parse($this->parser, $data);
}
function tag_open($parser, $tag, $attributes) {
$this->pointer[$tag] = Array(
'_parent' => &$this->pointer,
'_content' => null,
'_attributes' => $attributes,
);
$this->pointer =& $this->pointer[$tag];
}
function cdata($parser, $cdata) {
$this->pointer['_content'] .= $cdata;
}
function tag_close($parser, $tag) {
$this->pointer =& $this->pointer['_parent'];
unset($this->pointer[$tag]['_parent']);
}
} // end xml class
?>
yousuf at philipz dot com
25-Nov-2007 08:53
25-Nov-2007 08:53
Here is my modification of < dmeekins att gmail doot com > XMLParser class, as i have used it for quite a bit. There were 2 problems with his post, which of course was a modification of an earlier post, so the problem continued through the many versions. The problems were in the dataHandler function. The first problem was with '$data = trim($data);' which removed line breakers from data which went over many lines and the second problem was when a tag had a value 0. So here is the corrected function.
<?php
function dataHandler($parser, $data)
{
if(!empty($data) || strval($data) != "" )
{
if(isset($this->currTag['data']))
$this->currTag['data'] .= $data;
else
$this->currTag['data'] = $data;
}
}
?>
By removing '$data = trim($data);', you will notice that some [data] elements, mainly the root ones, will have alot of line breakers in them with no actual data.
The code by < geoffers [at] gmail [dot] com > was also quite good as it keeps things alot smaller than XMLParser and here's my modification of part of his code, as i preferred to have it look similar to how XMLParser has it (removes the ['child'] entry and changes 'attribs' to 'attr').
<?php
function parse($data)
{
$this->parser = xml_parser_create('UTF-8');
xml_set_object($this->parser, $this);
xml_parser_set_option($this->parser, XML_OPTION_SKIP_WHITE, 1);
xml_set_element_handler($this->parser, 'tag_open', 'tag_close');
xml_set_character_data_handler($this->parser, 'cdata');
if (!xml_parse($this->parser, $data))
{
$this->data = array();
$this->error_code = xml_get_error_code($this->parser);
$this->error_string = xml_error_string($this->error_code);
$this->current_line = xml_get_current_line_number($this->parser);
$this->current_column = xml_get_current_column_number($this->parser);
}
else
{
$this->data = $this->data;
}
xml_parser_free($this->parser);
}
function tag_open($parser, $tag, $attribs)
{
$this->data[$tag][] = array('data' => '', 'attr' => $attribs);
$this->datas[] =& $this->data;
$this->data =& $this->data[$tag][count($this->data[$tag])-1];
}
?>
The code by < adamaflynn at criticaldevelopment dot net > and < geoff at spacevs dot com > are also quite good but use xmlObject object rather than standard arrays.
geoff at spacevs dot com
08-Nov-2007 06:13
08-Nov-2007 06:13
Reading xml into a class:
<?PHP
class XmlData {}
$elements = array();
$elements[] =& new XmlData();
function startElement($parser, $name, $attrs) {
global $elements;
$element =& new XMLData();
$elements[count($elements)-1]->$name =& $element;
$elements[] =& $element;
}
function endElement($parser, $name) {
global $elements;
array_pop($elements);
}
function characterData($parser, $data) {
global $elements;
$elements[count($elements)-1]->data = $data;
}
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");
xml_parse($xml_parser, $xml, true);
xml_parser_free($xml_parser);
$request =& array_pop($elements);
echo $request->LOGIN->USER->data;
?>
demonpants at gmail dot com
23-Oct-2007 05:59
23-Oct-2007 05:59
I wanted to access the ISBN database, and was previously parsing the HTML string generated from their main page, that is until I discovered they have an API that returns XML.
So, if anyone wants to get some information from the ISBN database, all you need to do is the following.
<?php
//Search the ISBN database for the book.
$url = "http://www.isbndb.com/api/books.xml? access_key=KEY&index1=isbn&value1=$_GET[ISBN]";
$p = xml_parser_create();
xml_parse_into_struct($p,file_get_contents($url),$results,$index);
xml_parser_free($p);
$title = $results[$index[TITLELONG][0]][value];
$author = $results[$index[AUTHORSTEXT][0]][value];
$publisher = $results[$index[PUBLISHERTEXT][0]][value];
?>
You will need to get an access key from isbndb.com, but it takes two seconds and is free. When you get it, replace KEY in the URL with your own key. Also, my code above will search for the book that fits the ISBN number stored in the GET variable ISBN - you can search by other parameters and return more than one result, but my example is for a simple ISBN search.
TeerachaiJ at GMail dot com
15-Oct-2007 11:17
15-Oct-2007 11:17
I enhance xml2array (can't remember who author) to work with duplicate key index by change "tagData" function with this ->
<?
function tagData($parser, $tagData) {
// set the latest open tag equal to the tag data
$strEval = "\$this->arrOutput";
foreach ($this->arrName as $value) {
$strEval .= "[" . $value . "]";
$arr .= "[" . $value . "]"; //*Enhance by T•J (array when dup)
}
eval("\$x=\$this->arrOutput" . $arr . ";"); //*Enhance by T•J (array when dup)
if($x) { $strEval = $strEval . "[" . ++$this->arrOutput[$arr] . "] = \$tagData;"; } //*Enhance by T•J (array when dup)
else { $strEval = $strEval . " = \$tagData;"; }
eval ($strEval);
}
?>
I not sure have another do it now.
Hope!!! It will help your work.
Zvjezdan Patz
09-Sep-2007 05:22
09-Sep-2007 05:22
The problem I had was I needed to generate xml on the screen for users to actually see and copy to a file.
I'm generating the xml manually from a php file and the browser kept interpreting the xml...not very helpful.
This is how you get around it:
<?
$file = file_get_contents("http://fileurl/xml.php?whatever=$whatever");
print nl2br(htmlentities($file));
?>
Prints all my xml quite nicely.
v9 at fakehalo dot us
14-Jul-2007 05:04
14-Jul-2007 05:04
I needed this for work/personal use. Sometimes you'll have a XML string generated as one long string and no line breaks...nusoap in the case of today/work, but there are any other number of possible things that will generate these. Anyways, this simply takes a long XML string and returns an indented/line-breaked version of the string for display/readability.
<?
function xmlIndent($str){
$ret = "";
$indent = 0;
$indentInc = 3;
$noIndent = false;
while(($l = strpos($str,"<",$i))!==false){
if($l!=$r && $indent>0){ $ret .= "\n" . str_repeat(" ",$indent) . substr($str,$r,($l-$r)); }
$i = $l+1;
$r = strpos($str,">",$i)+1;
$t = substr($str,$l,($r-$l));
if(strpos($t,"/")==1){
$indent -= $indentInc;
$noIndent = true;
}
else if(($r-$l-strpos($t,"/"))==2 || substr($t,0,2)=="<?"){ $noIndent = true; }
if($indent<0){ $indent = 0; }
if($ret){ $ret .= "\n"; }
$ret .= str_repeat(" ",$indent);
$ret .= $t;
if(!$noIndent){ $indent += $indentInc; }
$noIndent = false;
}
$ret .= "\n";
return($ret);
}
?>
(...this was only tested for what i needed at work, could POSSIBLY need additions)
ricardo at sismeiro dot com
08-Jun-2007 01:29
08-Jun-2007 01:29
<?php
/**
* correction of the previous code
*/
/**
* Converts XML into Array
*
* @param array $result
* @param object $root
* @param string $rootname
*/
function convert_xml2array(&$result,$root,$rootname='root'){
$n=count($root->children());
if ($n>0){
/**
* start of the correction
*/
if (!isset($result[$rootname]['@attributes'])){
$result[$rootname]['@attributes']=array();
foreach ($root->attributes() as $atr=>$value){
$result[$rootname]['@attributes'][$atr]=(string)$value;
}
}
/**
* end of the correction
*/
foreach ($root->children() as $child){
$name=$child->getName();
convert_xml2array($result[$rootname][],$child,$name);
}
} else {
$result[$rootname]= (array) $root;
if (!isset($result[$rootname]['@attributes'])){
$result[$rootname]['@attributes']=array();
}
}
}
/**
* Example how to use the function convert_xml2array
*/
/**
* Return Array from a xml string
*
* @param string $xml
* @return array
*/
function get_array_fromXML($xml){
$result=array();
$doc=simplexml_load_string($xml);
convert_xml2array($result,$doc);
return $result['root'];
}
?>
adamaflynn at criticaldevelopment dot net
14-Apr-2007 08:50
14-Apr-2007 08:50
Here is an example of another XML parsing script that parses the document into an array/object structure instead of relying on startElement, endElement, etc handlers.
You can find the documentation at:
http://www.criticaldevelopment.net/xml/doc.php
And the code (both PHP4 and PHP5 versions):
http://www.criticaldevelopment.net/xml/parser_php4.phps
http://www.criticaldevelopment.net/xml/parser_php5.phps
If you have any questions about it, just drop me an e-mail.
phpzmurf[at]yahoo.com
12-Apr-2007 01:19
12-Apr-2007 01:19
/*
* Parse rss news, quotes etc.
*
* author : phpZmurf <phpzmurf[at]yahoo.com>
* created: 12.04.2007
* ver : 1.0
*
*/
$data = implode("", file("http://feeds.feedburner.com/quotationspage/qotd/"));
$parser = xml_parser_create();
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0);
xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1);
xml_parse_into_struct($parser, $data, $values, $tags);
xml_parser_free($parser);
# data saved here
$arrQuotes = array();
# at the beginig - the tag is set colsed
$tagOpen = false;
foreach($values as $key => $item) {
if(!$tagOpen and $item['tag'] == 'item' and $item['type'] == 'open') {
# item tag opens
$tagOpen = true;
# empty temporary variables
$temp_title = '';
$temp_description = '';
$temp_guid = '';
$temp_link = '';
} elseif($item['tag'] == 'item' and $item['type'] == 'close') {
# item tag ends
$tagOpen = false;
# if all 4 tags contain data... add them to output array
if($temp_title != '' and $temp_description != '' and $temp_guid != '' and $temp_link != '') {
$arrQuotes[] = array(
'title' => $temp_title,
'description' => $temp_description,
'guid' => $temp_guid,
'link' => $temp_link
);
}
} else {
# save data into temporary variables
switch($item['tag']) {
case 'title':
$temp_title = $item['value'];
break;
case 'description':
# this here quz there was a fuggin <p> at the end of the desription
#$temp_description = $item['value'];
$temp_description = substr($item['value'], 0, strpos($item['value'], '<'));
break;
case 'guid':
$temp_guid = $item['value'];
break;
case 'link':
$temp_link = $item['value'];
break;
default: break;
}
}
}
foreach($arrQuotes as $key => $item) {
print_r($item);
}
Sheer Pullen
15-Mar-2007 03:27
15-Mar-2007 03:27
I took the code posted by forqoun and modified it to be somewhat more readable (by me), somewhat more friendly to the idea of parsing multiple files with the same object, and to be compatable with a HTTP POST of XML data. Anyone who's interested in my version of associated array output can check it out at http://www.sheer.us/code/php/xml-parse-to-associative-array.phpsrc
Be nice to me, this is my first published php code
geoffers [at] gmail [dot] com
31-Dec-2006 12:27
31-Dec-2006 12:27
Time to add my attempt at a very simple script that parses XML into a structure:
<?php
class Simple_Parser
{
var $parser;
var $error_code;
var $error_string;
var $current_line;
var $current_column;
var $data = array();
var $datas = array();
function parse($data)
{
$this->parser = xml_parser_create('UTF-8');
xml_set_object($this->parser, $this);
xml_parser_set_option($this->parser, XML_OPTION_SKIP_WHITE, 1);
xml_set_element_handler($this->parser, 'tag_open', 'tag_close');
xml_set_character_data_handler($this->parser, 'cdata');
if (!xml_parse($this->parser, $data))
{
$this->data = array();
$this->error_code = xml_get_error_code($this->parser);
$this->error_string = xml_error_string($this->error_code);
$this->current_line = xml_get_current_line_number($this->parser);
$this->current_column = xml_get_current_column_number($this->parser);
}
else
{
$this->data = $this->data['child'];
}
xml_parser_free($this->parser);
}
function tag_open($parser, $tag, $attribs)
{
$this->data['child'][$tag][] = array('data' => '', 'attribs' => $attribs, 'child' => array());
$this->datas[] =& $this->data;
$this->data =& $this->data['child'][$tag][count($this->data['child'][$tag])-1];
}
function cdata($parser, $cdata)
{
$this->data['data'] .= $cdata;
}
function tag_close($parser, $tag)
{
$this->data =& $this->datas[count($this->datas)-1];
array_pop($this->datas);
}
}
$xml_parser = new Simple_Parser;
$xml_parser->parse('<foo><bar>test</bar></foo>');
?>
Didier: dlvb ** free * fr
24-Dec-2006 06:53
24-Dec-2006 06:53
Hi !
After parsing the XML and modifying it, I just add a method to rebuild the XML form the internal structure (xmlp->document).
The method xmlp->toXML writes into xmlp->XML attributes. Then, you just have to output it.
I hope it helps.
class XMLParser {
var $parser;
var $filePath;
var $document;
var $currTag;
var $tagStack;
var $XML;
var $_tag_to_close = false;
var $TAG_ATTRIBUT = 'attr';
var $TAG_DATA = 'data';
function XMLParser($path) {
$this->parser = xml_parser_create();
$this->filePath = $path;
$this->document = array();
$this->currTag =& $this->document;
$this->tagStack = array();
$this->XML = "";
}
function parse() {
xml_set_object($this->parser, $this);
xml_set_character_data_handler($this->parser, 'dataHandler');
xml_set_element_handler($this->parser, 'startHandler', 'endHandler');
if(!($fp = fopen($this->filePath, "r"))) {
die("Cannot open XML data file: $this->filePath");
return false;
}
while($data = fread($fp, 4096)) {
if(!xml_parse($this->parser, $data, feof($fp))) {
die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($this->parser)),
xml_get_current_line_number($this->parser)));
}
}
fclose($fp);
xml_parser_free($this->parser);
return true;
}
function startHandler($parser, $name, $attribs) {
if(!isset($this->currTag[$name]))
$this->currTag[$name] = array();
$newTag = array();
if(!empty($attribs))
$newTag[$this->TAG_ATTRIBUT] = $attribs;
array_push($this->currTag[$name], $newTag);
$t =& $this->currTag[$name];
$this->currTag =& $t[count($t)-1];
array_push($this->tagStack, $name);
}
function dataHandler($parser, $data) {
$data = trim($data);
if(!empty($data)) {
if(isset($this->currTag[$this->TAG_DATA]))
$this->currTag[$this->TAG_DATA] .= $data;
else
$this->currTag[$this->TAG_DATA] = $data;
}
}
function endHandler($parser, $name) {
$this->currTag =& $this->document;
array_pop($this->tagStack);
for($i = 0; $i < count($this->tagStack); $i++) {
$t =& $this->currTag[$this->tagStack[$i]];
$this->currTag =& $t[count($t)-1];
}
}
function clearOutput () {
$this->XML = "";
}
function openTag ($tag) {
$this->XML.="<".strtolower ($tag);
$this->_tag_to_close = true;
}
function closeTag () {
if ($this->_tag_to_close) {
$this->XML.=">";
$this->_tag_to_close = false;
}
}
function closingTag ($tag) {
$this->XML.="</".strtolower ($tag).">";
}
function output_attributes ($contenu_fils) {
foreach ($contenu_fils[$this->TAG_ATTRIBUT] as $nomAttribut => $valeur) {
$this->XML.= " ".strtolower($nomAttribut)."=\"".$valeur."\"";
}
}
function addData ($texte) {
// to be completed
$ca = array ("é", "è", "ê", "à");
$par = array ("é", "è", "ê", "agrave;");
return htmlspecialchars(str_replace ($ca, $par, $texte), ENT_NOQUOTES);
}
function toXML ($tags="") {
if ($tags=="") {
$tags = $this->document;
$this->clearOutput ();
}
foreach ($tags as $tag => $contenu) {
$this->process ($tag, $contenu);
}
}
function process ($tag, $contenu) {
// Pour tous les TAGs
foreach ($contenu as $indice => $contenu_fils) {
$this->openTag ($tag);
// Pour tous les fils (non attribut et non data)
foreach ($contenu_fils as $tagFils => $fils) {
switch ($tagFils) {
case $this->TAG_ATTRIBUT:
$this->output_attributes ($contenu_fils);
$this->closeTag ();
break;
case $this->TAG_DATA:
$this->closeTag ();
$this->XML.= $this->addData ($contenu_fils [$this->TAG_DATA]);
break;
default:
$this->closeTag ();
$this->process ($tagFils, $fils);
break;
}
}
$this->closingTag ($tag);
}
}
}
dmeekins att gmail doot com
20-Dec-2006 04:02
20-Dec-2006 04:02
I reworked some of the code I found posted previously here, mainly so I could access the structure of the parsed xml file by the tags' names. So if I was parsing html that's also valid xml, I could access the page title by $xmlp->document['HTML'][0]['HEAD'][0]['TITLE'][0]['data']. The index after the tag name corresponds to the occurrence of that tag. If there were two <head></head> in the same depth, then the second one could get accessed by ['HEAD'][1].
<?php
class XMLParser
{
var $parser;
var $filePath;
var $document;
var $currTag;
var $tagStack;
function XMLParser($path)
{
$this->parser = xml_parser_create();
$this->filePath = $path;
$this->document = array();
$this->currTag =& $this->document;
$this->tagStack = array();
}
function parse()
{
xml_set_object($this->parser, $this);
xml_set_character_data_handler($this->parser, 'dataHandler');
xml_set_element_handler($this->parser, 'startHandler', 'endHandler');
if(!($fp = fopen($this->filePath, "r")))
{
die("Cannot open XML data file: $this->filePath");
return false;
}
while($data = fread($fp, 4096))
{
if(!xml_parse($this->parser, $data, feof($fp)))
{
die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($this->parser)),
xml_get_current_line_number($this->parser)));
}
}
fclose($fp);
xml_parser_free($this->parser);
return true;
}
function startHandler($parser, $name, $attribs)
{
if(!isset($this->currTag[$name]))
$this->currTag[$name] = array();
$newTag = array();
if(!empty($attribs))
$newTag['attr'] = $attribs;
