Usually, PHP is limited to using somewhere between 16mb and 128mb of RAM. So what happens if you want to parse a 1.1gb file of exported product data (over 500,000 products) and not hit the RAM limiter?
At first this seemed to be a pretty impossible task, as to parse the file you require the entire XML to parse it to a tree.
Usually you would run something such as file_get_contents() and then parse the contents returned, but this would load in the entire 1.1gb of XML and put you well beyond most PHP ram limiters.
What you need to do is parse the XML in small chunks (example below uses 128kb chunks) and parse those bit by bit, this way, you get to speedily parse through your XML file, while at the same time, steer clear of the PHP RAM limiter.
set_time_limit(0);
define('__BUFFER_SIZE__', 131072);
define('__XML_FILE__', 'pf_1360591.xml');
function elementStart($p, $n, $a) {
//handle opening of elements
}
function elementEnd($p, $n) {
//handle closing of elements
}
function elementData($p, $d) {
//handle cdata in elements
}
$xml = xml_parser_create();
xml_parser_set_option($xml, XML_OPTION_TARGET_ENCODING, 'UTF-8');
xml_parser_set_option($xml, XML_OPTION_CASE_FOLDING, 0);
xml_parser_set_option($xml, XML_OPTION_SKIP_WHITE, 1);
xml_set_element_handler($xml, 'elementStart', 'elementEnd');
xml_set_character_data_handler($xml, 'elementData');
$f = fopen(__XML_FILE__, 'r');
if($f) {
while(!feof($f)) {
$content = fread($f, __BUFFER_SIZE__);
xml_parse($xml, $content, feof($f));
unset($content);
}
fclose($f);
}