Great PHP for looping with SimpleXMLElement is very slow: memory issues?

I currently have some PHP code that basically extracts data from an XML file and creates a simple xml object with $products = new SimpleXMLElement($xmlString);Then I go through this code with a for loop in which I set the product details for each product in an XML document. Then it is saved in the mySql database.

When you run this script, added products reduce the frequency until they stop until they reach their maximum. I tried to work garbage collection in between, to no avail. As well as disabling various variables that don't seem to work.

Part of the code is shown below:

<?php
$servername = "localhost";
$username = "database.database";
$password = "demwke";
$database = "databasename";
$conn = new mysqli($servername, $username, $password, $database);

$file = "large.xml";
$xmlString = file_get_contents($file);
$products = new SimpleXMLElement($xmlString);
unset($xmlString, $file);
$total = count($products->datafeed[0]);

echo 'Starting<br><br>';

for($i=0;$i<$total;$i++){
    $id = $products->datafeed->prod[$i]['id'];
etc etc
    $sql = "INSERT INTO products (id, name, uid, cat, prodName, brand, desc, link, imgurl, price, subcat) VALUES ('$id', '$store', '$storeuid', '$category', '$prodName', '$brand', '$prodDesc', '$link', '$image', '$price', '$subCategory')";
}
echo '<br>Finished';
?>

All php variables are defined using a similar line, as with the identifier $ id, but removed to simplify reading.

, /, ? , .

+3
5

: SimpleXML, . foreach.:

// Before, with [index]:
for ($i=0;$i<$total;$i++) {
    $id = $products->datafeed->prod[$i]['id'];
    ...

// After, with foreach():
$i = 0;
foreach ($products->datafeed->prod as $prod) {
    $i++; // Remove if you don't actually need $i
    $id = $prod['id'];
    ...

...->node[$i] node[] , node o (N), o (N 2). , , K K-1 ( ). foreach , , o (N).

foreach , ( ):

    $a[0] = $products->datafeed->prod[15]['id'];
    ...
    $a[35] = $products->datafeed->prod[1293]['id'];

// After, with foreach():
$want = [ 15, ... 1293 ];
$i = 0;
foreach ($products->datafeed->prod as $prod) {
    if (!in_array(++$i, $want)) {
        continue;
    }
    $a[] = $prod['id'];
}

, MySQLi XML. () SQL-, , , , ( ...:-)) .

, XML :

for($i=0;$i<$total;$i++){
    $id = $products->datafeed->prod[$i]['id'];

... , SimpleXMLObject. Schlemiel the Painter.

: " , ", " ".

, :

$i = -1;
foreach ($products->datafeed->prod as $prod) {
    $i++;
    $id = $prod['id'];
    ...
}

XML :

// Stage 1. Create a large XML.
$xmlString = '<?xml version="1.0" encoding="UTF-8" ?>';
$xmlString .= '<content><package>';
for ($i = 0; $i < 100000; $i++) {
    $xmlString .=  "<entry><id>{$i}</id><text>The quick brown fox did what you would expect</text></entry>";
}
$xmlString .= '</package></content>';

// Stage 2. Load the XML.
$xml    = new SimpleXMLElement($xmlString);

$tick   = microtime(true);
for ($i = 0; $i < 100000; $i++) {
    $id = $xml->package->entry[$i]->id;
    if (0 === ($id % 5000)) {
        $t = microtime(true) - $tick;
        print date("H:i:s") . " id = {$id} at {$t}\n";
        $tick = microtime(true);
    }
}

XML , , 5000 . , , . .

21:22:35 id = 0 at 2.7894973754883E-5
21:22:35 id = 5000 at 0.38135695457458
21:22:38 id = 10000 at 2.9452259540558
21:22:44 id = 15000 at 5.7002019882202
21:22:52 id = 20000 at 8.0867099761963
21:23:02 id = 25000 at 10.477082967758
21:23:15 id = 30000 at 12.81209897995
21:23:30 id = 35000 at 15.120756149292

, : XML .

, foreach:

// Stage 1. Create a large XML.
$xmlString = '<?xml version="1.0" encoding="UTF-8" ?>';
$xmlString .= '<content><package>';
for ($i = 0; $i < 100000; $i++) {
    $xmlString .=  "<entry><id>{$i}</id><text>The quick brown fox did ENTRY {$i}.</text></entry>";
}
$xmlString .= '</package></content>';

// Stage 2. Load the XML.
$xml    = new SimpleXMLElement($xmlString);

$i      = 0;
$tick   = microtime(true);
foreach ($xml->package->entry as $data) {
    // $id = $xml->package->entry[$i]->id;
    $id = $data->id;
    $i++;
    if (0 === ($id % 5000)) {
        $t = microtime(true) - $tick;
        print date("H:i:s") . " id = {$id} at {$t} ({$data->text})\n";
        $tick = microtime(true);
    }
}

... "", , , , .

( , . , , XML).

21:33:42 id = 0 at 3.0994415283203E-5 (The quick brown fox did ENTRY 0.)
21:33:42 id = 5000 at 0.0065329074859619 (The quick brown fox did ENTRY 5000.)
...
21:33:42 id = 95000 at 0.0065121650695801 (The quick brown fox did ENTRY 95000.)
+2

. , , 5k .

<?php
$servername = "localhost";
$username = "database.database";
$password = "demwke";
$database = "databasename";
$conn = new mysqli($servername, $username, $password, $database);

$file = "large.xml";
$xmlString = file_get_contents($file);
$products = new SimpleXMLElement($xmlString);
unset($xmlString, $file);

$total = count($products->datafeed[0]);

//get your starting value for this iteration
$start = isset($_GET['start'])?(int)$_GET['start']:0;

//determine when to stop
//process no more than 5k at a time
$step = 5000;
//where to stop, either after our step (max) or the end
$limit = min($start+$step, $total);

echo 'Starting<br><br>';

//modified loop so $i starts at our start value and stops at our $limit for this load.
for($i=$start;$i<$limit;$i++){
    $id = $products->datafeed->prod[$i]['id'];
etc etc
    $sql = "INSERT INTO products (id, name, uid, cat, prodName, brand, desc, link, imgurl, price, subcat) VALUES ('$id', '$store', '$storeuid', '$category', '$prodName', '$brand', '$prodDesc', '$link', '$image', '$price', '$subCategory')";
}

if($limit >= $total){
    echo '<br>Finished';
} else {
    echo<<<HTML
<html><head>
<meta http-equiv="refresh" content="2;URL=?start={$limit}">
</head><body>
Done processing {$start} through {$limit}. Moving on to next set in 2 seconds.
</body><html>
HTML;
}
?>

, ( ), .

: / ?

+2

:

file_get_contents() SimpleXML. .

XMLReader:

$reader = new XMLReader;
$reader->open($file);
$dom = new DOMDocument;
$xpath = new DOMXpath($dom);

// look for the first product element
while ($reader->read() && $reader->localName !== 'product') {
  continue;
}

// while you have an product element
while ($reader->localName === 'product') {
  // expand product element to a DOM node
  $node = $reader->expand($dom);
  // use XPath to fetch values from the node
  var_dump(
    $xpath->evaluate('string(@category)', $node),
    $xpath->evaluate('string(name)', $node),
    $xpath->evaluate('number(price)', $node)
  );
  // move to the next product sibling
  $reader->next('product');
}

, .

script -. , `set_time_limit().

- , . / , . .

INSERT INTO table 
   (field1, field2) 
VALUES 
   (value1_1, value1_2), 
   (value2_1, value2_2), ...

SQL mysql . , , exec().

+1

2 , .

1) Increase the default PHP execution time from 30 sec to a bigger one.
   ini_set('max_execution_time', 300000);

2) If fails please try to execute your code though cron job/back end.
0

.

XML , file1, file2, file3, .

You can explode your xml with a text editor that can open large files. Do not waste time on php when hacking a file.

edit: I find the answer for huge xml files. I think this is the best answer for this purpose. Parsing huge XML files in PHP

0
source

All Articles