Catch 404 error in DOMDocument-> load ()

I load a bunch of rss feeds using the DOM, and sometimes one of them will be 404 instead of creating a file. The problem is that the web server sends an html 404 page instead of the expected xml file using this code:

$rssDom = new DOMDocument();
$rssDom->load($url);
$channel = $rssDom->getElementsByTagName('channel');
$channel = $channel->item(0);
$items = $channel->getElementsByTagName('item');

I get this warning:

Warning: DOMDocument::load() [domdocument.load]: Entity 'nbsp' not defined

Following this error:

Fatal error: Call to a member function getElementsByTagName() on a non-object

Usually this code works fine, but on the occasion when I get 404, it does nothing. I tried the standard try-catch around the load statement, but it didn't seem to catch it.

+3
source share
5 answers

You can suppress the output of analysis errors with

libxml_use_internal_errors(true);

To check if the returned answer is 404, you can check $http_response_headerafter callingDOMDocument::load()

Example:

libxml_use_internal_errors(true);
$rssDom = new DOMDocument();
$rssDom->load($url);
if (strpos($http_response_header[0], '404')) {
    die('file not found. exiting.');
}

file_get_contents, , 404 DOMDocument::loadXml. DOMDocument XML.

, , 404 .

+3

HTML file_get_contents curl ( ), , DOMDocument::loadHTML.

curl (, , , , ); HTTP curl_getinfo.

+2

, LIBXML_NOWARNING (: ).

the more important issue here is a fatal error: to avoid this, you should check if the document is loaded correctly. for this just save the load()return-value and look for it:

$loaded = $rssDom->load($url, LIBXML_NOWARNING);
if($loaded){
    $channel = $rssDom->getElementsByTagName('channel');
    $channel = $channel->item(0);
    $items = $channel->getElementsByTagName('item');
}else{
    // show error-message or something like that
}
0
source

Like this:

$rssDom = new DOMDocument();
if($rssDom->load($url)) {
   $channel = $rssDom->getElementsByTagName('channel');
   $channel = $channel->item(0);
   $items = $channel->getElementsByTagName('item');
}
0
source

If someone needs a solution, this works like a charm:

$objDOM = new DOMDocument();
$loaded=@$objDOM->load(url);

if (!$loaded){
    //something went terribly wrong
} else {
    //this is going ok!!
}

This works because we are warning the "@" warning, and loading returns true or false in case of errors.

0
source

All Articles