Search for Broken Images

12 posts by 2 authors in: Forums > CMS Builder
Last Post: August 5, 2010   (RSS)

Hey,

Can anything think of a way to search for broken images, i.e.:

A table has an upload field, and each record has a an image "uploaded" - (has urlPath etc in SQL).

However, in some instances, the physical image is missing on the server.

So, the output is that there is a filepath/image, but it's not displayed as it's physically not there.

Is there anyway I can write a script which will check if the image urls (urlPath) works, or 404s?

I just need to be able to identify the records in question (with ultimate aim to delete the urlPath from the SQL, but I can do that once I have identified the broken paths).

Is that even possible?

Cheers!

Re: [rjbathgate] Search for Broken Images

By Chris - August 2, 2010

Hi rjbathgate,

Here's a super simple solution that might help:

<?php header('Content-type: text/html; charset=utf-8'); ?>
<?php
require_once "C:/wamp/www/sb/CMS Builder/cmsAdmin/lib/viewer_functions.php";

list($uploads,) = getRecords(array(
'tableName' => 'uploads',
));

foreach ($uploads as $upload) {
echo("<a href=\"/cmsAdmin/admin.php?menu={$upload['tableName']}&action=edit&num={$upload['recordNum']}\"><img src=\"{$upload['urlPath']}\"></a>");
}
?>


You'll want to replace the code in red above with your own path and URL.

This script will list all the uploads in your database (as images) and clicking on them should take you to edit the record which owns the upload. Would looking through them for broken images might help?

If not, or if you have any questions, please let me know.
All the best,
Chris

Re: [chris] Search for Broken Images

By rjbathgate - August 2, 2010 - edited: August 2, 2010

Hey,

Thanks for the reply...

Only issue is I have 110,000 records, so ideally I was after a way to only display those which are broken...

The below will try to display the images so at least I can go thru and identify broken ones, but going thru a page of 110,000 results will be quite time consuming...

EDIT: cannot run the above, memory limit exhausted. I could break it down into limit/offset but again thats even more time consuming... [:(]

But logically I can't see a way, as the php can't know if the url is 404 or not... can it?

Cheers

Re: [rjbathgate] Search for Broken Images

By Chris - August 3, 2010

Hi rjbathgate,

How about this?

<?php header('Content-type: text/html; charset=utf-8'); ?>
<?php
require_once "C:/wamp/www/sb/CMS Builder/cmsAdmin/lib/viewer_functions.php";

function check404($url) {
$handle = @fsockopen("tcp://localhost", 80, $errno, $errstr, 5);
if (!$handle) { return; }

$url = str_replace(' ', '%20', $url);

$request = "GET " . $url . " HTTP/1.0\r\n\r\n";
fwrite($handle, $request);

$response = '';
while (!feof($handle)) {
$buffer = fgets($handle, 128);
if (!isset($buffer)) { break; } // prevent infinite loops on fgets errors
$response .= $buffer;
}

$httpStatusCode = null;
if ($response) {
list($header, $html) = preg_split("/(\r?\n){2}/", $response, 2);
if (preg_match("/^HTTP\S+ (\d+) /", $header, $matches)) { $httpStatusCode = $matches[1]; }
}

fclose($handle);

return intval($httpStatusCode);
}

$page = 0;
while (true) {
$page++;
list($uploads,) = getRecords(array(
'tableName' => 'uploads',
'perPage' => 100,
'pageNum' => $page,
'orderBy' => 'tableName, recordNum+0',
));
if (empty($uploads)) { break; }

foreach ($uploads as $upload) {
$httpStatusCode = check404($upload['urlPath']) . "<br />";
if ($httpStatusCode == 404) {
echo "<a href=\"/cmsAdmin/admin.php?menu={$upload['tableName']}&action=edit&num={$upload['recordNum']}\">";
echo "{$upload['tableName']} {$upload['recordNum']}";
echo "</a><br>";
}
}
}
?>


Does that help?
All the best,
Chris

Re: [chris] Search for Broken Images

Wow, thanks heaps!

It's almost working - although it's returning 404 on everything (ie including thoses which are valid).

I've checked the code through, and it's returning the right file paths throughout , i.e. checking the right path for the image, which is a valid url, but still 404 is returned under httpStatusCode

Is there likely to be some server specific limitations/settings preventing it from working?

Many thanks Chris

Re: [chris] Search for Broken Images

Hey Chris,

On checking '/' $httpStatusCode = 403 (forbidden)

On ../index.php I get 400 (bad request) so presume it needs to be root.

On http://www.domain.com/index.php I get 404

and on full path to root I get 404 too

So it doesn't seem to sucessfully get anything,.

Re what urls am I checking, the above I've checked basic ones (/, index.php etc)

And for the images, the urls being checked are in format:

/cmsAdmin/uploads/ist2_11299731-young-serious-man.jpg

for example.

Thanks heaps!
Rob

Re: [rjbathgate] Search for Broken Images

By Chris - August 4, 2010

Hypens and underscores are "safe" characters, so those URLs look fine.

Hmm, getting a 403 for / is troubling...

I wonder if you should be sending the Host header?

Try changing this:

$request = "GET " . $url . " HTTP/1.0\r\n\r\n";

...to this:

$request = "GET " . $url . " HTTP/1.0\r\nHost: www.mywebsite.com\r\n\r\n";

If that doesn't help, can you tell me what your host is and I can try some things from here?
All the best,
Chris

Re: [chris] Search for Broken Images

Hey

Same problem I'm afraid.

Will send u host details in email.

Thanks heaps,
Rob

Re: [rjbathgate] Search for Broken Images

Just thought the fopen script could help here maybe...

if(fopen($url, "r')
{
echo "this image is here";
}
else
{
echo "This image isn't here";
}


Will give this a go later on... might be barking up the wrong tree, but I've just used it successfully in a different instance, same principle thou...

Cheers