Search for Broken Images
12 posts by 2 authors in: Forums > CMS Builder
Last Post: August 5, 2010 (RSS)
By rjbathgate - August 2, 2010
Can anything think of a way to search for broken images, i.e.:
A table has an upload field, and each record has a an image "uploaded" - (has urlPath etc in SQL).
However, in some instances, the physical image is missing on the server.
So, the output is that there is a filepath/image, but it's not displayed as it's physically not there.
Is there anyway I can write a script which will check if the image urls (urlPath) works, or 404s?
I just need to be able to identify the records in question (with ultimate aim to delete the urlPath from the SQL, but I can do that once I have identified the broken paths).
Is that even possible?
Cheers!
Re: [rjbathgate] Search for Broken Images
By Chris - August 2, 2010
Here's a super simple solution that might help:
<?php header('Content-type: text/html; charset=utf-8'); ?>
<?php
require_once "C:/wamp/www/sb/CMS Builder/cmsAdmin/lib/viewer_functions.php";
list($uploads,) = getRecords(array(
'tableName' => 'uploads',
));
foreach ($uploads as $upload) {
echo("<a href=\"/cmsAdmin/admin.php?menu={$upload['tableName']}&action=edit&num={$upload['recordNum']}\"><img src=\"{$upload['urlPath']}\"></a>");
}
?>
You'll want to replace the code in red above with your own path and URL.
This script will list all the uploads in your database (as images) and clicking on them should take you to edit the record which owns the upload. Would looking through them for broken images might help?
If not, or if you have any questions, please let me know.
Chris
Re: [chris] Search for Broken Images
By rjbathgate - August 2, 2010 - edited: August 2, 2010
Thanks for the reply...
Only issue is I have 110,000 records, so ideally I was after a way to only display those which are broken...
The below will try to display the images so at least I can go thru and identify broken ones, but going thru a page of 110,000 results will be quite time consuming...
EDIT: cannot run the above, memory limit exhausted. I could break it down into limit/offset but again thats even more time consuming... [:(]
But logically I can't see a way, as the php can't know if the url is 404 or not... can it?
Cheers
Re: [rjbathgate] Search for Broken Images
By Chris - August 3, 2010
How about this?
<?php header('Content-type: text/html; charset=utf-8'); ?>
<?php
require_once "C:/wamp/www/sb/CMS Builder/cmsAdmin/lib/viewer_functions.php";
function check404($url) {
$handle = @fsockopen("tcp://localhost", 80, $errno, $errstr, 5);
if (!$handle) { return; }
$url = str_replace(' ', '%20', $url);
$request = "GET " . $url . " HTTP/1.0\r\n\r\n";
fwrite($handle, $request);
$response = '';
while (!feof($handle)) {
$buffer = fgets($handle, 128);
if (!isset($buffer)) { break; } // prevent infinite loops on fgets errors
$response .= $buffer;
}
$httpStatusCode = null;
if ($response) {
list($header, $html) = preg_split("/(\r?\n){2}/", $response, 2);
if (preg_match("/^HTTP\S+ (\d+) /", $header, $matches)) { $httpStatusCode = $matches[1]; }
}
fclose($handle);
return intval($httpStatusCode);
}
$page = 0;
while (true) {
$page++;
list($uploads,) = getRecords(array(
'tableName' => 'uploads',
'perPage' => 100,
'pageNum' => $page,
'orderBy' => 'tableName, recordNum+0',
));
if (empty($uploads)) { break; }
foreach ($uploads as $upload) {
$httpStatusCode = check404($upload['urlPath']) . "<br />";
if ($httpStatusCode == 404) {
echo "<a href=\"/cmsAdmin/admin.php?menu={$upload['tableName']}&action=edit&num={$upload['recordNum']}\">";
echo "{$upload['tableName']} {$upload['recordNum']}";
echo "</a><br>";
}
}
}
?>
Does that help?
Chris
Re: [chris] Search for Broken Images
By rjbathgate - August 4, 2010
It's almost working - although it's returning 404 on everything (ie including thoses which are valid).
I've checked the code through, and it's returning the right file paths throughout , i.e. checking the right path for the image, which is a valid url, but still 404 is returned under httpStatusCode
Is there likely to be some server specific limitations/settings preventing it from working?
Many thanks Chris
Re: [rjbathgate] Search for Broken Images
By Chris - August 4, 2010
Glad to be of help! :)
A couple things to check:
1. What do you get when you do a check404() for a URL you know should work? e.g. check404('/')
2. What are the URLs it's trying to check? Can you post one?
I had the same problem until I added the str_replace for spaces.
Chris
Re: [chris] Search for Broken Images
By rjbathgate - August 4, 2010
On checking '/' $httpStatusCode = 403 (forbidden)
On ../index.php I get 400 (bad request) so presume it needs to be root.
On http://www.domain.com/index.php I get 404
and on full path to root I get 404 too
So it doesn't seem to sucessfully get anything,.
Re what urls am I checking, the above I've checked basic ones (/, index.php etc)
And for the images, the urls being checked are in format:
/cmsAdmin/uploads/ist2_11299731-young-serious-man.jpg
for example.
Thanks heaps!
Rob
Re: [rjbathgate] Search for Broken Images
By Chris - August 4, 2010
Hmm, getting a 403 for / is troubling...
I wonder if you should be sending the Host header?
Try changing this:
$request = "GET " . $url . " HTTP/1.0\r\n\r\n";
...to this:
$request = "GET " . $url . " HTTP/1.0\r\nHost: www.mywebsite.com\r\n\r\n";
If that doesn't help, can you tell me what your host is and I can try some things from here?
Chris
Re: [chris] Search for Broken Images
By rjbathgate - August 4, 2010
Same problem I'm afraid.
Will send u host details in email.
Thanks heaps,
Rob
Re: [rjbathgate] Search for Broken Images
By rjbathgate - August 5, 2010
if(fopen($url, "r')
{
echo "this image is here";
}
else
{
echo "This image isn't here";
}
Will give this a go later on... might be barking up the wrong tree, but I've just used it successfully in a different instance, same principle thou...
Cheers