archive of pdf uploads safe from bots
6 posts by 4 authors in: Forums > CMS Builder
Last Post: March 5, 2012 (RSS)
By markr - February 28, 2012
If you use a custom upload directory to hide them, aren't they still just waiting for a clever bot or patient crawler to see them?
Re: [markr] archive of pdf uploads safe from bots
By Damon - February 28, 2012
You can prevent directory from being browsable using htaccess edit or just by simply dropping an index.php file into it.
If anyone (including bots) go to the directory, they will just see the index.php file.
The only other way bots would pick up the links to PDF files is from links within your site content.
If your content is protected by username/password, then the bots wouldn't be able to find any links to PDFs.
Hope that helps!
Damon Edis - interactivetools.com
Hire me! Save time by getting our experts to help with your project.
http://www.interactivetools.com/consulting/
Re: [Damon] archive of pdf uploads safe from bots
By markr - February 28, 2012
And even if I drop a blank index.php file in the directory, a browser can still load the pdf if landed upon. Let's say the client names the pdf something simple like a.pdf, the bot would randomly generate that name pretty quickly. A longer name would only delay the beast, no?
When you say "protected by username/password", are you referring to the htaccess edit?
I was wondering if maybe the pdf could be stored in a non-public directory and displayed on a secure (members only) html page using an embed thing. In that hypo, can cmsb upload to a non-public area of the server?
Re: [markr] archive of pdf uploads safe from bots
By sublmnl - March 4, 2012 - edited: March 4, 2012
Or if you can't move to above/outside the www folder, then put a robots.txt file in your root and dissallow the pdf directory you are talking about. Also put a index.php file in the root of the pdf folder that makes the listing die or redirect. Also put a .htaccess file in the PDF folder that doesn't allow listing of file contents for that folder.
Or do like he said and put a login on the pdf folder.
you could also apply document security to the PDF's so that you would have to put in a password if you tried to open them. You can do all of the above as good practice if needed.
We have done this with a client and moved an entire section of their site 'behind the wall' - it was all learning module content and run from the CMS. So was the content on the outside of the login wall. Win win for us. It may be a win win for your client as well.
Re: [markr] archive of pdf uploads safe from bots
By Dave - March 4, 2012
>can cmsb upload to a non-public area of the server?
Yes, you can set custom upload dirs in the fields editor for upload fields.
We've dealt with document security a number of times and there's a few common issues that come up.
- Bot Security, Generally bots won't find your upload directory unless it's linked from somewhere, and if they do it's not usually a problem if it doesn't list all the files (a blank index.html/php will hide directory listings). It's true they could guess at filenames, but this is as secure as passwords which can also be guessed. Assuming a-z is 26 chars, plus 0-9 if another 10, each filename char has 36 possibilities, so a 3 char filenamecould take over 46 thousand guesses (36*36*36). It's usually not a problem unless your filenames follow a pattern, eg: 1001.pdf, 1002.pdf or if they match something else on your site (product SKUs, etc).
- User Security, the next concern is limiting download links to logged in users, since once someone has the link they could just share it and anyone could access it. The easiest way to do this is to create a custom wrapper script that requires users to be logged in and displays the PDF. A link such as memberPdfDownload.php?table=products&num=123 could let them download the PDF, but only if they were logged in so sending that link to others wouldn't help.
- Home PC Security, of course, nothing prevents a user from saving the file to their computer and emailing it around as an attachment. And even complicated systems that don't let a user download a file are still susceptible to someone taking a picture of their screen with their camera. Basically, there's no way to prevent a user from copying the data once they have it, just lots of ways of making it more difficult.
Hope that helps, let me know any questions. Thanks!
interactivetools.com