Home » Security & Spam Etc » Spider Blocking » Spider Trap - Detects and blocks bad bots
Spider Trap - Detects and blocks bad bots [message #13] Wed, 17 August 2005 08:37 Go to next message
nicholak  is currently offline nicholak
Messages: 14
Registered: May 2005

Site Admin
The following addition to your site will ensure that any robots (AKA spiders) that do not follow the instructions in your robots.txt file, as required by agreed web standards, will be blocked, and will get 5000 dud email addresses for their trouble (for the spam email harvesters out there..).

Good robots will read your robots.txt file, and will do as you ask and ignore the trap. When a bad robot follows your hidden 1 pixel link, it lands in the trap.

The trap spawns random email addresses on the page requested, and updates your .htaccess file with a block on the source IP address of the spider. The IP addresses are blocked by being added automatically to the top of your .htaccess file, and can be deleted any time you wish if in error. You are emailed the IP and user agent for each IP that is blocked by the trap.

In robots.txt add the following. Allow a few days after this before adding the trap to avoid trapping nice spiders. If your site does not yet have a robots.txt file, simply create one with the following in it, and upload it to the root directory of your website.

Basically, the instruction below is for all robots/spiders to stay out of this file, which is what the good bots will do (google, yahoo, etc.).

User-agent: *
Disallow: /getout.php


/getout.php is the file and directory to your own trap file. You may wish to change this to another name, and put it in a directory Eg. /welcome/index.php, or whatever you have decided is the file/directory you want to put your trap in. You must ensure that the file ends in .php though...

Once you are confident good bots have read this file and are abiding by it (allow at least 2-3 days), make the following additions:

Add this to the very top of your .htaccess file in site root:

SetEnvIf Request_URI "^(/403.*\.htm|/robots\.txt)$" allowsome
<Files *>
order deny,allow
deny from env=getout
allow from env=allowsome
</Files>


* Note: the above 1st line contains a pipe "|" after "htm", and not a small letter "L".

For your trap file, in this case getout.php, the contents are:

<?php
//////* CONFIGURATION START */////

$filename = '/home/username/public_html/.htaccess';// Change username to your hosting account username to suit the path to your .htaccess file
$emailalert = 'alertaddress@yourdomain.com.au';// Change to your email address
$emailfrom = 'as_above';// Change to alternative email address that you want the alert to appear from, or leave as 'as_above'
$qtyemails = 5000;// How many dud emails do you want to generate?

/////* CONFIGURATION END */////
// Do not adjust below here! //

if ($emailfrom == 'as_above') $emailfrom = $emailalert;
$content = "SetEnvIf Remote_Addr ^".str_replace(".","\.",$_SERVER["REMOTE_ADDR"])."$ getout # ".$_SERVER["HTTP_USER_AGENT"]."\r\n";
$handle = fopen($filename, 'r');
$content .= fread($handle,filesize($filename));
fclose($handle);
$handle = fopen($filename, 'w+');
fwrite($handle, $content,strlen($content));
fclose($handle);
mail($emailalert,
"Spider Alert!",
"The following ip just got banned because it accessed the spider trap.\r\n\r\n".$_SERVER["REMOTE_ADDR"]."\r\n".$_SERVER["HTTP_USER_AGENT"]."\r\n".$_SERVER["HTTP_REFERER"]
,"FROM: $emailfrom");

// start free emails for spider
$page = '';
for ( $i = 0; $i < $qtyemails; $i++ )
{
$page .= new_email();
}

$page .= "Goodbye!";
echo $page;

function new_email()
{
$email = '';
$letters_array = array('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r',
's', 't', 'u', 'v', 'w', 'x', 'y', 'z');
for ( $i = 0; $i < 17; $i++ )
{
$email .= ( $i!== 10 )? $letters_array[ mt_rand( 0, 25) ] : '@';
}
$email .= '.com.au';
$email = '<a href="mailto:' . $email . '">' . $email . "</a>\n";
return $email;

}
?>


* Note: You MUST configure the above file for your installation - adjust the values for $filename & $emailalert. You may also adjust the other values, though this is not required.

Finally, you need to add a tiny link to your site for the bots to recklessly follow when they ignore your robots.txt file. In the following example, I use a 1 pixel transparent image (an invisible dot to your users) on the website:

<a href="http://www.mydomain.com.au/getout.php"><img src="http://www.mydomain.com.au/images/pixel_trans.gif" border=0></a>


If you want mine, you can get it here. Just right-click and choose "save as": 1 Pixel transparent image

That's it! You will now be alerted when a robot follows the link that you have instructed them not to, and it will be banned from your site thereafter! Bad Bot!

[Updated on: Wed, 21 December 2005 11:31]


Nicholas Keown
Forum Admin
30 day money-back guarantee with all Portability hosting
Web site addresses from $29.00 AUD
Re: Spider Trap - Detects and blocks bad bots [message #20 is a reply to message #13 ] Wed, 19 October 2011 20:29 Go to previous messageGo to next message
CarolSmith123  is currently offline CarolSmith123
Messages: 1
Registered: October 2011
Location: 512 Artesia Blvd, Redondo...
This is really amazing the robots get the taste of there own food.

[Updated on: Wed, 19 October 2011 20:31]


Switch Mode Power Supplies
Re: Spider Trap - Detects and blocks bad bots [message #21 is a reply to message #20 ] Mon, 24 October 2011 19:20 Go to previous messageGo to next message
alexnikol  is currently offline alexnikol
Messages: 1
Registered: October 2011
Location: 125 Artesia Blvd, Redondo...
I think the bots will find other way to avoid this program and start repeating there work for which they are being ordered to do.


Usana
Re: Spider Trap - Detects and blocks bad bots [message #22 is a reply to message #20 ] Wed, 02 November 2011 16:05 Go to previous messageGo to next message
SammyGreen  is currently offline SammyGreen
Messages: 1
Registered: November 2011
Location: Schieffelin Ave Bronx, NY
Looking for the reviews regarding spider trap and its working.


VPN mobile
Re: Spider Trap - Detects and blocks bad bots [message #23 is a reply to message #22 ] Sat, 24 December 2011 18:17 Go to previous messageGo to next message
alstonsammy  is currently offline alstonsammy
Messages: 1
Registered: December 2011
Location: 8091 Willow Isle Rd, Lake...
Using from sometime and it is working great!


parentline
Re: Spider Trap - Detects and blocks bad bots [message #28 is a reply to message #23 ] Mon, 16 January 2012 23:47 Go to previous messageGo to next message
harliniJshenoy  is currently offline harliniJshenoy
Messages: 1
Registered: January 2012
Location: 410 S 3rd St, Yakima, WA ...
That's a great it , Finally we can have a deep breath on it success Razz


weight watchers online coupon
Re: Spider Trap - Detects and blocks bad bots [message #32 is a reply to message #23 ] Wed, 25 January 2012 16:08 Go to previous message
harryKshenoy  is currently offline harryKshenoy
Messages: 1
Registered: January 2012
Location: 410 S 3rd St, Yakima, WA ...

True! I am also impressed with this and would suggest everyone to use this for security purpose


Hardwood Flooring
Previous Topic:air charter
Goto Forum:
  


Current Time: Mon Jul 28 20:19:11 EST 2014

Total time taken to generate the page: 0.37852 seconds
Contact | Home

Powered by: FUDforum.
Copyright ©2001-2004 FUD Forum Bulletin Board Software