Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

.htaccess Ban bot only on url with params

Google is visiting the page with parameters, i need to block it.

Give page 404 on all pages with param
Look like site.com?q=text or site.com/?q=text

but not block if link just site.com

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I wrote this script for .htaccess

ErrorDocument 403 "Your connection was rejected"
ErrorDocument 404 /404.shtml


RewriteEngine On
#RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{HTTP_USER_AGENT} (Googlebot) [NC]
RewriteCond %{REQUEST_URI} ^/q= [NC]
RewriteRule ^ - [F,L]

But have 2 problems
First – How to set params

And second – when they blocked not showing 404 page and show

Not Found
The requested URL was not found on this server.

Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.

But i give ErrorDocument 404 /404.shtml.
Why apache not found 404.shtml?
If I give a missing page , it is displayed normally 404.shtml.

>Solution :

First this is that you need to use QUERY_STRING not the REQUEST_URI to match query string.

Moreover, you are getting this error because query string is not getting in the redirected URL i.e. /404.shtml?q=text after 404 redirect and you rule will again try to redirect to same URL.

Ideally you should be returning 403 forbidden like this:

RewriteEngine On

RewriteCond %{HTTP_USER_AGENT} (Googlebot) [NC]
RewriteCond %{QUERY_STRING} ^q= [NC]
RewriteRule ^ - [F]

However if you have to use 404 only then use it like this:

RewriteEngine On

RewriteCond %{HTTP_USER_AGENT} (Googlebot) [NC]
RewriteCond %{QUERY_STRING} ^q= [NC]
RewriteRule !^404\.shtml$ - [R=404,NC,L]

Which will execute this rule for all URLs except for /404.shtml.

You may also check for REDIRECT_STATUS like this:

RewriteEngine On

RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{HTTP_USER_AGENT} (Googlebot) [NC]
RewriteCond %{QUERY_STRING} ^q= [NC]
RewriteRule ^ - [R=404,L]

This will execute this rule for original URL only.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading