Print Page | Close Window

Search Robots

Printed From: Web Wiz Forums
Category: Web Wiz Web App Support Forums
Forum Name: Web Wiz Forums
Forum Description: Support forum for Web Wiz Forums application.
URL: https://forums.webwiz.net/forum_posts.asp?TID=25491
Printed Date: 28 March 2026 at 7:35am
Software Version: Web Wiz Forums 12.08 - https://www.webwizforums.com


Topic: Search Robots
Posted By: manxboy
Subject: Search Robots
Date Posted: 26 March 2008 at 8:52am
I am finding that in my active users I am seeing a lot of search robots, yahoo, google etc.

Is there anyway I can stop these from being seen as active users?



Replies:
Posted By: WebWiz-Bruce
Date Posted: 26 March 2008 at 12:19pm
Not at the present time, but it is a good idea for future releases.

-------------
https://www.webwiz.net/web-wiz-forums/forum-hosting.htm" rel="nofollow - Web Wiz Forums Hosting
https://www.webwiz.net/web-hosting/windows-web-hosting.htm" rel="nofollow - ASP.NET Web Hosting


Posted By: manxboy
Date Posted: 26 March 2008 at 12:28pm
I have read somewhere something about a robots.txt file or evening putting lines in the common.asp file to to say if the OSTYPE is a serach robot divert it to a unknown page?

Does this mean anything?


Posted By: billd3
Date Posted: 26 March 2008 at 12:34pm
There must be something weird going on as the past few days we've seen DOZENS AND DOZENS of Yahoo and others. I mean it might show 80 users and I look and 70 of them are search bots!
What's up with that sudden surge? It's like an attack, literally.
It really skews the numbers, but we sure don't show up any better in the search results.
The worst seems to be yahoo (which IMO is a bunch of yahoos anyway)


-------------
BillD
http://theamcpages.com
http://theamcforum.com


Posted By: manxboy
Date Posted: 26 March 2008 at 12:45pm
I would also say I have seen a surge but nothing like 70 odd....more like 20 for me.....

Anyone got any ideas on this?


Posted By: ctscott
Date Posted: 26 March 2008 at 1:06pm
it may be Yahoo Slurp doing that to you.  i had to disallow it a while back for chewing up too much bandwidth.

-------------
______________________
http://www.cfbtrivia.com" rel="nofollow - College Football Trivia


Posted By: manxboy
Date Posted: 26 March 2008 at 1:16pm
How do you disallow it?


Posted By: Scotty32
Date Posted: 26 March 2008 at 10:44pm
You only really want to use the Robots.txt if you do not want traffic from search engines - which may mean you get no new visitors.

There are different ways you can stop search engines:

1) stop guests from viewing all or most forums, this will then include the Search Robots.

2) use a robots.txt file in the root of your site (eg so its www.site.com/robots.txt)

you then add the following to the robots file:

User-Agents: *
Disallow: /forum/


As i said, with both of these methods you will stop search engines from crawling your website, which will then mean you do not display in search results, which will then mean you get no new members. If how ever your site relays on word of mouth, or other methods of advertising it may not affect you.

For more information, i wrote a page about the http://www.s2h.co.uk/seo/misc/robots-file.asp - Robots.txt File on my website, but please note that when i wrote it i mistakenly added a "allow:" command which apparently  doesnt exist

For google how ever, you can use the http://www.google.com/webmasters/ - Webmaster Center to tell google how often you want it to crawl your website. It will only affect google, but it has the advantage that you will not disappear from googles search results.



-------------
S2H.co.uk - http://www.s2h.co.uk/wwf/" rel="nofollow - WebWiz Mods and Skins

For support on my mods + skins, please use http://www.s2h.co.uk/forum/" rel="nofollow - my forum .


Posted By: manxboy
Date Posted: 26 March 2008 at 10:58pm
Thanks for the info. I may add a robots.txt file in but it is good to read your document on this regarding the security concerns.

Is it intended in a later version of the forums to add an option to stop them being crawled?


Posted By: WebWiz-Bruce
Date Posted: 27 March 2008 at 9:37am
The problem with search engine robots is they don't always declare themselves as such.

The only real way to prevent your forum from being index is to disallow Guests from being able to access forums. This way search engines will also not index your forum.


-------------
https://www.webwiz.net/web-wiz-forums/forum-hosting.htm" rel="nofollow - Web Wiz Forums Hosting
https://www.webwiz.net/web-hosting/windows-web-hosting.htm" rel="nofollow - ASP.NET Web Hosting


Posted By: manxboy
Date Posted: 27 March 2008 at 9:44am
Thanks Bruce, unfortunately I want to allow guests to read the forum so I may have to put up with the robots.


Posted By: manxboy
Date Posted: 01 April 2008 at 9:04am
There are now 53 robots browsing my forum.

I have but a file, robots.txt in the root of my site now with the code listed previously in this thread contained.

How long does it take for this to come into effect or is it instant?


Posted By: WebWiz-Bruce
Date Posted: 01 April 2008 at 9:59am
It can take awhile as the search bots don't continually read the robot.txt file. But you should see it starting to take effect over the next few days.


-------------
https://www.webwiz.net/web-wiz-forums/forum-hosting.htm" rel="nofollow - Web Wiz Forums Hosting
https://www.webwiz.net/web-hosting/windows-web-hosting.htm" rel="nofollow - ASP.NET Web Hosting


Posted By: ctscott
Date Posted: 01 April 2008 at 11:40am
Originally posted by WebWiz-Bruce WebWiz-Bruce wrote:

Not at the present time, but it is a good idea for future releases.
if this enhancement is implementated please make it optional.  i prefer to know they are crawling.

-------------
______________________
http://www.cfbtrivia.com" rel="nofollow - College Football Trivia


Posted By: WebWiz-Bruce
Date Posted: 01 April 2008 at 12:19pm
There is a new SEO optimising section in the admin area of version 10, it could be added as an option under that.

-------------
https://www.webwiz.net/web-wiz-forums/forum-hosting.htm" rel="nofollow - Web Wiz Forums Hosting
https://www.webwiz.net/web-hosting/windows-web-hosting.htm" rel="nofollow - ASP.NET Web Hosting


Posted By: WebCity
Date Posted: 06 April 2008 at 6:58pm
Originally posted by Scotty32 Scotty32 wrote:

You only really want to use the Robots.txt if you do not want traffic from search engines - which may mean you get no new visitors.

There are different ways you can stop search engines:

1) stop guests from viewing all or most forums, this will then include the Search Robots.

2) use a robots.txt file in the root of your site (eg so its www.site.com/robots.txt)

you then add the following to the robots file:

User-Agents: *
Disallow: /forum/


As i said, with both of these methods you will stop search engines from crawling your website, which will then mean you do not display in search results, which will then mean you get no new members. If how ever your site relays on word of mouth, or other methods of advertising it may not affect you.

For more information, i wrote a page about the http://www.s2h.co.uk/seo/misc/robots-file.asp - Robots.txt File on my website, but please note that when i wrote it i mistakenly added a "allow:" command which apparently  doesnt exist

For google how ever, you can use the http://www.google.com/webmasters/ - Webmaster Center to tell google how often you want it to crawl your website. It will only affect google, but it has the advantage that you will not disappear from googles search results.



I tried the suggestions above but they didn't work the google bots kept coming in and eating more of my bandwidth.

So I tried  the blocking IP address thing.  I blocked one then 2 then 3 and seen a pattern.  I noticed the IP address is the same but the last 2 numbers was different so I blocked the all the numbers that where the same in the IP address.  That has seemed to stop the google bots.  This may cause problems later but for now it seems to stop the bandwidth eating monsters.  The big rule on our board is if you like to eat bandwidth you will be Denied Access


Posted By: billd3
Date Posted: 06 April 2008 at 7:07pm
Yesterday, we had over 200 guests, ALL YAHOO.
 
What the heck is up with the yahoos at yahoo lately? Why do they need to hit a single site with that much power?  I mean 200 of all YAHOO. Google sure doesn't hit us that hard, no one else does. I thing YAHOO has some serious issues, and always have, actually. Someone needs to contact them and thell them what they are doing.
It's one thing to see that every few days, or even every week, but we see such nonsense almost daily now.


-------------
BillD
http://theamcpages.com
http://theamcforum.com


Posted By: WebWiz-Bruce
Date Posted: 07 April 2008 at 8:18am
Some of them do go over the top, the best thing to do is create a robots.txt file in the root of your web site and put in crawl delays to slow down some of these bots. They will then still index the site but without going mad and sucking all your bandwidth.

Below is the one used for Web Wiz Guide:-

robots.txt
User-agent: *
Disallow: /images/
Disallow: /includes/

User-agent: googlebot
Crawl-delay: 60

User-agent: yahoo
Crawl-delay: 60

User-agent: Slurp
Crawl-delay: 60

User-agent: msnbot
Crawl-delay: 60

User-agent: Teoma
Crawl-delay: 60

User-agent: aipbot
Disallow: /

User-agent: BecomeBot
Disallow: /

User-agent: psbot
Disallow: /

User-agent: fast
Disallow: /



-------------
https://www.webwiz.net/web-wiz-forums/forum-hosting.htm" rel="nofollow - Web Wiz Forums Hosting
https://www.webwiz.net/web-hosting/windows-web-hosting.htm" rel="nofollow - ASP.NET Web Hosting


Posted By: freakyfred
Date Posted: 07 April 2008 at 1:02pm
Yahoo had me with 260 the other day. Made the site look popular.


Posted By: 123Simples
Date Posted: 07 April 2008 at 3:15pm
Thanks for that Bruce
I have just uploaded my txt file now


-------------
http://www.123simples.com/" rel="nofollow - Visit 123 Simples Web Design



Print Page | Close Window

Forum Software by Web Wiz Forums® version 12.08 - https://www.webwizforums.com
Copyright ©2001-2026 Web Wiz Ltd. - https://www.webwiz.net