Print Page | Close Window

Robots.txt file project

Printed From: Web Wiz Forums
Category: Web Wiz Web App Support Forums
Forum Name: Web Wiz Forums
Forum Description: Support forum for Web Wiz Forums application.
URL: https://forums.webwiz.net/forum_posts.asp?TID=15698
Printed Date: 13 April 2026 at 8:55am
Software Version: Web Wiz Forums 12.08 - https://www.webwizforums.com


Topic: Robots.txt file project
Posted By: radic
Subject: Robots.txt file project
Date Posted: 02 July 2005 at 8:59pm
Hi,
 
I would like to add all of the files from the forum except for the pages like default.asp, forum_topics.asp & forum_posts.asp etc into my site robots.txt file. Instead of doing this from scratch I would like to see if anyone else has one already completed and could share with the community.
 
If you want googlebot etc to index more files and more often then this is vital.
 
If you want traffic and high SEO listings then this is vital.



Replies:
Posted By: Duval
Date Posted: 03 July 2005 at 4:04pm
Radic, what makes you think that by excluding files you'll get spidered more thoroughly? Generally the robots.txt exclusion is for non public areas of your site or with pages that you have concerns about duplicate content.

The single best thing that one can do with a forum to improve search engine spidering is to rewrite the url's. http://www.isapirewrite.com/

Here's a link to the exclusion protocol http://www.robotstxt.org/wc/norobots.html

Please repost any specifics if you run into difficulty.


-------------
http://www.oversite.us - OverSite free submission for qualiy sites. (no reciprocal link required)


Posted By: wistex
Date Posted: 03 July 2005 at 8:31pm
BTW, if you press the space bar after typing a URL, it makes it clickable.  Here's a clickable version of the links provided above: http://www.isapirewrite.com/ - http://www.isapirewrite.com/ and http://www.robotstxt.org/wc/norobots.html - http://www.robotstxt.org/wc/norobots.html
 
The key to getting your site indexed is content content content.  Yes, there are tricks to get Google and other search engines to understand your site better and therefore rank it higher, but content still comes first.  On one of my sites, we have not used any SEO tricks, and yet we rank it the top 10 on Google and Yahoo! and MSN in some of the appropriate categories.  Actually, we do the opposite of what some SEO people say, yet we rank in the top 10.  Why?  The content is good and people keep coming back, and search engines have ways to figure this out.
 
And excluding files won't help you get indexed, actually it will probably hurt you.  The larger your site, the more of an authority you are, and that effects your ranking.  So excluding files would probably make you less of an authority since your site will appear smaller to search engines.


-------------
http://www.wistex.com" rel="nofollow - WisTex Solutions
http://www.caribbeanchoice.com/forums" rel="nofollow - CaribbeanChoice Forums


Posted By: radic
Date Posted: 04 July 2005 at 9:45am
Ok thanks for the feedback, I'm quite surprised that you both think its better to not use the Disallow command for files that you dont want shown on google etc.
 
I'm well aware of content is King etc and have been studying SEO quite a bit but there is so much to know. I mean what would be the point of having a file that has no content like an included header or one of the forum files that has no use or could produce an error if landed on?
 
I would much rather have the robot come in and index all the forum posts and boards than letting the robot index all these incuded or funtion files, I think you need to make it easy & clear for them for what they should do with your site and not throw hurdles in their path. 

Its also a waste of bandwith although thats not the point. The robots will turn away and not index a site properly if they have to deal with too much junk, they just want the content files, not the includes, not the errors caused be landing on a include file etc. You also dont want these files for everyone to see on google etc, you want users to find the forum posts where the keywords are.
 
Anyway thats just how I see it but very interested to hear more.
 
 


Posted By: dpyers
Date Posted: 04 July 2005 at 12:25pm
The SE bots follow links. They don't actually "walk" a directory tree. You don't link to an include file so the bots never see it.
The first time a bot visits, it does a "shallow" scan - usually only 1 or two levels deep from your home page. On subsequent visits, it scans deeper and deeper. It can take months to fully scan a deep site.

Pages that change frequently are visited more often by the bots. Once they get past your main forum page, they'll see pages that change frewuently and keep coming back more often. I've seen forums that get scanned twice a day.

One way of getting your site fully indexed wuicker is to include a site map on your front page. Another way to get a forum site indexed is to include a link to active topics on the home page.

The robots.txt exclude functions are used to prevent the bots from following links to those directories - not to prevent walking the directories which is something bots can't do if you've turned off directory views for your site.

One of the useful excluding directories and files from robots.txt does is to keep hidden things that you want hidded from the se's - like keeping your images out of google.

Note that robots.txt is only useful for conforming search engines. Bad Bots either ignore it, or use it to identify areas where "good stuff" might be kept.


-------------

Lead me not into temptation... I know the short cut, follow me.


Posted By: wistex
Date Posted: 04 July 2005 at 7:35pm
dpyers is right, the search engines only follow and index links.  They don't index any file that noone links directly to.  I've never had a problem with Google or any of the others linking to any header or footer files.
 
Remember, the bots can't read your directory structure, they can only follow hyperlinks in a webpage.


-------------
http://www.wistex.com" rel="nofollow - WisTex Solutions
http://www.caribbeanchoice.com/forums" rel="nofollow - CaribbeanChoice Forums


Posted By: dpyers
Date Posted: 05 July 2005 at 9:05pm
Something new from google that you may be interested in
http://www.google.com/webmasters/sitemaps/login - https://www.google.com/webmasters/sitemaps/login


-------------

Lead me not into temptation... I know the short cut, follow me.


Posted By: radic
Date Posted: 06 July 2005 at 8:58am
Originally posted by dpyers dpyers wrote:

Something new from google that you may be interested in
http://www.google.com/webmasters/sitemaps/login - https://www.google.com/webmasters/sitemaps/login
 
 
hahaha, ive been busy working on these for my sites for the last few days since I herd of this. So the point of this post is not really an issue anymore if this Google Sitemap thing works. I got a snitz forum Site Map indexed yesterday and im now working on ones for the rest of my sites... Wink 
 
dpyers, ok I see what your saying now and thanks for that information and that was explained well.


Posted By: radic
Date Posted: 08 July 2005 at 10:29pm
I wonder if someone could write a similar script for webwiz forums for the google Sitemap.
 
Here is a link to the snitz code and I would like to do the same thing with my webwiz forum: http://forum.snitz.com/forum/topic.asp?TOPIC_ID=58757 - http://forum.snitz.com/forum/topic.asp?TOPIC_ID=58757



Print Page | Close Window

Forum Software by Web Wiz Forums® version 12.08 - https://www.webwizforums.com
Copyright ©2001-2026 Web Wiz Ltd. - https://www.webwiz.net