| Author |
Topic Search Topic Options
|
radic
Newbie
Joined: 24 February 2005
Status: Offline
Points: 7
|
Post Options
Thanks(0)
Quote Reply
Topic: Robots.txt file project Posted: 02 July 2005 at 8:59pm |
Hi,
I would like to add all of the files from the forum except for the pages like default.asp, forum_topics.asp & forum_posts.asp etc into my site robots.txt file. Instead of doing this from scratch I would like to see if anyone else has one already completed and could share with the community.
If you want googlebot etc to index more files and more often then this is vital.
If you want traffic and high SEO listings then this is vital.
|
 |
Duval
Newbie
Joined: 02 December 2004
Status: Offline
Points: 22
|
Post Options
Thanks(0)
Quote Reply
Posted: 03 July 2005 at 4:04pm |
|
Radic, what makes you think that by excluding files you'll get spidered
more thoroughly? Generally the robots.txt exclusion is for non public
areas of your site or with pages that you have concerns about duplicate
content.
The single best thing that one can do with a forum to improve search
engine spidering is to rewrite the url's. http://www.isapirewrite.com/
Here's a link to the exclusion protocol http://www.robotstxt.org/wc/norobots.html
Please repost any specifics if you run into difficulty.
|
 |
wistex
Mod Builder Group
Joined: 30 August 2003
Location: United States
Status: Offline
Points: 877
|
Post Options
Thanks(0)
Quote Reply
Posted: 03 July 2005 at 8:31pm |
BTW, if you press the space bar after typing a URL, it makes it clickable. Here's a clickable version of the links provided above: http://www.isapirewrite.com/ and http://www.robotstxt.org/wc/norobots.html
The key to getting your site indexed is content content content. Yes, there are tricks to get Google and other search engines to understand your site better and therefore rank it higher, but content still comes first. On one of my sites, we have not used any SEO tricks, and yet we rank it the top 10 on Google and Yahoo! and MSN in some of the appropriate categories. Actually, we do the opposite of what some SEO people say, yet we rank in the top 10. Why? The content is good and people keep coming back, and search engines have ways to figure this out.
And excluding files won't help you get indexed, actually it will probably hurt you. The larger your site, the more of an authority you are, and that effects your ranking. So excluding files would probably make you less of an authority since your site will appear smaller to search engines.
|
|
|
 |
radic
Newbie
Joined: 24 February 2005
Status: Offline
Points: 7
|
Post Options
Thanks(0)
Quote Reply
Posted: 04 July 2005 at 9:45am |
Ok thanks for the feedback, I'm quite surprised that you both think its better to not use the Disallow command for files that you dont want shown on google etc.
I'm well aware of content is King etc and have been studying SEO quite a bit but there is so much to know. I mean what would be the point of having a file that has no content like an included header or one of the forum files that has no use or could produce an error if landed on?
I would much rather have the robot come in and index all the forum posts and boards than letting the robot index all these incuded or funtion files, I think you need to make it easy & clear for them for what they should do with your site and not throw hurdles in their path.
Its also a waste of bandwith although thats not the point. The robots will turn away and not index a site properly if they have to deal with too much junk, they just want the content files, not the includes, not the errors caused be landing on a include file etc. You also dont want these files for everyone to see on google etc, you want users to find the forum posts where the keywords are.
Anyway thats just how I see it but very interested to hear more.
|
 |
dpyers
Senior Member
Joined: 12 May 2003
Status: Offline
Points: 3937
|
Post Options
Thanks(0)
Quote Reply
Posted: 04 July 2005 at 12:25pm |
|
The SE bots follow links. They don't actually "walk" a directory tree.
You don't link to an include file so the bots never see it.
The first time a bot visits, it does a "shallow" scan - usually only 1
or two levels deep from your home page. On subsequent visits, it scans
deeper and deeper. It can take months to fully scan a deep site.
Pages that change frequently are visited more often by the bots. Once
they get past your main forum page, they'll see pages that change
frewuently and keep coming back more often. I've seen forums that get
scanned twice a day.
One way of getting your site fully indexed wuicker is to include a site
map on your front page. Another way to get a forum site indexed is to
include a link to active topics on the home page.
The robots.txt exclude functions are used to prevent the bots from
following links to those directories - not to prevent walking the
directories which is something bots can't do if you've turned off
directory views for your site.
One of the useful excluding directories and files from robots.txt does
is to keep hidden things that you want hidded from the se's - like
keeping your images out of google.
Note that robots.txt is only useful for conforming search engines. Bad
Bots either ignore it, or use it to identify areas where "good stuff"
might be kept.
|
Lead me not into temptation... I know the short cut, follow me.
|
 |
wistex
Mod Builder Group
Joined: 30 August 2003
Location: United States
Status: Offline
Points: 877
|
Post Options
Thanks(0)
Quote Reply
Posted: 04 July 2005 at 7:35pm |
dpyers is right, the search engines only follow and index links. They don't index any file that noone links directly to. I've never had a problem with Google or any of the others linking to any header or footer files.
Remember, the bots can't read your directory structure, they can only follow hyperlinks in a webpage.
|
|
|
 |
dpyers
Senior Member
Joined: 12 May 2003
Status: Offline
Points: 3937
|
Post Options
Thanks(0)
Quote Reply
Posted: 05 July 2005 at 9:05pm |
Something new from google that you may be interested in
https://www.google.com/webmasters/sitemaps/login
|
Lead me not into temptation... I know the short cut, follow me.
|
 |
radic
Newbie
Joined: 24 February 2005
Status: Offline
Points: 7
|
Post Options
Thanks(0)
Quote Reply
Posted: 06 July 2005 at 8:58am |
hahaha, ive been busy working on these for my sites for the last few days since I herd of this. So the point of this post is not really an issue anymore if this Google Sitemap thing works. I got a snitz forum Site Map indexed yesterday and im now working on ones for the rest of my sites...
dpyers, ok I see what your saying now and thanks for that information and that was explained well.
|
 |