Print Page | Close Window

Unwanted pages in search engines

Printed From: Web Wiz Forums
Category: Web Wiz Web App Support Forums
Forum Name: Web Wiz Forums
Forum Description: Support forum for Web Wiz Forums application.
URL: https://forums.webwiz.net/forum_posts.asp?TID=23466
Printed Date: 07 April 2026 at 11:19am
Software Version: Web Wiz Forums 12.08 - https://www.webwizforums.com


Topic: Unwanted pages in search engines
Posted By: cathal
Subject: Unwanted pages in search engines
Date Posted: 05 June 2007 at 1:46pm
How do i prevent pages like this:
 
-- http://www.gamer-park.com/calendar_week.asp?M=10&Y=2008&W=1 - Calender Pages
 
-- http://www.gamer-park.com/printer_friendly_posts.asp?TID=62 - Printer Friendly Pages
 
from appearing in Google, Windows Live Search and other search engines



Replies:
Posted By: michael
Date Posted: 05 June 2007 at 2:12pm
Create a robots.txt file that excludes those pages. Most "good" search engines should honor that file.
Go to any domain like microsoft.com webwiz.net or whatever and call robots.txt to see a sample.


-------------
http://baumannphoto.com" rel="nofollow - Blog | http://mpgtracker.com" rel="nofollow - MPG Tracker


Posted By: cathal
Date Posted: 05 June 2007 at 4:13pm
What files / folders will i need to exclude from the search to stop these pages appearing


Posted By: michael
Date Posted: 05 June 2007 at 7:03pm
Like printer_friendly_posts.asp and calendar_week.asp etc.

-------------
http://baumannphoto.com" rel="nofollow - Blog | http://mpgtracker.com" rel="nofollow - MPG Tracker


Posted By: _PDG_
Date Posted: 08 June 2007 at 8:45am
Originally posted by michael michael wrote:

Like printer_friendly_posts.asp and calendar_week.asp etc.
 
 
Or for the newbies a simple way to just Cut/Paste info and save as a .Txt file named Robots ( robots.txt ) Wink
 
Quote
 
User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /includes/
Disallow: /404error.asp
Disallow: printer_friendly_posts.asp 
Disallow: calendar_week.asp
 

User-agent: googlebot
Crawl-delay: 20

User-agent: yahoo
Crawl-delay: 60

User-agent: Slurp
Crawl-delay: 60

User-agent: msnbot
Crawl-delay: 60

User-agent: Teoma
Crawl-delay: 20

User-agent: aipbot
Disallow: /

User-agent: BecomeBot
Disallow: /

User-agent: psbot
Disallow: /

User-agent: fast
Disallow: /


Posted By: cathal
Date Posted: 08 June 2007 at 10:14am
Why is the cgi-bin file disallowed? I cant even see it in my forum. Is it hidden?


Posted By: WebWiz-Bruce
Date Posted: 08 June 2007 at 10:29am
This looks very much like the robots.txt file used for this site.

Parts of it shouldn't be used, as crawl delays are built into this one to make search robots index the site more slowly to stop them sucking bandwidth as this site is continually crawled by many search engines as there are over 100,000 sites linking to this site. Most sites wouldn't want this as it could prevent your site being index properly.

Your robots.txt file which you place into the root of your site should look like:-


User-agent: *
Disallow: /forum_images/
Disallow: /includes/
Disallow: printer_friendly_posts.asp
Disallow: calendar_week.asp



-------------
https://www.webwiz.net/web-wiz-forums/forum-hosting.htm" rel="nofollow - Web Wiz Forums Hosting
https://www.webwiz.net/web-hosting/windows-web-hosting.htm" rel="nofollow - ASP.NET Web Hosting



Print Page | Close Window

Forum Software by Web Wiz Forums® version 12.08 - https://www.webwizforums.com
Copyright ©2001-2026 Web Wiz Ltd. - https://www.webwiz.net