# directions: http://www.searchtools.com/robots/robots-txt.html # format: : # The first thing a robot does when it visits you site is to look for a file called robots.txt. If the file exists it will follow the instructions contained within it. If there is no robots.txt file present then you are giving it free reign to index any page it wishes. # # By including a robots.txt file you can indicate exactly what is, and what is not off-limits to all, or just some robots. Use notepad or whatever text editor you prefer and set it out like this: # # # robots.txt file for http://www.yoursite.com # # # User agent: webcrawler # Disallow: # # User agent: altavista # Disallow: / # # User agent: * # Disallow: /forms # Disallow: /logs # # # Any line starting with '#' specifies a comment. Use it for your own information. # # In this instance the first paragraph after the comments is specific to the robot # called 'webcrawler' and states that webcrawler has nothing disallowed so it is # free to go anywhere. # # The second paragraph indicates that the robot called 'altavista' is effectively # barred from your entire site. # # The last paragraph indicates that all other visiting robots should not visit # URLs starting with /forms or /log. The '*' is not a wildcard but a special # character. You cannot use wildcard patterns or other expressions in the # User agent or Disallow fields. # # You also cannot string lines together like this: # # User agent: * # Disallow: /forms /logs /errors /tmp # # You must create a new Disallow line for each entry like this: # # User agent: * # Disallow: /forms/ # Disallow: /logs/ # Disallow: /errors/ # Disallow: /tmp/ # # # Once you are happy save the file as "robots.txt" (no quotes) and move it to your # root directory of your site, i.e. where your default page resides. # # Just follow these simple rules and you should have no problems. # # META Exclusion Tags # If you are unable to create a robots.txt file because, for example, you share # or don't administer the server your files are on then you can utilise the following META tags: # # # # # Including this line between your header tags in your HTML will mean that that page will not be indexed. # # If, instead you do: # # # # then the page will be indexed, but any links in that document will not be followed by the robot. # User agent: * Disallow: /.admin/ Disallow: /cgi-bin/ Disallow: /berdichev/ Disallow: /vizbook/ Disallow: /ftp/ Disallow: /misc/ Disallow: /phpBB2/ Disallow: /usage/ Disallow: /webalizer/