[SEO] How to Create Perfect Robots.txt File
The robot exclusion standard, also known as the Robots Exclusion Protocol or robots.txt protocol, is a convention to prevent cooperating web spiders and other web robots from accessing all or part of a website which is otherwise publicly viewable. Robots are often used by search engines to categorize and archive web sites, or by webmasters to proofread source code. The standard complements Sitemaps, a robot inclusion standard for websites.
How to built a Perfect Robots.txt File?
There are hundreds of sites which provide you to create a perfect Robots.txt file for your Blog/Site. One of them is Mcanerin International, it provides Robot Control Code Generation Tool from which we can create a perfect Robots.txt File.
Create your own Robots.txt File by using Robot Control Code Generation Tool
Pre-Made Robots.txt Files:
Allow All Robots everywhere EXCEPT the cgi-bin and the images directory
Only Allow Known Major Search Engines
FAQs:
1.) Why should we use Robots.txt File?
By itself, a robots.txt file is harmless and actually beneficial. However, its job is to tell a search engine to keep away from parts of your website. If you misconfiguration it, you can accidentally prevent your site from being spidered and indexed.
This has happened to people both due to an error in the robots.txt file and also after a site redesign where the directory structure of the site has changed and the robots.txt has not been updated. Always check the robots.txt after a major site redesign.
A robots.txt file and, for that matter, the robots meta tag, has NO EFFECT on speeding up the spidering and indexing of a website, and no effect of the depth or breadth of the spidering of a site.
You cannot issue a search engine spider a command to do something – you can only tell it not to do something.
2.) Where should we locate Robots.txt file?
You can place your robots.txt file on your site root directory.
Example: http://www.yourdomain.com/robots.txt
3.) What does a Robots.txt look like?
At its most simple, a robots.txt file looks like this:
User-agent: *
Disallow:This one tells all robots (user agents) to go anywhere they want (disallow nothing).
This one, on the other hand, keeps out all compliant robots:
User-agent: *
Disallow: /As you can see, the only difference between them is a single slash ( “/” ). But if you accidentally use that slash when you didn’t mean to, you could find your search engine rankings disappear. Be very careful.
4.) Why should we use Sitemap?
You better don’t believe you can technically force Google to crawl and index – it’s them (or their system) who decide when and how and at which frequency to crawl and index. In case your site had been recognized as being “very important” you’d probably see your sitemap and pages crawled even at an hourly rate, if not, you’ll have to wait for crawling and indexing.
5.) Why should we Restrict Directories?
If we do not want our Directories to get crawl and index in search engines, then we should Restrict Directories that we want to be.
6.) Major Known Spiders / Crawlers?
Googlebot (Google), Googlebot-Image (Google Image Search), MSNBot (MSN), Slurp (Yahoo), Yahoo-Blogs, Mozilla/2.0 (compatible; Ask Jeeves/Teoma), Gigabot (Gigablast), Scrubby (Scrub The Web), Robozilla (DMOZ), Twiceler (Cuil)

![create robots.txt - seo create robots.txt seo [SEO] How to Create Perfect Robots.txt File](http://www.rajeshpatel.net/wp-content/uploads/2009/06/create-robots.txt-seo.jpg)