What is Robots.txt??
The robots.txt file is used to instruct search engine robots about what pages on your website should be crawled and consequently indexed. Most websites have files and folders that are not relevant for search engines (like images or admin files) therefore creating a robots.txt file can actually improve your website indexation. It also provides you a way to hide your content from search engine.
Robots.txt Syntax
# comment
User-agent: [robot-names][(*)Wild card character]
Disallow:[(/)all] [specific directory] [specific file Location]
User-agent
The value of this field is the name of the robot the record is describing access policy for. If more than one User-agent field is present the record describes an identical access policy for more than one robot. At least one field needs to be present per record. You can multiple ( more than 1) User-Agents in one entry.
Disallow
The value of this field specifies a partial URL that is not to be visited. This can be a full path, or a partial path; any URL that starts with this value will not be retrieved. For example, Disallow: /help disallows both /help.html and /help/index.html, whereas Disallow: /help/ would disallow /help/index.html but allow /help.html. Any empty value indicates that all URLs can be retrieved. At least one Disallow field needs to be present in a record.
Things to remember while writing robots.txt:
- Robots.txt should be written in a plain text editor like Notepad. Do not use MS-Word or any other text editor to create robots.txt. The bottom line is this file should have the extension “”.txt”" else it will be useless.
- A robots.txt file is always stored in the root of your site, and is always named in lower case. Spiders will always search for it in the root directory (e.g. http://www.example.com/robots.txt)
- There can only be one instruction per line,
- You should avoid putting spaces before the instructions (recommended to avoid making mistakes).
- For security reasons, be aware while preventing spiders from accidentally indexing sensitive and private areas of your site, as anybody at all can view your robots.txt file.
What people said