Tutorials

A Definitive Guide To Robots.txt

Guide To Robot.txt For SEO

A website is not just a group of files stored on a remote disk. Instead, websites work as fully-fledged systems of their own. All search engines respect privacy of websites and they cannot go against the permissions granted by webmasters to them. These rules are stored in the Robots.txt file which itself should be located at the root of website in order to work.

Robots.txt does not contain kind of special code, though you will have to write the rules in a certain format. As a standard, search engines only understand Unix text syntax.

To a normal person writing rules in Robots.txt can be hard. But after getting the hang of it, you can easily write the rules on your own.

Read: SEO tool that really worked for me

Robot.txt file is just a group of set of rules. In each set, the webmaster has to tell which search engines the rule will apply to and then they have to define the rule itself. The below text snippet is a perfect example of this.

User-agent: *

Disallow: /images/

Disallow: /js/

Here the first line is used to address search engine(s). ‘*’ means all search engines. The second and lines tell the search engines to not index the “images” and “js” directory of the respective website.

Note: Search engine bots are case sensitive. Be sure to enter the syntax and folder names properly.

Some examples of Robot.txt files:

Example 1:

User-agent: *

Disallow:

(A blank Disallow command means search engines can index all files and folders of the respective website)

Example 2:

User-agent: *

Disallow: /css/style.css

(This rule is telling all search engines to not index a particular file in on the server)

Example 3:

User-agent: *

Disallow: /css/

(Here the rule is telling search engines to not index the whole “css” folder)

Example 4:

User-agent: *

Disallow: /

(Here the rule is telling search engines to not index the whole website)

Example 4 (grouping different rules together)

User-agent: Opera 9

Disallow: /css/

User agent: *

Disallow:

(Here the Robots.txt file is telling Opera 9 bot to not index the ‘css’ folder. At the same time the second rule in the same file allows all other search engines to index the whole website)

What To Allow For Indexing:

  • All images
  • All java script files
  • All css files
  • All html files
  • Anything else that is linked or embedded to your website

What Not To Allow For Indexing:

  • Personal files that you do not want to be displayed in search results
  • Admin folder or admin pages

Hint: In Robots.txt, it is not required to actively mention the files you want to get indexed. Instead, it is a file that just tells the search engines what not to index.

About the author

Rajesh Chauhan

GoogieHost is one of the leading free web hosting provider on the internet over 300000 satisfied clients across the globe. We also offer Affordable SEO Web Hosting and Digital Marketing service to give give wings to your online business. Unlimited FREE HOSTING.

Hey! leave comment to ask your queries about this topic.

Powered by GoogieHost