1 / 6

How to create a Robot.txt | what is a Robots.txt | Robots exclusion standard | ShoutMakersDream

Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results. <br><br>The location of robots.txt is very important. It must be in the main directory because otherwise user agents (search engines) will not be able to find it – they do not search the whole site for a file named robots.txt. Instead, they look first in the main directory (i.e. http://mydomain.com/robots.txt) and if they don't find it there, they simply assume that this site does not have a robots.txt file and therefore they index everything they find along the way. So, if you don't put robots.txt in the right place, do not be surprised that search engines index your whole site. <br><br>The concept and structure of robots.txt has been developed more than a decade ago and if you are interested to learn more about it, visit http://www.robotstxt.org/ or you can go straight to the Standard for Robot Exclusion because in this article we will deal only with the most important aspects of a robots.txt file. Next we will continue with the structure a robots.txt file.

Télécharger la présentation

How to create a Robot.txt | what is a Robots.txt | Robots exclusion standard | ShoutMakersDream

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ROBOTS.TXT PRESENTED BY: SAURAV DEO Shout Makers Dream www.shoutmakersdream.com admin@shoutmakersdream.com

  2. ROBOTS.TXT • The robots exclusion standard, also known as the robots exclusion protocol or simply  robots.txt. • robots.txt is a standard used by websites to communicate with web crawlers and other web robots. 

  3. ROBOTS.TXT • Block all web crawlers from all content User-agent: * Disallow: / • Block a specific web crawler from a specific folder User-agent: Googlebot Disallow: /no-google/ • Block a specific web crawler from a specific web page User-agent: Googlebot Disallow: /no-google/blocked-page.html

  4. Where do you put robots txt? • You must save your robots.txt code as a text file. • You must place the file in the highest-level directory of your site (or the root of your domain). • The robots.txt file must be named robots.txt • Optimal Format robots.txt needs to be placed in the top-level directory of a web server in order to be useful. Example: http://www.example.com/robots.txt

  5. What is robots meta tag? Here is an example of a robots meta tag that instructs web crawlers to not index the page and to not crawl any of the links on the page: <meta name="robots" content="noindex, nofollow"> For example: <meta name="robots" content="nofollow"> <meta name="googlebot" content="noindex">

  6. THANK YOU SHOUT MAKERS DREAM www.shoutmakersdream.com admin@shoutmakersdream.com

More Related