Guide to using the robots.txt file

Procédure

Guide to using the robots.txt file

If you're new to SEO, you'll need to create a robots.txt file for your site.
This tutorial will show you the purpose of the robots.txt file and how you can use it to improve your SEO.

What is a robots.txt file?

Robots.txt is a simple text file that you create and place at the root of your website to tell search engine spiders to access the different web pages on your site.

How a robots.txt file works:

Lorsque les robots des moteurs de recherche scannent votre site pour l'indexer , ils recherchent d'abord un fichier robots.txt dans le répertoire racine. Ce fichier contient les instructions sur les pages qu'ils peuvent analyser et indexer sur les SERP* et sur lesquels ils ne peuvent pas indexer.

SERP: acronym for Search Engine Result Page.

You can use the robots.txt file to:

  • Make search robots ignore duplicate pages on your site
  • Not index certain internal pages of your website (e.g. your admin panel or pages containing sensitive information)
  • Limit robots to index certain parts of your site or the whole site
  • Prohibit search robots from indexing certain files on your site, such as images and PDFs.

Example of robots.txt directives

If you want to prevent robots from visiting your site and not be referred to by search engines, use the following code:

User-agent: * Disallow: /

You can also prevent robots from analysing parts of your site, while allowing them to analyse other sections. The following example tells search engines and spiders not to scan the wp-admin folder, the tmp folder, the private folder and all items in these folders on your website.

User-agent: * Disallow: /admin/ Disallow: /tmp/ Disallow: /prive/

In the example above, http://www.mon-domaine-lws.fr/prive/index.html is one of the blocked URLs, but http://www.mon-domaine-lws.fr/index.html and http://www.mon-domaine-lws.fr/ folder/ will be crawlable.

User-agent: * means that the following rule must apply to all robots, you can specify particular robots for example for the Google robot: User-agent: Googlebot here is the complete list of robots, http://www.robotstxt.org/db.html

robots.txt file for WordPress

User-agent: * Disallow: /wp-admin/ #disallow access to admin section Disallow: /wp-login.php #disallow access to dashboard login page Disallow: /search/ #disallow access to internal search results page Disallow: *?s=* #disallow access to internal search results page Disallow: *?p=* #disallow access to pages with bad permalinks Disallow: *&p=* #disallow access to pages with bad permalinks Disallow: *&preview=* #disallow access to previous pages Disallow: /tag/ #disallow access to tag pages Disallow: /author/ #disallow access to author pages Disallow: /404-error/ #disallow access to 404 pages Sitemap: https://www.mon-domaine-lwsfr/sitemap_index.xml 

You can test whether your robots.txt file is blocking access to certain pages on your online site using Google Webmaster Tool by visiting this page and clicking on Open Robots.txt Tester.

You can also generate your robots.txt file online using this tool

Conclusion:

Robots. txt could play an important role in your site's ranking on search engines. That's why you need to manage this simple file with importance and care.

Don't hesitate to share your comments and questions!

Rate this article :

This article was useful to you ?

Article utileYes

Article non utileNo

Vous souhaitez nous laisser un commentaire concernant cet article ?

Si cela concerne une erreur dans la documentation ou un manque d'informations, n'hésitez pas à nous en faire part depuis le formulaire.

Pour toute question non liée à cette documentation ou problème technique sur l'un de vos services, contactez le support commercial ou le support technique

MerciMerci ! N'hésitez pas à poser des questions sur nos documentations si vous souhaitez plus d'informations et nous aider à les améliorer.


Vous avez noté 0 étoile(s)

Similar articles

2mn reading

How do you optimise images for your website?

1mn reading

How do I activate Google Analytics on my website?


Ask the LWS team and its community a question