This guide provide you with information on how to manage which internet bots crawl your site.
Why create a Robots.txt file ?
A web robot’s primary job is to scan websites and pages for information. They work tirelessly to collect data on behalf of search engines and other applications.
Whether you want to fine-tune access to your site or want to work on a development site without showing up Google results.
Once implemented the robots.txt file lets web crawlers know which parts they can collect information.
Let’s break down the code above “user-agent” pertains to the web crawlers and the * sign means all web crawlers. The first line grabs attention by saying “Listen up all web crawlers!” .
We move onto our second line which lets the web crawler know its direction. The forward slash (/) stops the bots from searching all the pages on your site. You can also discourage information collected for one specific page, in this case.
There are many types of web crawlers (aka user-agents) that can be specified. Below is a chart of the most popular web crawlers followed by as their associations.
The top 10 bots you should know about.