Generate a robots.txt file to control how search engines crawl your website. Choose from presets or customise rules for specific bots.
The robots.txt file is one of the most important technical SEO files on any website. It lives at the root of your domain and provides instructions to search engine crawlers about which parts of your site they should and shouldn't access. While it's a simple text file, getting it wrong can accidentally block your entire site from being indexed.
Every website should have a robots.txt file, even if it just allows everything. Search engines like Google check for this file before crawling your site. If it's missing, crawlers will assume they can access everything. If it contains errors, crawlers may be blocked from important pages or waste their crawl budget on unimportant ones.
For African businesses and developers, proper robots.txt configuration is especially important for managing crawl budget efficiently. If your site is hosted in Africa, response times to Google's crawlers (which are primarily based in the US and Europe) may be higher, making crawl budget management even more critical. Block unnecessary paths like admin panels, search result pages, and staging areas to ensure Google focuses its crawl budget on your most important pages.
In 2024-2025, blocking AI training bots has become a major concern. Bots like GPTBot (OpenAI), CCBot (Common Crawl/Anthropic), and Google-Extended scrape website content for AI training. Many publishers and businesses now specifically block these bots while continuing to allow search engine indexing. Our generator makes it easy to add these rules.
The robots.txt file must be placed at the root of your domain: https://example.com/robots.txt. It only works at the root level — placing it in a subdirectory (like /blog/robots.txt) will not work. Upload it via FTP, your hosting file manager, or your CMS settings.
Not exactly. Blocking a page in robots.txt prevents Google from crawling it, but the URL may still appear in search results (without a description) if other pages link to it. To fully remove a page from Google, use the "noindex" meta tag instead, and make sure the page is NOT blocked in robots.txt (Google needs to crawl the page to see the noindex tag).