############################## # robots.txt for [YourSite] # Purpose: Block bad bots, allow search engine indexing with limits ############################## ### SECTION: Major Search Bots ### User-agent: Googlebot Disallow: /expensive-endpoint/ Disallow: /cgi-bin/ Disallow: /tmp/ Allow: / User-agent: bingbot Crawl-delay: 10 Disallow: /expensive-endpoint/ User-agent: Yandex Disallow: / User-agent: Mediapartners-Google* Disallow: / # Only block if you're not using AdSense ### SECTION: Known Site Scrapers and Bad Bots ### User-agent: UbiCrawler Disallow: / User-agent: DOC Disallow: / User-agent: Zao Disallow: / User-agent: sitecheck.internetseer.com Disallow: / User-agent: Zealbot Disallow: / User-agent: MSIECrawler Disallow: / User-agent: SiteSnagger Disallow: / User-agent: WebStripper Disallow: / User-agent: WebCopier Disallow: / User-agent: Fetch Disallow: / User-agent: Offline Explorer Disallow: / User-agent: Teleport Disallow: / User-agent: TeleportPro Disallow: / User-agent: WebZIP Disallow: / User-agent: linko Disallow: / User-agent: HTTrack Disallow: / User-agent: Microsoft.URL.Control Disallow: / User-agent: Xenu Disallow: / User-agent: larbin Disallow: / User-agent: libwww Disallow: / User-agent: ZyBORG Disallow: / User-agent: Download Ninja Disallow: / User-agent: grub-client Disallow: / User-agent: k2spider Disallow: / User-agent: NPBot Disallow: / User-agent: WebReaper Disallow: / User-agent: DataForSeoBot Disallow: / ### SECTION: AI Bots & Crawlers (Resource Intensive) ### User-agent: ChatGPT-User Disallow: / User-agent: GPTBot Disallow: / User-agent: Claude-Web Disallow: / User-agent: ClaudeBot Disallow: / User-agent: anthropic-ai Disallow: / User-agent: cohere-ai Disallow: / User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / User-agent: Diffbot Disallow: / User-agent: SemrushBot Disallow: / User-agent: omgili Disallow: / User-agent: FacebookBot Disallow: / User-agent: GoogleOther Disallow: / User-agent: Google-Extended Disallow: / ### SECTION: Default Catch-All ### User-agent: * Crawl-delay: 10 Disallow: /expensive-endpoint/ ### (Optional) Sitemap ### # Sitemap: https://www.yoursite.com/sitemap.xml