# ═══════════════════════════════════════════════════════════════ # Ghost in the Codex — robots.txt (Faro Configuration) # Updated: 2026-06-27 # ═══════════════════════════════════════════════════════════════ # ── Section 1: Block Training Crawlers ── User-agent: GPTBot Disallow: / User-agent: ClaudeBot Disallow: / # ── Section 2: Allow AI Search / RAG Indexers ── User-agent: OAI-SearchBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: Claude-SearchBot Allow: / Crawl-delay: 2 User-agent: Claude-User Allow: / User-agent: PerplexityBot Allow: / User-agent: Perplexity-User Allow: / User-agent: Google-Extended Allow: / User-agent: Applebot-Extended Allow: / # ── Section 3: Allow Traditional Search Engines ── User-agent: Googlebot Allow: / Crawl-delay: 1 User-agent: Bingbot Allow: / Crawl-delay: 1 User-agent: DuckDuckBot Allow: / Crawl-delay: 1 User-agent: Applebot Allow: / Crawl-delay: 1 User-agent: YandexBot Allow: / Crawl-delay: 1 User-agent: Baiduspider Allow: / Crawl-delay: 1 # ── Section 4: Allow Social / Reference Previews ── User-agent: facebookexternalhit Allow: / User-agent: Meta-ExternalFetcher Allow: / User-agent: Google-NotebookLM Allow: / User-agent: MistralAI-User Allow: / User-agent: OAI-AdsBot Allow: / User-agent: Copilot Allow: / # ── Section 5: Catch-all — Block everything else ── User-agent: * Disallow: / # ── llms-full.txt — single-file RAG corpus, all crawlers welcome ── User-agent: * Allow: /llms-full.txt Sitemap: https://ghostinthecodex.com/sitemap.xml Sitemap: https://ghostinthecodex.com/sitemap_index.xml