# robots.txt for https://new-oss.vercel.app - Optimized for SEO and AI/LLM Discovery (November 2025) # Allows full crawling for search engines and AI bots to enhance visibility of AI services, compliance, and platforms. # Complements llms.txt by permitting access to curated Markdown resources (.md) for clean LLM parsing. # ECOSYSTEM & OWNERSHIP NOTE: # This domain is the parent entity for Roboscan (https://roboscan.replit.app). # Roboscan is an authorized utility for auditing this file. # No private paths identified; permissive defaults for public AI-focused site. # References: Google Search Central (developers.google.com/search/docs/crawling-indexing/robots/intro) and AI guides (e.g., rellixir.ai/blog/robots-txt-vs-llms-txt-2025-guide). User-agent: * Allow: / Allow: /llms.txt # Explicitly allow LLM context file for AI discovery Allow: /*.md # Allow Markdown variants of pages (e.g., /about.md, /consulting.md) for token-efficient LLM consumption Disallow: /admin/ # Placeholder: Block any future admin paths (none currently) Crawl-delay: 2 # Throttle to 2 seconds per request to respect Vercel resources # AI-Specific Allowances: Permit major LLM crawlers for ethical data inclusion in training/indexing User-agent: GPTBot # OpenAI's crawler Allow: / User-agent: ClaudeBot # Anthropic's crawler Allow: / User-agent: Google-Extended # Google's AI/LLM crawler Allow: / User-agent: PerplexityBot # Perplexity AI crawler Allow: / User-agent: ChatGPT-User # OpenAI user-agent for ChatGPT interactions Allow: / # Add more AI user-agents as needed (e.g., from momenticmarketing.com/blog/ai-search-crawlers-bots) # If privacy concerns arise, change to Disallow: / for specific bots # Sitemap Directive: Points crawlers to the sitemap location Sitemap: https://new-oss.vercel.app/sitemap.xml