Cloudflare has accused Perplexity AI of employing stealth crawlers to bypass website bans and scrape content from sites that had blocked them. The company delisted Perplexity from its verified bots program and launched new defenses against scraping after complaints from its customers about persistent access attempts to restricted content. Tests revealed that Perplexity's crawlers continued to request information from domains even after explicit blocking measures, and utilized undisclosed IP addresses and impersonated legitimate browsers to evade detection. Cloudflare stated that Perplexity's actions were in stark contrast to OpenAI’s practices, which respect website directives. In response to the situation, Cloudflare has implemented measures such as updating its managed rules and developing tools that will further hinder deceptive scraping tactics. The controversy has highlighted broader concerns about AI companies’ data extraction practices and their impact on web traffic and resource access.

Source 🔗