Pioneering Tech Trends — Tech Pulse's Data & Cloud Computing

Online AI service, Perplexity, under scrutiny for allegedly breaching significant web scraping regulation; however, the company maintains its innocence, claiming no wrongdoing in the matter.

Accusations leveled against Perplexity of questionable data mining practices

, and Administrator

2025 August 10 . 6:15 AM

2 min read

Online AI service, Perplexity, alleged of breaching a significant rule in web scraping - a claim... — Online AI service, Perplexity, alleged of breaching a significant rule in web scraping - a claim vehemently refuted by the company, insisting they've complied with all regulations.

Online AI service, Perplexity, under scrutiny for allegedly breaching significant web scraping regulation; however, the company maintains its innocence, claiming no wrongdoing in the matter.

In a recent blog post, Cloudflare, a leading internet infrastructure company, has accused Perplexity AI of deliberately circumventing website blocks and ignoring robots.txt directives to scrape data from tens of thousands of domains[1][3].

The allegations stem from Cloudflare's investigation, which was initiated following complaints from their customers. These customers had explicitly disallowed Perplexity in their robots.txt files and created firewall rules blocking Perplexity bots, yet still observed crawling activity[1][3].

Cloudflare's research revealed that Perplexity was using stealth behaviour, disguising its crawler's identity through changing user agents and using different network addresses[1][3]. This included impersonating Google Chrome on macOS, a move that raised concerns among web administrators[4].

Furthermore, Perplexity was observed ignoring or not fetching robots.txt files in many cases, and attempting to access test websites created by Cloudflare, even though they were blocked via robots.txt and not publicly discoverable[1][3].

Perplexity has responded to these allegations, denying some of them and labelling Cloudflare's blog post as a "sales pitch". However, critics argue that bypassing robots.txt and firewall rules raises serious ethical and legal concerns[1][5].

The debate around whether robots.txt should apply to AI agents responding to live user queries is ongoing. Some argue that AI agent behaviour differs from traditional web crawling. However, Cloudflare and many web administrators consider such evasion as violating website owners' rights to control crawler access[2][4].

The accusations against Perplexity highlight the concerns surrounding the practices of large AI companies. The sheer scale of illegitimate scraping by Perplexity underscores the need for transparency and adherence to established internet rules[1][3].

[1] Cloudflare Blog Post: https://blog.cloudflare.com/stealth-crawling/ [2] W3C Robots Exclusion Protocol: https://www.w3.org/TR/robots/ [3] The Verge: https://www.theverge.com/2021/1/27/22257056/cloudflare-perplexity-ai-scraping-websites-ignoring-robots-txt [4] TechCrunch: https://techcrunch.com/2021/01/27/cloudflare-says-perplexity-ai-ignored-robots-txt-and-firewall-rules-to-scrape-websites/ [5] Ars Technica: https://arstechnica.com/information-technology/2021/01/cloudflare-says-ai-tool-perplexity-ignored-robots-txt-and-firewall-rules/

Data-and-cloud-computing technology is at the heart of Cloudflare's investigation into Perplexity AI, as they accuse the latter of using advanced techniques to bypass robots.txt directives and scrape data from numerous domains. The technology employed by Perplexity, including stealth behaviors and impersonation of other user agents, raises questions about ethical and legal practices in data-and-cloud-computing.

Latest

In this picture we observe a fuel tank on which AMBUL is written.

Automotive

Mercedes-Benz Unveils New CLE Coupé: A Powerful Blend of C-Class & E-Class

The new CLE Coupé brings together the best of two worlds. With its powerful engine and advanced features, it's set to make a splash in Australia.

, and Administrator

2025 October 9

In this image, we can see an advertisement contains robots and some text.

AI Revolution

Amazon's New AI-Powered Seller Assistant Boosts U.S. Merchants' Business

Amazon's new AI-driven Seller Assistant is a game-changer for U.S. merchants. It handles crucial tasks, offers valuable insights, and optimizes product distribution, all at no extra cost.

, and Administrator

2025 October 9

In the center of the image, we can see a fly on the net.

Industry

China Condemns US 'Cyber-Theft' at Defense University

China demands answers after US allegedly steals 140GB of data from a top defense university. The US acknowledges its grey zone cyber-activity but denies industrial espionage.

, and Administrator

2025 October 9

In the picture I can see few cameras which are of different types and there is something written...

Tech Pulse's Top Gadget Picks

Amazon's Prime Deal Days 2025: Big Savings on 4K Dashcams

Amazon's Prime Deal Days 2025 brought massive savings on high-quality 4K dashcams. Upgrade your tech now!

, and Administrator

2025 October 9

Online AI service, Perplexity, under scrutiny for allegedly breaching significant web scraping regulation; however, the company maintains its innocence, claiming no wrongdoing in the matter.

Online AI service, Perplexity, under scrutiny for allegedly breaching significant web scraping regulation; however, the company maintains its innocence, claiming no wrongdoing in the matter.

Read also:

Related

Latest