Google has filed a federal lawsuit against SerpApi LLC, alleging that the Texas-based company violated the Digital Millennium Copyright Act by bypassing SearchGuard protections to scrape copyrighted content from search results. This presents a paradox, as Google itself scrapes billions of webpages for AI training and search indexing while seeking legal protection against those who scrape its own results.
The lawsuit, comprising 13 pages filed in the United States District Court for the Northern District of California, seeks statutory damages between $200 and $2,500 for each instance of circumvention. SerpApi handles hundreds of millions of automated queries daily, which could lead to significant potential liabilities. In contrast, Google’s own scraping activities are far more extensive, accessing nearly every publicly available webpage to support its search services and AI models, often keeping users within its ecosystem.
This current lawsuit echoes events from February 2014, when then-Google head of web spam Matt Cutts asked for examples of scraper sites. Dan Barker, a search marketing professional, responded by pointing out how Google’s Knowledge Panels displayed content taken from Wikipedia with similar text formatting. Fast forward to December 2025, and Google is now portraying itself as the victim of large-scale scraping.
The complaint underscores Google’s investment in SearchGuard, a protective technology introduced in January 2025, after significant investment and development efforts. SearchGuard utilizes JavaScript challenges to ascertain whether search queries originate from human users rather than automated systems.
Despite constructing barriers around its results, Google continues to extract content from publishers globally for AI purposes. Penske Media Corporation initiated a comprehensive antitrust lawsuit against Google in September 2025, alleging that Google coerces online publishers into providing content without compensation while simultaneously decreasing website traffic, which publishers rely on for revenue.
The Penske Media complaint outlines an untenable situation for publishers, who must choose between allowing their content to be used for AI training without payment or facing exclusion from search results that generate significant revenue. This lawsuit asserts that Google misuses its monopoly power in general search services to compel publishers to provide content, which Google then reuses in AI-generated responses.
Google’s data scraping for AI training exceeds what SerpApi allegedly extracts from search results. On December 9, 2025, the European Commission launched an antitrust investigation into whether Google breached EU competition rules by using content from publishers and creators for AI purposes without sufficient compensation or viable opt-out options. The investigation will examine if Google imposed unfair conditions on these entities while securing privileged access to training data unreachable by competitors.
The Commission’s announcement suggested that Google might have used publisher content for AI features without payment. Google’s “Google-Extended” controls were introduced in September 2023 to allow publishers to limit content usage for AI training; however, these controls are criticized for being inadequate. Attempting to block AI training through technical means like robots.txt files can lead to decreased traffic, rendering such options impractical for publishers.
The economic imbalance generated by Google’s scraping practices is stark. Cloudflare’s CEO Matthew Prince noted during a CNBC interview that ten years ago, Google sent one visitor for every two pages it crawled. That ratio has worsened, now reaching one visitor per 15 pages scraped. Publishers must bear costs for content creation, hosting, and bandwidth, while Google benefits from zero-click searches that keep users within its platform.
Research from Ahrefs examining 300,000 keywords revealed that AI Overviews decrease organic clicks by 34.5% when present in search results. Nevertheless, in a recent podcast, Google SVP of Knowledge and Information Nick Fox asserted that publishers should not alter their views on content for AI search, dismissing suggestions for standardized licensing agreements to ensure fair compensation.
SerpApi was founded by Julien Khaleghy in 2017, who realized “scraping images from Google was an intensive process.” The company’s business model revolves around utilizing content sourced from services that have incurred significant expenses to produce it, and delivering this content to others through paid subscriptions. They offer a “Google Search API” primarily for scraping Google results, targeting specific content blocks and listings.
Google estimates that SerpApi generates hundreds of millions of artificial search requests daily, with volume surging by up to 25,000% in two years. These automated queries demand considerable computing resources without any revenue offset. Google’s Terms of Service unequivocally prohibit automated access to its search content per the machine-readable instructions on its webpages.
The lawsuit claims that SerpApi developed methods to bypass SearchGuard’s restrictions instituted in January 2025. Khaleghy described the process as creating fake browsers using numerous IP addresses that Google would accept as genuine users. These techniques involve falsifying device info and syndicated authorizations to bypass security measures.
SerpApi promotes these circumvention capabilities to customers, assuring them they need not concern themselves with detection issues. A recent SerpApi blog post stated that the introduction of SearchGuard made web scraping more challenging, but claimed the company had been “minimally impacted” due to their existing solutions.
The lawsuit against SerpApi marks the second significant legal action taken against the company in 2025. Reddit sued SerpApi along with others in October for circumventing its anti-scraping measures and Google’s SearchGuard to scrape content from Reddit’s search pages.
This scraping issue unfolds within broader industry tensions regarding access to content and AI training data. Over 80 media executives met in New York under the IAB Tech Lab banner in late July 2025 to discuss the perceived threats to digital publishing posed by AI companies. However, key AI firms were notably absent from this gathering.
Publishers increasingly see unauthorized AI training data collection as a major risk. Research showed that over 35% of top websites now block certain AI bots due to rising scraping attacks. AI’s reliance on fresh, human-generated content to maintain accuracy was highlighted by research pinpointing heavy scraping activity from publisher content.
In one instance, a publishing company disclosed earning only $174 from AI crawlers over an extended period, illustrating the disparity: while AI firms, including Google, scrape vast amounts of content, publishers receive minimal compensation despite covering costs for content creation, hosting, and bandwidth.
Google’s lawsuit against SerpApi asserts violations of copyright laws, detailing how each circumvention act may incur statutory damages. The complaint argues that SearchGuard constitutes a technological measure that protects copyrighted works. It cites instances where Google licenses content to enhance search results, including Knowledge Panels and listings featuring merchant-supplied information.
Nonetheless, publishers argue that Google applies significantly different standards to its own scraping practices. Recipe creators confronted Google in December 2025 over AI features showcasing complete, incorrect recipes, and photos without proper credit.
The lawsuit seeks to compel SerpApi to stop bypassing technological measures and to destroy any related technology. Additionally, Google requests damages for each violation or alternative damages based on its losses and SerpApi’s profits.
Google’s dual role as both a prolific scraper of web content and a litigious defender of its search results generates frustration across the digital publishing landscape. While Google invests in protections for the content it licenses from third parties, its scraping operations for AI training vastly overshadow those of SerpApi.
This lawsuit is significant for the marketing community, revealing power disparities in content distribution. Google holds a dominant position in general search services and can dictate terms to publishers while extracting their content without adequate compensation.
Details regarding how Google bypasses publisher preferences are detailed within the Penske Media complaint, which scrutinizes several product features relying on content scraped from publisher websites. Publishers are faced with a challenging decision: permit Google to utilize their content for AI or risk losing search visibility.
Overall, the trajectory of this lawsuit reflects Google’s efforts to safeguard its interests through legal frameworks while continuing extensive extraction of publisher content, raising critical issues about fairness and compensation within the ecosystem.
—

