Categories: Web and IT News

Thunderbit Launches High-Fidelity Web Data API, MCP Server, and CLI

Thunderbit, an AI web data platform with over 100,000 users, launched its developer API, Model Context Protocol (MCP) server, and CLI, giving developers new ways to turn complex, long-tail websites into clean Markdown or structured data for AI agents, RAG pipelines, and automation workflows.

AI agents are only as useful as the web data they can actually reach. We built Thunderbit to turn changing web pages into data that software can use reliably.

At the center of the launch is Thunderbit Distill, an adaptive HTML-to-Markdown engine designed for high-fidelity conversion across complex web pages. In internal HTML-to-Markdown evaluations, Distill scored 0.87 ROUGE-L and produced cleaner, more complete Markdown across product pages, pricing tables, directories, search results, reviews, and other page types, without requiring site-specific rules.

Marketing Technology News: MarTech Interview with Stephen Howard-Sarin, MD of Retail Media, Americas @ Criteo

Thunderbit uses AI models rather than fixed parsing rules to identify meaningful page content, then cleans navigation, scripts, ads, and boilerplate so LLMs and databases receive less noisy input.

Thunderbit also introduced Extract, which returns structured JSON or CSV from a URL using a developer-defined schema. Together, Distill and Extract support Markdown for AI agents, RAG, knowledge bases, and content ingestion, or structured data for databases, spreadsheets, enrichment jobs, and internal tools.

Marketing Technology News: From MarTech Stack to MarTech Fabric: Weaving Brand, Content, and Conversion Into One Thread

“AI agents are only as useful as the web data they can actually reach,” said Shuai Guan, Co-founder and CEO of Thunderbit. “We built Thunderbit to turn changing web pages into data that software can use reliably.”

Traditional scraping pipelines often rely on CSS selectors, XPath, or site-specific parsing rules that can break when layouts change. Thunderbit is built to understand page semantics and adapt to changing structure, helping developers get cleaner, more complete output without maintaining custom scrapers for every site.

The launch extends Thunderbit beyond its no-code Chrome extension and web app, which are used by sales, ecommerce, research, and operations teams to extract tens of millions of pages every month. Developers can now bring the same adaptive extraction engine into AI applications, automated workflows, and internal systems.

Write in to editor@pressreleasecc.com to learn more about our exclusive editorial packages and programs.

The post Thunderbit Launches High-Fidelity Web Data API, MCP Server, and CLI first appeared on PressReleaseCC.

Thunderbit Launches High-Fidelity Web Data API, MCP Server, and CLI first appeared on Web and IT News.

awnewsor

Recent Posts

Market Logic Network Strengthens Conversion-Focused Video Editing for Modern Businesses

Company combines advanced video production, AI creative systems, and advertising expertise to help businesses create…

51 minutes ago

Expert.ai and Fincons Group Join Forces to Bring Neuro-Symbolic AI to Data-Driven Businesses

The expanded partnership combines domain expertise, AI excellence and deep experience in architectures and systems…

52 minutes ago

AI Search Wars Intensify as Google Overhauls Its Engine and Challengers Gain Ground

Google just rolled out its most significant search changes in a quarter century. The updates,…

4 hours ago

China’s Subsea AI Vault: Offshore Wind and Ocean Depths Power First Commercial Underwater Data Center

Off the coast of Shanghai, in waters 10 meters deep, a cluster of sealed modules…

4 hours ago

Ecosia’s Bet on Leaner, European AI Challenges Big Tech’s Power-Hungry Models

Berlin-based Ecosia has spent years turning web searches into forests. Now the not-for-profit search engine…

4 hours ago

Pichai Faces the Boos: Google’s CEO Prepares Stanford Speech as AI Optimism Meets Graduate Fury

Eric Schmidt took the stage at the University of Arizona this month. He praised artificial…

4 hours ago

This website uses cookies.