TikTok parent company ByteDance has a tool that's scraping the web 25 times faster than OpenAI

What we know about ByteDance's web crawler and its massive appetite.
By Cecily Mauran  on 
TikTok logo displayed on a phone screen and a laptop keyboard
What is ByteDance and its aggressive web crawler up to? Credit: Jakub Porzycki / NurPhoto / Getty Images

TikTok parent company ByteDance is amassing huge volumes of web data way faster than the other major web crawlers

ByteDance may be planning to release its own LLM, and is aggressively using its web crawler, "Bytespider," to scrape up data to train its models, Fortune reported.

Bytespider showed up on the scene in April, and since then, its rate of consumption puts web scrapers from OpenAI, Google, Meta, and Anthropic to shame.

Mashable Light Speed
Want more out-of-this world tech, space and science stories?
Sign up for Mashable's weekly Light Speed newsletter.
By signing up you agree to our Terms of Use and Privacy Policy.
Thanks for signing up!

Sam Crowther, CEO of Kasada, a company that specializes in bot management, told the outlet that Bytespider's scraping rate is 25 times more than OpenAI's GPTbot and 3,000 times the rate of ClaudeBot, which is Anthropic's web crawler for its Claude LLM. Crowther also said that Kasada's data has seen "huge spikes in scraping activity" from Bytespider in the last six weeks.

As Bytespider voraciously consumes the web, the U.S. government is trying to inhibit potential access of American user data to the Chinese government. In April, President Biden signed a bill forcing the ban of TikTok unless it was sold by ByteDance within the year. Given ByteDance's ticking clock for selling TikTok, the sense of urgency fits the massive rate of its web crawling activity — whether for an LLM, a better algorithm, or something else, we don't know.

What ByteDance plans to do with all of its newly-mined data remains to be seen. But TikTok has launched several AI-powered features for the platform. In May, it announced a suite of tools for advertisers to create AI-generated ads, and AI-generated avatars for brands and creators. TikTok is also rumored to be working on an internal search engine, with results powered by AI — possibly using ChatGPT.

Mashable Image
Cecily Mauran

Cecily is a tech reporter at Mashable who covers AI, Apple, and emerging tech trends. Before getting her master's degree at Columbia Journalism School, she spent several years working with startups and social impact businesses for Unreasonable Group and B Lab. Before that, she co-founded a startup consulting business for emerging entrepreneurial hubs in South America, Europe, and Asia. You can find her on Twitter at @cecily_mauran.


Recommended For You
OpenAI has a '99% effective' ChatGPT-detection tool ready. So why aren't they releasing it?
The Open AI logo

OpenAI is reportedly going all-in as a for-profit company
A close-up image of OpenAI CEO Sam Altman next to a phone displaying the OpenAI logo.

Elon Musk is suing OpenAI and Sam Altman again after dropping his previous lawsuit
Elon Musk and Sam Altman with the OpenAI logo on a mobile device

OpenAI funding round values company at $157 billion
Sam Altman, chief executive officer of OpenAI, arrives for the Allen & Co. Media and Technology Conference in Sun Valley, Idaho, US, on Tuesday, July 9, 2024

OpenAI cofounder shakeup: John Schulman quits, Greg Brockman goes on leave.
the OpenAI logo is displayed on a smartphone screen

Trending on Mashable
Wordle today: Answer, hints for October 31
a phone displaying Wordle

NYT Connections hints today: Clues, answers for October 31
A phone displaying the New York Times game 'Connections.'


Wordle today: Answer, hints for October 30
a phone displaying Wordle

NYT's The Mini crossword answers for October 31, 2024
Closeup view of crossword puzzle clues
The biggest stories of the day delivered to your inbox.
This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.
Thanks for signing up. See you at your inbox!