Игорь Градов
Игорь Градов
6 мин
ai

Стартап Subquadratic заявил, что его LLM модели в 12 раз быстрее: что это значит для рынка

Miami-based AI startup Subquadratic came out of stealth mode last month with a huge claim. It announced that it had solved a mathematical bottleneck that had been holding back large language models for almost a decade.

The details were thin, and many people were unconvinced. But Subquadratic has started to bring the receipts, sharing the results of an independent evaluation of its new tech. The results suggest that the company's claims might be worth paying attention to.

According to Subquadratic, it has developed a new kind of LLM, called SubQ, that is faster and cheaper and uses a lot less energy than any other model on the market. The company also claims that SubQ is able to process up to 12 times as much text at once than most other models, allowing it to carry out a range of data-heavy tasks, such as analyzing hundreds of documents or entire code bases.

What's more, Subquadratic says, SubQ does this while more or less matching the performance of the best models put out by Google DeepMind, OpenAI, and Anthropic on key tasks like coding.

The problem was that the company at first provided little evidence for its claims beyond a handful of self-published test scores. And it has yet to make SubQ widely available for people to try out themselves.

So it's no surprise that Subquadratic's claims were met with skepticism. Dan McAteer, an artificial intelligence engineer, captured the overall response on X: "SubQ is either the biggest breakthrough since the Transformer ... or it's AI Theranos."

A month on, the company has published more information about its model, including the results of additional independent tests run by third-party firm Appen.

"We expected healthy skepticism," says Subquadratic cofounder and chief technology officer Alex Whedon. "In hindsight, releasing the third-party benchmarks alongside the initial announcement would have preempted much of the skepticism, which is why we're taking the time to make sure any future results are fully verified before putting them out."

Subquadratic asked Appen, which evaluates other companies' models, to run its tests on SubQ. The results seem to back up a lot of Subquadratic's claims. "That was really exciting to me, it validated their architecture," says Jeanine Sinanan-Singh, Appen's director of generative AI research.

"I was like, 'Wow, this could be a game changer,' because models struggle with speed and inefficiency," she adds. "But when you have kind of shocking results, it's really not as credible when you say it yourself."

SubQ won't replace existing top models across the board, but it could offer huge increases in speed at a fraction of the typical cost for certain tasks. Subquadratic insists that in the long run, though, its breakthrough could change how LLMs are built. "We hope we're kicking off a new age of efficiency," says Justin Dangel, the firm's cofounder and CEO. "We don't think anybody will be building on transformers in a few years."

Attention! To understand why Subquadratic's claims are a big deal, let's dig into how most LLMs work. The key mechanism inside an LLM is a type of neural network called a transformer, which runs a process known as dense attention. Today's LLMs typically chain together multiple transformers. (The foundational paper of the LLM era, published by researchers at Google in 2017, was titled "Attention Is All You Need.")

Dense attention works like this: When a transformer processes a chunk of text, it first encodes each word (or part of a word, known as a token) with a number. To capture the meaning of the full text, it then multiplies each of those numbers with every other number for that text. For example, a piece of text 10,000 words long would kick off almost 50 million individual multiplications. That's a lot of computation and the main reason that LLMs are notorious power hogs.

"If you want to summarize The Great Gatsby, you have to look at the first word and the last word together, and then you have to look at every other combination," says Dangel.

As the length of the text increases, the number of computations skyrockets. That's because each additional number must be multiplied by all other previous numbers. Double the number of words, and you roughly quadruple the number of computations, a rate of increase known as a quadratic expansion.

(You can picture this yourself: Draw a circle and mark dots around its edge. Each dot is a token. Then draw lines between pairs of dots to represent the multiplication of those two tokens. A circle with five dots will have 10 lines crossing it. Make it 10 dots and you will have 45 lines, 20 dots and you will have 190 lines, and so on.)

Slashing costs Subquadratic's solution is to ditch dense attention, the core operation of a transformer, in favor of what's known as sparse attention, which slashes the number of computations needed. Instead of multiplying the number assigned to each token by every other number, sparse attention selects just some of the numbers to multiply. The idea is that not all relationships between words in a piece of text matter.

"Sparse attention says not all of those relationships are important, because they're not," says Whedon. "If you're reading a book, you're not going to look at the first and second words, first and third—that's insane."

It's a simple approach, and Subquadratic is not the first to try it. "Pretty much everything under the sun has been attempted," says Will Depue, an independent AI researcher who previously worked at OpenAI. "It's not impossible, but it's akin to running a four-minute mile."

Previous techniques for selecting which numbers to multiply and which to ignore have not produced a mechanism that can capture the meaning of a document as well as dense attention can.

Subquadratic claims to have cracked the problem at last. It pitches SubQ as the first large language model to demonstrate that sparse attention can match the quality of dense attention while dramatically cutting costs.

"If the claims prove true, that's a very big deal," says Depue. "I want to see more, but I don't think we should write them off."

Cautious optimism Subquadratic's early results are promising, but the picture is still incomplete. Many researchers want to see how SubQ performs on a wider range of tasks. The company says it plans to publish a technical paper soon and make SubQ available through an API.

"This is a moment that demands cautious optimism," says Sinanan-Singh. "We need more evidence, but what we've seen so far is encouraging."

Subquadratic has raised an undisclosed amount of seed funding. The company has not revealed its valuation or the names of its investors. Источник: MIT Technology Review Ссылка: https://www.technologyreview.com/2025/06/20/1115282/a-startup-claims-it-broke-through-a-bottleneck-thats-holding-back-llms/

Поделиться:TelegramVK
Игорь Градов
Игорь Градов

Основатель dzen.guru. Эксперт по монетизации и продвижению на Дзен. Автор курса «Старт на Дзен 2026».

Комментарии

Читайте также

Не поиск по фото AI, а память моделей: сервис ex-OpenAI покажет, знают ли вас 13 нейросетей
ai

Не поиск по фото AI, а память моделей: сервис ex-OpenAI покажет, знают ли вас 13 нейросетей

Сервис In the Weights, запущенный бывшими сотрудниками OpenAI Томасом Димсоном и Джоуи Флинном, проверяет, насколько хорошо популярные ИИ-модели «помнят»…

4 мин
Белый дом запретил экспорт ИИ-моделей Anthropic: кибербезопасность повторяет провал криптовойн
ai

Белый дом запретил экспорт ИИ-моделей Anthropic: кибербезопасность повторяет провал криптовойн

Экспортный контроль над кибероружием проваливается уже тридцать лет, и история с запретом моделей Anthropic показывает, что попытка ограничить доступ к мощным…

6 мин
Глава корпоративного ИИ ушёл из OpenAI спустя 5 месяцев: новости накануне IPO тревожат
ai

Глава корпоративного ИИ ушёл из OpenAI спустя 5 месяцев: новости накануне IPO тревожат

Барретт Зоф, глава корпоративного направления ИИ в OpenAI, покинул компанию спустя пять месяцев после возвращения, подтвердив тренд на утечку руководителей…

4 мин