Microsoft has announced an innovative technology that can implement LLM with 1-bit operations.

Microsoft has announced an innovative technology that can implement LLM with 1-bit operations.

In an era where a power plant must be built to turn one model, and a GPU can be the price of a car. The advancement of artificial intelligence soon became a war of capital. It is not strange to call it a kind of GPU mafia, a few companies that grab huge data centers and power grids.

However, Microsoft quietly announced a framework called bitnet.cpp last week. It unveiled the entire one-bit computational framework that allows large language models (LLMs) to be quickly turned into CPUs without GPUs.

The idea of BitNet is simple. Do we really need 16-bit, 32-bit operations? Can’t we use 1-bit operations?

It is an idea that returned to the essence of the computer that represented the world by 1 and 0,000. This model does not just reduce numbers, it again asks the philosophy of computation. For what percentage of accuracy are we using a few thousand watts?

BitNet compresses the world to a strange number of 1.58 bits. The weight is expressed in three values (-1, 0, +1), and activation is converted to an 8-bit integer. Let’s do this, the model is not a two-gigabyte monster, but a small, hard engine with 400 megabytes. It runs on desktops and laptops. Power is still one-tenth as fast.

The combination of bitnet.cpp and native 1-bit learning is likely to be a turning point for CPU reasoning. The commercial threshold for small model reasoning has been exceeded, and the need for GPUs seems to have been significantly lowered in edge, on-prem and cost-sensitive workloads.

However, it is clear that the high quality, long contextual processing, and re-learning of large-scale models are still the domain of GPUs. Therefore, a hybrid strategy that combines CPU (lightweight inference) and GPU (high-level inference and learning) is the most reasonable, and it seems that it is important to adopt this method to optimize the cost structure, especially in multi-agent systems that rely on high-cost large-scale LLM operations.

출처: https://arxiv.org/abs/2504.12285

tslaaftermarket

Share
Published by
tslaaftermarket

Recent Posts

U.S. stocks rise on chipmaker gains amid easing Middle East concerns

05/05 U.S. stocks rise on chipmaker gains amid easing Middle East concerns The U.S. stock…

17시간 ago

The heat in the stock market is really hot. On the

The heat in the stock market is really hot. On the contrary, the Iran war…

17시간 ago

It is a 100% novel that only looks at the current situation from a Chinese perspective.

It's 100% a novel. It is a 100% novel that only looks at the current…

24시간 ago

IonQ: It captures the movement of the ground with a satellite radar. $IONQ

IonQ: It captures the movement of the ground with a satellite radar. $IONQ You know…

24시간 ago

05/04 U.S. stocks fall on risk of Strait of Hormuz, oil prices, long-term interest rates rise together

05/04 U.S. stocks fall on risk of Strait of Hormuz, oil prices, long-term interest rates…

2일 ago

U.S. stocks open mixed after rising on solid earnings and U.S.-Iran expectations

05/01 U.S. stocks open mixed after rising on solid earnings and U.S.-Iran expectations The U.S.…

5일 ago