Microsoft has announced an innovative technology that can implement LLM with 1-bit operations.

Microsoft has announced an innovative technology that can implement LLM with 1-bit operations.

In an era where a power plant must be built to turn one model, and a GPU can be the price of a car. The advancement of artificial intelligence soon became a war of capital. It is not strange to call it a kind of GPU mafia, a few companies that grab huge data centers and power grids.

However, Microsoft quietly announced a framework called bitnet.cpp last week. It unveiled the entire one-bit computational framework that allows large language models (LLMs) to be quickly turned into CPUs without GPUs.

The idea of BitNet is simple. Do we really need 16-bit, 32-bit operations? Can’t we use 1-bit operations?

It is an idea that returned to the essence of the computer that represented the world by 1 and 0,000. This model does not just reduce numbers, it again asks the philosophy of computation. For what percentage of accuracy are we using a few thousand watts?

BitNet compresses the world to a strange number of 1.58 bits. The weight is expressed in three values (-1, 0, +1), and activation is converted to an 8-bit integer. Let’s do this, the model is not a two-gigabyte monster, but a small, hard engine with 400 megabytes. It runs on desktops and laptops. Power is still one-tenth as fast.

The combination of bitnet.cpp and native 1-bit learning is likely to be a turning point for CPU reasoning. The commercial threshold for small model reasoning has been exceeded, and the need for GPUs seems to have been significantly lowered in edge, on-prem and cost-sensitive workloads.

However, it is clear that the high quality, long contextual processing, and re-learning of large-scale models are still the domain of GPUs. Therefore, a hybrid strategy that combines CPU (lightweight inference) and GPU (high-level inference and learning) is the most reasonable, and it seems that it is important to adopt this method to optimize the cost structure, especially in multi-agent systems that rely on high-cost large-scale LLM operations.

출처: https://arxiv.org/abs/2504.12285

tslaaftermarket

Share
Published by
tslaaftermarket

Recent Posts

Netherlands defends official approval of Tesla FSD

26/6/17 #Tesla/SpaceX Key Trends News Briefing Netherlands defends official approval of Tesla FSDThe Dutch transport…

1주 ago

BOJ plans to raise 25bp and end tapering

[BOJ plans to raise 25bp and end tapering] Bank of Japan (BoJ) raises short-term policy…

1주 ago

U.S. stocks rise on supply and demand factors from options deal in U.S.-Iran deal settlement

06/15 U.S. stocks rise on supply and demand factors from options deal in U.S.-Iran deal…

1주 ago

U.S. stocks rallied sharply on news that Israel and Iran would not attack each other, but

U.S. stocks rallied sharply on news that Israel and Iran would not attack each other,…

1주 ago

Cybertruck Dual-Motor AWD Launches For $59,990

26/6/12 #TeslaNews Summary Tesla FSD Approves Belgium Official… Europe's fifth countryTesla's fully self-driving feature, Supervised…

2주 ago

Tesla FSD Approves Belgium Official…

26/6/12 #TeslaNews Summary Tesla FSD Approves Belgium Official… Europe's fifth countryTesla's fully self-driving feature, Supervised…

2주 ago