Microsoft has announced an innovative technology that can implement LLM with 1-bit operations.

Microsoft has announced an innovative technology that can implement LLM with 1-bit operations.

In an era where a power plant must be built to turn one model, and a GPU can be the price of a car. The advancement of artificial intelligence soon became a war of capital. It is not strange to call it a kind of GPU mafia, a few companies that grab huge data centers and power grids.

However, Microsoft quietly announced a framework called bitnet.cpp last week. It unveiled the entire one-bit computational framework that allows large language models (LLMs) to be quickly turned into CPUs without GPUs.

The idea of BitNet is simple. Do we really need 16-bit, 32-bit operations? Can’t we use 1-bit operations?

It is an idea that returned to the essence of the computer that represented the world by 1 and 0,000. This model does not just reduce numbers, it again asks the philosophy of computation. For what percentage of accuracy are we using a few thousand watts?

BitNet compresses the world to a strange number of 1.58 bits. The weight is expressed in three values (-1, 0, +1), and activation is converted to an 8-bit integer. Let’s do this, the model is not a two-gigabyte monster, but a small, hard engine with 400 megabytes. It runs on desktops and laptops. Power is still one-tenth as fast.

The combination of bitnet.cpp and native 1-bit learning is likely to be a turning point for CPU reasoning. The commercial threshold for small model reasoning has been exceeded, and the need for GPUs seems to have been significantly lowered in edge, on-prem and cost-sensitive workloads.

However, it is clear that the high quality, long contextual processing, and re-learning of large-scale models are still the domain of GPUs. Therefore, a hybrid strategy that combines CPU (lightweight inference) and GPU (high-level inference and learning) is the most reasonable, and it seems that it is important to adopt this method to optimize the cost structure, especially in multi-agent systems that rely on high-cost large-scale LLM operations.

출처: https://arxiv.org/abs/2504.12285

tslaaftermarket

Share
Published by
tslaaftermarket

Recent Posts

Tesla Accelerates Global Expansion of Charging Network for Businesses

26/1/10 #TesslerNews Summary Tesla Accelerates Global Expansion of Charging Network for BusinessesTesla said it has…

3일 ago

amid profit-taking sales, led by semiconductor sector

01/08 U.S. stocks close mixed amid profit-taking sales, led by semiconductor sector The U.S. stock…

4일 ago

Tesla Captures First Highway Driving Test of Cybercab

26/1/8 #TeslaNews Summary Tesla Captures First Highway Driving Test of Cybercab• Tesla's self-driving car Cybercab…

5일 ago

U.S. stock markets open for sale on impact of Trump regulatory issues after digesting economic indicators

01/07 U.S. stock markets open for sale on impact of Trump regulatory issues after digesting…

5일 ago

Canada’s Ontario launches large-scale ESS operation based on Tesla’s Megapac

26/1/7 #TeslaNewsSummary Canada's Ontario launches large-scale ESS operation based on Tesla's MegapacTesla's $90 million Mega…

6일 ago

In a 5% drop from the high on the Nasdaq

In a 5% drop from the high on the Nasdaq, a collective liquidation of the…

6일 ago