Microsoft has announced an innovative technology that can implement LLM with 1-bit operations.

Microsoft has announced an innovative technology that can implement LLM with 1-bit operations.

In an era where a power plant must be built to turn one model, and a GPU can be the price of a car. The advancement of artificial intelligence soon became a war of capital. It is not strange to call it a kind of GPU mafia, a few companies that grab huge data centers and power grids.

However, Microsoft quietly announced a framework called bitnet.cpp last week. It unveiled the entire one-bit computational framework that allows large language models (LLMs) to be quickly turned into CPUs without GPUs.

The idea of BitNet is simple. Do we really need 16-bit, 32-bit operations? Can’t we use 1-bit operations?

It is an idea that returned to the essence of the computer that represented the world by 1 and 0,000. This model does not just reduce numbers, it again asks the philosophy of computation. For what percentage of accuracy are we using a few thousand watts?

BitNet compresses the world to a strange number of 1.58 bits. The weight is expressed in three values (-1, 0, +1), and activation is converted to an 8-bit integer. Let’s do this, the model is not a two-gigabyte monster, but a small, hard engine with 400 megabytes. It runs on desktops and laptops. Power is still one-tenth as fast.

The combination of bitnet.cpp and native 1-bit learning is likely to be a turning point for CPU reasoning. The commercial threshold for small model reasoning has been exceeded, and the need for GPUs seems to have been significantly lowered in edge, on-prem and cost-sensitive workloads.

However, it is clear that the high quality, long contextual processing, and re-learning of large-scale models are still the domain of GPUs. Therefore, a hybrid strategy that combines CPU (lightweight inference) and GPU (high-level inference and learning) is the most reasonable, and it seems that it is important to adopt this method to optimize the cost structure, especially in multi-agent systems that rely on high-cost large-scale LLM operations.

출처: https://arxiv.org/abs/2504.12285

tslaaftermarket

Share
Published by
tslaaftermarket

Recent Posts

A point that changes into a certain important

A point that changes into a certain important transition phase is called an inflection point.…

2 분 ago

Is the ‘Russian Engine Mistake’ story true?

Is the 'Russian Engine Mistake' story true? Why South Korea Becomes World's 7th Space Power…

1시간 ago

The price of McDonald’s Big Mac in the U.S. has

The price of McDonald's Big Mac in the U.S. has become 40% more expensive compared…

2일 ago

Supply and demand segments

11/24 Supply and demand segments (comprehensive of major investment firms) - U.S. stocks on Wednesday…

2일 ago

Alternative Economic Model for Korean

< Alternative Economic Model for Korean Progress - The 'European Model' is on the verge…

2일 ago

Governor Williams ‘has room to cut interest rates’

Governor Williams 'has room to cut interest rates' 1) Dollar-Won rises 20 won in one…

2일 ago