Microsoft has announced an innovative technology that can implement LLM with 1-bit operations.
In an era where a power plant must be built to turn one model, and a GPU can be the price of a car. The advancement of artificial intelligence soon became a war of capital. It is not strange to call it a kind of GPU mafia, a few companies that grab huge data centers and power grids.
However, Microsoft quietly announced a framework called bitnet.cpp last week. It unveiled the entire one-bit computational framework that allows large language models (LLMs) to be quickly turned into CPUs without GPUs.
The idea of BitNet is simple. Do we really need 16-bit, 32-bit operations? Can’t we use 1-bit operations?
It is an idea that returned to the essence of the computer that represented the world by 1 and 0,000. This model does not just reduce numbers, it again asks the philosophy of computation. For what percentage of accuracy are we using a few thousand watts?
BitNet compresses the world to a strange number of 1.58 bits. The weight is expressed in three values (-1, 0, +1), and activation is converted to an 8-bit integer. Let’s do this, the model is not a two-gigabyte monster, but a small, hard engine with 400 megabytes. It runs on desktops and laptops. Power is still one-tenth as fast.
The combination of bitnet.cpp and native 1-bit learning is likely to be a turning point for CPU reasoning. The commercial threshold for small model reasoning has been exceeded, and the need for GPUs seems to have been significantly lowered in edge, on-prem and cost-sensitive workloads.
However, it is clear that the high quality, long contextual processing, and re-learning of large-scale models are still the domain of GPUs. Therefore, a hybrid strategy that combines CPU (lightweight inference) and GPU (high-level inference and learning) is the most reasonable, and it seems that it is important to adopt this method to optimize the cost structure, especially in multi-agent systems that rely on high-cost large-scale LLM operations.
출처: https://arxiv.org/abs/2504.12285
Charles Schwab Announces 'Prospect' Of Musk's 2025 CEO Performance CompensationInvestment manager Charles Schwab has officially…
Atlantic in AI Crash Scenario current situation an enormous amount of investment Global AI Spending…
1) USD-Won flat note, biggest ever foreign equity investment in OctoberThe dollar-won (REGN) exchange rate…
Jensen Huang promised to supply Nvidia GPUs to Korea first and 260,000 units. This is…
Jensen Huang promised to supply Nvidia GPUs to Korea first and 260,000 units. This is…
The Daily Show host's comment has become a hot topic in Korea. "We're trying hard…