Huawei. A open source method called SINQ It has iroduced developers to shrink large language models and reduce their memory consumption by up to 5 %. This way, advanced artificial ielligence models can be implemeed on cheap hardware.
One of the biggest obstacles to the widespread use of large language models is their large size and their inevitable need for memory and computational power. Implemeation of these models usually requires very expensive graphics processors (such as the NVIDIA A100 or H100 for tens of thousands of dollars) and the high costs of cloud servers. This makes it difficult to access powerful artificial ielligence for researchers, startups and smaller companies.
One way to solve this problem is a process called quaity. In this method, the numerical accuracy of the model is reduced (similar to the reduction of the quality of an image to reduce its volume). This reduces memory consumption and speeds up, but its large risk is the decline in quality and the accuracy of the model’s output.
Huawei’s new and open source technique, called SINQ, is designed precisely to solve this problem. This method reduces memory consumption by 5 % to 5 %, without making a significa drop in output quality.
Huawei’s new way to run artificial ielligence on cheap systems
Reducing memory consumption in the new method means that a model that previously needed more than 2 GB of memory can now run on a system with about 1 GB of memory. In practice, this means you can use a graphics card like the NVIDIA GeForce RTX 4090 (priced at around $ 2) instead of a $ 4,000 H100 processor. This reduction in cost of using cloud servers is as significa.

The poi is that Huawei has made this method a completely open source. SINQ with Apache 2.0 licensed on GitHub and Hugging Face, which means that any individual or company around the world can use this code completely for free, change it, and even use it in its commercial products.
By dramatically reducing hardware and financial obstacles, Huawei gives the world -class developme power to work with larger and more powerful models and create a new wave of innovation in smart apps and services.



