Huawei. A open source method called SINQ It has introduced developers to shrink large language models and reduce their memory consumption by up to 5 %. This way, advanced artificial intelligence models can be implemented on cheap hardware.
One of the biggest obstacles to the widespread use of large language models is their large size and their inevitable need for memory and computational power. Implementation of these models usually requires very expensive graphics processors (such as the NVIDIA A100 or H100 for tens of thousands of dollars) and the high costs of cloud servers. This makes it difficult to access powerful artificial intelligence for researchers, startups and smaller companies.
One way to solve this problem is a process called quantity. In this method, the numerical accuracy of the model is reduced (similar to the reduction of the quality of an image to reduce its volume). This reduces memory consumption and speeds up, but its large risk is the decline in quality and the accuracy of the model’s output.
Huawei’s new and open source technique, called SINQ, is designed precisely to solve this problem. This method reduces memory consumption by 5 % to 5 %, without making a significant drop in output quality.
Huawei’s new way to run artificial intelligence on cheap systems
Reducing memory consumption in the new method means that a model that previously needed more than 2 GB of memory can now run on a system with about 1 GB of memory. In practice, this means you can use a graphics card like the NVIDIA GeForce RTX 4090 (priced at around $ 2) instead of a $ 4,000 H100 processor. This reduction in cost of using cloud servers is as significant.
The point is that Huawei has made this method a completely open source. SINQ with Apache 2.0 licensed on GitHub and Hugging Face, which means that any individual or company around the world can use this code completely for free, change it, and even use it in its commercial products.
By dramatically reducing hardware and financial obstacles, Huawei gives the world -class development power to work with larger and more powerful models and create a new wave of innovation in smart apps and services.
RCO NEWS




