The Japanese Japanese startup has unveiled a new technology called “CUDA Artificial Ielligence Engineer” that automatically discovers and optimizes CUDA kernels using large language models and evolutionary optimization techniques. This technology can increase the processing speed by up to 5 times more compared to PyTorch and up to 5 times more than the existing CUDA kernels.
What is CUDA and its optimization challenges?
CUDA is a low -level programming ierface that allows direct access to NVIDIA GPUs and is designed for parallel processing. However, the manual optimization of CUDA kernels requires a high expertise in GPU architecture and is a complex and time -consuming process.
The new Sakana AI technology has solved this challenge by automating the process of developing and optimizing CUDA kernels through the use of artificial ielligence.


How CUDA AI AI artificial ielligence works
This framework converts the standard PyTorch code to optimized CUDA kernels through a multi -stage pipeline. The process includes the following steps:
- Translation of Operations PyTorch To kernels Cuda: This step usually reduces run time without the need for manual settings.
- Evolutionary optimization: By using methods such as Operation of composition (Crossover Operations) And Innovation Archives (Innovation Archive)The yield of kernels is optimized.
- Iegration of various kernel operationsA: This system is capable of combining various operations as efficie and to achieve a better performance than many accelerated methods available.
Results and achievemes
According to Sakana AI, CUDA artificial ielligence engineer has been able to:
- More than 1 out of 2 operations PyTorch To translate successfully.
- So far 2.3 kernels Cuda Produced that 2.3 of them Have been approved in terms of performance.
- Almost 1% From kernels produced They have a better performance than native PyTorch.
Feedback and landscapes
«Jim Fan»NVIDIA’s chief research officer said about the technology:
“This is one of the most ieresting automatic coding tools I have seen recely. “The use of artificial ielligence to write better CUDA kernels and increase processing speed in artificial ielligence will change the future of computational resources.”
He believes that the technology can significaly increase the productivity of computational resources.
Access to ieractive data and tools
For transparency and public access, Sakana AI has published a set of data related to the project under CC-By-4.0 on the Hugging Face platform. This set includes:
- Reference implemeation
- Profile data
- Functional comparisons against native PyTorch versions
Also, the company has set up an ieractive website that allows users to:
- Check the data set.
- See the ranking of optimized kernels.
- Access to kernel code, performance indicators, and optimization tests.
Conclusion
Sakana AI’s CUDA Artificial Ielligence Engineer Technology has taken a major step in increasing GPU processing speed by automating the developme and optimization process of CUDA kernels. This approach not only significaly improves computational performance, but also allows developers and researchers to enjoy CUDA power in a simpler and efficie way.
This technology can be used as one of the key tools in optimizing the computational resources and accelerating artificial ielligence developmes.



