In a keynote today, Bill Daly, chief scientist and senior vice president of research at Nvidia, highlighted significant gains in hardware performance that have paved the way for advances in artificial intelligence and offer exciting opportunities for further advances in machine learning.
According to Hoshio, Senior Scientist Bill Daly presented research that promises to take machine learning to unprecedented levels.
In a talk at Hot Chips, an annual event for processor and system architects, Daly showcased a range of techniques currently in use, some of which are showing significant results.
“Advances in artificial intelligence have been remarkable thanks to improved hardware, and those advances are still held back by deep learning hardware,” says Daly, one of the world’s leading computer scientists and former chair of Stanford University’s computer science department.
For example, he showed how ChatGPT, a large language model (LLM) used by millions of people, can suggest an outline for your speech. Such capabilities owe much to GPU gains in AI inference performance over the past decade, he said.
Researchers are preparing the next wave of advances. Daly shared details of a test chip that demonstrated impressive performance of nearly 100 tera-operations per watt in a low-power, low-power memory (LLM) configuration.
Through a recent experiment, researchers have discovered a low-power way to speed up transformer models used in generative artificial intelligence. This technique involves the use of four-bit calculations, which is considered a simplified numerical method. This method shows promising prospects for improvements and providing more benefits in the future.
Daly has discussed ways to speed up calculations and save energy using logarithmic mathematics as an approach to achieving these goals. The approach is detailed in a patent filed by NVIDIA in 2021.
He has explored dozens of other techniques for customizing hardware specifically for AI tasks. This customization involves creating new data types or operations to optimize hardware performance in AI applications.
Daly noted that researchers must develop hardware and software together and make thoughtful choices about how to allocate energy resources. For example, minimizing data transfer in memory and communication circuits should be prioritized.
“Being a computer engineer in this day and age is exciting because they are playing an influential role in the important revolution that is happening in artificial intelligence,” Daly said. “The true extent of this revolution is not yet fully understood, and that makes it all the more exciting.”
RCO NEWS