Now a team of MIT and Google researchers inspired by a delicious paste (!) Have come up with a new method that can double the response speed of these models, without a significant drop in quality.
This new method, called PASTA (stands for Parallel Structure Annotation), teaches models to identify and write parts of the answer that have independent meaning when producing text; Instead of going to the word and back (as before).
Artificial brains learn to work their own division
The difference between Pasta and the previous methods is that it is no longer relevant to handmade rules and predefined structures (such as bolts or paragraphs). These models learn where it can be written at the same time, just like a skilled chef who realizes that it does not have to prepare all the ingredients and can parallel some parts.
According to Tian Jane, the main author of the research and a doctoral student at MIT: “The previous language models were like a chef that prepared Lasagna. But with Pasta, now they know they can prepare the material at the same time; For example, when the oven is warming up, stirring. “
How does it work?
There are two key components at the heart of this method: “Pasta-Lang The marking language that the model uses to label independent sections in its responses, and the interpreter that reads these labels and uses them parallel to the production of the text.
The models first find in two stages of how to make these labels and use them to produce faster, without reducing the accuracy of the answers. According to the report, the responses in most cases have either improved quality up to 5 % or have dropped up to 5 %, which is an acceptable transaction for up to twice as fast.
RCO NEWS




