Apple researchers from a new artificial ielligence model called Ferret-UI Lite have unveiled; A lightweight artificial ielligence age that runs directly on the device and can ieract with the user ierface of applications based on user requests. It is worth noting that this model, despite having only 3 billion parameters, performs at the same level or even better than some GUI models, which are up to 24 times larger.
The story of Ferret goes back to December 2023; When a team of 9 Apple researchers published an article titled “FERRET: Refer and Ground Anything Anywhere at Any Granularity” published In that research, a multimodal linguistic model (MLLM) was iroduced that could respond to linguistic references about specific parts of an image.
After that, Apple developed versions including Ferretv2 ,Ferret-UI and Ferret-UI 2 published

While the original Ferret-UI was built on a model with 13 billion parameters, and Ferret-UI 2 added support for more platforms and higher resolutions, the Lite version takes a differe approach; The model, which was designed from the beginning to run directly on the device, has a light and energy-efficie structure, and despite its smaller size, it appears competitive against much larger models.
Researchers emphasize that most existing GUI ages are based on massive server-side models; Because these models have strong reasoning and planning ability. But such models are usually too heavy and expensive to run on the device.
Ferret-UI Lite combines real and artificial data, supervised fine-tuning and trained reinforceme learning and uses real-time cropping and zooming techniques. In this method, after an initial prediction, the model re-cuts the same part and analyzes it more accurately to compensate for the limitation of its capacity in processing image details.

One of the main innovations of Ferret-UI Lite is the use of a multi-age system to generate syhetic training data; A system that designs tasks, divides them io execution steps, executes them, and finally evaluates the result so that real ieractions, even with errors and unforeseen conditions, are recorded in the data.
The strengths and limitations of this artificial ielligence model
The results show that Ferret-UI Lite performs very well in short-term and low-level tasks, but appears weaker than larger models in complex and multi-step ieractions; An issue that can be expected considering the limitations of a small model and on a device.
On the corary, its most importa advaage is local implemeation and privacy protection; Because no data is se to cloud servers for processing.
All in all, Ferret-UI Lite could be an importa step towards personal AI ages that run directly on a phone or laptop and automatically ieract with apps.



