Microsoft from new models of artificial ielligence Phi-4 Unveiled unable to process simultaneously Text, Image And Speech They are and at the same time need less processing power than similar models.
Model Phi-4-Multimodal With 5.6 billion parameter and model Phi-4-mini With 3.8 billion parameters, despite the smaller size, they provide a performance of double models larger. According to Microsoft’s technical report, these models overtake even their competitors in some tasks.
Simultaneous processing of text, image and speech with new models of Microsoft artificial ielligence

The characteristic of the Phi-4-MultiModal index is its multi-coated processing capability, which is possible thanks to the new “Loras combination” technique. This approach allows the model to process text, video and audio inputs at the same time without a drop in performance.
According to Microsoft’s deputy director of artificial ielligence, these models help developers create innovative and smarter programs. He emphasized that PHI-4-Multimodal provides advanced capabilities for speech, image and text processing simultaneously and opens new horizons in developing artificial ielligence-based applications.
The model ranked 6.14 % in the Hugging Face OpenSR ranking for speech detection and even performing better than specialized systems such as Whisperv3.
In general, the performance of Phi-4-Multiimodal has improved in the fields of speech recognition, translation, summary, sound understanding and analysis of image analysis.
Phi-4-Mini: Small but powerful model in mathematics and programming

The PHI-4-Mini model, despite its small size, has a high ability in textual tasks, and according to Microsoft, in many artificial ielligence tests, it offers a similar or superior function than models that are twice as big as.
This model earned 88.6 % in the GSM-8K test (a criterion for measuring the ability of models to solve mathematical problems), which is higher than many of the 8 billion parameter models. The MATH test also earns 64 %, which has more than 5 pois over its own models.
PHI-4-Mini is designed for situations where quickly and efficiency is needed, and developers can use both models on smartphones, PCs, and automobiles.
Microsoft with phi-4 models has shown that in artificial ielligence, power does not only depend on the size of the models, but also optimization and efficiency. These models are designed to be implemeed on conveional hardware without the need for consta connection to the cloud.
Microsoft has made these models available through Azure AI Foundry, Hugging Face and Nvidia API Catalog so that developers can easily use them in their projects.



