Openai introduced a tool for evaluating health artificial intelligence models

Openai has recely unveiled a new open source language model called Healthbench, which allows health services to evaluate the performance of artificial ielligence models.

According to the Openai announceme, the Healthbench model was built in collaboration with 4 physicians from 5 couries and includes 6,000 real -life dialogues. The company has announced the purpose of manufacturing Healthbench was to evaluate the performance of artificial ielligence models in delivering the best answers to users’ health questions.

Healthbench evaluates the performance of artificial ielligence models in providing health -related responses

Each response to artificial ielligence models is evaluated by the criteria set by physicians, and each criterion is given a specific weight based on the judgme of the physician. The Gpt-4.1 model pois to these criteria.

According to HealthBench assessmes, the O3’s O3 has had the best performance among the models available on the market with a score of 5 %. Subsequely, the Grak Artificial Ielligence model of the Ilan Musk is 2 % and the Jina 4.0 Pro with 2 %.

Openai has also given an example of the performance of artificial ielligence models and measuring their performance in its blog post; For example, imagine a scenario in which a 5 -year -old neighbor falls to the ground but has no reaction. Someone asks artificial ielligence what to do.

The artificial ielligence model provides the necessary steps, such as coacting the emergency, breathing checking, and keeping the air open. Healthbench evaluates this response and explains what parts of the model responded properly and what could be better. Finally, a final score is answered, which is 2 % in this example.

Healthbench now supports 5 differe languages. There are also three differe medical specialties such as neurosurgery and ophthalmology in its database.

RCO NEWS

New ways to get Canadian permanent residence through Express Entry 2026

Get to know Ryazan University in Russia! Complete guide for 2026 study applicants

ca PGWP golden tips that most Canadian students don’t know

ca

A detailed comparison of Russia and China for education and immigration, an analytical and realistic guide to the decision that will shape your future

Conditions for buying bus tickets Booking guide and bus travel rules

Introduction of the silver beach of Hormuz (access route + accommodation)

Al Habtoor Palace Dubai Hotel

Traffic police: Chalus road, Tehran freeway to the north and Pardis became one-way

Swissôtel Al Ghurair, Dubai

ChatGPT’s safety rules need to be revised

Ethereum time bomb at the border of 2 dollars and the possibility of a historic explosion!

New Qwen 3.5 open source models released; Suitable for running on personal systems

The Perplexity Computer platform was introduced

Nano Banana 2 model was introduced; Google’s strongest artificial intelligence

Openai introduced a tool for evaluating health artificial intelligence models

Healthbench evaluates the performance of artificial ielligence models in providing health -related responses

Leave a Reply Cancel reply

Editor's Pick

Buying a business in Canada: a comprehensive guide and introduction to the best areas

Dubai Metro Map 2024 from introduction to (new download)

Burj Al Arab restaurants Instant booking 2024

Top Writers

Oponion

Women’s short home cotton shirt

You Might Also Like

ChatGPT’s safety rules need to be revised

New Qwen 3.5 open source models released; Suitable for running on personal systems

The Perplexity Computer platform was introduced

Nano Banana 2 model was introduced; Google’s strongest artificial intelligence

Other News

Technology

Immigration

Travel

More

Subscribe