Microsoft has unveiled a new feature called Copilot Vision, which is integrated into the Edge browser and allows users to analyze web pages with the help of artificial intelligence in real time. This feature is currently available in preview for some Copilot Pro users in the US through Copilot Labs.
Mostafa Suleiman, a prominent AI executive, tweeted about the feature: “Perhaps one shouldn’t favor features, but Copilot Vision has excited me since day one. This is the first time such an artificial intelligence experience has been offered. Now you can show the AI what you need help with in real-time when you’re browsing, shopping, or even finding places instead of having to explain.” This new feature of Microsoft promises a revolution in the way users interact with the web.
This service allows Copilot Vision to read and analyze web pages with the user’s permission, simplifying information and helping with tasks such as shopping or planning entertainment. In this announcement, the Copilot team noted: “Browsing the Internet no longer needs to be alone; Now you can easily browse web pages and do your work with artificial intelligence.”
By scanning and interpreting the content of pages, Copilot Vision helps users make decisions or learn from the information presented. For example, Copilot Vision can guide users in learning new games or finding specific products that match their online shopping preferences.
Microsoft has emphasized privacy and security in the development of Vision and has ensured that user data is not stored after the end of each session and is used in accordance with the company’s privacy policies. “Only copilot responses are saved to improve our security systems,” the company said.
Currently, Vision only interacts with a handful of websites. Microsoft plans to gradually expand its reach and collect user feedback to improve the user experience. The company is also working with third-party publishers to improve how Vision interacts with web pages.
“Vision does not capture, store or use any data from publishers to train our models,” the company added in its blog post. Similar to Vision Copilot, OpenAI plans to launch its own operator called Operator in January.
Meanwhile, Google is working on an experimental AI assistant called “Jarvis,” possibly powered by Gemini 2.0. Jarvis works inside the Chrome browser and interacts with on-screen elements such as fields and buttons. This assistant is capable of complex tasks; such as flight booking and help with online shopping and make digital activities easier for users.
Similarly, Anthropic has introduced a new feature called “Computer Use” for the Claude 3.5 Sonnet version, which allows the AI to automatically perform tasks such as moving the mouse, clicking and typing. Designed for software developers, this feature can handle complex tasks such as coding a simple website or planning entertainment in various applications.
RCO NEWS