Microsoft’s AI is now better at image captioning than humans
Describing an image accurately, and not just like a clueless robot, has long been the goal of AI. In 2016, Google said its artificial intelligence could caption images almost as well as humans, with 94 percent accuracy. Now Microsoft says it’s gone even further: Its researchers have built an AI system that’s even more accurate than humans — so much so that it now sits at the top of the leaderboard for the nocaps image captioning benchmark.
And while that’s a notable milestone on its own, Microsoft isn’t just keeping this tech to itself. It’s now offering the new captioning model as part of Azure’s Cognitive Services, so any developer can bring it into their apps. It’s also available today in Seeing AI, Microsoft’s app for blind and visually impaired users that can narrative the world around them. And later this year, the captioning model will also improve your presentations in PowerPoint for the web, Windows and Mac. It’ll also pop up in Word and Outlook on desktop platforms.
“[Image captioning] is one of the hardest problems in AI,” said Eric Boyd, CVP of Azure AI, in an interview with Engadget. “It represents not only understanding the objects in a scene, but how they’re interacting, and how to describe them.” Refining captioning techniques can help every user: It makes it easier to find the images you’re looking for in search engines. And for visually impaired users, it can make navigating the web and software dramatically better.
It’s not unusual to see companies tout their AI research innovations, but it’s far rarer for those discoveries to be quickly deployed to shipping products. According to Boyd, it took Microsoft a matter of months to weave the new model into Azure and its apps.
https://www.blogsmithmedia.com/www.engadget.com/media/feedlogo.gif?cachebust=true Engadget RSS Feed https://www.engadget.com/rss.xml