At I/O 2019, Google is going public with a bold AI-driven accessibility initiative it calls Project Euphonia. The goal is to transcribe the words spoken by people who have non-standard speech patterns — those who live with ALS, hearing impairment and other conditions — across messaging platforms to enable better communication between those people, their friends and family, and others.
Speech to text is something we take for granted a lot of the time. It's a complicated process, though — so much so that the heavy lifting is actually done remotely, and the end result is sent back to our devices. But Google has worked out a way to shrink the process to the point that it can be performed locally, and the fruits of that labor are coming to Gboard.
Speech recognition is one of the most powerful aspects of many Google products, particularly in the Google app where Voice Search relies on being able to understand what we're saying. The same is true of Gboard, which is capable of typing up entire messages based on what you dictate to it. We may take it for granted somewhat these days, but it truly is a marvel. Now, this feature can be enjoyed by many more around the globe as Google has added support for 30 further languages.
Google's speech recognition error rate is getting lower and lower - yesterday, the company said it's now under 5% and has dropped from 8.5% this time last year. And I find that to be more and more the case in my own use: Google seems to recognize almost everything I throw at it now, even when I add Lebanese/Arabic names from my contacts list that I wouldn't expect it to get right.
But if you're wondering how Google's speech recognition fares in comparison to other voice assistants, Wired has made a video in conjunction with Andy Wood and Matt Kirshen (from Probably Science) to show you just that.
It's no secret that one of Google's strengths in recent years has been voice recognition. In my own experience, my Google Home picks up what I am trying to say almost every time, even in a low voice. Obviously the success rate varies by language and accent, but it is still pretty darn impressive.
As the years go by, I get lazier it seems. So the prospect of talking to my computing devices gets better and better as time passes (even if I have to yell across my house to my Google Home, which is amusing in its own right). However, always-on speech recognition comes at a price for battery-operated devices. Researchers at MIT claim to have to come up with a solution to this: a dedicated speech recognition chip that can reduce power consumption by 90-99% across real-world devices.
We know, we know - you're tired of hearing about Siri and its respective knockoffs. But, we assure you, this one is different. Very different. In fact, it's beyond anything we've ever seen before.
The app is called Utter! and while it isn't yet available for download, it's already doing things that we could previously only imagine. Instead of just giving you a generic answer such as Siri and the like, it actually utilizes the apps that you already have installed. Want a add a calendar appointment? Tell Utter, and it'll take care of it. Get travel details, find out the weather, and launch applications - all child's play for Utter, and all done using native applications instead of just simple searches.
Earlier today, popular voice recognition software corporation Nuance launched Dragon Go! by Nuance on the Android Market, bringing voice recognition that "just works" to the Android platform. Dragon Go! answers the users' queries by pulling data from a variety of sources, including Spotify, Wolfram|Alpha, Yelp, YouTube, AccuWeather, Ask.com, Dictionary.com, ESPN, Facebook, Fandango, Last.fm, LiveNation, Milo.com, OpenTable, Pandora, Rotten Tomatoes, Twitter, Wikipedia, Yahoo!, Bing, and hundreds of others. Additionally, Dragon Go's "Dragon Carousel" software provides users with complementary results to compare information across the most relevant sites for their query.