
Are you going to dictate into a shared document while you’re offline? Write an email? Ask for a conversion between liters and cups? You’re going to need a connection for that! Of course this will also be better on slow and spotty connections, but you have to admit it’s a little ironic. But it’s sort of funny considering hardly any of Google’s other products work offline.

Making speech recognition more responsive, and to have it work offline, is a nice development. “Given the trends in the industry, with the convergence of specialized hardware and algorithmic improvements, we are hopeful that the techniques presented here can soon be adopted in more languages and across broader domains of application,” writes Google, as if it is the trends that need to do the hard work of localization. So in a way this is just kind of a stress test for the real thing. So what’s the catch? Well, it only works in Gboard, Google’s keyboard app, and it only works on Pixels, and it only works in American English. No need to wait until you’ve finished a sentence to think whether you meant “their” or “there” - it figures it out on the fly. Google’s work on the topic, documented in a paper here, built on previous advances to create a model small and efficient enough to fit on a phone (it’s 80 megabytes, if you’re curious), but capable of hearing and transcribing speech as you say it. Dictation also supports numerous voice commands that allow you to select/edit text, move the cursor to a specified. But steady advancements in the field have made it plausible to do so, and Google’s latest product makes it available to anyone with a Pixel. Your phone could do it, for sure, but it wouldn’t be much faster than sending it off to the cloud, and it would eat up your battery. It’s not just about hearing a sound and writing a word - understanding what someone is saying word by word involves a whole lot of context about language and intention. Why not just do the voice recognition on the device? There’s nothing these companies would like more, but turning voice into text on the order of milliseconds takes quite a bit of computing power. This can take anywhere from a handful of milliseconds to multiple entire seconds (what a nightmare!), or longer if your packets get lost in the ether. The delay occurs because your voice, or some data derived from it anyway, has to travel from your phone to the servers of whoever operates the service, where it is analyzed and sent back a short time later. The dictation feature is currently available only if you use the online app in the Chrome browser.

#Google voice dictation online free#
Google’s latest speech recognition works entirely offline, eliminating that delay altogether - though of course mangling is still an option. Google Docs had added a dictation feature a few years back in the free Google Docs online application.

Voice recognition is a standard part of the smartphone package these days, and a corresponding part is the delay while you wait for Siri, Alexa or Google to return your query, either correctly interpreted or horribly mangled.
