I've had my current Android smartphone for several years, but have never tried its voice recognition features. I did try the feature on my prior phone, but after a few frustrating attempts to just have it call home when I said my wife's name, I deemed the feature useless to me; it was too aggravating to have the phone say something like "Did you say..." followed by something entirely unrelated. Though I might eventually get it to dial our home number, the process was more time consuming than for me to just type in the phone number. But after hearing an NPR report today while I was driving home on a recent study conducted jointly between researchers at Stanford University, the University of Washington and the Chinese search engine company Baidu that pitted humans typing on Apple iOS keyboards against Baidu's speech recognition software, I'll see how well the speech dictation software works on my current phone. The results of the study can be found on a Stanford University site at Speech Is 3x Faster than Typing for English and Mandarin Text Entry on Mobile Devices.
For the speech transcription method, the researchers used Baidu's Deep Speech 2 deep learning speech recognition system. The software took the spoken input and converted it to text. Recognition errors could then be corrected by the study participants by either speech or the smartphone's keyboard. That method was found to be three times faster than the participants relying solely on their typing skills on the keyboards for English and 2.8 times as fast for Mandarin Chinese. And, strikingly, the English error rate was 20.4% lower, and the Mandarin error rate 63.4% lower, than the keyboard method. I don't know Mandarin, but a 20.4% lower error rate for English is significant.
I took a typing class in high school - when I was in high school typewriters were still common - after the typing teacher stated that it would be useful for typing papers for those of us who hoped to go on to college. I bought a cheap typewriter in college, but didn't use it much, instead I had most of the papers I needed to be typewritten typed by a local high school teacher who, as a side business typed papers for the nearby university students at a nominal cost. She also proofread the papers, correcting spelling and grammar errors, which I felt was worth the cost of paying to have papers typed that would be an important part of my grade. But, though I didn't use the typing skills I learned in the high school typing class a lot for typing papers, I found those skills invaluable for the many later computer courses I took. So, I'm a fairly fast typist on a full size keyboard, but I'm very slow on the tiny keyboard on my phone and am akin to the sloth, Flash, in the movie Zootopia when compared to some of my nieces - one of my nephews bought a phone for his younger sister, but had to quickly change the text plan when she had 3,000 text messages one month. For her, typing her text messages might be faster than using the voice recognition feature on her phone, but, even though the voice recognition software on my phone is doubtless far less powerful than that of Baidu's Deep Speech 2, I'll try that feature of the phone, since I would also expect there has been improvement in the intervening years since I first tried the feature on a prior phone. As Baidu chief scientist Andrew Ng noted "Humanity was never designed to communicate by using our fingers to poke at a tiny little keyboard on a mobile phone. Speech has always been a much more natural way for humans to communicate with each other."
Ng also stated he looks forward to the day when his future grandchild comes home and asks, "Is it really true that when you were young, if you came home and you said something to your microwave oven — did it really just sit there and ignore you? That's just so rude of the microwave." As we move futher into the Internet of Things (IoT), I expect that people talking to their microwaves and other household appliances will become common.
A text version of the NPR article, which was broadcast on All Things Considered, is available at Voice Recognition Software Finally Beats Humans At Typing, Study Finds.