The first Sci-Fi I ever remember reading was The Hitch-Hikers Guide to the Galaxy. It made a huge impression on me, to the extent that I bought a version of Ford Prefect’s bag to take to school, and made sure I always had a towel in it. Luckily, since I was only about ten, and there were only fifty kids in my whole school, this wasn’t an issue.
In the book, technology is accessible. Eddie, the shipboard computer, has a personality and responds to voice commands. He doesn’t always respond by carrying out those commands, but he hears and understands them. Around the same time, we got a Cheetah Sweet Talker module for our BBC Micro B, and we could make it talk!
If you watch the above video, you’ll see that the Sweet Talker was very basic tech, and we had to tell it what to say. Getting the computer to respond to a query was possible, but it would only be a response that you had programmed in. And to a query that you typed.
Many years later, I had a PC running Windows. Somewhere in there was the Windows Speech Recognition system. I still wanted a voice responsive computer, but the most common thing I wanted to ask for was for it to play my music, to skip a song I didn’t want, or to pause when the phone rang. I couldn’t get the system to START the music player, and once the music was playing, the computer didn’t seem to be able to hear me to ask it to pause the music.
All this was brought to mind last week, when I had an issue at work and went to ask Terry to sort it out. (Because Terry CAN sort things out, that’s why. Every office has a Terry.) He had to send an email, and was irritated by the fact that he had to type it out by hand.
“Wish this computer had Siri.” he muttered, “Then I could just dictate this.”
I was a bit amazed. As a writer, naturally I have tried dictation as a method for getting the stupid words out of my head and onto the screen. Sadly, as soon as the little microphone icon goes red to show it’s listening, my head goes blank, and the screen starts filling up with “er…once…er. I mean… Hang on… No. No. Stop. Delete. Delete. How do you stop this?” I once spent ten minutes yelling at my PC because I kept saying “it” and the software kept writing “Eat” or “at”. It sounded like a Monty Python sketch by the end, with my trying every phonetic variation to try and get the computer to understand me.
You used to train the computer to understand your voice. There were some paragraphs you had to read that contains the most common phonemes. I remember an episode of “The Archers” where someone is trying to raise some money, and naturally they decide to become an author (because that’s where the big money is, right?). They’re no typist, so they get dictation software and spend ages reading it “Winnie the Pooh” stories to train it. Trouble is, when they’re reading, they use a BBC “Alexandra Palace” voice, but dictate in their regular voice.
Direct voice dictation didn’t work for me. The myriad mistakes meant that any time saved on typing was more than used up in editing. For a time, the only thing that I could operate by voice was the function to switch off the computer (but only if I said “switch off” not “turn off” or “shut down”). Last month, I found this had been removed by Microsoft.
Despite the rude things I said about streaming services last year, we went ahead and got Spotify services. (Spock’s rule from Wrath of Khan applies here.) You would think that a system designed to be used with stuff like the Google Home Spying Device would actually work well with voice activation, wouldn’t you? Well, you’d be wrong. I can’t ask for specific songs by artists. Well, I can, but I can get anything from completely different songs by other artists, to recipes. Seriously – I ask the thing to play me a specific song, and it gives me a recipe instead. Is it my English accent? Since I traveled south at age 7, I have a boring RP accent, and yet the Google Always Listening In Case You Want To Buy Something can’t tell Taylor Swift from Ed Sheeran.
Incidentally, if you say “OK Google, I forbid you to play any Ed Sheeran song ever again” it will reply “Ok. Playing Ed Sheeran on repeat.” Or that’s what it says to ME, anyway.
I’ve whinged before about Microsoft deciding that the PC is their tool for monitoring you, not your tool for doing work, and this is another symptom. Voice recognition is something people expect – you see it depicted in movies, tv shows and comics all the time. We want to talk to our devices in a naturalistic way, and have them respond. But the companies behind them don’t want that. If they did, you’d be able to rename your Alexa, your Google Home, and have it respond to a name of your choosing. It would learn how you speak. It would get the bloody song right, and never play Ed Bloody Sheeran when I’m in the room.
Maybe there’s a brighter future ahead, where I can ask the tv to just find the movie I want, or tell the coffee pot I want coffee at 7am. Right now, I doubt it.