Voice Controlled Computing - New Opportunities and a Big Challenge
The Ambility team has long thought, and written, that voice enabled computing interactions are best when paired with a display screen where content of all types can be served up
Even deriding the Echo and the, fictional, Hal 9000 as being wholly inadequate replacements for a tablet-based voice controlled assistant. Then we spent some time getting to know the Echo.
We’ve argued here and here, at least, that well designed voice controlled digital assistants will multiply the use cases that tablets help satisfy. But we learned over the course of a long weekend that ‘Alexa,’ the name you call to summon responses from the Echo, proved capable of addressing a lot of common, albeit simple, computing use cases as well as a variety of household tasks that we hadn’t reached to a computer to solve before.
The capabilities – and limits – of the screen-less Echo clearly showed a couple of things: that voice-controlled computing interactions are introducing a wealth of new ways of automating our lives (and providing new data for analysts, marketers, and solutions developers), and that the challenge of linking a voice-controlled digital assistant to our screen-based online interactions has not yet been addressed.
Several times in the past we have discussed the work being done by large and small companies with deep pockets to usher in the age of voice-controlled computing. For the most part we have focused on the advantages consumers would realize from being able to leverage computing power with simple voice commands and the additional use cases connected devices would help satisfy as a result. As we continue to experience voice controlled devices weaving into our lives, two areas we haven’t discussed have started to come into focus – one an opportunity for insights and the other a challenge for product developers and barrier to widespread adoption.
Lessons from the Echo
We have written several times about the advantages tablet devices offer over voice controlled computers like the Amazon Echo, that offer no display, as the richness of the response is multiplied by a screen where information, imagery, and videos can be offered in addition to audio responses. The Ambility team still believes this to be a significant advantage, but we have learned a great lesson from the Echo that we didn’t fully anticipate – that a wealth of use cases can be addressed very well through audio-only responses, and that these audio-based interactions offer brand new areas of insight for service and marketing providers to learn about their audiences.
Another lesson we learned from interacting with Echo is that Siri and ‘OK Google’ are not really voice controlled computing platforms. They offer great doorways into web content end experiences, but once you get there you have to rely on tapping and swiping to get what you want.
New Use Cases, New Opportunities for Data
Over the course of a long weekend the Ambility leadership team found themselves turning to the Echo, by summoning ‘Alexa,’ more and more to satisfy simple queries and to help with tasks around the house. Our computers and mobile devices have long helped us settle debates by getting that easy answer, but how many of us turn to those devices to set a timer for the bread we’re baking or to dim the lights before dinner. With the Echo these were tasks easily completed, so by the end of the weekend we had forgotten where the light switches were and never cared to check for a timer in the kitchen.
Beyond those tasks we also turned to the Echo to play music, create a shopping list, and check traffic, but it was the mundane uses of the device to help with dinner and manage the room’s heat and lighting that stood out (Tom’s Guide also identified tuning your guitar and having Alexa act as your exercise coach as good uses of the product). These are tasks that for most people are not completed using connected devices, and therefore have been unobserved by marketers and analysts. As voice controlled devices increase in their application and penetration into modern households, the opportunity (and burden) of harnessing this new data for insights will be vast.
So overall the Ambility team liked the Echo and adopted its use for certain needs around the house quickly – to a degree that we don’t do with Siri or OK Google. Why is that?
The obvious answer is that Alexa was always available. We didn’t need to grab a phone or tablet, hold a button and then ask for what we wanted, we only had to hail ‘Alexa’ and then make a request. The not-so-obvious answer is that the Echo “interface” is built for an audio only interaction and does not default to older, tactile mechanisms of interactive experience.
Voice-Screen Interactions Require New UX Standards
Building “always on” capabilities is straight-forward enough (Siri allows it when your iPad is plugged in), but enabling audio only commands that interact with screen display is a far trickier change. Touch screen technology, historians tend to agree, was first developed in 1965 by E.A. Johnson at the Royal Radar Establishment in Malvern, UK, but it would be over forty years before mass audiences would have the chance to adopt them for anything other than highly specific interactions. Apple’s release of the iPhone in 2007 introduced intuitive standards of interaction that developers could then apply to web and application design.
Siri, OK Google, and Soundhound’s new Hound product continue to enhance the ability for our devices to recognize voice commands and provide base level responses. And now there’s even a program for making your laptop respond with J.A.R.V.I.S.-type displays like those Iron Man relies on, but for now all of these offerings assume some level of touch or mouse based interaction. For example, Siri and OK Google respond to most queries with a standard search results page (SRP) with no way to select a result by voice command. Siri’s voice controlled messaging functionality works well but correcting or editing a message can be frustrating unless you resort to tapping and typing.
Big Challenge, Big Opportunity
The Amazon Echo so far has at least demonstrated that voice controlled interactions have some real usefulness and appeal. Even without a display screen the provision of always on audio computing is valuable. But the Echo hasn’t provided a way of navigating the rich and varied offerings the internet is so good at delivering. And a display screen would be a good start.
Tackling that interactive challenge is far more complicated than programming a voice-controlled timer, but the Echo showed us that intuitive, voice-controlled computing solutions will be a welcome addition to consumers’ connected worlds. And the payoff for the company that establishes those standards, the solutions designers who leverage them, and the analysts looking for more insights into their target audiences will be massive.