Published on: July 29, 2019
Content Guy’s Note: In a competitive environment where Amazon’s Alexa-based system seems to have created for itself a distinct competitive advantage, how do you compete? I recently heard one Eye-Opening answer.
In the class that Tom Gillpatrick and I team-teach at Portland State University during the summer, our students had the good fortune of spending time with Jon Stine, who has had a unique 30+ year career working in marketing (for companies like Pendleton Woolen Mills), academia (at MIT) and technology (for companies that include Cisco and Intel). Stine was in class to talk to us about his newest venture, called the Open Voice Network, which struck me when he described it as being a little audacious and a lot appropriate and timely - especially if you are in the business of competing against Amazon, which, let’s face it, pretty much everybody is.
And so, I asked Stine to engage in an email interview for MNB to describe the Open Voice Network, its ambitions and implications, and he agreed. FYI … this e-interview has been lightly edited for clarity.
KC: There seems to be two basic premises behind your development of the Open Voice Network. One is that voice technology is both early days and in a time of enormous growth, and the other is that, in terms of commerce, Amazon (and I say this with some admiration) has achieved a rather successful land grab with its Alexa based system. Is this a fair description, and how did you come to the conclusion that what was required was a different approach - not a series of competitive offerings, but one broad-scale competitor that has a very different operating/business model?
Jon Stine: The originating premises (back in 2016-2017) were these: 1) this AI-voice assistant thing (notably, at that time, Alexa and Siri) is no mere tech toy – in fact, we can see that it will become a primary interface to the internet (and perhaps, THE primary interface to the internet); 2) as that happens, it will affect all consumer-facing industries, and may have a huge, re-shaping impact on commerce, and 3) hmm . . . there’s a lack of interoperability – are these like the browser war days of the early internet?
Then, this question: given the benefits brought by the introduction of standards through the years, and the value of “open” technologies . . . might AI-voice be a place where an open, standards-based approach might provide the greatest economic and social benefit? Might the future of AI-voice be bigger and brighter if AI-voice assistance was interoperable, accessible, and data protected?
A “standards-based” approach suggests specifications that, for instance, would enable interoperability across platforms. Such specifications could be adopted by proprietary platforms, or could be the basis for the development of a completely “open” platform.
KC: Explain the structure of the Open Voice Network - exactly how will it work, how will it be funded, and what kind of access will retailers large and small have to it? And exactly what devices will be employed? A new technology entrant? Google’s existing system? Apple’s?
JS: The Open Voice Network is planned as a not-for-profit association of consumer-facing enterprises, technology companies, marketing firms, and university researchers, dedicated to AI-voice that is “open:” standards-based, interoperable, accessible, and data-protected. It will be funded by its members, governed by a Steering Committee of member representatives, and led day-to-day by an Executive Director.
In addition, it will operate as a Directed Fund of The Linux Foundation; as such, it will have access to The Linux Foundation’s shared administrative, legal, and marketing services, and the collective wisdom of The Linux Foundation’s many successful standards-setting efforts.
KC: Okay, hang on a minute. I’m not the most tech-savvy person in the world, so I want to be absolutely clear about this - exactly what are you building? Is it software? Hardware? Just a loose coalition of companies creating standards? I’m having a little trouble envisioning what the “final” product will look like.
JS: At present, we can envision three outcomes of the Open Voice work.
First, a set of technical specifications (software and hardware) that would be proposed as standards for the AI-voice industry – to be adopted by manufacturers of smart speakers, and more importantly, the developers of voice assistant interfaces and platforms. It’s envisioned that these technical specifications would enable such things as platform interoperability, data privacy, and the global registration of an entity’s name – so that requests for “retailer A” would always go to retailer A.
Second, a reference architecture or reference design which would enable any third party to build an operative, standards-based device, interface, or platform. This is the essence of “open” technology – the core capabilities are known to all, and used by all . . . and commercial offerings are then created by third-party developers, who differentiate their offerings by adding various features and services.
And third, an open AI-voice platform, such as the “Almond” platform recently created by Professor Monica Lam and team at Stanford University. This is being made available to all, and is an alternative to proprietary platforms.
All three outcomes are aimed at the big goal: AI-voice that is interoperable, accessible (i.e., you can be confident of finding the other party you want), data protected, and secure . . . regardless of whether you’re communicating through a proprietary platform or a newly-available “open” platform. If we do our work well, every AI-voice user will have a multitude of choices, and confidence that the system will work as promised – much as it is today with the internet.
KC: You talk about a “zero interface” system … but I’ve always been under the impression from Amazon that you need an interface - with wake words - in order to assure some level of privacy (and, as we now know, the Amazon system isn’t as private as we may have thought). How do you address privacy concerns? Who/what will be the arbiter of standards used in the system?
JS: The ”zero interface” is a reference to the fact that, with voice, we no longer type, tap, or swipe – we go back, if you will, to the interface of birth: voice. With all voice interfaces to date, you will – as you point out – need a “wake word” that alerts the assistance to forthcoming commands or requests.
The topic of privacy in AI-voice is a multi-faceted challenge. First, voice is a biometric identifier – I can determine who you are by your voiceprint. Second, AI-voice provides much more than words or phrases; intonation, pitch, cadence, and patterns of hesitation all point to meaning and emotional states. Third, AI-voice in commerce will contain all the information of a T-log, along with sequence of ordering and any questions that may accompany the purchase. We are just beginning to identify and address the ethical, legal, commercial, and technological questions that emerge in the world of AI-voice.
KC: Who are you working with at this point? Are there technology companies, retailers and/or suppliers who have committed to at least testing it?
JS: We have received an encouraging number of sponsorship commitments to date -- from retailers, consumer goods firms, technology firms, marketing agencies, and start-ups in the AI-voice space. Thanks to this show of support, The Open Voice Network expects to officially stand up as an organization this Fall.
KC: Can you imagine a time when Amazon might be accessible through the Open Voice Network?
JS: We can certainly imagine a world in which interoperability, accessibility, and data-protection standards for AI-voice have been adopted worldwide. As part of that vision, we can certainly imagine an Alexa user connecting with assurance (and data protection) to a retail brand of choice that has built its AI-voice assistance system on another platform.
KC: How long will it take from conception to implementation? And what do you need - in terms of money, participation and whatever else may be required - to achieve your timeline goals?
JS: The development and adoption of global technology standards is not measured in weeks or months, but in years. That being said, we must act now – and with speed. The most recent forecasts suggest a world of more than 8 billion operative AI-voice assistants by 2023. Will that world be open or closed? Good for the many or the few?
What do we need? Two things: members and their active participation. Membership brings the financial resources that enable us to invest in the academic and industry research from which standards will emerge. Active participation results in brain-storming, best practice sharing, and the intellectual give-and-take that makes us all better.
Another note from the Content Guy: If indeed it is reasonable to argue that Amazon has successfully managed, through speed of innovation and implementation, to dominate the early e-commerce days of the AI-voice assistants business, then it may also make sense that for other companies - on the supplier and retailer sides, as well is in the tech sector - to find ways to collaborate that may give everybody a better chance to compete in this arena. I’m fascinated by what Jon Stine is proposing, and will look forward to tracking the Open Voice Network’s development as it moves forward.
If you want to know more about the Open Voice Network, go to its website, email him at email@example.com , or call him at 503.449.4628.
- KC's View: