1 / 37

From the Lab to Ubiquity Speech Technology’s Road to Mainstream

From the Lab to Ubiquity Speech Technology’s Road to Mainstream. Eric Chang, Ph.D. Assistant Managing Director MSR Asia Advanced Technology Center. Impact of Disruptive Technology. 9/3/2004, 6:03 AM, Pacific Green Bay. Human Language Capability. Technology Adoption Lifecycle.

harker
Télécharger la présentation

From the Lab to Ubiquity Speech Technology’s Road to Mainstream

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. From the Lab to UbiquitySpeech Technology’s Road to Mainstream Eric Chang, Ph.D. Assistant Managing Director MSR Asia Advanced Technology Center

  2. Impact of Disruptive Technology 9/3/2004, 6:03 AM, Pacific Green Bay

  3. Human Language Capability

  4. Technology Adoption Lifecycle • Successful technology adoptions increase exponentially • But it’s not a smooth process, there is a “chasm”* *Geoffrey Moore, Crossing the Chasm

  5. Technology Adoption Lifecycle Visionaries Early Adopter Early Majority Late Majority Laggards

  6. Technology Adoption Lifecycle • Visionaries • Early Adopters • Early Majority • Late Majority • Laggards

  7. Visionaries • Technology for technology’s sake • Technology fans • Won’t be a major market

  8. Early Adopter • Adopts technology for its benefits • Not afraid to try something new

  9. Early Majority • Will not be the first to adopt a new technology • Practical and utilitarian • Heavily influenced by what other people are doing • Strong “viral” effect

  10. Late Majority • Will adopt a technology only when necessary • By this stage, a few dominant technology providers have emerged.

  11. Laggards • Suspicious of new technology • Can get alone fine without new technology

  12. Discontinuous Technology • Not a simple extension of repackaging of existing technology • Automobile replacing horse and carts • Telephones replacing telegraphs

  13. Technology Adoption Lifecycle Visionaries Chasm Early Adopter Early Majority Late Majority Laggards

  14. Into the Tornado Visionaries Tornado Early Majority Late Majority Early Adopter Laggards

  15. Technology Adoption Lifecycle • Visionaries: Speech Search & Transcription • Early Adopters: Dictation • Early Majority: Call Center Automation • Late Majority: Leapfrog • Laggards

  16. Technology Adoption Lifecycle • Visionaries: Speech Search & Transcription • Early Adopters: Dictation • Early Majority: Call Center Automation • Late Majority: Leapfrog • Laggards

  17. Example: Visual Audio Notebook

  18. Example: Visual Audio Notebook

  19. Example: Automatic Processing of Voicemail

  20. Technology Adoption Lifecycle • Visionaries: Speech Search & Transcription • Early Adopters: Dictation • Early Majority: Call Center Automation • Late Majority: Leapfrog • Laggards

  21. Early Adopter: Dictation • Continuous dictation first sold in 1996 • Shelfware • English, Chinese, and Japanese dictation in Office XP in 2001 • Current status • Vital for people who need it • No viral effect yet • Difficulties in handling speaker accent, wearing microphone, individual vocabulary

  22. Language is a live medium • Dalian: 腕狭子 • Japan: WaiShaTsu • England: White Shirt • US: Suit • Japan: SeBiRoo • England: Saville Row

  23. Technology Adoption Lifecycle • Visionaries: Speech Search & Transcription • Early Adopters: Dictation • Early Majority: Call Center Automation • Late Majority: Leapfrog • Laggards

  24. Technology Adoption Lifecycle

  25. Cost Satisfaction Productivity Revenue The Business Value of Speech for Call Centers $5/call to $.20/call Reduced Call Time Fewer Agents Less Time in Queue Increased System Usage Customer Retention Customer Focus Less Time/Call Efficient Agents New Revenue Opportunities Up-Sell/Cross-Sell

  26. Cost Satisfaction Productivity Revenue Call Center Examples • Merrill Lynch • Automation rates from 82% to 90% • First Year Savings $6.3M • Amtrak • 61% Increase in Satisfaction • 75% Increase in Automation Rate • 90% Increase in Ticket Sales • ThriftyCar Rental • 40% increase in CSR productivity • $1 million first year savings

  27. The Business Value of Speech for Operators The mobile operators need to make money from value-added services! Revenue In US$M

  28. Calendaring / Email Location Based E-Commerce / Alerts Places: Auto Services Speech SMS / MMS Voice Dialing Search/Browsing Speech Makes Value-Added Services Usable

  29. ASP.NET Speech Speech SDK Microsoft Speech Server & SDK • Extends ASP.NET and Visual Studio • Call center + multimodal solution • Unifies web & call center • Reduces TCO • Introduced in March 2004 • Strong partner ecosystem • Strategic partnerships with Intel, Intervoice, Scansoft • 200+ Beta Program Applications • 300+ Partner Applications • 30,000 Speech SDK Users

  30. Human Error Rate Speech Recognition: Approaching Human Error Rate Microsoft licensed CMU Sphinx-II Whisper in MSR Speech in Office XP Speech in Tablet/Office 11 Speech in Longhorn

  31. Human Naturalness Text to Speech Approaching Human Naturalness Naturalness

  32. Technology Adoption Lifecycle • Visionaries: Meeting Transcription • Early Adopters: Dictation • Early Majority: Call Center Automation • Late Majority: Leapfrog • Laggards

  33. What is Leapfrog? $120 $50 $15

  34. Leapfrog’s Technology • Crucial technology • Speech compression technology • Using simple touch sensitive screen with new books and speech coding chip. • Sound business model • >80% highly recommendation rate in Amazon • Invented by a lawyer trying to teach his 3 year old child to read!

  35. Technology Adoption Lifecycle:Late Majority, Case of Leapfrog

  36. Summary • Disruptive technology adoption follows the technology adoption lifecycle • Natural language understanding is hard • Domain-free reasoning & common sense hardest • Truly human-level understanding likely elusive • Adoption of speech technology will increase • 2-3 years: telephony, multimodal, accessibility. • 7-10 years: intelligent assistance, meeting search/transcription, speech everywhere.

  37. Acknowledgement • Faculty Forum Speech, 2003, Kai-Fu Lee • Crossing the Chasm, Geoff Moore • Into the Tornado, Geoff Moore Thanks! echang@microsoft.com http://Research.microsoft.com/users/echang

More Related