Pandora team collaborates on voice mode
May 01, 2020
8 MIN READ

How Pandora Used Voice AI to Create a Frictionless Music Experience

By Karen Scates

Pandora’s reputation for providing exceptional and highly-personalized music experiences began with their music genome project in 1999. The project, which is based on the world’s most sophisticated taxonomy of musical information ever collected, classifies Pandora’s music library with a world-class recommendation engine that enables music discovery based on a subscriber’s preferences and listening history. 

By using this incredible wealth of musicological data, Pandora understands and responds to each individual user’s unique taste in music. In effect, they created a truly personalized listening experience many years before other brands jumped on the customization bandwagon. 

See Pandora’s In-App Voice Assistant in Action

To this day, the company still focuses on the same core values of providing effortless, personalized, easy, lean-back listening experiences. With this in mind, about a year ago Pandora launched Voice Mode, a voice-enabled user interface to their app. Powered by our Houndify™ Voice AI platform, Voice Mode features the customized wake-phrase “Hey, Pandora…,” which allows users to effortlessly control and continually refine their listening experience just by speaking naturally to the app.

“With voice mode, we were looking to bring intuitive hands-free access to Pandora. Some of the primary use cases we were trying to solve were the situations where you’ve got your hands full—you’re driving or you’re in the kitchen.” 

Chris Beale
Director of Engineering, Pandora Search and Voice

Over the last 10 months, Pandora reports that they are seeing “really high engagement,” which is a testament to the increasing popularity of voice adoption and the delightful voice experience built by the team. As they’ve been getting results back, and increasing their understanding of best practices and what their users like, the team has continued to further enhance Voice Mode. 

Recently, the team’s efforts were recognized when Pandora’s Voice Mode, powered by Houndify, was chosen as the winner of the 2020 Webbys in the Best Branded Voice Experience category. The award was an honor for both Pandora and SoundHound Inc.

The challenges of voice-enabling a music discovery app

For the team at Pandora, adding Voice Mode to their music discovery app was more than just adding the ease and convenience of a hands-free experience. They wanted to solve some of the greatest challenges for their listeners like figuring out what music they want to listen to and deciding which genre will suit their current mood. The team focused on the context of the listening experience and tried to understand how people request music in various listening situations.

We always keep Pandora’s core principles as our North star, which is to keep it friction-free and effortless.”

Ananya Sharan
Product Manager Pandora Voice Modeand Voice

Leveraging Pandora’s personalization and recommendation skills, thinking about the situations where people would use a voice interface, and focusing on bringing a great experience to users helped the team prioritize which features and functionality they would build into the voice assistant.

Keeping the context of the user in mind, the team created a voice interface that would offer a natural, conversational way of discovering and listening to music in a variety of settings. They want their listeners to ask for music to suit their mood or context just by speaking naturally. Users can make verbal requests like:

  • “Play me some focus music” 
  • “Play me something for getting ready in the morning” 
  • “Play me something for cooking”
  • “Play some workout music”
  • “Play me something relaxing” 

or even

  • “Play me something different

Designing the product to address the most likely scenarios for their listeners helped Pandora prioritize what they wanted to build. 

“One of the goals of voice mode was to take some of that burden away and leverage our strengths and recommendations, to find the perfect thing to play based on your mood,” 

Chris Beale

Director of Engineering, Pandora Search and Voice

As users began testing voice mode, the team was faced with new challenges. People were asking for things in ways that the team didn’t initially anticipate, like asking for music to listen to while mowing the lawn. Based on user data, they quickly adapted and improved their models to generate better results for even broader requests

Voice AI is the new frontier for Pandora user experiences

Before voice mode, millions of listeners were already enjoying Pandora on their smart speakers, but the skill was somewhat limited in the functionality it provided. Because playing music is by far the number one use case for smart speakers, Pandora wanted to bring this capability into the native app and capitalize on the rich personalization experience, wealth of listener data, and world class algorithms already in place. The decision to implement a voice user interface was a natural extension of these efforts.

We wanted to make the voice mode experience as easy as possible to use. As users are becoming more accustomed to voice assistants, they are shifting from using plain commands to being more conversational and speaking as if they were talking to a friend. And this was probably the biggest challenge—being able to meet user expectations when they’re talking to a voice assistant.”

Vito Ostuni

Manager of Science and Principal Scientist, Pandora Search and Voice

The team recognized that using a voice interface, they could deliver personalized playlists to their listeners at any time and anywhere. Listening to music is a natural use case for a voice interface because it’s something people do while they’re doing other things—commuting, working, studying, cooking or working out. Using voice mode, Pandora listeners can simply use their phones to ask for music on the go instead of tapping on a screen.

Voice AI is also helping Pandora innovate in other ways.The company has recently started serving some of its users voice-enabled ads from brands like Doritos, Wendy’s, Nestle and Comcast. Voice recognition software is used to analyze the users verbal responses to the ad to then  generate the right follow-up message, or additional ads based on the users interest and intent. 

“Voice is going to be the new frontier for user interfaces. And it has to be conversational. It has to be easy. It has to be personalized, and that’s really what will drive the adoption.” 

Ananya Sharan

Product Manager Pandora Voice Mode

Voice ads are still very new but Pandora already envisions a world where listeners  and brands will be able to have “conversations” via this new ad format. 

Just say “Hey Pandora” for a better user experience

To make sure the experience remained truly hands-free, the team at Pandora created the custom wake word, “Hey Pandora.” Users can use the branded wake word to invoke Voice Mode, eliminating the need to tap the mic to start a search and look at a phone to get started. 

The team envisioned the voice interface as a friend and companion for their users; an effortless way for listeners to discover and enjoy music and spoken content. The more a listener uses the app, the more Pandora is able to learn about their listening habits and then automate the process through machine learning and AI, resulting in the ability to deliver the right result at the right time.

“Personalization is our secret sauce. We use it everywhere in the app as much as we can. You can just go to search and start a station and we’ll use the music genome project and listener feedback data to power that experience and personalize it for you. Basically, we are tuning that station experience to your taste.” 

Vito Ostun

Manager of Science and Principal Scientist, Pandora Search and Voice

While designing for personalized experiences, the team at Pandora realized that there are many different ways a user could ask for something. They had to design the interface to work with different users, different speech patterns, and the variety of ways people can ask for the same thing. 

The right voice AI technology partner

There are so many ambiguities and so many different challenges with natural language that it’s notably one of the hardest AI challenges to solve. Houndify’s Speech-to-Meaning helped the Pandora team resolve many of the challenges of understanding natural language and delivering fast and accurate results, enabling the frictionless experience Pandora listeners have come to expect.

Combining Houndify’s advanced voice AI technology and Pandora’s world-class genome project data created a unique and seamless listening experience that delights audiences every time.

Within the music genome project, Pandora has been able to identify more than 400 attributes for each song. The vocabulary is very expanded and rich, contributing to the challenges of a voice interface. For example, if you think about the artists Fish with “F” and with “PH”, it’s very hard for the voice assistant to differentiate those two on its own.

Coupling the power of Pandora’s existing personalization architecture and Houndify’s natural language understanding allowed the app to deliver the right artist and songs based on the listener’s previous sessions.

Music’s a really complex domain full of uncommon spellings, names which people find difficult to pronounce, and people can ask for the same thing in very, very different ways….SoundHound’s capabilities in automatic speech recognition and natural language understanding made it really easy for us to build a solution that we could fit into our existing architecture, enabling us to build a platform that we could leverage across multiple different device implementations.” 

Chris Beale

Director of Engineering, Pandora Search and Voice

The team at Pandora hasn’t stopped innovating. From the time that they first began building a consumer facing voice interface to now, they’ve kept a close eye on the data to find insights about how their listeners are using voice and keep improving the customer experience. They continue to iterate and build features based on user data and learnings and, as a result, they’ve used feedback to provide the easy, effortless, lean-back listening experience that has made Pandora famous.

Karen Scates is a storyteller with a passion for helping others through content. Argentine tango, good books and great wine round out Karen’s interests.

Interested in Learning More?

Subscribe today to stay informed and get regular updates from SoundHound Inc.

Subscription Form Horizontal