Author Archive

Guidelines For Designing With Audio // UX Enhancement


  

As we’ve seen, audio is used as a feedback mechanism when users interact with many of their everyday devices, such as mobile phones, cars, toys and robots. There are many subtleties to designing with audio in order to create useful, non-intrusive experiences. Here, we’ll explore some guidelines and principles to consider when designing with audio.

While I won’t cover this here, audio is a powerful tool for designing experiences for accessibility, and many of the guidelines discussed here apply. Both Android phones and iPhones already have accessibility options that enable richer experiences with gestural and audio input and audio output.

First, who designs audio? Certainly, the audio producers and game designers who bring gaming to life. There’s also the world of voice user interface designers — those who design interactive voice response telephone systems for banks, airlines, etc. Then there are mobile, toy and interaction designers who have some of this expertise or who work closely with audio engineers and producers to create the right experience for their devices.

If audio might play a part in your design, here are some considerations to make once you have determined that the user’s device has a speaker and can play audio, and is either network-connected or has enough memory to store audio on the device.

Audio Design Guidelines

Choose the Right Type of Audio

Audio can be non-verbal sounds, sometimes called “earcons,� or can be words, sometimes called prompts, and choosing the right type is important. Meaning can be embedded in an earcon in such a way that a short non-intrusive sound can represent something much larger. Think of the sound that confirms that a text message has been sent on an iPhone: the sound effectively represents the action by suggesting motion and movement away from the user. Another example is the parking-assist system in a car; the intensity and pitch of sounds create a sense of urgency to let the driver know their distance from the nearest car.

Embedding meaning in a single sound allows for quick and efficient feedback; sounds are shorter than verbal prompts and can be less intrusive. The AOL email notification “You’ve got mailâ€� is a great example of the opposite — an incredibly annoying notification that makes most of us want to throw a hammer at the computer. (But if the AOL sound has made you nostalgic, check out “13 Tech Sounds You Don’t Hear Anymore.â€�)

But only so much information can be embedded in a sound. Sometimes words are the best way to communicate an idea. If that is the case with your product (say you are delivering instructions, alerts or dynamic information such as turn-by-turn navigation), then there are ways to design these smartly. You’ll also need to consider whether to localize the experience, with all of the implications that entails. A talking toy sold in multiple countries will probably need to have audio feedback in the language of each country, and this will require some thought on the scalability of audio feedback.

Embed Meaning in Audio Earcons

So, how can sounds be designed in such a way that the user intuitively knows what they mean? Some research is out there to guide novice earcon designers, such as work done by Blattner et al in “Earcons and Icons: Their Structure and Common Design Principles� (PDF). Blattner comments on W.W. Gaver’s mappings of earcons into symbolic, nomic and metaphorical sounds:

Symbolic mappings rely on social convention such as applause for approval, nomic representations are physical such as a door slam, and metaphorical mappings are similarities such as a falling pitch for a falling object.

Blattner goes on to say that if a good mapping can be found, then the earcon will be more easily learned and remembered. Earcons that take advantage of a pre-existing relationships enable users to associate sounds with meaning with minimal or no training.

Designing sound is complex, and audio designers will want to consider pitch, timbre, loudness, duration and direction to create the right sound. For details on how these should be considered in earcon design, consult “Auditory Interfaces: A Design Platform� (PDF).

Design in Context

Whether you are designing earcons or prompts, consider the particular context of the user, both physically and emotionally. If you are designing audio instructions or information, consider these factors:

  • Is there a way to differentiate between a novice user (i.e. someone who needs more hand-holding) and an expert user? This could be done by keeping track of the number of interactions that the user has with the device, and tailoring an audio experience for first-time users, while playing shortened prompts to expert users.
  • If the device has a screen, do you know whether the user will rely on visual feedback to complete their task? If so, audio might be a secondary feedback mechanism or might not be needed at all. Audio could be tailored specifically for these situations by playing less or different audio. Knowing where the device is in relation to the user could be done with certain sensors or accelerometers or derived from how the interaction was initiated. For example, if an interaction with Siri on the iPhone 4S was initiated from a Bluetooth headset, then the user’s phone is likely not available for visual feedback, so providing rich audio feedback becomes essential.
  • Many other contexts warrant tailoring the audio experience. With GPS, for example, you can determine whether the user is driving (using their speed). Sometimes the current state of the device is relevant and can indicate the proximity of the user or their level of engagement: Is the user listening to music? Have they recently interacted with the device? Have the swiped their credit card? Etc.

Consider the “Non-Use Cases�

Designers always talk of use cases, but for devices that “talk,� being aware of the non-use cases is also important, situations in which playing audio wouldn’t make sense. Alerts or information being shouted out from a device with no warning or context can be alarming. The example below shows a moving walkway that repeats its warning over and over, even when no one is nearby.

You will often want to give the user control over whether to play audio at all, through the settings. For example, on a Windows Phone, a user can set whether an incoming text message is read aloud automatically only when connected to a Bluetooth headset, when connected to any headset, always or never.

It’s Not Just What You Say But How You Say It

Designing prompts is part art and part science. Many good speech-recognition and voice user interface design books are out there with details. We’ll look at one example here and some of the problems with the design. Taken from an early version of the Ford Sync’s in-car speech recognition, this audio clip instructs the driver on how to ask for a particular music artist, but it does it very poorly; the pace, voice and grouping of words are just not clear enough.

Some design guidelines:

  • Use language that users understand. Stay away from lingo, jargon and technical terms that would make sense to the company but not to the end user.
  • Do not overload the user with too much information at once.
  • Limit the number of audio menu options. Audio is linear, time-sensitive and transient, unlike the Web and other visual feedback media in which users can take time to read, process and select. Research has shown that remembering more than five options from an audio menu is hard. Users will often listen to all choices before picking one, so a long list will limit their ability to remember them all.
  • When writing prompts that require users to make a choice, structure them so that the menu option comes before the action; for example “For y, press x,â€� instead of “Press x for y.â€� The user will more easily be able to identify the option they want and listen more attentively for the action.

Decide Between Recorded Prompts and Text-to-Speech

Another decision to make is whether to prerecord the audio with a voice actor or use text-to-speech (TTS). Prerecorded audio provides the most natural reading of text in most cases, but there are many considerations to make before implementing it. How many things must be recorded? Will the audio content change? How much storage is available?

Over the years, TTS has improved dramatically and in some cases does a great job of reading back audio. TTS engines should be evaluated based on the task at hand: Are multiple languages needed? Multiple voices? Is the type of information to be read back specialized? Evaluating various implementations is also important: Is the device connected, in which case the TTS engine could be cloud-based, or will the TTS engine need to be embedded in the device? Reactions to TTS vary; some users say that TTS impairs the experience so much that they avoid using it, while others barely notice it.

Here are two examples:

TTS Email

Recording Prompts

If you are able to record all prompts with an actor, choose a voice and personality that fits your brand and the experience. Best to recruit talent with a personality in mind, and have them record a representative script to evaluate how they would come across in the device.

There are many subtleties to be aware of when recording prompts. Voice user interface designers spend time directing voice actors to make sure that the prompts elicit the right spoken response from users. The following prompt can mean different things depending on how it’s read: “Would you like departures <pause> or arrivals?� would steer users to say “departures� or “arrivals.� A slightly different reading, “Would you like departures or arrivals?�, could be misinterpreted by users as requiring a yes or no response.

Prompts can be recorded even when some of the prompts need to change dynamically, such as when reading back the time or a phone number. In these cases, you would record shorter prompts and then concatenate them together during playback. To make these readings sound natural instead of robotic, record as large a chunk of the prompt as possible.

Summary

The most important consideration when designing with audio is to ensure that it enhances the experience and does not interfere or distract. If you are considering designing with audio, hopefully you are now armed with some helpful information to get you started on designing a great experience.

(al)


© Karen Kaushansky for Smashing Magazine, 2012.


Designing With Audio: What Is Sound Good For?


  

Our world is getting louder. Consider all the beeps and bops from your smartphone that alert you that something is happening, and all the feedback from your appliances when your toast is ready or your oven is heated, and when Siri responds to a question you’ve posed. Today our technology is expressing itself with sound, and, as interaction designers, we need to consider how to deliberately design with audio to create harmony rather than cacophony. The cacophony is beautifully captured in Chris Crutchfield’s video, in which he interprets the experience of receiving email, SMS texts, phone calls, Facebook messages and tweets all at the same time:

In this article, we’ll explore some of the uses of audio, where we might find it and when it is useful. This is meant not as a tutorial but rather as a discussion of some basics on using audio feedback.

Audio is a form of feedback that can be used either in combination with other forms, such as haptics, visual displays and LEDs, or on its own. We have to weigh several factors when designing feedback mechanisms: the scenario, the device and the interaction, where and how the device will be used, whether the user has a screen or display, whether the device has physical buttons or a touchscreen, where the user is relative to the device, and so on

Ear buds
(Image credit: Fey Ilyas)

For every action of the user, a good experience will include feedback that the action has been registered; for example, pressing a number key on a mobile phone would play a sound and show the number being pressed. Audio is particularly useful when there is no screen or when looking at the screen is not possible or not desirable (such as when users want to multitask). It’s interactive, creating a dialog with the user. It is also particularly good at providing feedback as “shared audio,� a form of feedback that reaches multiple people at once, such as a PA system or a citywide emergency warning system.

Audio is not always warranted. Something that makes noise repeatedly when other feedback would suffice is annoying. Audio that is private and intended for you but is heard by others is embarrassing, such as when your phone rings and announces a “Call from Sexy Neighbor.� Audio design has many ins and outs, but let’s start with some common uses of audio feedback.

Where We Find Audio

Many of us who work on interaction, mobile, device or game design have already discovered the importance of designing audio — audio is everywhere.

Mobile

Much of the Web is moving to mobile, which of course entails smaller screens and people on the go. But besides creating mobile-specific websites, there are ways to augment the mobile experience with audio when people aren’t looking at or can’t interact with the screen. A great example is GPS and turn-by-turn navigation systems that speak directions (either as part of a dedicated device or from a smartphone app). While audio isn’t yet native to mobile websites and apps, it is native to smartphones to indicate new email, incoming text messages and calendar events.

Gaming

For those who play video games, audio is integral to setting the mood, environment and situation, and it engages the user tremendously. First-person shooter games such as Halo and Call of Duty rely on audio feedback to show cause and effect — for example, the sound of a gun shooting and the moment of impact on the enemy. Or consider Wii Sports: the smash of the ball in tennis, the crack of the bat in baseball, and the cheer of fans all help to blur the line between the very physical game and the digital world.

Consumer Devices

As more appliances become smarter and connected, they might have more to say. Today a set of beeps tells you that the refrigerator door is open, but in the future you can expect notifications that the milk has gone bad or that you need to pick up eggs if you want to make that cake for your spouse’s birthday on Tuesday.

More and more of our everyday devices use audio feedback: a Bluetooth headset tells you who is calling, Nike+ tells you your current distance travelled and pace, and cars beep to help you park.

Speech Recognition and Robots

Voice interaction such as Siri’s is revolutionizing the way people interact with their iPhone and will help to change future interactions with all devices and information sources. People are beginning to talk to their devices and expect some audio feedback in return. Siri is just the start; we’re starting to see speech recognition in Xbox Kinect, Samsung TVs and more. Audio feedback is a natural way to let the user know that the system or device has heard them, is processing their request and so on.

Think of your favorite robots — HAL, Wall-E or any of the personal robotic devices that are emerging. These robots are developing human characteristics, with sounds being one of the strongest ways to deliver emotion. Leila Takayama of Willow Garage has talked about the “design challenge in communicating internal robot states and requests to effectively reach the robot’s assigned goals.â€� Willow Garage has created a set of sound libraries for communication between people and robots that might help make robots “more appealing.â€� Then there are other robots that speak English and other languages, such as the new Autom weight-loss coach. Studies have shown that people who use Autom stick with their diet and exercise routines for twice as long as people who use traditional weight-loss methods, perhaps partly because of its human-like interactions.

Why Use Audio?

There are numerous principles to determine why and when to use audio in designing interactions for devices. Being conscious about adding sound to a device is the first step in designing it right. The point is to do it deliberately, not as an afterthought, so that the audio means something and is not annoying. Here are some of the many scenarios in which you should consider using audio.

Instructions and Information

Audio is used to give instructions, especially where there is no screen or where looking at a screen would be difficult, unsafe or impossible. Again, think turn-by-turn directions. Or it can be used to augment visuals. The parking machine below obviously has visual instructions for entering a credit card, but they weren’t sufficient to get people to enter it correctly.

Audio can be used to offer information, either when no screen is available or when certain details would be better captured as audio. The Jambox by Jawbone tells the user when they need to recharge the battery. The Leapfrog LeapPad takes this one step further by specifying the type of batteries it needs!

Feedback and Interaction

As mentioned, audio is used as a feedback mechanism when the user takes action. This could be feedback for when the user pushes a button, such as when turning on a Jambox speaker, or to tell a driver that they are getting close to a parked car.

It’s also used to allow for interaction and conversation with our devices. We’re used to interacting with speech-recognition systems when we call an airline or a bank, and now sending a text message with your voice from a Windows phone is just as easy. The audio from these services and devices create a dialog that enables users to get things done.

Personalization and Customization

Audio allows for personalization of a device, helping to engage users and create an emotional attachment. Siri learns its user’s name and uses it in its replies, adding a personal connection to the interaction. Garmin and TomTom let users download all kinds of voices to their GPS devices, from Bert and Ernie to Star Wars characters to Kitt from Knight Rider, with the goal of creating more engaging experiences. Jawbone device owners can download different languages and characters to their Bluetooth headset and speaker, with the device volunteering such responses as, “A bombshell is whispering in my ear… And yes, I’m blushing.�

Audio also helps to establish personality and to humanize a device. Ford and other electric vehicle manufacturers are dealing with a proposed bill that would require electric vehicles to make some sound to ensure pedestrian safety. Ford has asked the public to vote on four different sounds that would essentially shape the personality of its cars. Here’s one of them:

In another example, to show the power of talking devices, Radio Lab reported on an experiment that pitted a hamster against a Barbie doll and a Furby (the popular furry electronic talking robot) to see how long kids could hold each of them upside down. While all of the five kids in the experiment could hold Barbie upside down “almost forever,� they treated Furby much more like the living hamster than the Barbie. Why? Well, when you hold Furby upside down, he says, “Me scared,� giving human-like characteristics to the toy. The kids said afterwards that they “didn’t want him to be scared.�

Conclusion

Audio is everywhere, and there are good reasons to use it: to instruct, enhance and engage and to personalize experiences. But if poorly designed or used inappropriately, it can detract from the experience and be annoying. We’ve covered the why and the where of audio. Next time, we’ll review some guidelines and principles on the ins and out of designing with audio.

(al)


© Karen Kaushansky for Smashing Magazine, 2012.


  •   
  • Copyright © 1996-2010 BlogmyQuery - BMQ. All rights reserved.
    iDream theme by Templates Next | Powered by WordPress