Print Friendly, PDF & Email

This article was prompted by an excellent blog post by Jack Ward, on how an audio drama can lose an audience. I’ve expanded on those thoughts here to help us look at the slightly different topic of why some audio dramas are so easily forgotten and how this can be avoided by carefully employing Audio Drama’s three tools (dialog, sound effects, and music) and its spotlight (volume) to effectively control the focus of the audience.

microphone by Miyukiko © 2013
microphone by Miyukiko © 2013

“Why can’t I remember what was going on?”

That is a question we don’t want the members of our audience to be asking after they have listened to or read one of our plays.

One of the core benefits, one of the real selling points, of audio drama is that it is so immersive. Audio dramatists have ALWAYS been able to achieve scenes that film-makers, until very recently, have only dreamed of showing to an audience. A million battleships floating in the inky blackness of space, exploding stars, monsters that defy description. Audio drama has been able to bring these things to life. And in ways that are, often, superior even to novels.

Sound is an amazingly powerful medium. Sounds enter the brain directly by a path that bypasses most of the pre-processing that the senses of touch, and sight require. Our instinctive responses to sound (such as the jump that follows the sudden “boo” scare) are hard-wired into us for our own survival.

Sound and story together, teaming up as they do with the human imagination, results in a massively immersive experience. So why is it, then, that sometimes we are unable to recall the stories we have heard?

The Michael Bay Effect

For lots of reasons I call this the Michael Bay effect. Take a moment, right now, if you have seen any of the Transformers films, to try and remember the plot of any one of them. Even if you are a huge fan of the franchise in terms of its spectacle and special effects wizardry, you are going to be hard pressed to accomplish this.

Lindsay Ellis, who was recommended to me by one of the great folks over at the Audio Drama Production Podcast facebook page, has put together a great little video (see the end of this article) on this that makes some of the points I wanted to cover really clear.

Another great unpacking of this has been done by Every Frame a Painting (also linked at the end of this article).

The short version is that our brains use a variety of cues to identify and sort between the relative importance of what we need to pay attention to. Attention is the key to memory. If we aren’t able to pay attention, we won’t remember. Memory (particularly short-term memory) is also limited. Experiences and events pass through short-term memory in rapid succession and most are discarded. Those we deem important are moved into medium and long-term memory. Of course, this is only successful where we can genuinely distinguish important from unimportant information. In a visual medium like film this is achieved through focus. Whatever has the focus of the camera is important, and cinema uses a variety of techniques to emphasize the relative importance of elements that have the focus.

The thing about Michael Bay’s Transformers movies is this – in his films everything is treated as if it is EQUALLY important and so nothing is separated from the mass as being RELATIVELY more important than anything else. Short-term memory is unable to accommodate the great mass of material it is being presented with and, because information passing through short-term memory must be “flagged” as relatively more important than the passing stream to enter medium and long-term memory, the majority of the material is forgotten.

The Limits of the Brain

How does this apply to audio drama? The cognitive limits of our brains place far more constraints on what we can do in audio drama than any of the limits created by lack of resources and technology in film.

Listen carefully to a wide selection of audio drama and you are going to come across some of the masters of the form. In the modern era some of the big names are, in no particular order, Greg Taylor (Decoder Ring Theatre), Dirk Maggs (Hitchhikers Guide to the Galaxy v.3-5, Good Omens and many others), John Finnemore (Cabin Pressure), and K.C. Wayland (We’re Alive). From the past, names like Carlton Morse (I Love a Mystery), Arch Obler (The Chickenheart, Johnny Got His Gun, The Devil and Mr O), Lucille Fletcher (Sorry, Wrong Number; The Hitchhiker), Norman Corwin (We hold these Truths; On a Note of Triumph), Marian Clark, John Meston, and Les Crutchfield (collectively Gunsmoke), are bound to resonate.

All of these have an intimate understanding of the three tools of audio drama; dialog, sound, and music. Dirk Maggs is a much-imitated designer of deep, immersive, and layered soundscapes, the likes of which haven’t been equalled since the days of Gunsmoke. Greg Taylor has achieved extraordinary things with a relatively minimalist sound pallet. Arch Obler and Norman Corwin were innovators and have a great modern equivalent in the equally innovative sound work of K.C. Wayland. Among many pieces, Lucille Fletcher wrote some of the most famous intimate little tales of suspense ever broadcast. While John Finnemore’s gift for situation and sketch comedy harkens back to the golden age of Fibber Macgee and Molly, Our Miss Brooks, and The Great Gildersleeve.


The imitators of these greats, however, have often failed to grasp the structure that underlies their genius. Firstly, these masters chose and wrote great scripts. But beyond that they made good technical choices. Casts were kept to around eight regulars with well-differentiated voices. Our limited capacity to distinguish voices combined with the limits of our short-term memory made it essential that casts be kept small. Scriptwriters and producers who require large casts are putting themselves into direct conflict with the capacity of listeners to maintain a clear conception of each of the characters. Without signature voices, confusion becomes inevitable where the cast extends beyond roughly five adults. The Old Time Radio formula was to cast according to seven easily distinguished vocal types;

Bass			Heavy/Elderly male
Contralto		Elderly female
Baritone		Leading man
Mezzo-soprano	        Leading woman
Tenor			Juvenile
Soprano		        Ingénue
Treble			Child

This they would then change up with the use of accents and contextual cues if necessary.

I’ve written elsewhere about the dangers and techniques of writing for a large cast so I won’t repeat myself further here. Suffice to say that the more voices we have to distinguish, the greater the strain we place on our memories.

It has become fashionable recently to overlay dialog in order to increase the realism of a scene. But again focus and contrast matter. The way the dialog is constructed is critical. Important information should never be obscured so the overlaps must occur on top of unimportant words and phrases (and the danger in creating redundant dialog should be obvious). The voices on either side of the overlap must contrast well for this technique to work. Two overly similar voices delivering dialog over the top of each other does not “read” well. If they are too similar the result will be noise, indistinguishable to the listener, and, as a result, contributing to more of the story failing to enter memory.

Sound Effects

Likewise when building soundscapes, many imitators of the greats fail to pay attention to how a good soundscape is actually constructed. They layer sound on top of sound in the mistaken belief that a wall of effects will increase the sense of immersion experienced by the listener. But not every sound is equally important and the auditory focus must shift between the most important elements of the soundscape we are building. This focusing of attention must be purposeful, establishing a sense of time or place for the listener. It must contribute meaningfully to the narrative or it is little more than distracting noise. A soundscape is not immersive simply because it is full of sounds. Greg Taylor has demonstrated just how effective a minimalist soundscape can be. Sounds are immersive because we can “read” the sounds easily and because they provide cues for our imaginations to build the world.

The capacity of our minds to build entire worlds around minimal cues demonstrates that crowded soundscapes are not necessary for world-building to occur in the imagination of the audience. After all if I mention a detective’s office, you can build the entire space in your imagination right down to the furniture, outside traffic noises, doors, walls and ceiling, without me saying a further word.
Another important constraint comes from the Reticular Activating System (RAS) in the brain. The reticular activating system is a built in part of the brain that functions to filter stimuli in order to limit the amount of sensory input we pay attention to. This is a very important element of human perception for audio-dramatists to understand (even if it does seem a little counter-intuitive the first time you encounter it). Again, it’s a matter of focus. The brain shifts our focus naturally, scanning the environment for new and novel stimuli and filtering them on the basis of how important it deems them to be. When we put on our clothes first thing in the morning we are aware of the sensation of cloth against our skin, but our brains quickly filter that out as an unnecessary/unimportant stimuli to be paying attention to. Thereafter the sensation of clothing against our skin becomes, to all intents and purposes, invisible to our conscious selves.

Good audio dramatists filter the environmental noise for their listeners so that only the important things receive focus. In this very real sense, less is more.


Music is also another element that must be deliberately focused. Music can establish mood, contribute to the sense of time and place, set the scene, and emphasise both the emotion and drama of a scene. But again, the focus must be controlled so that it doesn’t compete unnecessarily with other important elements of the play. At any given moment, the audience must be aware of what is important. There should be no confusion about whether the audience should be paying primary attention to the swelling music, the dialog, or the sound of approaching shoes on the tiles.

One Important Reason Good Dramas Fail to be Memorable

In an environment where there is too much sound competing for the audience’s attention we create confusion. The audience must work to understand the action and separate the important from the unimportant. The mind doesn’t know which sound elements (dialog, music, sound effects) to pay attention to without the focus very deliberately provided by the audio production. As a result the majority of the audience members’ experience of the play is going to pass through short term memory without being retained.

Regardless of how clever, dramatic, and moving, our script is, if the production fails to focus audience attention properly, the play will be confusing and fail to be memorable.

Exceptions to the Rules

I should add that there are times when we actually want to create confusion in the minds of our listeners. We want to create moments where the audience experiences and shares the disorientation of our characters, moments when the story becomes surreal, and moments where the sounds and events are ambiguous and difficult to “read”. When we do this deliberately in order to achieve a specific dramatic effect we are actively giving the focus to the confusing in order to make a specific point. The confusion is, in such circumstances, important – but only for a time. We never want to be confusing when it is not our deliberate intention to be so.

The Techniques that Achieve Focus

By this point in the proceedings, I hope it’s clear why controlling the focus of the audience is so important. Without clear focus, the story is buried under a wall of auditory stimulation that results in it failing to enter our memories.

Three suggestions

Here are three suggestions…

Firstly, dialog, sound and music must be carefully planned and crafted so that each element contributes to the telling of the story rather than obscuring it. One of the early playwrights of the radio age tells a story of being reduced to tears because the orchestra conductor played more and more loudly over her play when it was being performed. When she asked him what he was doing he responded with “why should my orchestra’s beautiful rendition of that piece be missed by the audience just because some people are talking over it?” It can be easy to forget that the brilliant soundscape or score we’ve constructed will actually damage the production if it overwhelms and distracts from the story. We must plan and construct each element of our audio plays so that they receive audience attention only when they are actively supporting the story telling.

Secondly, we also need to use the elements of the drama to constantly keep our audience oriented to what is happening. This means that a good script needs to, when necessary, provide additional dialog to keep characters alive in a scene who might otherwise fade from the audience’s memory, regularly use and reinforce character names, provide exposition to identify and explain ambiguous sounds, emphasise moments of drama with music, draw attention to key sounds at key times, and so on.
That sounds easy enough to say, but how is this done in practice? So long as the fundamentals of a well-planned script are all in place, the control of focus can be achieved very easily (as the next and final point reveals).

Finally, focus is controlled by controlling volume. The spotlight of audio drama, if you will, is volume. We give focus to the most important element of the script in any given moment by our control of volume. We fade the background soundscape under the dialog to give the dialog focus, we fade the dialog under the sound of approaching footsteps to emphasise the importance of the new arrival, we punctuate the discovery of a dead body with a jarring chord of music, establishing and fading sound from foreground to background as required. Volume is the master key to controlling what the audience will focus on. It is essential, therefore, that we are not timid in distinguishing foreground sounds (whether dialog, sound effects or music) from the background. To read clearly, every important sound element must receive clear focus through control of volume and all others must diminish in importance (by fading or halting). At any given moment, only one thing at a time should ever have the spotlight. Control of the audience’s focus is maintained through the control of volume. Where the focus moves continually, and unambiguously, without competition, from sound effect to dialog to music (even where overlap is used as a means of transition) the play will “read” as a continuous narrative and the human brain will be able to sort and retain it in our memories.

Do you have other tricks and tips to help the audience to “read” and remember the audio-drama they listen to? If so, post them in the comments below. I’d love to hear what you think.

This article is © 2017 by Philip Craig Robotham – all rights reserved.

Lindsay Ellis on Michael Bay

Every Frame a Painting on Michael Bay