The Deaf Writer’s Guide to Sound and Silence in Audio Drama

microphone by Miyukiko © 2013
microphone by Miyukiko © 2013

As a person who has a hearing loss (deaf in one ear) I always feel a little awkward sharing thoughts on sound.  After all, there are whole bands of sound-frequencies that I can’t detect that are available to everyone else, so who am I to express an opinion?  However, my inability to detect certain pitches, etc. does not, I think, prevent me from making some basic observations regarding sound from an audio-script-writer’s point of view – and in my imagination, and as I write on the page, my hearing is as good as anyone else’s. 😀

Audio engineers and sound effects artists have expertise that I, as a writer, do not.  But my script should aim to provide these artists with appropriate guidance as to the way sound should be used in the telling of my stories.  Some of this guidance will be ignored (in favour of that expertise) but my own expertise as a storyteller is only enhanced when I understand and can use the spatial, psychological, and auditory qualities of human experience (hearing in particular) to more effectively tell my story.

It’s common, when thinking about audio drama, to focus, as a writer, on speech (dialogue and narration), sound effects, and music.  After all, these are where the storytelling happens.

However, that is not all there is. Speech, SFX, and music might form the canvas upon which we paint our stories, but a fourth (if somewhat abstract) element is essential to our understanding and our writing; that is, volume.

Volume

Volume is audio drama’s spotlight.  That which is loudest is most important.  Therefore, by controlling the volume, the audio dramatist controls and directs the attention of the audience.

A common mistake, even among experienced dramatists, is, in production, to fail to effectively separate background from foreground.  An immersive soundscape that is not sufficiently distinguishable from the dialog becomes an inaccessible and confusing jumble.  Where everything is equally loud, our brains attempt to treat everything as equally important.  And where everything is equally important, nothing is important, and so all is confusing.

From a writer’s point of view, the spotlight is controlled through the use of sound directions such as ESTABLISH, FADE UP, CROSS FADE, FADE OUT, FINISH, SOFT, LOUD, AT A DISTANCE, DEPARTING, APPROACHING etc.  Newcomers to audio drama sometimes fail to realize that the most effective use of sound cannot be accomplished without a little understanding of how our brains work when hearing and listening.

Selective Attention and Fading

Not all sound is equal.  The world around us is full of sound.  True silence is very rare.  Yet we don’t consciously experience all (or even most) of the sounds that are occurring near us at any given time.

As I write this, I can hear a fan running, the sound of my kids in the background getting breakfast, and the sound of some birds outside (it’s early morning).  There’s also some traffic in the distance as well as a dog barking.  But until I specifically stopped and listened, those sounds did not register consciously, or command my attention.  I was experiencing silence where there was none.  This experience of silence in the midst of sound is a function of the human brain.

There is, in our brains, an area known as the Reticular Activating System.  It acts as a filter, designed to select what we need to pay attention to from all the stimulus we experience.  It selects what it thinks is most important and weeds out (or silences) the rest. 

When I get up in the morning and get dressed I might, for a few moments, feel the cloth of my shirt against my skin, but the RAS quickly decides this is not important and silences that sensation so that it won’t distract me throughout the day.  Can I still feel the texture of the cloth?  Absolutely.  But my brain has filtered the sensation out – effectively silencing it.  This is also true of the sounds we hear. 

The sounds I mentioned before were all around me (and still are), but my brain was filtering them out (silencing them) because they were not important.  This is something the brain does automatically.  Registering new sounds, deciding whether they are important or not, and bringing them to our conscious attention (or suppressing them) according to its own inner logic is an automatic and unconscious feature of human experience.

In audio drama, the writer and the producer of sound seeks to mimic this through control of volume.

But if the brain is so clever, and does this naturally, why should it be necessary for us to try to take control of this process when writing audio-drama?

In part it is due to the technology of audio drama delivery – the speakers, headphones, etc. that the sound is being delivered to us through.  When the brain focuses on an audio drama, it treats the speaker(s) as a single source for the purpose of filtering and doesn’t attempt to filter out any of the background sound the drama supplies.

In the real world, we automatically filter the background sounds around us out.  But when listening to an audio drama, the producers must do this for us, mimicking what our brains would otherwise do naturally, effectively tricking the listener’s brain into believing the drama is a facsimile of reality.

How is this achieved?  Through the use of the fade.

When we enter a crowded restaurant, we hear the soundscape behind us (cutlery clinking, the murmur of many voices, perhaps some ambient music).  These sounds set the scene, but, in the real world, these sounds are of little importance and quickly fade out and are excluded from conscious notice while we focus on what we do feel is important; the voices of our dinner companions.

Audio dramas which continuously play realistic background sound under the action and dialogue, ironically, fail to create realism.  We instinctively know that we should not be continuously aware of the background, and so the scene feels off and slightly unreal.

However, when the audio drama plays the background sound at the start of the scene then fades it to nothing when the dialog begins, we notice nothing.  It simulates our ordinary experience and therefore both feels real, and is not noticed.

Likewise, footsteps.  Footsteps that play continuously under a scene are distracting.  Footsteps that are established and fade out, are not.

Note the importance of FADE rather than STOP.  If the background sound simply stops, we notice it and wonder why.  If the footsteps stop suddenly, we assume the individuals have stopped walking.  But by fading the noise out, the audience assumes the sound is still present (even though it is gone).

When we wish to reintroduce background sound, however, we usually bring it back sharply, mimicking the abrupt shift of our attention back to the soundscape (turning what was unnoticed, and therefore silent, background into foreground).

In our restaurant scene, we might have our characters enter the restaurant and establish the ambient sounds, quickly fading them out as their conversation gets under way. But when two gunmen enter brandishing weapons we bring the background noise back into focus as the patrons scream and drop their cutlery etc. in response to the changed conditions.

Likewise, we might fade out the sound of footsteps while our characters are walking down the street, but when they stop, we might introduce the sound of their shoes scuffing to a standstill, signalling the end of the walk (even though the sounds of walking were not present for a significant period).

Microphone Position

The position of the microphone is another key element of sound design.  The microphone stands in for the audience member.  Each audience member is the unseen eavesdropper listening in on the action.  Part of the immersion created by audio-drama comes from this realisation.  When a character moves away from the microphone, they are moving away from the audience member.  When they approach the microphone, they are approaching the audience member.  And when the microphone follows a character, the audience member is, in fact, doing the following.  This allows us to create various levels of involvement and distance in the story.  A different emotional effect is created when actors approach and move away from the microphone to when the microphone follows them around (moving with them from one room to another, for example).

Spatial Quality

I also want to make a point with regard to reverb.  One of the unconscious cues we use to orient ourselves in space is the amount of reverb we experience.  We can tell whether we are outdoors, or inside a cavernous room, or inside a small box, based on the echo (or its lack) around us.  When the spatial characteristics of the locations we choose to present to our audience are ignored in the production, we create a sense of unreality.  Some writers argue that the inclusion of INT and EXT (interior and exterior) is unnecessary in audio drama scripts, but this is not the case.  The location has a distinct impact on the design and production of the sound.

Immersive versus minimalist sound design

One area I have deliberately avoided commenting on, at least until this point, is that of immersive sound-scapes versus minimalist sound design.  Here my hearing disability has a bearing and I’m fairly certain my perspective is skewed.  I am personally biased towards minimalism.  Like many folks with hearing loss, there are certain auditory experiences that are very unpleasant.  I dislike being in large crowds or crowded restaurants where there is a lot of ambient noise.  The sounds tend to bleed together into a white-noise haze that I cannot separate.  As a result, I go from being partially deaf to completely deaf in such environments (with all the sense of social isolation that involves).

I can still enjoy a multi-layered, realistic and immersive, soundscape – Gunsmoke comes to mind as the ultimate example of this – but beyond a certain point and volume, the soundscape becomes auditory mud.

That said, the selective nature of the psychological experience of hearing, means that a minimalist soundscape can seem just as “real” to a listener as something more immersive.  It is also cheaper and easier to take a minimalist approach, requiring fewer people and resources to achieve.  It is also, it should be said, more accessible.  Some folks, particularly audio engineers, aren’t aware that their own expertise and proficiency distinguishing sounds is not shared by the wider population, being in fact a fairly rare gift.  As a result, they produce soundscapes that are not readily accessible to anyone who does not share their finely tuned auditory sense.  It may come as a surprise, but many deaf people, like myself, enjoy audio-drama, but we do find simple soundscapes more accessible.

That is not to say that complex soundscapes should be avoided.  Build the show you want to build. Here I am merely stating a preference.

Stereo Sound

This leads me to the final, and least qualified, point I wish to make.  A final feature of realistic listening experience is stereo.  We are stereo creatures; we have two ears positioned one on either side of our heads.  We experience sound and detect where it is coming from based on the relative loudness of sound in each ear.  As a deaf person, I am an outsider to this experience.  I have only one working ear and so I cannot detect where a sound is coming from except by turning my good ear towards it and detecting the change in volume.  However, even though I am deaf in one ear, my good ear can still hear all the sounds around me.  Some will simply be softer because they are occurring on the other side of my head.

So, if by chance, an audio-engineer happens to be reading this article, I’d like to end this secton on a plea with regard to stereo recording (and panning).  Please, please, please, make sure you never place sound exclusively in one or the other stereo channel.  In the real world, sound can be heard by both ears all the time, it is merely louder in one than the other depending on the orientation of the listener to the sound.  If you place a sound exclusively on one channel you, firstly, create a circumstance in which the soundscape noticeably ceases to match what we experience in the real world, and, secondly, you exclude those of us who are deaf in one ear from detecting the presence of the sound at all.  Sixty to Eighty percent of every sound should be centered, while only 40 to 20 percent of the sound should favour a particular stereo channel (and from a deaf person’s point of view 20 percent or less is better).

Of course, given my own circumstances I would never provide guidance in my own scripts regarding panning and stereo, but your mileage may differ.  😀

I want to look at sound-effects and music before moving on to how this all interacts with the writer’s bread and butter (speech).

Types of Sound

There are several broad categories (or axes) of sound that writers should be aware of. 

Firstly, there is diegetic and non-diegetic sound.  Diegetic sound is sound that is natural to the scene, that a character in the scene would, normally, be able to hear (footsteps, the fan, a gramophone playing, etc.).  Non-diegetic sound is sound that is added to the scene that would not normally be heard by the characters in the scene (the ambient music of a soundtrack, the loud beating of a heart etc).  Sounds are generally placed on a spectrum between these two poles, but tend to cluster at the non-diegetic end in larger numbers.

Secondly, there is self-identifying and ambiguous sound.  Self-identifying sounds are readily recognizable sounds (such as a car engine starting, or the clickety-clack of a steam train hurtling along the tracks) while ambiguous sounds require context to be identified (the sound of crumpled celephane can be read as fire, twigs snapping, a note being crumpled up, etc.).  Again this is a spectrum in which the majority of sounds cluster at the ambiguous end.

Wherever possible the writer should try to identify the appropriate context for a sound before it occurs.

E.g :

EDDIE: Hey Joe, can you get a fire started?  You can borrow my flint.

JOE: Sure, Eddie.

SOUND: SPARKING OF FLINT, CRACKLE OF FIRE TAKING HOLD.

Because most sounds are ambiguous, we want to prevent the listener from having to revise their interpretation after hearing the sound as this tends to break their immersion.

If we provide the interpretation for an ambiguous sound immediately following its introduction, it will usually be okay, but it’s far better to provide it ahead of its introduction wherever possible.

Musical Soundtracks

Non-diegetic music (background music) is intended to support or highlight the emotion of a scene.  Not everyone uses music in their productions, but there are some general guidelines that should apply for those of us who do.  Sometimes as writers we are aware of the exact piece of music that would support the scene we are writing.  If you have a precise vision of what you want to use, then name the piece in your script.  However, copyright, licensing, expense, and or the preferences of the producers may render the piece unsuitable or unavailable.

More important, you should identify the mood or emotion you want the music to express (and then add the title of a suggested piece that captures that mood).  As a non-musician, even though I am fully aware of how much value is added by a good musical score, I rarely bother with musical cues (leaving that to those who are more gifted than I), but when I do, I make use of Hevner’s adjective circle.  This construction may not hold much water as a means of classifying music, but I find the adjectives helpful when I am trying to describe the mood I wish to communicate.

Here they are…

1 awe-inspiring dignified lofty sacred serious sober solemn spiritual

2 dark depressing doleful frustrated gloomy heavy melancholy mournful pathetic sad tragic

3 dreamy longing plaintive pleading sentimental tender yearning yielding

4 calm* leisurely lyrical quiet satisfying serene soothing tranquil

5 delicate fanciful graceful humorous light playful quaint sprightly whimsical

6 bright cheerful gay happy joyous merry

7 agitated dramatic exciting exhilarated impetuous passionate restless sensational soaring triumphant

8 emphatic exalting majestic martial ponderous robust vigorous

Tips for Integrating Dialog and Narration with sound

Audio drama is an auditory medium.  The audience constructs the visuals in their minds in response to what they can hear.  But, as already noted, not every sound is clearly able to identify what is happening for the audience.  The context, and here I mean the speech that provides the context of the scenes, must assist the audience to understand the sound.  We often intuitively believe that the sound supports the story telling (and it does) but the majority of sound is not readily identifiable (and therefore cannot do its job) without the context provided by speech. 

Speech, in the form of dialogue and narration, is essential to conveying a large amount of what cannot be seen, and interpreting the sounds that add to the realism of the play.

In a film, when two gunmen enter a room brandishing pistols, it is obvious to all.  But in audio, the audio must assist the listener to “see” what can only be heard.

Sometimes this results in dialogue that would in a stage play or cinematic setting, be a little on the nose…

SALLY: Look out, George, those men are carrying guns.

Or…

GEORGE:  Take it easy, fellas.  Put the guns away.

Another feature of audio that is important BECAUSE we only hear the action is the repeated use of names.  Characters in audio, refer to each other far more often by name than they do in other mediums.  This is because voices are not nearly as distinctive as images.  We can see George came through the door in a film, but in audio, it helps if someone says “Oh, hi, George” and identifies the character by name, even if he speaks on entry.

And while we’re on the topic of names, try to avoid giving character’s names that are difficult to distinguish from other words.  For example, “Hugh” and “you” are too easily confused to make it a good name to pick for an audio-drama character.

Characters are easily forgotten without a visual reminder of their presence, so when a scene contains more than two people, it is important to regularly have the extra characters contribute something to the conversation.  Otherwise they will not remain alive and present in the imaginations of the audience members.

Voices

And at last we come to the final topic that it is helpful for a writer to understand.  Audio drama is a more intimate dramatic form than many others.  We do not have the advantage of visuals to help us distinguish characters and therefore rely on vocal quality.  In the days of old-time-radio, casts were deliberately selected on the basis of vocal characteristics.

Bass: Heavy/Elderly male

Contralto: Elderly female

Baritone: Leading man

Mezzo-soprano: Leading woman

Tenor:  Juvenile

Soprano: Ingenue

Trebble: Child

Along with identifying accents, these seven vocal types – E.g GEORGE (Elderly Male shopkeeper with a BASS voice and a Dixi accent) – can be very useful (and provide a great deal of variation) when describing and defining your characters.  Characters who sound too similar to one another will easily be confused.  And limits on human memory tend to make confusion even more likely when a cast exceeds five to seven main characters in total.

As advice to writer’s go, I’m aware that this was fairly eclectic.  What other advice regarding sound would you add for those who are engaged in writing audio scripts?

Copyright Philip Craig Robotham © 2020 .

Please follow and like us:
The Deaf Writer’s Guide to Sound and Silence in Audio Drama

One thought on “The Deaf Writer’s Guide to Sound and Silence in Audio Drama

  1. The article ‘The Deaf Writer’s Guide to Sound and Silence in Audio Drama (part 1 and 2) are excellent.
    Thanks for sharing.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to top