Once again, I am, in my overlong and wordy way, wading into an area of Audio drama that, as a person who has significant hearing loss, I probably don’t belong. As with my last article on the topic of sound and silence, these aren’t hills I’m particularly committed to dying on – merely thoughts and observations on sound from my own point of view (as a scriptwriter). 😀
STORY FIRST
What’s the most important part of an audio drama? The dialog? The sound effects? The music? None of the above.
The most important part of an audio drama is the story. Every element of the drama must serve (and not compete with) the story. Anything that gets in the way of the story (whether the dialog, the sound, or the music) must be jettisoned. That is not to say that any of these elements are unimportant. They all matter, but the thing that decides the relative value of each is their contribution to telling the story.
In a previous article, I talked about how the relative importance of each is highlighted by audio drama’s spotlight; volume. That which is most important is loudest while that which is of lesser importance becomes part of the background. And we determine what should be loudest by its relative value to convey the story. Sometimes, the dialog is front and center. Sometimes the sound effects necessarily overwhelm everything else, and sometimes the music needs to soar in order to convey the emotion of the scene. The importance of the element is conveyed by its volume from moment to moment, and that volume is determined by its importance to the needs of the story from moment to moment.
When it comes to sound, it is easy to fall into the trap of over-layering sound – a trap that is peculiar to sound engineers. People who work with sound tend to have a highly tuned “ear”. Typically, they detect more sounds than the average (in terms of the range of pitch and loudness, softness), and they tend to be able to distinguish (or identify) more sounds than the average person. The downside of this gift is that they can lose sight of how a sound “reads” to an ordinary person.
TYPES OF SOUND
There are two, general, types of sound effects. Self-identifying and ambiguous.
Self-identifying sounds are sounds that most people can be relied upon to recognise without any help from the context or external cues. A police siren, a heart monitor, a tin whistle, and a bell tend to be sounds that are self-identifying.
Ambiguous sounds are sounds we rely on the context to decode. Is that the hiss of a pneumatic pump, or a steam kettle, or a snake? Is that the sound of rain on a rooftop, or the sound of a fire burning in the hearth? When it comes to ambiguous sounds (sounds that are similar enough to be difficult to distinguish) we let the context tell us what they represent.
And herein lies one of the great freedoms of audio drama. You can build a soundscape that is “readable” by an audience from sounds that are not exact matches to the real thing (crumpled cellophane can stand in for flames, etc.). Sometimes sound engineers forget that their trained “ear” is significantly more sensitive than that of the typical listener and that, in the midst of a story, a close, but not perfect, sound will be read as the story requires so long as the context and dialog support it. Likewise, without proper context, a sound that, in terms of verisimilitude, is an exact representation of something in the real world, will be misread if not properly supported (fire really does sound like rain – and vice versa – if the context isn’t clear and goes a long way towards explaining the dreadful cliché of having a thunderclap sound before a downpour is heard).
Commitment to realism is great but, if it is not paired with a recognition of the need for context to establish its “readability”, it will only confuse an audience. “Deep” soundscapes quickly become confusing noise where the sounds that are layered into them are without context and/or are hard to pick out. The skilled sound engineer is in danger of letting their own capacity to identify sounds carry them away when constructing an immersive soundscape.
So how much is enough? And how accurate must we aim to be?
REALISM VS. REPRESENTATION
Sound does not need to be “realistic” to be effective. The imagination of the audience can carry a production a long way.
Let’s imagine a sound effect or two that we would like to build. Start with a body drop, for example.
What are the chief characteristics of a body drop? For a start, it’s a soft thud, to be differentiated from the harder thud of a block of wood falling to the floor, or the metallic clang of a kitchen knife falling. It also sounds “heavy”. It has a resonance and weight to it. A duffle bag full of cloth scraps makes a great body drop sound, but in a pinch so will allowing your arm to fall heavily on a wooden desk. The brevity of the sound means that there are plenty of substitutions that could be made here. The context (a gunshot or a character crying “lookout, she’s going to faint”) will be enough to establish any reasonable facsimile of a body drop, and the quickness of the sound effect will cover any slight lack of realism in its use.
Approximation is fine when it comes to ambiguous sounds, but there is a limit. A duffle bag full of sharp-edged blocks of wood would make a clatter that would contradict the context – falling people don’t clatter. The incongruity or dissonance would break the audience’s sense of immersion and their willing suspension of disbelief.
What about a more complex sound, for example, a horse and cart? Well, you could go out and record the sound of a horse and cart of course, but you could also build the “impression” by building together a bunch of approximations. Coconuts can be used for the hooves on a hard surface, as can bags of cloth thumped against the chest for hooves on a soft surface. The rhythm is ultimately what sells the sound. What else is needed? The rumble of the wheels of the cart is a key sound. A friend of mine put a steel drum on a crank handle and made a passable wagon’s rumble by turning it. Together this might be enough, but the jangle of some car keys to suggest tack and harness could be added as well. The context needs to ultimately sell the combination of sounds, so a character saying “It’s been hard work gettin’ this wagon down the trail” or something less on-the-nose would clinch the deal.
Do you need more? More could certainly be added. You could simulate the thump and clatter of the cart hitting bumps with a couple of blocks of heavy wood. Layering in some horse snorts and some ambient outdoor noises (birds, wind, etc.) could also help. But to “read” the scene, the audience only needs enough sound to register the context and support what is taking place in their imaginations. And, as mentioned in a previous article, sound fades from our attention once it has done the job of establishing itself – so it needn’t stay for long and should never be allowed to stay in the foreground long enough to distract us from the story.
Some sounds can’t be faked, of course. Animal vocalisations are particularly difficult unless you have access to a skilled impressionist. As a result, a recording of a real horse’s neigh or lion’s roar may be required. But many sounds can be approximated with a bit of creativity. A friend of mine simulated a cart crashing into a wall by slowly crushing a balsa wood box (that he built out of some balsa wood planes that were being thrown away) against a table. The result was great.
Near-enough is often good enough, as far as our brains are concerned, so long as there is context (in the form of other sounds or dialog) to help identify the action – but selectivity is essential. Just like the mind, a good audio drama will ignore sounds in which it is not immediately interested and will aim to employ sounds primarily for their clarity and effective communication. By this selection, audio scripts control and direct the listener’s attention. It is important to only introduce sounds that serve a dramatic purpose. If this does not happen the story is rendered confusing. Listeners are forced to pay attention to the irrelevant and lose track of the important.
MINIMALISM – SELECTING AND PRESENTING ONLY WHAT IS NEEDED
I don’t want to rehash the points made in the previous article in this series except to say that an immersive and realistic soundscape is a great thing, if you have the budget and technical expertise, but a minimalist approach to audio drama can be just as effective (if not more so) because of its selectivity. An approach where only sounds that contribute to the story are selected is not to be underestimated. Most listeners won’t notice that they haven’t been hearing footsteps in the story, even if we introduce the sound of someone limping at a later point. Our brains are quite selective regarding what we pay attention to and we rarely notice the absence of a particular sound if it isn’t important to the action. Audio drama is quite “realistic” in this sense – it doesn’t have to give us a layered and detailed soundscape to be satisfying. In fact, a minimalist approach that carefully selects sounds for their contribution to the drama can feel more immersive simply because of the absence of distracting “noise”.
THE WRITER’S ROLE
But what significance does this have for the writer? After all, isn’t all this the preserve of the sound engineer? You don’t want to step on another person’s creative expression, right?
The design of the sounds themselves are definitely the preserve of the sound engineer. However, the ambiguity of sound makes identifying the sounds for the audience a writer’s problem. Without the sense of sight to tell us what an ambiguous sound refers to, we need cues to help in their identification. Ambiguous sounds require “stage-setting” or guidance from the writer. This stage-setting should generally occur before the sound is introduced to prevent confusion. If the listener is thinking conveyor-belt before the sound is identified as a waterfall, then confusion will, inevitably, result.
Generally, the identification of a sound is made through dialogue or narration.
IDENTIFICATION THROUGH DIALOG
When identifying sounds through dialog, the identification doesn’t need to be explicit. Indirect reference works equally well.
SOUND: RUMBLE OF CAR – FADE IN.
BOB: Take the back roads, Jim. We can’t afford to get stopped again this trip.
JIM: Sure. It might slow us down a bit, though.
BOB: It’s better than trying to explain why we’ve got Ted Wilson’s body in the trunk.
Narration has a bad reputation these days, but a few words of narration can render an otherwise confusing soundscape intelligible.
NARRATOR: Ted and George make their way through a busy building site.
SOUND: WHINE OF A METAL SAW, CLANG OF HAMMERS AND PIPES ETC. – ESTABLISH AND UNDER
TED: Keep your head down. If Moretti’s guys spot us, they aren’t gonna play nice. A lot of innocent workers are gonna get hurt.
GEORGE: Then why did you lead us here?
TED: They’d have spotted us on the street in a minute. Our only chance of getting out of this alive is to lose ’em in here, so stop your whining.
IDENTIFICATION THROUGH OTHER SOUNDS
Occasionally sounds can be used to identify other sounds. Again, it doesn’t need to be explicit; an implied identification is often all that is required. The sound of thunder prior to rain, the sound of a car horn to introduce a car’s motor, or the sound of a train whistle to introduce a train are common to the point of being a cliché. The sound of a heart monitor automatically establishes that we are in a hospital and will identify an otherwise ambiguous whooshing sound as intubation.
In many cases, sounds become clear through context (through the plot itself), via perfectly natural references in dialogue and narration.
BACKGROUND SOUND
Sounds can serve a purpose in setting the scene: establishing the presence of a babbling brook, or busy roadway. Scene setting is essential but many (perhaps most) scene setting sounds are not self-identifying. Bird song suggests a rural setting, outdoors during daytime, and crickets suggest night. Horns honking suggests traffic, but running water? Scene setting narration or exposition may be required.
Compare…
SOUND: RUNNING WATER – FADE IN, ESTABLISH, FADE UNDER
JENNY: I don’t see any way across the stream, do you? Should we wade through?
SAMANTHA: It depends how deep it is in the middle. I can’t swim, remember?
With…
- SOUND: TINNY RADIO – UNDER – AT A DISTANCE
- SOUND: RUNNING WATER – FADE IN, ESTABLISH, FADE UNDER AND CONTINUE TO 6.
- SOUND: OCCASIONAL SPLASHING – CONTINUE UNDER
- JENNY: Don’t let the sink overflow.
- GEORGE: I won’t.
- JENNY: What are you doing, anyway? It’s not like you to clean the dishes.
- SOUND: FAWCETT TURNS OFF – LET IT FINISH.
- GEORGE: I’m doing my socks.
- JENNY: At the office? I’m sorry I asked.
The same running water sound effect could be used in both, but the context identifies the sound differently for the listener in each.
In long scenes, we may need to bring the background sound back to the audience’s attention from time to time in order to keep the setting alive.
Later…
- SOUND: PLUNK (WATCH FALLS INTO BASIN) – LET IT FINISH.
- GEORGE: Damnit. That was my watch.
- JENNY: Aren’t you finished yet?
- SOUND: SPLASHING IN SEARCH OF IT – CONTINUE UNTIL 6.
- GEORGE: I was just about to pull the plug.
- JENNY: Was it worth it?
- GEORGE: Huh? (BEAT) Wait, there it is. If I clean off the suds it’ll be good as new.
- JENNY: You know, worth it? I mean your socks have to cost less than your watch.
- GEORGE: Not as much as you might think, and besides, nothing beats wash-basin fresh. Anyway, it’s not the first time. The watch will dry out… eventually.
But if you go to the trouble to introduce setting and events with sound, make sure they contribute something to the story. If we must listen to George drop his watch in the basin, then the story had better include a moment later on where George’s watch is discovered to have stopped and where that stoppage has a significant impact on George’s life.
UNREALISTIC SOUND
I’ve already mentioned that sounds intended to represent real-world experiences can break immersion where they are not close enough to the real thing, are not identified for the audience, or contradict an audience’s expectations. But it is also possible to use sound deliberately in an unrealistic manner.
Comedic scripts in particular make use of unrealistic sound cues. A door might open and close in an unrealistically fast manner or a whoosh might accompany the suggestion that someone is traveling across town at blinding speed. A slide whistle might accompany a fall or a comedic crash might accompany someone tripping over. Bird whistles might accompany someone receiving a bump on the head. These unrealistic sounds will fit such a context – if they appear to be a deliberate choice by the script’s writer and producer. The danger, of course, is that the audience might think the sound is merely a poor choice or accident.
Here contrast and exaggeration help. Unrealistic sounds require a certain amount of exaggeration to read as deliberate. If you want to make a comedic body drop, the sound should not be similar to a realistic body drop. Instead, the sound should be exaggerated in its difference. A standard body drop is a heavy thud, so the comedic body drop could be presented as a metallic crash. It would be advisable to lean into the difference by extending it a little and exaggerating it (so that the sound includes numerous metallic pieces falling and perhaps a trailing item ringing as it comes to rest).
Another unrealistic use of sound arises when you want to communicate symbolically. It’s a little cliché, but the ticking metronome is easily read by an audience to indicate that time is running out. Music, such as the “dun-dun-dah” sting used to announce the discovery of a body, is also a well-recognized (cliché) audio symbol.
WRITING SOUND CUES IN THE SCRIPT
Remember, the written sound cue puts a spotlight on a particular sound. That sound will be given prominence in the story for a moment through volume. As such, only the most important sounds tend to make it into the script… and when they do it is because they fulfill an important role in the story.
A number of formats are currently in circulation for audio drama (American Audio drama, BBC format, Screenplay format). I have my own preference (American Audio drama), but writers need to use the format preferred by the production company they are working with – and I have used each, as required, at different times. My preference for American Audio Drama format arises from its use at the height of the Golden Age of Radio Drama. It was developed specifically for radio in the context of the high-volume production environment of the ’30s through ’50s when audio drama was engaged in constant innovation and refinement. As a result, it is highly efficient and radio-specific. The BBC format (again designed for radio) is my second choice. Screenplay format was designed for a visual medium and isn’t, in my view, as good a fit for audio production, but it does have the advantage of being readily automated and available in tools like Final Draft, CeltX, Fade In, and Studio Binder – lots of companies prefer it.
Regardless of the script format, sound cues are generally formatted in ALL-CAPS with underlined text in the present tense. The golden age of radio taught writers to make them as simple as possible. Many shows were produced live and depended on the effects being quickly read and implemented by the sound effects staff. The habit of concision is still worth cultivating. The sound engineer will thank you for keeping the cues short, to the point, and concrete.
Bad cues are wordy, needlessly abstract, and don’t serve the story. E.g…
SOUND: GEORGE ENTERS THE ROOM WITH A SOUND LIKE THE SLOW LEAKING OF A BROKEN HEART
Good cues are short, to the point, concrete, and serve the story.
SOUND: DOOR OPENS
Some additional cue-words and descriptions are listed below that can help to capture your intention for a sound effect.
SOUND: (WALLA) CROWDED CITY STREET – FADE IN, ESTABLISH, FADE UNDER AND OUT.
(BRIDGE) — Music played between scenes — the audio drama equivalent of raising and lowering the curtain on a scene.
CONTINUE UNTIL — Let the sound or music play until a particular line number or symbol (such as * ) is reached.
[CUE] — The actor or sound should wait for the director to indicate it is time to begin delivering the line or playing.
ESTABLISH — Let the sound or music play for a moment before any other sound or dialog is added.
FADE IN — Start the sound or music softly and then gradually increasing its volume.
FADE OUT — Gradually lower the volume on the sound or music until it can no longer be heard.
FADE UNDER (or simply UNDER) — Lowering the volume of the sound effect or music until the actors’ voices are clearly audible over it.
LET IT FINISH — Playing the sound or music until it is complete without fading it.
(STING) — Music used as punctuation to emphasize the emotion of a moment. The “dum-de-dum-dum” that plays when a body is discovered or the “bada-bing” cymbal crash of a joke being delivered etc.
UNDER — Continue a sound effect or music at low volume under the dialog or action taking place.
(WALLA) — Background sound belonging to the environment (for example, the sounds of a busy street).
While it is generally the sound engineer’s responsibility to source, construct, record, and edit sound effects, the inclusion of sound cues in the script remains an essential skill for the writer. The number of sounds that require identification far outnumbers those that are self-identifying. As such it is essential for the audio scriptwriter to identify those sounds for the audience in the script. The scriptwriter (even a hearing-impaired writer like me) must know, at least a little, about how sounds are constructed and how they can be identified for and decoded by the audience.
Copyright Philip Craig Robotham © 2020 .