36 Seconds

It never surprises me when a mix takes longer than expected. This post will look at part of the process that resulted in the following piece of music.

 

Due to time constraints, Danielle and I have abandoned the acoustic-midi hybrid orchestra as this would mean scoring, rehearsals, performance to picture, conducting etc. We have decided to opt for ‘Plan B’ which is a midi-midi hybrid orchestra. Allow me to explain…

Due to it’s lock-solid efficiency, I find myself leaning towards Xpand2 for writing music. If you search long and hard enough, you can find some of the more realistic sounds. However Xpand2 lacks the expressions of instruments such as vibrato, attack & release, crescendoing, diminuendoing; therefore resulting in that midi sound we’re all used to.

xpand

Xpand2

Kontakt 3 player is the opposite to Xpand2; it has nicer samples as well as the above mentioned expressions that sound closer to that of human players; however, it is also quite possibly the most unreliable, malfunctioning piece of trash plugin I have ever dared to use and cost a severe amount of time towards the process. In unity with Xpand2 however (with a delicate balance of EQ, Reverb and Compression) the midi-midi orchestra sounds better than either plugins perform alone.

kontakt

Kontakt 3

So why did the mix take so long? All instruments originally tracked on Xpand2 needed to be duplicated with suitable instruments from Kontakt 3 for merging possibilities. The addition of Kontakt 3 added 2 hours due to the consistent employment of workarounds to avoid crashing; the crashing itself; plus the tweaking of velocity to smooth over and allow for inter-plugin velocity variances. The rest of the mix took an additional 5 hours to complete which was quite fast for me (albeit a thirty-six second piece). See below screenshots and notice the duplicated tracks making up the hybrid:

Screen Shot 2014-03-23 at 20.40.31 Screen Shot 2014-03-23 at 20.40.53

So that’s seven hours for a thirty-six second piece of music of which has now been inserted into a section of the film which you can view shortly.

The final requirement to allow the music to fit amongst the sounds of the forest (backed up by the article written by Jay Rose in the previous post) was a couple of EQ adjustments to the stereo music track itself (below).

Screen Shot 2014-03-23 at 21.29.40

Cutting some of the midrange allowed for the foley sounds to reside in their own sonic hole whist boosting “the *sizzle* region” of which Rose boosts “both the music and voice“; gave the music more presence to make up for the sucked out mids. 

Finally, here is the finished section.

The Magical Foley Curve

Experimentation during the mixing of the ‘Banquet Scene’ led me towards a contextually-universal EQ curve–or near enough a good starting point of which can be applied to future scenes and… here it is.

Screen Shot 2014-03-11 at 12.34.38

A common phrase blasted around the audio-for-visual fraternity is “Dialogue is King” of which Hollywood’s Post Production and Dialogue Mixer, Stephen Tibbo asserts “everything else wraps around it (iZotope, 2014)”. Therefore when EQ’ing foley, the motivation is to avoid masking the human voice by creating a frequency hole in the foley for the voice to sit.

The concept of the curve is:

  • To remove harshness around the 3k range and carve out space for the human voice to sit.
  • To boost the lower mids to add perceived warmth, bottom and midrange. This is also an area of less importance for the voice (see article below).
  • To sweeten and massage out the individual characteristics of the intricate foley sounds by shelve-boosting the upper-mids to highs–adding presence to stand out from the mix.

Jay Rose, author of Audio Postproduction for Film and Video offers an insightful breakdown of the general rules of which the human voice adheres to and is something I find myself referencing a lot when considering mix decisions concerning the human voice. The following excerpt informed the rationale behind the EQ curve.

jayawards
Jay Rose

 

Under 150 Hz
————-
In this range you can pull down the dialog. It will help reduce plosives, handling noise and echo in large rooms. If the voices get too thin, lower the cutoff frequency.

For music, you can cut things like bass drums a bit at the lowest frequencies (say under 60Hz) to ensure that you have no sub-sonics, and to give you more headroom.

Low cuts are nice for safety, but if you’re too aggressive, things get thin. Find the right balance.

150Hz – 300Hz
————–
This is where the fundamentals of voice and many important instruments exist. I like to give the voices a slight boost and the music/effects/foley a slight cut here. If your voices are boomy, back it off. If thin, boost away.

One trick is to send your music to two sub-busses. When you’re underscoring dialog, use the cut version. When standing alone, use the flat version.

300Hz – 600Hz
————–
This region is less critical for voice. You can boost your music/etc here and get away with it. Cut the voices here to make room.

600Hz – 1200Hz
—————
This region is critical for consonants. Boost the voices and cut the music. You lose some fast attacks on your music, but it’s more important to understand the talent.

1200Hz – 2400Hz
—————-
This area isn’t critical for voice. Cut the dialog, boost the music a bit.

2400Hz – 4800Hz
—————-
This region is important for distiguishing voices and instruments. Boost the voices and cut the music – especially if there are multiple voices. The downside is that your oboe will start to sound like a clarinet. You can push this range back up for the music when it stands alone.

4800Hz – 9600Hz
—————-
This is the *sizzle* region. I boost both the music and voice heavily here.

9600Hz and up
————–
Don’t worry about higher frequencies too much. It’s mostly noise. If your tracks have HF noise, feel free to cut with a heavy hand.

Artistic vs. Technical

It came to me as an afterthought following the mixing process of the Banquet scene that one has to sometimes compromise between artistic and technical decisions. In an interview with Producer and Engineer, Erik Zobler (George Duke, Stanley Clarke, Dianne Reeves, Al Jarreau, Natalie Cole, Anita Baker) talks about sound mixing and anecdotally mentions his experience working with engineer, Bruce Houdini of which offered Zobler a piece of solid mixing advice of which to always “Fill the Meter, Fill the Speaker”. In regards to the latter of which Houdini attributes to bringing balance not only to channel levels, but also to the entire frequency spectrum to deliver that lush sounding piece of audio. Zobler asserts:

Screen Shot 2014-03-15 at 16.44.42

 

“Make sure you’re putting all the frequencies in there, whatever you do, no matter what you do. Maybe not each instrument, but when you’re done with your mix, make sure your mix has all the frequencies, way down, way high.”

Full Interview can be viewed here.

 

(Zobler, 2004, Waves)

This piece of advice particularly resonated with me and following implementation into common practice, I started reaping the benefits. However by side-effect, this imposed rule led towards an artistic vs technical battle when considering tracking and mixing scenes for film. For example, the Banquet scene of 5150 takes place inside a living room in the countryside, there is a lit fireplace, it’s a fairytale setting and appears to be a clear day outside. Therefore chosen components making up the soundscape and fulfilling the picture consisted of the following:

  • Fire-1
  • Fire-2
  • Clock
  • Blackbird
  • Rural birds (assorted)
  • Wind
  • Horror Drone

Here lies the balancing act of making your scene not only aesthetically pleasing with full use of the frequency spectrum, but to colour it with strokes of realism that compliment the imagery plus further sweeps of non-diegetic elements such as music and SFX. The choosing of the above sounds were as a result of this balancing act. For example, the wind was not something we felt would be aesthetically pleasing to the ear, but was indeed low-passed, pulled down and used to fill out the sub > low end of the spectrum. The fire tracks were initially going to be used to fill the entire void. However, allowing the subs to filter through perceptually gave the impression of proximity to the listener and caused a dissonance or clashing to the perceived realism. This was also the same with the high end of the fire cracks, so frequencies above 8-10k were shelved off via equalisation. I managed to retrieve the airiness and presence by boosting the high end of the intricate foley sounds, the upper harmonics of the dialogue and the composition of music by way of a careful balancing act.

This notion of balancing art and technical very much parallels to that of musical composition. Sounds take on the heir of the instruments of which in order to create a full-sounding orchestration, they need to be collated and organised throughout the entire frequency range. An orchestra of silken-sound, filling the spectrum, filling the meter and therefore, filling the speaker.

Title Sequence Music

After receiving the title sequence graphics, it was time to begin assessing the type of music required. When purveying the visuals, I was immediately reminded of the film Seven of which the title sequences are similarly flashing with crime-styled, disturbing imagery. Therefore I began by listening to the music to establish an influence in my own writing. The music is a remix of a Nine Inch Nails song titled “Closer” and conceptually adds elements of sound design to the imagery.

Title sequence of Seven: https://www.youtube.com/watch?v=-k2gsEI34CE

Elements of the music include: 

  • A pulsating bass kick
  • Distorted instruments
  • Music crescendo’s towards the final climax
  • Musical triggers to coincide with flashing imagery

Therefore I decided to follow these conceptual ideas however using the main lullaby melody as the drive and it’s chord progression accompaniment. 

Bass Elements

  • The bass was provided by a cinematic sub-boom that was placed at the start of every phrase.
  • A synthesised and sustained electronic bass playing throughout the piece in B flat.

Distortion & Crescendoing

To conceptually achieve this element, I implemented the following: 

  1. Chose a small section of the piano and looped it through a delay with infinite repeats and printed the results to audio. 
  2. Bussed the printed audio through a guitar amp plugin and altered the tone to a desirable effect.
  3. Placed the loop at the start of the piece and applied volume automation, steadily increasing the volume until the climax of the piece.

Below is an example of the finished loop:

PIANO LOOP DELAY

(EchoBoy delay on full feedback to create the desired effect)

Musical Triggers

I managed to achieve this element by selecting appropriate orchestral instruments fitting with the brand of the film score and performed sweeps at integral, visual-sync moments. Instruments working this included: Clarinet, Piano & Kick.

A Visual Leap

When initially receiving the imagery that linearly follows the fantasy section of the film, I noticed a potential issue with the style of the visuals (for the title sequence onwards) being radically different to the preceding scenes. This was overcome by sonically, branding the film by choosing the main lullaby for the music, as the audience would have already heard different forms of the piece already and will help to audibly connect the scenes together. The choice of orchestral instruments, morphed the electronic style of the piece to an electro-orchestral hybrid–connecting the preceding musical style of the film to the scenes following the sequence. I believe this piece of music played a vital role towards smoothing over this leap in the visuals.

Here is the completed sequence: