De-essing

src: i.ytimg.com

De-essing (also desibilizing) is any technique intended to reduce or eliminate the excessive prominence of sibilant consonants, such as the sounds normally represented in English by "s", "z", "ch", "j" and "sh", in recordings of the human voice. Sibilance lies in frequencies anywhere between 2-10 kHz, depending on the individual voice.

Excess sibilance can be caused by compression, microphone choice and technique, and even simply the way a person's mouth anatomy is shaped. Ess sound frequencies can be irritating to the ear, especially with earbuds or headphones, and interfere with an otherwise modulated and pleasant audio stream.

Video De-essing

Process of de-essing

De-essing is a dynamic audio editing process, only working when the level of the signal in the sibilant range (the ess sound) exceeds a set threshold. De-essing temporarily reduces the level of high frequency content in the signal when a sibilant ess sound is present. De-essing differs from equalization, which is a static change in level among many frequencies. However, equalization of the ess frequencies alone can be manipulated to reduce the level of sibilance.

Side-chain compression or broadband de-essing

With this technique, the signal feeding the side-chain of a dynamic range compressor is equalized or filtered so that the sibilant frequencies are most prominent. As a result, the compressor only reduces the level of the signal when there is a high level of sibilance. This reduces the level over the entire frequency range. Because of this, attack and release times are extremely important, and threshold settings cannot be placed as low as with other types of de-essing techniques without experiencing more blatant sound artifacts.

Split-band compression

Here, the signal is split into two frequency ranges, a range that contains the sibilant frequencies, and a range that does not. The signal containing the sibilant frequencies is sent to a compressor. The other frequency range is not processed. Finally the two frequency ranges are combined back into one signal.

The original signal can either be split into high (sibilant) and low frequencies, or split so that the frequencies both below and above the sibilance are untouched. This technique is similar to multi-band compression.

Dynamic equalization

The gain of a parametric equalizer is reduced as the level of the sibilance increases. The frequency range of the equalizer is centered on the sibilant frequencies.

De-essing with automation

A more recent method of de-essing involves automation of the vocal level in a digital audio workstation (DAW). Whenever problematic sibilance occurs the level can be set to follow automation curves that are manually drawn in by the user.

This method is made feasible by editing automation points directly, as opposed to programming by manipulating gain sliders in a write-mode. An audio engineer would not be able to react fast enough to precisely reduce and restore vocal levels for the brief duration of sibilants during real-time playback.

De-essing without automation or with manual equalization

Audio editing software, whether professional or amateur software such as Audacity, can use the built-in equalization effects to reduce or eliminate sibilance ess sounds that interfere with a recording. Described here is a common method with Audacity. The process is in two phases: 1) analyze the frequency of the voice's ess sound by sampling several instances and calculating the range of ess frequencies, which most likely fall between 4,000-10,000 Hz depending on the speaker, then 2) apply an equalization effect to reduce the frequency by -4 dB to -11db during ess frequency events.

In Audacity the procedure for manual de-essing is (and is analogous for many other audio editing interfaces):

# Make 2-3 copies of an audio file containing the ess sounds or make a file to use for testing purposes.

On the copy, locate a sibilant ess sound(s), such as the word "lasso," "necessary," or "essence."
Highlight only the ess frequency spike and do Analyze > Plot Spectrum to visualize the frequency range at which the speaker's ess sound appears. Analyze at least 3 samples.
Highlight a few seconds containing the ess (or the entire file) and do Effect > Equalization and click the Flatten checkbox to set the base equalization to 0 dB (no changes).
Follow the software's equalization steps to reduce the target ess frequency range by -4 dB to -14db. When detecting those frequencies, the software will reduce them (equalize) by the decibels you select. Press OK to apply the changes.
Review the changes. Use more copies of the test file as required. Lower decibel levels (-7 to -14 dB) may make the word sound muffled or mushy. Too little equalization may not eliminate the ess irritant. Experiment to find the correct equalization for the ess.
After testing, save the equalization profile so it can be applied to future audio files.