Amplitude Descriptors
Root Mean Square (RMS)
ID: rms
1
A measure of the amplitude (energy) of the current audio frame. It represents how loud the sound is within a short time window. Higher RMS values indicate louder sounds, while lower values indicate quieter sounds or silence.
OpenScofo implements the following equation:
Where root mean square of the amplitude representing the energy of the time-domain signal \(x[n]\) of length \(N\):
dB
ID: db
A logarithmic measure of sound level derived from the signal amplitude. Unlike RMS, which measures the raw energy of the signal, dB expresses this energy on a logarithmic scale that better reflects how humans perceive changes in loudness.
The equation implemented is:
Note that neither librosa or essentia implemented dB, but as it uses RMS, once you convert RMS to dB it is compatible with both.
Max Amplitude
ID: maxamp
Maximum normalized spectral amplitude detected in the current frame of audio. Equation used is \(MaxAmp = \max_{k} |X[k]|\), Where \(X[k]\) is an FFT Bin.
Loudness
ID: loudness
An estimate of perceived sound intensity based on psychoacoustic models of human hearing. Unlike dB, it applies perceptual models and filters derived from psychoacoustic studies to approximate how humans actually perceive loudness.
Equation used is:
where \(y[n]\) denotes the audio samples after applying the filtering stage defined in the ITU‑R BS.1770 recommendation.
The term \(N\) represents the number of samples in the analyzed frame. This formulation corresponds to the energy-based loudness estimate used in the loudness measurement procedure defined by the standard.
The loudness descriptor implemented in essentia is based on a simplified perceptual model derived from signal energy with a power-law compression. While computationally inexpensive, it does not incorporate perceptual frequency weighting or the measurement procedure defined in modern broadcast loudness standards.
OpenScofo instead implements loudness estimation following the methodology described in ITU‑R BS.1770. This approach applies perceptual filtering prior to the energy calculation and expresses the result in a logarithmic scale, which aligns with the methodology adopted in contemporary loudness measurement practices for audio production and broadcasting.
As reference, OpenScofo implements the code implemented in klangfreund/LUFSMeter.
Silence Probability
ID: silence
Probability that the current frame corresponds to silence, derived from Loudness (\(L\)) via a logistic function where \(\alpha = 0.25\) and \(L_0 = -60.0\):