Amplitude Descriptors
Root Mean Square (RMS)
ID: rms |
1
A measure of the amplitude (energy) of the current audio frame. It represents how loud the sound is within a short time window. Higher RMS values indicate louder sounds, while lower values indicate quieter sounds or silence.
-
Equation
\(RMS = \sqrt{\frac{1}{N} \sum_{n=0}^{N-1} x[n]^2}\)
-
Notes
Root mean square of the amplitude representing the energy of the time-domain signal \(x[n]\) of length \(N\):
dB
ID: db
A logarithmic measure of sound level derived from the signal amplitude. Unlike RMS, which measures the raw energy of the signal, dB expresses this energy on a logarithmic scale that better reflects how humans perceive changes in loudness.
-
Equation
\[L_{dB} = 20 \log_{10}(RMS)\] -
Notes
Neither
librosaoressentiaimplementeddB, but as it usesRMS, once you convertRMStodBit is compatible with both.
Max Amplitude
ID: maxamp
Maximum normalized spectral amplitude detected in the current frame of audio.
-
Equation
$\(MaxAmp = \max_{k} |X[k]|\)$ Where \(X[k]\) is an FFT Bin.
-
Notes
Loudness
ID: loudness
An estimate of perceived sound intensity based on psychoacoustic models of human hearing. Unlike dB, it applies perceptual models and filters derived from psychoacoustic studies to approximate how humans actually perceive loudness.
-
Equation
\(L = -0.691 + 10 \log_{10}\left(\frac{1}{N}\sum_{n=0}^{N-1} y[n]^2\right)\)
where \(y[n]\) denotes the audio samples after applying the filtering stage defined in the
ITU‑R BS.1770recommendation.The term \(N\) represents the number of samples in the analyzed frame.
This formulation corresponds to the energy-based loudness estimate used in the loudness measurement procedure defined by the standard. -
Notes
The
loudnessdescriptor implemented inessentiais based on a simplified perceptual model derived from signal energy with a power-law compression. While computationally inexpensive, it does not incorporate perceptual frequency weighting or the measurement procedure defined in modern broadcast loudness standards.OpenScofoinstead implements loudness estimation following the methodology described inITU‑R BS.1770. This approach applies perceptual filtering prior to the energy calculation and expresses the result in a logarithmic scale, which aligns with the methodology adopted in contemporary loudness measurement practices for audio production and broadcasting.
Silence Probability
ID: silence
Probability that the current frame corresponds to silence, derived from Loudness (\(L\)) via a logistic function where \(\alpha = 0.25\) and \(L_0 = -60.0\):
-
Equation
\(P_{silence} = \frac{1}{1 + e^{\alpha (L - L_0)}}\)
-
Notes