Theoretical basis for acoustics¶
Table of contents¶
What is sound (general description)
Mathematical reminders: Sine wave, trigo, complex, integral
Time domain:
Energy, Power… + Table of 5 different signals
Decibels
Time features
Frequency domain: FFT, PSD, PS…
Recording sounds: how a recording chain works
Three visualizations of signals: time pressure, spectrum, Spectrogram
Propagation
1) What is sound?¶
Sound is generated by the movement or vibration of any immersed object in an elastic/compressible medium (solid or fluid), such as air or water. Water density (1025) is more than 1000 times higher than air density (1.225). Thus, sound waves travel five times faster and further in water than in air (approximately 1500 m.s-1 vs 300 m.s-1) and have longer wavelengths. This velocity is the same for all frequencies, but in water is altered by environmental conditions which affect the density of water such as salinity, temperature and depth. Water is an efficient medium for sound propagation due to low absorption rates and thus low attenuation.
Sound can be detected as pressure fluctuations p(t) in the medium above and below the ambient hydrostatic pressure at any given depth in a static liquid (p0) (sound pressure). The sound wave is an example of compressional or longitudinal wave, that compresses (high pressure zones) and expands (low pressure zone) alternatively the medium as it passes (Figure 1). Its acts in all directions. The particles in a longitudinal wave move parallel to the direction in which the wave is traveling. This is a scalar quantity that can be described in terms of its magnitude and its temporal and frequency characteristics. The pressure corresponds to a quantity of force per surface unit measured in atmosphere units, the SI unit is the Pascal (1 Pa = 1 N.m-2).
Figure 1: Representation of the sound pressure through the back and forth movement of particles
How to characterize a sound?¶
A wave has a repeating pattern: one complete repetition of the pattern is a cycle and the time to complete a cycle is the period (P).
The frequency (or pitch) of sound waves (f) refers to the number of wave cycles (i.e. number of repetitions) per second, and it is expressed in ‘Hertz’ (Hz or cycles.s-1).
\begin{equation} f = \frac{1}{P} \end{equation}
For example, a frequency of 100 Hz means that particles vibrate 100 times (back to forth) in one second. If we talk about sound perception, then low frequencies correspond to deep tones (or low pitched sounds, low vibration’s speed of particles) and high values correspond to high-pitched sounds (high vibration’s speed of particles). For example, the human ear is able to detect (in average) frequencies between 20 Hz and 20 kHz. Below 20 Hz (infrasounds), particles vibrations are too slow to be detected by human hear. Above 20 kHz (ultrasounds), particles vibrations are too high. The peak sensitivity (the ability to hear quiet sounds) occurs between 100-4000 Hz within which speech falls.
The distance that a sound wave travels in one period is called the wavelength (λ). Because the wavelength depends on the sound speed (c), a sound underwater will have a longer wavelength compared to the same sound emitted in air.
\begin{equation} λ = {c}\times{P} = \frac{c}{f} \end{equation}
The amplitude (crest, cf Figure 1) of the sound pressure is related to the loudness of the sound. More the amplitude (in Pascal) will be elevated, more the sound will be loud.
Sound pressures can also be characterized with their duration (in seconds).
2) Sine waves and trigonometry¶
A sine wave, or sinusoid, is a continuous wave representing periodic oscillations of constant amplitude (periodic = repetitive). It is characterized by 3 different arguments previously described: frequency, duration, amplitude. Sine waves are useful in acoustics, because sounds can be represented as myriads of different sine waves. Notice that we do not mention cosine (or other) waves, because these trigonometric functions can be related to sine waves. Indeed, all of trigonometry can be proven from only the Pythagorean theorem, that the sum of the interior angles in a triangle are 180° degrees, and the law of sines. The cosine wave has the same shape as its sine wave counterpart that is it is a sinusoidal function, but is shifted by +90o or one full quarter (= pi/2 in trigonometric circle) of a period ahead of it. Thus, sine waves are directly related to trigonometry circles.
The sine function permits to build a wave, and expresses a length from an angle.
Then, we have the next formula for sine waves:
\begin{equation} y(t) = A sin(\frac{2π}{P} t) = A sin(2π f t) \end{equation}
Where y(t) = amplitude accrosse time, A = amplitude, sin(2π f t) = argument (an angle, in radian).
We can also change the phase of the sine wave by adding another radius ϕ to the argument. If ϕ > 0: the phase is moved forward; if ϕ < 0: the phase is moved backward. For example, if ϕ = + π/2, the sine wave has the same phase as the cosine wave.
If we add sinewaves of same frequency, their sum is a sine wave of same frequency but with a different amplitude and phase. If we add sinewaves of opposite frequency/phase, then their wavelengths canceled each other. If we add together sine waves of different frequencies, then the resultant waveform is no longer a sine wave.
3) Representation in complex numbers¶
The notation in exponential complex is generally preferred because it facilitates mathematical operations on signals. Let z = a + bi denote a complex number with real part a and imaginary part bi.
Then the absolute value or modulus or amplitude of z, denoted |z|, is the distance from z to 0 in the complex plane and is computed by the next formula, which is the Pythagorean Theorem.
\begin{equation} |z| = \sqrt{a²+b²} \end{equation}
It can also be calculated as:
\begin{equation} z = |z| e^{ix} \end{equation}
Where e^{ix} = cos(x)+ i sin(x) and t is the angle or argument of the complex number.
If z = a + bi is a complex number, then the following complex number is termed the conjugate of z. It is a very important notion in signal processing.
\begin{equation} \overline{z} = a - ib \end{equation}
The product of a complex with its conjugate is equal to the squared modulus as:
\begin{equation} z x \overline{z} = |z|^{2} \iff (a + ib)(a - ib) = a^{2} + b^{2} \end{equation}
Then, new sine wave formula can be represented as:
\begin{equation} y(t) = A e^{i(2π f t)} \end{equation}
with
\begin{equation} e^{i\theta} = cos (\theta) + i sin(\theta) \end{equation}
4) Explanation of the Fourier Transform¶
Frequency domain permits to estimate the frequency content of a complex signal. Indeed, sounds are not composed of “pure tones” or pure frequency sine waves, but are much more complex and composed of a myriad of sine waves. Thus, we need to find a way to decompose these types of complex signals. The Fourier transform states that a signal can be visualized as a linear combine of sine waves, all characterized with their own frequency, amplitude (i.e. contribution to the signal) and phase (advance, delay).
–> How to decompose a such complex signal? We need to pass through the frequency domain via the Fourrier Transform function.
A visual explanation of the effects of the Fourier transform
The discrete Fourier transform¶
Because sounds are recorded and analyzed as discrete signals, we need to compute their frequency domain using the discrete Fourier transform:
With: - N = number of time samples - n = current sample we are considering (from 0 to N-1) - X[n] = value of the time signal at time n - f = current frequency we are considering (0 Hz to N-1 Hz) - X[f] = amount of frequency k in the signal (amplitude and phase, a complex number) - n/N is the percent of time we are going through. 2pif is our speed in radians / sec. e^-x is our backwards-moving circular path. The combination is how far we have moved, for this speed and time. - The raw equations for the Fourier Transform just say “add the complex numbers”.
Another visual way to understand the dFT: