Course #50
HD Voice Communication - Coding and Enhancement
We recommend you to submit your
preliminary or firm registration at least 4 weeks before course
start to ensure a seat on the course.
TECHNOLOGY FOCUS
The focus of this course is on speech-audio coding and
advanced signal processing algorithms for GSM, UMTS or LTE mobile
phones, digital hearing aids, and human-machine interfaces. Within
the evolution of these systems, the improvement of the speech-audio
quality will remain one of the most important objectives to
mitigate physical constraints and technological limitations. A
comprehensive understanding of fundamental algorithms, standards,
applications, and trends is provided. Conditions and solutions are
presented with regards to audio-bandwidth limitation, bit-rate
restrictions, interference by acoustic background noise,
reverberation, acoustic echo signals, and residual transmission
errors.
COURSE CONTENT
The course covers theory, practice, standards and trends
of speech-audio communication, including state-of-the art
solutions, the on-going conversion of networks and terminals to HD
voice as well as emerging concepts for spatial HD audio
communication:
Speech-Audio Coding standards: ETSI, 3GPP, ITU, and MPEG
- Coding concepts for spatial HD audio communication
- Error Concealment & Soft Decision Source Decoding
- Audio Bandwidth Extension: With and without side
information
- Noise Reduction & Dereverberation: Single- and
Multi-Microphone Techniques
- Acoustic Echo Cancellation: Time- and frequency-domain,
adaptive postfiltering
The coding and processing techniques are demonstrated by audio
examples.

Figure: Digital Speech Transmission and Enhancement
Monday
Speech-Audio CODING
The first part of the course deals with speech-audio
coding. State-of-the-art concepts are discussed. The signal
processing aspects of quantization, differential waveform coding,
linear prediction, Code Exited Linear Prediction (CELP) and
Transform Coding are explained.
Speech Production Model
- Speech Production
- Digital Filter Structures for Speech Production
- Psycho-Acoustics
Linear Prediction
- Vocal Tract Models and Short-Term Prediction
- Optimum Prediction
- Spectral Flatness Measure
- Block-Adaptive Linear Prediction
- Levinson/Durbin Algorithm
- Long-Term Prediction
Quantization
- Uniform and Non-uniform Quantization
- Optimal Quantization
- Adaptive Quantization
- Vector Quantization
Speech-Audio Coding
- Model-Based Predictive Coding
- Adaptive Differential Pulse Code Modulation (ADPCM)
- Noise Shaping Open Loop and Closed Loop Prediction
- Code Excited Linear Prediction (CELP)
- Quantization and Line Spectral Frequencies (LSF)
- Transform- and Subband Coding
- Adaptive Bit Allocation
- Audio Examples: Quantization, Coding, Noise Shaping
Tuesday
Post Filtering
- Short-term Post Filter
- Long-term Post Filter
- Tilt Compensation
Quality Assessment
- Mean Opinion Score (MOS)
- Modulated Noise Reference Unit (MNRU, CCITT)
- Objective Quality Measures (PESQ)
Coding Standards
The most relevant speech-audio codec standards for
real-time communication are discussed and demonstrated by audio
examples:
- ITU: G.721/G.726, G.722, G.721, G.723, G.728, G.729, G.729.1,
G.711.1, G.718
- ETSI/3GPP: Full Rate, Half Rate, Enhanced Full Rate, AMR,
AMR-WB, AMR-WB+
- MPEG: AAC-LD, AAC-ELD
- Proprietary: SILK, CELT
ERROR CONCEALMENT and SOFT DECISION SOURCE
DECODING
Wireless speech transmission systems usually include
channel coding for error protection.
However, due to temporarily adverse channel conditions quite
frequently residual bit and frame errors remain. The negative
effects of these errors can be reduced by error concealment,
exploiting both residual source redundancy and information about
the instantaneous quality of the transmission channel:
- Frame Substitution and Standard Solutions (GSM)
- Soft-bits and Log. Likelihood Values
- Soft-Decision Speech Decoding by Parameter Estimation
- A Priori Knowledge and A Posteriori Probabilities
- Graceful Degradation by Soft Decoding
- Joint and Iterative Source-Channel (De-) Coding
- Audio Examples: PCM, ADPCM, GSM
Wednesday
BANDWIDTH EXTENSION
In the long run, the audio bandwidth of the telephone
networks and terminals will be extended to 7 kHz wideband
transmission. This will require new codecs at both sides of the
transmission link. In the probably very long transition period many
terminals have not yet been equipped with the wideband capability.
In this situation, the quality of the received narrowband speech
(3.4 kHz) may be improved by means of artificial bandwidth
extension:
- Source Filter Model
- Extension of the Excitation Signal
- Extension of the Spectral Envelope
- Statistical Estimation Based on a Markov State Model
- Bandwidth Extension with Side Information
- Implementation and Performance Evaluation
NOISE REDUCTION, DEREVERBERATION AND BEAMFORMING
If the signal is degraded by acoustic background noise,
dereverberation and a loudspeaker signal, various speech
enhancement methods can be applied prior to the speech encoding and
transmission. We will discuss state-of-the-art algorithms for noise
reduction, dereverberation and evaluate new proposals such as
super-Gaussian speech models and psycho-acoustic aspects. Noise
suppression schemes using only one single microphone or several
microphones are presented.
Single and Dual Channel Noise Reduction
- Wiener Filter
- Speech Enhancement in the DFT Domain
- Noise Estimation Techniques: Minimum Statistics, MMSE
- More Sophisticated Suppression Rules
- Conditional MMSE- and MAP-Estimation
- Estimation of Complex DFT-Coefficients
- Estimation of Real Valued DFT-Amplitudes
- Estimation with Super-Gaussian Models
- Noise Suppression Exploiting Masking
- Soft Weighting
- Dual Channel Noise Cancellation
- Coherence Function and Theoretical Limits
- Dual Channel Noise Suppression
- Noise Suppression using Small Microphone Arrays
- Audio Examples
Thursday
Multi-channel Noise Reduction
- Spatial Sampling of Sound Fields
- Beamforming
- Performance Measures
- Fixed Beamformers
- Multi-channel Wiener Filter and Postfilter
- Adaptive Beamformers
- Generalized Side-lobe Canceller
ACOUSTIC ECHO CONTROL
The key algorithms for acoustic echo control used for
hands-free communication
are explained, especially for the combination of echo
cancellation
with adaptive post-filtering.
- LMS and NLMS Time Domain Cancellation
- Convergence Analysis and Control
- Echo Cancellation and Postfiltering
- Joint Residual Echo Cancellation and Noise Reduction
- Frequency Domain Method and Block Processing
- Additional Measure
- Stereophonic Acoustic Echo Control
- Audio Examples
Said about the
course from previous participants:
"Good overview of all topics."
"The course gave deep theoretical background."
"Complete overview of Speed Transmission, Coding and Noise
Reduction /EC."
"Competent teacher and a very industry oriented course."