Research Colloquium #6: Perceptual Audio Coders – What To Listen For
March 13 @ 17:00 – 18:30
Perceptual audio coding is ubiquitous and perceptual audio codecs like MP3 and AAC are widely known. It has fundamentally revolutionized the way how we transmit, store and consume music, fueling applications like Internet streaming and downloading of music. Perceptual audio coding combines elements from digital signal processing, coding theory, and psychoacoustics into one system. It is, however, mostly the use of psychoacoustic noise shaping techniques that frequently leads to questions from non-experts (“Is this signal really intact?”) and sometimes is even perceived as some kind of “black magic” within the coder.
To educate users and listeners, the AES Technical Committee on Coding of Audio Signals (TC-CAS) has created educational material on “Perceptual Audio Codecs – What to Listen for”. It comprises tutorial information on the principles and limitations of perceptual audio coding technologies, as well as curated listening examples to educate and train listeners to identify possible audio coding artifacts.
Originally published as a CD-ROM in 2001, the material has been updated in content and functionality and been released as “The Second / Web Edition” for its 20th Anniversary in 2021. After the Web Edition has initially been exclusively available as an AES membership benefit, it is available free of charge to the general public since February 2023 for the 75th Anniversary of the AES, by courtesy of the AES and its Technical Committee on Coding of Audio Signals.
It is a useful resource for understanding the why’s and how’s of perceptual audio codecs, containing a taxonomy of common types of codec artifacts, as well as tutorial information on the background of each one. Example audio signals with different degrees of impairment illustrate the nature of the artifacts and help in training listener expertise. With this support, you can become your own expert listener in audio coding!
The research colloquium guides through this useful resource, with the speakers Jürgen Herre (TC chair) and Sascha Dick (editor).
- Tutorial freely available at: https://aes2.org/resources/audio-topics/audio_coding/perceptual-audio-codecs/
- Promotional Video (YouTube): https://www.youtube.com/watch?v=QKBkBvV-HCw
- AES Paper: Dick: “Introducing the Free Web Edition of the ‘Perceptual Audio Coders – What To Listen For’ Educational Material”, 154th AES Convention, Helsinki, May 2023, Express Paper 87, https://www.aes.org/e-lib/browse.cfm?elib=22112
Prof. Dr.-Ing. Jürgen Herre
Prof. Dr.-Ing. Herre is a fellow member of the Audio Engineering Society (AES), co-chair of the AES Technical Committee on Coding of Audio Signals and vice chair of the AES Technical Council.
He received a degree in Electrical Engineering from Friedrich-Alexander-Universität in 1989 and a Ph.D. degree for his work on error concealment of coded audio in 1995. In 1989 he joined the Fraunhofer Institute for Integrated Circuits (IIS) in Erlangen, Germany, and became a co-developer of the popular mp3 (MPEG-1/2 Audio Layer 3) codec. In 1995, he joined Bell Laboratories for a PostDoc term working on the development of MPEG-2 Advanced Audio Coding (AAC). By the end of 1996 he went back to Fraunhofer to work on the development of more advanced multimedia technology including MPEG-4, MPEG-7, MPEG-D, MPEG-H and MPEG-I, currently as the Chief Executive Scientist for the Audio/Multimedia activities at Fraunhofer IIS, Erlangen. In September 2010, Prof. Dr. Herre was appointed professor at the University of Erlangen and the International Audio Laboratories Erlangen. He is an expert in low bit-rate audio coding/perceptual audio coding, spatial audio coding, parametric audio object coding, perceptual signal processing and audio for Virtual and Augmented Reality (VR/AR).
Dr.-Ing. Sascha Dick
Sascha Dick received his Dipl.-Ing. degree in Information and Communication Technologies from the Friedrich Alexander University (FAU) of Erlangen-Nuremberg, Germany in 2011 with a thesis on an improved psychoacoustic model for spatial audio coding, and joined the Fraunhofer Institute for Integrated Circuits (IIS) in Erlangen, Germany as a Research Engineer in the same year. He has contributed to the development and standardization of audio codecs such as MPEG-H 3D Audio. In 2017 he joined the International Audio Laboratories at FAU to research psychoacoustic effects for 3-Dimensional (3D) audio and has received a PhD degree for his thesis on “Psychoacoustic Effects and Models for Processing and Coding of 3-Dimensional Audio” in 2023. He has since joined Fraunhofer IIS again to further pursue research and development of 3D-audio applications. His research interests include psychoacoustics, multichannel signal processing and audio coding.