Loading…
On-Demand [clear filter]
Thursday, October 28
 

9:00pm EDT

Acoustic Decoupling Device in Coaxial Compression Driver
Coaxial loudspeakers are designed to reproduce a broad frequency range while keeping a compact form factor. Correct driver integration requires to properly deal, at the design phase, with the presence of multiple radiating units and with the interference between their acoustic emissions; this is essential to obtain a smooth response and a wide crossover region suited to flexibly accommodate different filter designs. Due to the high frequencies they reproduce, recently-appeared coaxial compression drivers require particular care to ensure excellent acoustic performance at short wavelengths. Adding an appropriate decoupling device in the structure allows to effectively manage the acoustic emission of the two transducers with respect to each other, improving response regularity and increasing the available bandwidth for the crossover versus historical approaches.
This Engineering Brief presents the results of the research carried out at B&C Speakers during the development of the patent-pending acoustic decoupling device integrated into the Company's latest two-way coaxial compression driver, the B&C Speakers DCX464.

Speakers
FB

Filippo Bartolozzi

B&C Speakers s.p.a.


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

An Approach for Capturing Multi-Directional Radiation Characteristics of Sound Sources for 3D
In the general practice of immersive audio recording, there is a focus on how to capture the direct sound source. However, a sound source’s complex radiation properties help define that source in three-dimensional spaces. In this research, the idea of a “holographic sound recording” or HSR is explored. We choose the term “holographic” due to its uncanny ability to create a rendering of a real 3D sound source in space, similar to holographic visual experiences.

HSRs can be defined as the capturing of complex radiation characteristics of sound sources with the intention of being played back in either a virtual environment with six-degrees of freedom or in real life through a multi-directional coincidental speaker array. To research techniques on creating sonic holographic reproductions, two recording sessions were conducted at NYU in the summer of 2021. Through documenting and reflecting on the miking, mixing, and spatialization of the audio objects in Unity with Google Resonance, experimental realizations were made in search of best practices to consider when creating HSRs.

Concepts such as acoustical points of interest, adequate number of microphones, pickup patterns and angles will be explored, as well as capturing room tone and the benefits of player isolation. These aspects are introduced to create a holographic miking system called the Multi-Channel Pyramid Array (MPA) as a starting point for users who would like to create a holographic reproduction of any instrument. The MPA can theoretically be fine-tuned and customized based on the instrument and/or to the user’s desired results.

Speakers
MM

Michael Matsakis

New York University
avatar for Parichat Songmuang

Parichat Songmuang

Studio Manager/PhD Student, New York University
Parichat Songmuang graduated from New York University with her Master of Music degree in Music Technology at New York University and Advanced Certificate in Tonmeister Studies. As an undergraduate, she studied for her Bachelor of Science in Electronics Media and Film with a concentration... Read More →
PG

Paul Geluso

New York University


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

An Open Source Turntable for Electro-Acoustical Devices Characterization
This work introduces an Open Source turntable for the measurement of electro-acoustical devices. The idea is to provide an inexpensive and highly customizable device that can be adjusted according to specific measurement needs. Development of such turntable devices in the past required significant investment. Specific mechanical and motor control design skills were needed, leading to both costly and time-consuming processes. Recent developments in mechatronics and 3D printing allow to design and build a cost-effective solution.

Speakers
avatar for Daniele Ponteggia

Daniele Ponteggia

Audiomatica srl


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Audio Watermarking Technique Integrating Spread Spectrum and CNN-autoencoder
This e-Brief proposes a novel approach of audio watermarking based on Spread Spectrum (SS) involves the psychoacoustic model and deep learning Convolutional Neural Networks (CNN)-autoencoder. Moreover, logistic chaotic maps are employed to enhance the security level of the method. First, a compressed image produced from the CNN-autoencoder is fed to the image encryption stage to yield an encrypted image to be used as the watermark. To apply image encryption, the plain image is, at first 8-bit binary-coded and shuffled by M-sequence. Next, each encoded image is diffused with a different chaotic sequence. Within the embedding phase, the psychoacoustic model is employed to shape the amplitude of the watermark signal which guarantees high inaudibility, whereas a logistic chaotic map is used to determine the positions for watermark embedding in a random manner. This scheme offers an extremely efficient and practical method as it can be used by institutions and companies for embedding their logos or trademarks as a watermark in audio products as the scheme utilizes RGB images. Experimental results show that the transparency and imperceptibility of the proposed algorithm are satisfactory also good image quality even against various attacks. The validity of the proposed audio watermarking method is demonstrated by simulation results.

Speakers
NK

Noha Korany

Electrical Engineering Department, Faculty of Engineering, Alexandria
NE

Namat Elboghdadly

Electrical Engineering Department, Faculty of Engineering, Alexandria
ME

Mohamed Elabdein

Instructor, Electrical Engineering Department, Alexandria Higher Institute of Engineering and Technology, Alexandria
Audio watermarkingAcoustics


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Automatic Classification of Enclosure-Types for Electrodynamic Loudspeakers
It is investigated whether an automated classification of loudspeaker enclosures can be realized. The acoustic load of the enclosure is reflected in the electrical impedance of the loudspeaker and is hence detectable from the point of view of the power amplifier. In order to classify the enclosures of passive one-way speakers, an artificial neural network is trained with synthetic impedance spectra based on equivalent electrical circuit models. The generalization capability is tested with measured test sets of closed, vented, band-pass and transmission-line enclosures. The resulting classification procedure works well within a synthetic test set. However, a generalization to the measured test-data has shown to require deeper investigations to achieve better separation between the different vented enclosures types.

Speakers
JW

Johannes Werner

Hochschule Mittweida
avatar for Tobias Fritsch

Tobias Fritsch

Research Engineer, Fraunhofer IDMT


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Binaural mixing of popular music
In this engineering brief, we present an initial experiment that was conducted to gain insights into what kinds of sound sources would benefit from binaural rendering for typical pop, rock and electronic dance music (EDM) tracks. Original multi-tracks for three different songs (Pop, Rock and EDM) were divided into four elements: drums, bass, guitar/synth and vocals/lead. Eight different mixes of these elements were created in 3rd-order Ambisonics using the RoomEncoder and BinauralDecoder of the IEM Plugin Suite, with different combinations of binauralised and non-binauralised (i.e., stereo) elements within the mixes, ranging from a full stereo mix to a full binaural mix.

A multiple comparison listening test was conducted online, with 21 subjects participating. Their task was to rate the eight mixes in terms of overall immersive experience as well as perceived spatial and timbral qualities. Results showed that mixes with non-binauralised drums were commonly rated higher than mixes with binauralised drums for all three attributes. The full binaural mixes were rated lowest in general, whereas the mixes closest to a full stereo mix tended to be rated highest for and Rock, but less so for EDM. These results suggest that (i) simply panning all sources in binaural would not necessarily lead to a more immersive experience compared to a traditional stereo mix, (ii) a spatial contrast between binauralised and non-binauralised sources might help improve immersiveness (e.g., drums in stereo and guitars widely panned in binaural), and (iii) optimal binaural mixing techniques would tend to depend on the genre of music.

Speakers
avatar for Hyunkook Lee

Hyunkook Lee

Professor, Applied Psychoacoustics Lab, University of Huddersfield
Professor
PA

Pablo Abehsera Morell

University of Huddersfield


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Developing plugins for your ears
We present a new intuitive development platform that allows algorithm developers to put plugins in our ears. The growing number of advanced audio processing plugins developed for DAWs is enabling highly creative sound experiences. We explain how plugins for DAWs can be easily ported to embedded platforms used in ear worn products and other audio devices. This includes signal processing targeting low latency, low power, high compute and large memory plugins. We describe an open platform to bring machine learning based algorithms directly to the end user. This will also give plugin developers access to data streams from additional sensors and multichannel audio data beyond stereo music streaming. The next generation of hearables for gaming, music, movies, AR/VR will require processing techniques currently only available to professionals in studios. This new platform allows end users to select, download and control plugins to unlock innovation the fits their individual needs and personal preferences.

Speakers
GA

Gary A. Spittle

Audicus Inc.


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Impression evaluation of mixed acoustic signals with different spatial acoustic characteristics
Teleconference systems have been widely used and mixed acoustic signals with different spatial acoustic characteristics for each place are heard. However, the perceptual impression of mixed acoustic signals containing multiple different spatial acoustic characteristics has not been sufficiently investigated. In this study, a listening test was performed to survey the difference of impression regarding the mixed acoustic signals with single and different spatial acoustic characteristics. Three instrumental signals (guitar, bass and drums) were played with a loudspeaker and recorded in three rooms. For the listening test, four mixed acoustic signals were set:
1. All instruments were captured in the small reflection room.
2. All instruments were captured in the middle reflection room.
3. All instruments were captured in the large reflection room.
4. Bass was captured in the small reflection room, drum in the middle reflection room and guitar in the large reflection room.
Participants listened to No. 4 and compared it with Nos. 1, 2 and 3 and responded to which of the seven evaluation items (resonant, pleasant, natural, coherent, clear, likeable, noisy) matched. Participants perceived 1 as a pleasant acoustic signal with little reverberation and 2 and 3 as unpleasant acoustic signals with more reverberation compared to 4. It is suggested that mixed acoustic signals recorded in the small reflection room are considered the least reverberant and the most comfortable. The homogenization of spatial acoustic characteristics by suppressing reverberation from the acoustic signals captured in multiple spaces is considered to be useful in giving a pleasant impression.

Speakers
avatar for Shota Okubo

Shota Okubo

KDDI Research, Inc.
avatar for Toshiharu Horiuchi

Toshiharu Horiuchi

KDDI Research, Inc.


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Integrating Vibrato Into Artificial Reverberation
This paper discovers methods to create vibrato effects with artificial reverberation. Feedback delay networks have been used for many reverb effects. Past work with time varying feedback delay networks has focused primarily on small modulations of the delays and or feedback matrices in order to create a more natural sounding reverb. In this paper, we consider the possibility of using wider modulations of these reverbs for the purposes of sound effect generation. Specifically, amplitude modulation and frequency modulation can be obtained by varying feedback matrices or delay lines respectively. The results showed a convincing vibrato effect with minor artifacts and promise for using FDNs in sound effect generation. Future work will include reducing artifacts and fine tuning control parameters.

Speakers
SR

Sarah R. Smith

University of Rochester
SF

Senyuan Fan

University of Rochester


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Measuring Voice Coil Temperature using Ultrasonic Pilot Tones
Measuring the voice coil temperature of speakers during audio playback is useful for optimizing thermal design and preventing overheating. The established DC method uses a constant current to measure resistance and determine temperature. Therefore, it suffers from noise induced by low frequency audio signals and creates an unwanted constant voice coil displacement. These problems are solved by the HF method, where the voice coil temperature is derived from a high frequency (HF) impedance measurement with an ultrasonic pilot tone. This study extends on the previous research by Gautama and Anazawa on the HF method, for example by showing that in addition to the voice coil temperature the pole plate temperature can be measured.

In this study, impedance measurements at different temperatures of an example microspeaker are used to calibrate the HF method. A comparison to the DC method with different test signals heating the speaker demonstrates that the HF method works well in this case. However, it is susceptible to errors from the skin effect in large diameter voice coil wire, changes in cabling, close metallic objects or drifting of the average voice coil position over time or with large amplitudes. The measured pole plate temperature rises with applied high frequency audio signals, which may be explained by induction heating. Overall, the HF method seems especially suited for applications where impedance measurement and non-linear excursion control are part of the design.

Speakers
avatar for Tobias Fritsch

Tobias Fritsch

Research Engineer, Fraunhofer IDMT
avatar for Johannes Fried

Johannes Fried

TU Ilmenau
Johannes Fried received a Bachelor in Technical Physics from TU Ilmenau, Germany, in 2021. He has had an interest in loudspeaker technology since his teenage years and is now a Master student in Media Technology at the same university.


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

MPEG-H Audio production workflows for a Next Generation Audio experience in broadcast, streaming and music
MPEG-H Audio is a Next Generation Audio (NGA) system offering a new audio experience for the audience in various applications: Object-based immersive sound delivers a new degree of realism and artistic freedom for immersive music applications, such as the 360 Reality Audio music service. Advanced interactivity options enable improved personalization and accessibility, including solutions to create object-based features out of legacy material, e.g. deep-learning-based dialogue enhancement. 'Universal delivery' allows for optimal rendering of one production over all kinds of devices and various distribution ways, e.g. broadcast or streaming. All of these new features are achieved by adding metadata to the audio, which is defined during production and offers content providers flexible control of interaction and rendering options. Thus, new possibilities and requirements during the production process are imposed. In this paper, examples of state-of-the-art NGA production workflows are detailed and discussed, with special focus on immersive music, broadcast, and accessibility.

Speakers
YG

Yannik Grewe

Fraunhofer Institute for Integrated Circuits IIS
PE

Philipp Eibl

Group Manager Media Production Tools, Fraunhofer Institute for Integrated Circuits IIS
CS

Christian Simon

Fraunhofer Institute for Integrated Circuits IIS
MT

Matteo Torcoli

Fraunhofer Institute for Integrated Circuits IIS
avatar for Daniela Rieger

Daniela Rieger

Research Associate / Sound Engineer, Fraunhofer Institute for Integrated Circuits IIS
avatar for Ulli Scuda

Ulli Scuda

Fraunhofer Institute for Integrated Circuits IIS


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Perceptual evaluation of a new, portable three-dimensional recording technique: ``W-Ambisonics''
In order to exploit strengths and avoid weaknesses of the First Order Ambisonics (FOA) microphone technique, we devised a new, portable 3D microphone recording technique (``W-Ambisonics''). This new technique incorporates a stereo cardioid microphone pair (for frontal information) with two FOA microphone arrays (for lateral, rear, and height information). The design focus of this technique was ``accessibility'' in the recording stage and ``scalability'' in the reproduction stage. Our proposed portable 3D recording technique enables audio reproduction over multiple configurations including immersive platforms.
First, we evaluated lateral localization of the proposed method compared with a conventional 5-channel surround microphone technique. Second, we devised a new binauralization method utilizing two interaural-distant FOA microphone arrays for headphone-based reproduction. Each FOA microphone array renders precise spherical harmonics at each ear position. Lastly, we made a solo piano recording using 5-channel surround technique, 7-channel immersive technique (5-channel plus two height channels), and the ``W-Ambisonic'' array, and subsequently conducted subjective listening evaluations of the sound qualities of the two techniques. The results of our study show that (1) the ``W-Ambisonics'' method enables improved lateral localization over the conventional spaced array technique; (2) the binauralized headphone translation from the ``W-Ambisonics'' recording provided spacious yet precise sound images in listening evaluations; and (3) the ``W-Ambisonics'' recording produced comparable sound quality to the 7-channel recording technique for immersive experience of the concert hall. The proposed ``W-Ambisonics'' microphone technique is practical, precise, and scalable across multiple reproduction scenarios, from binaural to multichannel systems.

Speakers
avatar for Doyuen Ko

Doyuen Ko

Associate Professor, Belmont University
Dr. Doyuen Ko is an Associate Professor of Audio Engineering Technology at Belmont University in Nashville, Tennessee. He received his Ph.D. and Master of Music from the Sound Recording Department at McGill University, Canada. Before studying at McGill, he has worked as a sound designer... Read More →
LX

Lu Xuan

Rochester Institute of Technology
avatar for Sungyoung Kim

Sungyoung Kim

Rochester Institute of Technology
MK

Miriam Kolar

Department of Music, Amherst College, Amherst, MA 01002, USA


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Power Efficiency of Capacitive MEMS Speakers
Recently introduced MEMS micro speakers target in-ear audio applications such as true wireless earbuds, hearables and hearing aids. A key requirement in these ultra-mobile applications is a high energy efficiency of the components. While wireless communication standards continue to improve and offer low power profiles, audio components remain at the same level in absolute terms, but the share of the total energy budget increases. MEMS speakers, seeking a competitive advantage in the micro speaker market, claim to be more efficient than voice coil speakers, whose efficiency is limited by resistive and magnetization losses. The most advanced approaches to MEMS speakers are based on capacitive transducer technology. Therefore, these speakers can be modeled - to a first approximation - as capacitors and therefore offer negligible resistive losses but introduce a high component of reactive power. This reactive power must be provided by an electronic circuit which has a limited efficiency. Thus, it is desirable to minimize the required reactive power by the transducer. In this contribution, we discuss some fundamental energetic relations that such electroacoustic transducers are subject to, and draw some implications for the requirements of a transducer that can operate at high efficiency.

Speakers
LE

Lutz Ehrig

Arioso Systems GmbH
HS

Hermann Schenk

Arioso Systems GmbH
JM

Jorge Mario Monsalve Guaracao

Fraunhofer Institute for Photonic Microsystems
AM

Anton Melnikov

Fraunhofer Institute for Photonic Microsystems


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Production Tools for the MPEG-H Audio System
Next Generation Audio Systems, such as MPEG-H Audio, rely on metadata to enable a wide variety of features. Information such as channel layouts, the position and properties of audio objects or user interactivity options are only some of the data that can be used to improve consumer experience.
Creating these metadata requires suitable tools, which are used in a process known as "authoring", where interactive features and the options for 3D immersive sound rendering are defined by the content creator.
Different types of productions each impose their own requirements on these authoring tools, which lead to a number of solutions appearing in the market. Using the example of MPEG-H Audio, this paper will detail some of the latest developments and authoring solutions designed to enable immersive and interactive live-and post productions.

Speakers
YG

Yannik Grewe

Fraunhofer Institute for Integrated Circuits IIS
PE

Philipp Eibl

Group Manager Media Production Tools, Fraunhofer Institute for Integrated Circuits IIS
avatar for Daniela Rieger

Daniela Rieger

Research Associate / Sound Engineer, Fraunhofer Institute for Integrated Circuits IIS
avatar for Ulli Scuda

Ulli Scuda

Fraunhofer Institute for Integrated Circuits IIS


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Quantifying HRTF Time and Spectral Precision using Interaural Transfer Function
Generating auditory localization cues is an important feature in spatial audio processing engines that help generate a sense of plausibility of virtual sounds to the user, especially in XR applications (VR/AR/MR). Algorithmic approaches have been proposed to quantify the engine’s ability to reproduce interaural level difference (ILD) cues through regression and statistical methods, which provides a useful standardization and automation method to estimate the localization accuracy potential of a spatial audio engine. This engineering brief builds off this approach to include interaural time difference (ITD) cues as part of the analysis through the use of the interaural transfer function (ITF). The use of the ITF, comprising both ILD and ITD information, is demonstrated and discussed. Even though this approach can substitute critical listening studies as an evaluation method, it has not been validated through comparisons with localization user studies. This brief concludes with a review of listening studies that can be used to gain confidence in this algorithmic approach to measure localization accuracy potential in a spatial audio engine.

Speakers
JM

Justin Mathew

Magic Leap
LJ

Lukasz Januszkiewicz

Magic Leap/SoftServe Inc.
MP

Maria Pensko

Magic Leap/SoftServe Inc.
RA

Remi Audray

Facebook Reality Labs


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Real-time Implementation of the Spectral Division Method for Binaural Personal Audio Delivery with Head Tracking
A framework for implementing the Spectral Division Method (SDM) in real-time for delivering binaural personal audio to multiple listeners with head tracking is presented. The SDM, as an analytical approach for sound field reproduction, has been applied to generating personal audio filters that create acoustically bright and dark zones. However, only the case of static listening positions has been investigated. In realistic situations, the performance of such personal audio delivery systems will degrade significantly when the listeners move out of the "sweet spots". In order to achieve dynamic personal audio delivery that compensates for listeners' head movements, the SDM-based filters are updated in real-time through simple multiplications in the wavenumber domain, by utilizing the shifting theorem of the spatial Fourier transform along the x-axis. Furthermore, by selecting two spatial window functions targeted at two ears, the generated filters are able to deliver separate binaural personal audio to multiple listeners. The proposed framework offers an intuitive and efficient solution for binaural personal audio delivery with head tracking, at a moderate computation cost.

Speakers
avatar for Yue Qiao

Yue Qiao

PhD candidate, Princeton University
Yue is a fifth-year Ph.D. candidate at Princeton University’s 3D3A Lab, where he focused on personal sound zone rendering and spatial audio reproduction over loudspeakers. Since his undergraduate study, Yue has contributed to research projects on spatial audio, including Ambisonics... Read More →
EC

Edgar Choueiri

Princeton University


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Recreating complex soundscapes for audio quality evaluation
A method for creating complex soundscapes for tasks related to audio quality evaluation is presented. This approach uses an ambisonics-derived basis for recreating dynamic noise that follows ETSI standard EG 202 396-1 for background noise reproduction. Recordings were captured with the mh acoustics Eigenmike spherical microphone array and processed to match the two-dimensional four-loudspeaker array by creating four directional beams, each feeding an individual channel. As a result, a spatial background noise ambience is recreated, preserving the transient characteristics of the original recording.

Speakers
DK

Damian Koszewski

Intel Technology Poland
JB

Jan Banas

Intel Technology Poland
PM

Przemyslaw Maziewski

Intel Technology Poland
PT

Pawel Trella

Intel Technology Poland
PP

Pawel Pach

Intel Technology Poland
PK

Piotr Klinke

Intel Technology Poland
DS

Dominik Stanczak

Intel Technology Poland
MK

Maciej Kuklinowski

Intel Technology Poland


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Subjective Assessment of Multichannel Audio on a Tablet Computer
Handheld electronic devices like tablet computers are commonly used for the playback and streaming of music. With the growing popularity of multichannel and immersive audio technologies, it is important to know if they offer any improvement over traditional stereo and mono in terms of audio quality and user experience on such devices. This paper shows the results of four MUSHRA based listening tests that were conducted for the subjective assessment of multichannel audio versus stereo and mono while played back on a tablet computer with two different sets of headphones. BAQ (basic audio quality) and QoE (quality of experience) were the attributes measured. The results show that multichannel audio out performs stereo and mono for both the attributes and a repeated measures ANOVA (analysis of variance) also confirms that the audio format has a large bearing on the results. Though the use of different headphones does change the user ratings, the consolidated results for each test follow a similar trend.

Speakers
avatar for Joshua Reiss

Joshua Reiss

Queen Mary University of London
Josh Reiss is a Professor with the Centre for Digital Music at Queen Mary University of London. He has published more than 200 scientific papers, and co-authored the book Intelligent Music Production, and textbook Audio Effects: Theory, Implementation and Application. He is the President-Elect... Read More →
FT

Fesal Toosy

University of Central Punjab
MS

Muhammad Sarwar Ehsan

University of Central Punjab


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Transdisciplinarity in Sound Design and Music Composition for Film Post-Production: an Experiential Remote Learning Case Study
This e-brief presents a case study of a one-semester transdisciplinary experience involving three courses into the Herberger Institute for Design and the Arts at Arizona State University. Participating students from the Film and Media Post-Production Capstone II and Advanced Sound Design courses in the Sidney Poitier New American Film School’s undergraduate program, and a Music Composition course in the School of Music, Dance and Theatre, experienced during the Spring semester of the 2020-2021 academic year a completely remote Post Production process of approximately twenty, approx 10-minute long short films, showcased at a virtual live event on April 30th and May 1st, 2021, due to the Covid-19 pandemic. On a student-centric constructivist standpoint, inspired by Paulo Freire’s pedagogy approach of raising students to be subjects and not objects of the world, the experience explores remote learning tools to develop not only hands-on immersion into techniques and workflows, but most importantly agency ecosystems where students can explore film departments’ power dynamics and develop interpersonal skills aside to a diverse set of cultural and socio-political mindfulness. Focusing on sharing the strategies and findings of a work in progress, this initial study aims for further developments and to establish hybrid remote and in-person experiential frameworks for the transdisciplinary engagement of sound and music in the visual arts.

Speakers
avatar for Rodrigo Meirelles

Rodrigo Meirelles

Professor - Sound Design, Arizona State University. The Sidney Poitier New American Film School.
Rodrigo Meirelles is an awarded multidisciplinary professional passionate about sound, music and education. He built his career combining his skills and passions, what gave him a unique background and an international reputation as an educator and audio engineer.      He started... Read More →
FN

Fernanda Navarro

Arizona State University. School of Music, Dance and Theatre.
JC

Janaki Cedanna

Arizona State University. The Sidney Poitier New American Film School.


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Using Ambisonic Microphone as Spot Microphones in an Ensemble Recording
Ambisonic microphones have provided a flexible and eloquent method to capture surround sound with a compact microphone array. However, tetrahedral microphones are typically applied to free-field capture applications paired with traditional mono or stereo spot microphones, where enhanced control over individual sound source balancing and timbre is required. This paper investigates the use of tetrahedral microphones as a versatile spot microphone technique that can render both direct sound and supporting room impressions for individual sound sources. These multiple perspectives can then be mixed together in a surround environment for great effect. Multiple recordings were made at NYU in the summer of 2021 to explore these techniques. A jazz quartet was recorded using three separate miking systems; an ambisonic-only system, an ambisonic and omnidirectional coincidental spot system, and a “traditional” jazz recording system. In this paper, the techniques we used will be explained and evaluated based on discrete ATMOS mixes with each system.

Speakers
PG

Paul Geluso

New York University
KC

Kerri Chandler

New York University | University of Trinidad & Tobago
CM

Cale Michaels Bonderman

New York University


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

WaveBeat: End-to-end beat and downbeat tracking in the time domain
While deep learning approaches for beat and downbeat tracking have brought advancements, these approaches continue to rely on hand-crafted, subsampled spectral features as input, restricting the information available to the model. In this work, we propose WaveBeat, an end-to-end approach for joint beat and downbeat tracking operating directly in the time domain. This method forgoes engineered spectral features, and instead produces beat and downbeat predictions directly from the waveform, the first of its kind for this task. Our model utilizes temporal convolutional networks (TCNs) operating on waveforms that achieve a very large receptive field (~30s) at audio sample rates. This is achieved in a memory efficient manner by employing rapidly growing dilation factors, which enable a relatively shallower network architecture. Combined with a straightforward data augmentation strategy, our method outperforms previous state-of-the-art methods on some datasets, while producing comparable results on others.

Speakers
avatar for Christian Steinmetz

Christian Steinmetz

PhD Researcher, Centre for Digital Music, Queen Mary University of London
PhD Researcher
avatar for Joshua Reiss

Joshua Reiss

Queen Mary University of London
Josh Reiss is a Professor with the Centre for Digital Music at Queen Mary University of London. He has published more than 200 scientific papers, and co-authored the book Intelligent Music Production, and textbook Audio Effects: Theory, Implementation and Application. He is the President-Elect... Read More →


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

3D Impulse Response Convolution with Multichannel Direct Sound: Assessing Perceptual Equivalency between Room- and Source- Impression for Music Production
A method for representing the three-dimensional radiation patterns of instruments/performers within artificial reverberation using multichannel direct sound files convolved with channel-based spatial room impulse responses (SRIRs) is presented. Two reverb conditions are studied in a controlled listening test: a) all SRIR channel positions are convolved with a single monophonic direct sound file, and b) each SRIR channel position is convolved with a unique direct sound file taken from a microphone array surrounding the performer. Participants were asked to adjust the level of each reverberation condition (relative to a fixed direct sound stream) to three perceptual thresholds relating to source- and room- impression. Results of separate three-way within-subject ANOVAs and post-hoc analysis show significant interactions between instrument / room type, and instrument / reverb condition on each of the three thresholds. Most notably, reverb condition b) required less level than condition a) to yield perceptual equivalency between source- and room- impression, suggesting that the inclusion of multichannel direct sound in SRIR convolution may increase the salience of room impression in the immersive reproduction of acoustic music.

Speakers
avatar for Jack Kelly

Jack Kelly

McGill University
Jack Kelly is. Ph.D. candidate at the Schulich School of Music, McGill University. His thesis research centers on the influence of spatial room impulse response convolution technologies (channel-based and HOA arrays) on the sensation of physical presence in immersive music production. He... Read More →
avatar for Richard King

Richard King

McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
WW

Wieslaw Woszczyk

McGill University


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

A Neural Beamforming Front-end for Distributed Microphone Arrays
Robust real-time audio signal enhancement increasingly relies on multichannel microphone arrays for signal acquisition. Sophisticated beamforming algorithms have been developed to maximize the benefit of multiple microphones. With the recent success of deep learning models created for audio signal processing, the task of Neural Beamforming remains an open research topic. This paper presents a Neural Beamformer architecture capable of performing spatial beamforming with microphones randomly distributed over very large areas, even in negative signal-to-noise ratio environments with multiple noise sources and reverberation. The proposed method combines adaptive, nonlinear filtering and the computation of spatial relations with state-of-the-art mask estimation networks. The resulting End-to-End network architecture is fully differentiable and provides excellent signal separation performance. Combining a small number of principal building blocks, the method is capable of low-latency, domain-specific signal enhancement even in challenging environments.

Speakers
JZ

Jonathan Ziegler

Stuttgart Media University
LS

Leon Schröder

Stuttgart Media University
AK

Andreas Koch

HdM Stuttgart
AS

Andreas Schilling

Eberhard Karls University Tuebingen


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

AI 3D immersive audio codec based on content-adaptive dynamic down-mixing and up-mixing framework
Recently, people who prefer to consume media contents via over the top (OTT) platform, such as YouTube, Netflix etc., rather than a conventional broadcasting get increased more and more. To deliver an immersive audio experience to them more effectively, we propose a unified framework for AI-based 3D immersive audio codec. In this framework, to maximize the original immersiveness even at a down-mixed audio, while enabling to precisely reproduce the original 3D audio from the down-mixed audio, content-adaptive dynamic down-mixing and up-mixing scheme is newly proposed. The experimental results show that the proposed framework can render more improved down-mixed audio compared to the conventional method as well as successfully reproduce the original 3D audio.

Speakers
avatar for Woo Hyun Nam

Woo Hyun Nam

Principal Engineer, Samsung Research, Samsung Electronics
Woo Hyun Nam received the Ph.D. degree in electrical engineering from the Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea in 2013. Since 2013, he has been with the Samsung Research, Samsung Electronics, where he is a Principal Engineer and is currently leading... Read More →
TL

Tammy Lee

Samsung Research, Samsung Electronics
SC

Sang Chul Ko

Samsung Research, Samsung Electronics
YS

Yoonjae Son

Samsung Research, Samsung Electronics
HK

Hyun Kwon Chung

Samsung Research, Samsung Electronics
KK

Kyung-Rae Kim

Samsung Research, Samsung Electronics
JK

Jungkyu Kim

Samsung Research, Samsung Electronics
SH

Sunghee Hwang

Samsung Research, Samsung Electronics
KL

Kyunggeun Lee

Samsung Research, Samsung Electronics


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Analysis of a Unique Pingable Circuit: The Gamelan Resonator
This paper offers a study of the circuits developed by artist Paul DeMarinis for the touring version of his work Pygmy Gamelan. Each of the six copies of the original circuit, developed June-July 1973, produce a carefully tuned and unique five-tone scale. These are obtained by five resonator circuits which pitch pings produced by a crude antenna fed into clocked bit-shift registers. While this resonator circuit may seem related to classic Bridged-T and Twin-T designs, common in analog drum machines, DeMarinis' work actually presents a unique and previously undocumented variation on those canonical circuits. We present an analysis of his third-order resonator (which we name the Gamelan Resonator), deriving its transfer function, time domain response, poles, and zeros. This model enables us to do two things: first, based on recordings of one of the copies, we can deduce which standard resistor and capacitor values DeMarinis is likely to have used in that specific copy, since DeMarinis' schematic purposefully omits these details to reflect their variability. Second, we can better understand what makes this filter unique. We conclude by outlining future projects which build on the present findings for technical development.

Speakers
EJ

Ezra J. Teboul

Paris
Historian of electronic music technology, its users, and its makers.CHSTM sound and technology group co-convener:https://www.chstm.org/content/sound-and-technology
avatar for Kurt James Werner

Kurt James Werner

Research Engineer, iZotope, Inc.


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Application of AI techniques for nonlinear control of loudspeakers
To obtain high loudness, with good bass extension while keeping distortion low, and ensure mechanical protection, one needs to control accurately the motion of the loudspeaker diaphragm. Actual solutions for nonlinear control of loudspeakers are complex, difficult to implement and to tune. They are limited in accuracy due to insufficient physical models, that do not completely capture the complexity of the loudspeaker. Furthermore, the physical model parameters are difficult to estimate.
We present here a novel approach that uses a Neural Network to map directly the diaphragm displacement to the input voltage, allowing us to invert the loudspeaker. This technique allows to control and linearize the loudspeaker without theoretical assumptions and with better accuracy than a model-based approach. It is also simpler to implement.

Speakers
YL

Yuan Li

Senior Engineer, Samsung Research America
avatar for Pascal Brunet

Pascal Brunet

Dir. Research, Samsung Research America
Pascal Brunet obtained his Bachelor's in Sound Engineering from Ecole Louis Lumiere, Paris, in 1981, his Master's in Electrical Engineering from CNAM, Paris, in 1989 and a PhD degree in EE from Northeastern University, Boston, in 2014. His thesis was on nonlinear modeling of loudspeakers... Read More →
GK

Glenn Kubota

Samsung Research America
AM

Aaquila Mariajohn

Samsung Research America


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Audio-Source Rendering on Flat-Panel Loudspeakers with Non-Uniform Boundary Conditions
Devices from smartphones to televisions are beginning to employ dual purpose displays, where the display serves as both a video screen and a loudspeaker. In this paper we demonstrate a method to generate localized sound-radiating regions on a flat-panel display. An array of force actuators affixed to the back of the panel is driven by appropriately filtered audio signals so the total response of the panel due to the actuator array approximates a target spatial acceleration profile. The response of the panel to each actuator individually is initially measured via a laser vibrometer, and the required actuator filters for each source position are determined by an optimization procedure that minimizes the mean squared error between the reconstructed and targeted acceleration profiles. Since the single-actuator panel responses are determined empirically, the method does not require analytical or numerical models of the system’s modal response, and thus is well-suited to panels having the complex boundary conditions typical of television screens, mobile devices, and tablets. The method is demonstrated on two panels with differing boundary conditions. When integrated with display technology, the localized audio source rendering method may transform traditional displays into multimodal audio-visual interfaces by colocating localized audio sources and objects in the video stream.

Speakers
MH

Michael Heilemann

Assistant Professor, University of Rochester
avatar for Tre DiPassio

Tre DiPassio

PhD Student, University of Rochester
Hello! My name is Tre, and I am in my final semester as a PhD student studying musical acoustics and signal processing under the supervision of Dr. Mark Bocko and Dr. Michael Heilemann. The research lab I am a part of has been developing an emerging type of speaker, called a flat... Read More →
MB

Mark Bocko

University of Rochester


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Automatic Loudspeaker Room Equalization Based On Sound Field Estimation with Artificial Intelligence Models
In-room loudspeaker equalization requires a significant amount of microphone positions in order to characterize the sound field in the room. This can be a cumbersome task for the user. This paper proposes the use of artificial intelligence to automatically estimate and equalize, without user interaction, the in-room response. To learn the relationship between loudspeaker near-field response and total sound power, or energy average over the listening area, a neural network was trained using room measurement data. Loudspeaker near-field SPL at discrete frequencies was the input data to the neural network. The approach has been tested in a subwoofer, a full-range loudspeaker, and a TV. Results showed that the in-room sound field can be estimated within 1--2 dB average standard deviation.

Speakers
avatar for Adrian Celestinos

Adrian Celestinos

Samsung Research America
YL

Yuan Li

Senior Engineer, Samsung Research America
VM

Victor Manuel Chin Lopez

Samsung Research Tijuana


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Best Paper
The livestream can be viewed HERE. Please note you will need to log into aesshow.com using your AES credentials.

Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Binaural Audio Externalization Processing
Headphone or earbud listening scenarios span from the home or office to mobile and automotive environments, with audio source content formats including two-channel stereo, multi-channel surround, immersive or object-based material. Post-processing methods have been developed with the intent of restoring, during headphone playback, the spatial audio cues experienced in natural or loudspeaker listening, remediating known effects of headphone-mediated audio reproduction: the perceived localization of sounds in or near the head, accompanied by timbre or balance distortions and spatial image blurring or warping. The intended benefits include alleviating listening fatigue and cognitive load. In this E-Brief presentation, we review previously reported binaural audio post-processing methods and consider a strategy emphasizing minimal signal modification, applicable to enhancing conventionally produced stereo recordings.

This is a work-in-progress report on an investigation that we plan to report on in a future paper. The slides and audio demonstrations are posted at izotope.com/tech/aes_extern.

Speakers
avatar for Jean-Marc Jot

Jean-Marc Jot

Founder and Principal, Virtuel Works LLC
Spatial audio and music technology expert and innovator. Virtuel Works provides audio technology strategy, IP creation and licensing services to help accelerate the development of audio and music spatial computing technology and interoperability solutions.
avatar for Alexey Lukin

Alexey Lukin

Prinicipal DSP Engineer, iZotope Inc
Alexey specializes in audio signal processing, with particular interest in similarities with image processing in spectral analysis, noise reduction, and multiresolution filter banks. He earned his M.S. (2003) and Ph.D. (2006) in computer science from Lomonosov Moscow State University... Read More →
avatar for Kurt James Werner

Kurt James Werner

Research Engineer, iZotope, Inc.
EA

Evan Allen

iZotope, Inc.


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Bit Rate Requirements for an Audio Codec for Stereo, Surround and Immersive Formats
This paper describes a comprehensive study on the sound quality of the Opus codec for stereo, surround and immersive audio formats for music and cinematic content. We conducted three listening tests on Opus encoded stereo, 5.1 and 7.1.4 test samples taken from music, cinematic and EBU files encoded at bit rates of 32, 48 and 64 kbps per channel. Preliminary results indicate that a bit rate of 64 kbps per channel or higher is required for stereo, but 48 kbps per channel may be sufficient for surround and immersive audio formats.

Speakers
avatar for Sunil G. Bharitkar

Sunil G. Bharitkar

Samsung Research America
avatar for Allan Devantier

Allan Devantier

Samsung Research America
CT

Carlos Tejeda-Ocampo

Samsung Research Tijuana
CZ

Carren Zhongran Wang

Samsung Research America
WS

Will Saba

Samsung Research America


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Comparison of different techniques for recording and postproduction using main-microphone arrays for binaural reproduction.
We present a subjective evaluation of various 3D main-microphone techniques for three-dimensional binaural music production. Forty-seven subjects participated in the survey, listening on headphones. Results suggest that ESMA-3D, followed by Decca tree with height, work best of the included 3D arrays. However, the dummy head and a stereo AB microphone performed as well if not better than any of the arrays. Though not implemented for this study, our workflow allows the possibility to include individualized HRTF's and head-tracking; their impact will be considered in a future study.

Speakers
avatar for Josua Dillier

Josua Dillier

Zürcher Hochschule der Künste ZHdK
Josua Dillier is a young audio engineer and producer living in Zurich, Switzerland. His works range from CD- or Videoproduction to live mixing. He is specialized in the recording of acoustic instruments.Before his studies as a Tonmeister at University of the Arts Zurich he studied... Read More →
HJ

Hanna Järveläinen

Zürcher Hochschule der Künste ZHdK


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Deconvolution of Room Impulse Responses from Simultaneous Excitation of Loudspeakers
Traditional room-equalization involves exciting one loudspeaker at a time and deconvolving the loudspeaker-room response from the recording. As the number of loudspeakers and positions increase, the time required to measure loudspeaker-room responses will increase. In this paper, we present a technique to deconvolve impulse responses after exciting all loudspeakers at the same time. The stimuli are shifted relative to a base-stimuli and are optionally pre-processed with arbitrary filters to create specific sounding signals. The stimuli shift ensures capture of the low-frequency reverberation tail after deconvolution. Various deconvolution techniques including correlation-based, and adaptive filter-based, are presented. The performance is characterized in terms of plots and objective metrics using responses from the Multichannel Acoustic Reverberation Dataset (MARDY) dataset.

Speakers
avatar for Sunil G. Bharitkar

Sunil G. Bharitkar

Samsung Research America


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Defining reverberation plugin structure: A comparative exploration of system design and expert knowledge in an audio education context
Reverberation plugin designs differ significantly between manufacturers. The use of abstract terminology, individually stylised interfaces, and a manufacturers preferred lexicon increases complexity and decreases skill transference for novice users. Two studies were undertaken to explore the degree of complexity within the reverberation domain. In study one, the extent of both lexical and functional aspects of 46 reverberation plugins were examined through in-vivo coding of manufacturer documentation. From this, parameter labels were identified and inducted into nine higher level categories based on function. In study two, a free elicitation task was undertaken by seven experienced reverberation plugin users. This study identified the most salient parameters within their underlying knowledge structures, allowing the overlap between system and user to be viewed. The results from both studies establish the lexicon used within existing reverberation plugins, and the breadth of parameters discovered suggests that recognising and understanding parameters across designs may be challenging for novice users. The findings also provide an overview of the reverberation domain whilst highlighting the core parameters identified by expert users. This data could potentially act as the basis for a novice training system.

Speakers
avatar for Kevin Garland

Kevin Garland

PhD Researcher, TUS
Kevin Garland is a Postgraduate PhD Researcher at the Technological University of the Shannon: Midlands Midwest (TUS), Ireland. His primary research interests include human-computer interaction, user-centered design, and audio technology. Current research lies in user modelling and... Read More →


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Effect of Flicker Noise on Audio Signal Reproduction
The effect of multiplicative flicker noise superimposed on audio equipment on the sense of hearing was considered. Variable resistors used for volumes generate flicker noise, which indicates that it acts multiplicatively on the signal flowing through it. Flicker noise measurements were made for some variable resistors. In addition, the audition test was conducted to investigate the perceptible magnitude of the case where the flicker noise acts on the signal in a multiplicative manner. As a result, it was concluded that untrained individuals rarely could discern the multiplicative effect of volume flicker noise.

Speakers
AY

Akihiko Yoneya

Nagoya Institute of Technology


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Effects of Near-field Sources on Ambisonics Recording and Playback
Ambisonic recording with spherical microphone arrays (SMAs) is based on a far-field assumption which determines how microphone signals are encoded into Ambisonic signals. In the presence of a near-field source, low-frequency distance-dependent boosts arise in SMAs in similar nature to proximity effects in far-field equalized directional microphones. In this study, the effects of near-field sources on Ambisonic signals are modelled analytically, their interaction with regularization stages is observed, and then traced further across to two basic ambisonic processing operations: virtual microphones, and binaural decoding.

Speakers
avatar for Raimundo Gonzalez

Raimundo Gonzalez

Post-Doctoral Researcher, Aalto University
AP

Archontis Politis

Audio & Speech Processing Group, Tampere University of Technology
TL

Tapio Lokki

Department of Signal Processing and Acoustics, Aalto University


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Evaluating the Relationship Between Kurtosis Loss and Spectral Insertion Loss for Musicians' Hearing Protection Devices
Hearing protection devices (HPDs) are essential for musicians during loud performances to avoid hearing damage, but the standard Noise Reduction Rating (NRR) performance metric for HPDs metric says little about their behavior in a musical setting. One analysis tool being used to evaluate HPDs in the noise exposure research community is kurtosis measured in the ear and the reduction of noise kurtosis through an HPD. A musical signal, especially live music, will often have a high crest factor and kurtosis, so evaluating kurtosis loss will be important for an objective evaluation of musicians' HPDs. In this paper, a background on kurtosis and filters affecting kurtosis is described, as well as a setup for generating high-kurtosis signals and measuring in-ear kurtosis loss through an HPD. Measurement results on a variety of musicians' HPDs show that 83% of devices measured strongly reduce kurtosis, and that the kurtosis loss is likely an independent metric for performance because it is not correlated to the mean or standard deviation of the spectral insertion loss.

Speakers
DA

David Anderson

Applied Research Associates
TA

Theodore Argo

Applied Research Associates


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Forensic Handling of User Generated Audio Recordings
User generated recordings (UGRs) are common in audio forensic examination. The prevalence of handheld private recording devices, stationary doorbell cameras, law enforcement body cameras, and other systems capable of creating UGRs at public incidents is only expected to increase with the development of new and less expensive recording technology. It is increasingly likely that an audio forensic examiner will have to deal with an ad hoc collection of unsynchronized UGRs from mobile and stationary audio recording devices. The examiner’s tasks will include proper time synchronization, deducing microphone positions, and reducing the presence of competing sound sources and noise. We propose a standard forensic methodology for handling UGRs, including best practices for assessing authenticity and timeline synchronization.

Speakers
avatar for Rob Maher

Rob Maher

Professor, Montana State University
Audio digital signal processing, audio forensics, music analysis and synthesis.
BM

Benjamin Miller

Montana State University
FR

Fraser Robertson

Montana State University


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Gunshot Detection Systems: Methods, Challenges, and Can they be Trusted?
Many communities which are experiencing increased gun violence are turning to acoustic gunshot detection systems (GSDS) with the hope that their deployment would provide increased 24/7 monitoring and the potential for more rapid response by law enforcement to the scene. In addition to real-time monitoring, data collected by gunshot detection systems have been used alongside witness testimonies in criminal prosecutions. Because of their potential benefit, it would be appropriate to ask– how effective are GSDS in both lab/controlled settings vs. deployed real-world city scenarios? How reliable are outputs produced by GSDS? What is system performance
trade-off in gunshot detection vs. source localization of the gunshot? Should they be used only for early alerts or can they be relied upon in courtroom settings? Are resources spent on GSDS operational costs well utilized or could these resources be better invested to improve community safety? This study does not attempt to address many of these questions including social or economic questions of GSDS, but provides a reflective survey of hardware and algorithmic operations of the technology to better understand its potential as well as limitations. Specifically, challenges are discussed regarding environmental and other mismatch conditions, as well as emphasis on validation procedures used and their expected reliability. Many concepts discussed in this paper are general and will be likely utilized in or have impact on any gunshot detection technology. For this study, we refer to the ShotSpotter system to provide specific examples of system infrastructure and validation procedures.

Speakers
JH

John Hansen

Center for Robust Speech Systems; The University of Texas at Dallas
HB

Hynek Boril

University of Wisconsin - Platteville


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Implementing and Evaluating a Higher-order Ambisonic Sound System in a Multi-purpose Facility: A Lab Report
Although Ambisonic sound reproduction has an extensive history, it started finding more widespread use in the past decade due to the advances in computer hardware that enable real-time encoding and decoding of Ambisonic sound fields, availability of user-friendly software that facilitate the rendering of such sound fields, and recent developments in immersive media technologies, such as AR and VR systems, that prompt new research into spatial audio. In this paper, we discuss the design, implementation, and evaluation of a third-order Ambisonic system in an academic facility that is built to serve a range of functions including instruction, research, and artistic performances. Due to the multi-purpose nature of this space, there are numerous limitations to consider when designing an Ambisonic sound system that can operate efficiently without interfering with the variety of activities regularly carried out in it. We discuss our approach to working around such limitations and evaluating the resulting system. To that end, we present a user study conducted to assess the performance of this system in terms of perceived spatial accuracy. Based on the growing number of such facilities around the world, we believe that the design and evaluation methods presented here can be of use in the implementation of spatial audio systems in similar multi-purpose environments.

Speakers
avatar for Anıl Çamcı

Anıl Çamcı

Associate Professor of Performing Arts Technology, University of Michigan
SS

Sam Smith

University of Michigan
SH

Seth Helman

University of Michigan


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Informed postprocessing for auditory roughness removal for low-bitrate audio coders
In perceptual audio coding using very low bitrates, modulation artifacts can be introduced onto tonal signal components, which are often perceived as auditory roughness. These artifacts may occur for instance due to quantization errors or may be added when using an audio bandwidth extension, which sometimes causes an irregular harmonic structure at the borders of replicated bands. Especially, the roughness artifacts due to quantization errors are difficult to mitigate without investing considerably more bits in encoding of tonal components. We propose a novel technique to remove these roughness artifacts at the decoder side controlled by a small amount of guidance information transmitted by the encoder.

Speakers
SV

Steven Van De Par

Carl von Ossietzky University, Department of Medical Physics and Acoustics
SD

Sascha Disch

Fraunhofer IIS, Erlangen
AN

Andreas Niedermeier

Fraunhofer IIS, Erlangen
BE

Bernd Edler

Audiolabs Erlangen


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Interactive Application to Control and Rapid-prototype in a Collaborative Immersive Environment
Human-scale immersive environments offer rich, often interactive, experiences and their potential has been demonstrated across areas of research, teaching, and art. The variety of these spaces and their bespoke configurations leads to a requirement for content highly-tailored to individual environments and/or interfaces requiring complicated installations. These introduce hurdles which burden users with tedious and difficult learning curves, leaving less time for project development and rapid prototyping. This project demonstrates an interactive application to control and rapid-prototype within the CRAIVE-Lab at Rensselaer. Application Programming Interfaces (APIs) render complex functions of the immersive environment, such as audio spatialization, accessible via the Internet. A front-end interface configured to communicate with these APIs gives users simple and intuitive control over these functions from their personal devices (e.g. laptops, smartphones). While bespoke systems will often require bespoke solutions, this interface allows users to create content on day one, from their own devices, without set up, content-tailoring, or training. Three examples utilizing some or all of these functions are discussed.

Speakers
JB

Jonas Braasch

Professor, Rensselaer Polytechnic Institute
SC

Samuel Chabot

Rensselaer Polytechnic Institute


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Material models in loudspeakers using frictional elements
The compliance of a moving coil loudspeaker is known to depend on the level of the input signal. This effect is visible as a drop in resonance frequency. A nonlinear frictional element with hysteresis, and thus a level dependent compliance and damping, is added to the standard lumped parameter model. A comparison of simulation results and measurements reveals that the frictional model is able to explain the nonlinear behavior seen in the measurements.
The paper presents a scheme for fitting the model parameters to measured data. Results suggest that strong interaction between the frictional elements and the linear parameters is complicating this fitting, and strategies for solving this problem is presented and discussed

Speakers
RB

Rasmus Bølge Sørensen

Technical University of Denmark
FA

Finn Agerkvist

Technical University of Denmark


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Mayflower & The Seven Seas: Sonification of The Ocean
Created in conjunction with the Marine Institute at the University of Plymouth, the intention of this project was to use data transmitted by the on-board sensors of the Mayflower Autonomous Ship (MAS), to manipulate specially created pieces of music, based on sea shanties and folk ballads. Technical issues and Covid delays forced a late change, and the project was switched to using data from the university’s weather stations. This paper will illustrate how the music was produced and recorded, and the software configured to make the musical pieces vary in real-time, according to the changing sea conditions, so that the public will be able to view the current conditions and listen to the music evolve in real-time.

Speakers
ER

Eduardo Reck Miranda

University Of Plymouth
CM

Clive Mead

University Of Plymouth
DH

Dieter Hearle

University Of Plymouth


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Objective-oriented method for uniformation of various directivity representations
Over recent years, numerous attempts were taken to provide efficient methods of directivity representation, either regarding sound sources or head-related transfer functions. Because of the wide variety of programming tools and scripts used by different researchers, the resulting representations are inconvevnient to reproduce and compare with each other, hampering the development of the subject. Within this paper, an objective-oriented method is proposed to deal with this issue. The suggested approach bases on defining classes for different directivity models that share some general properties of directivity functions, allowing for easy comparison between different representations. A basic Matlab toolbox utlizing this method is presented alongside exemplary implementations of directivity models based on spherical and hyperspherical harmonics.

Speakers
AS

Adam Szwajcowski

AGH University of Science and Technology


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

On the comparison of flown and ground-stacked subwoofer configurations regarding noise pollution
In addition to audience experience and hearing health concerns, noise pollution issues are increasingly considered in large scale sound reinforcement for outdoor events. Among other factors, subwoofer positioning relative to the main system influences sound pressure levels at large distances, which may be considered as noise pollution.
In this paper, free field simulations are first performed showing that subwoofers positioning affects rear and side rejections but has a limited impact on noise level in front of the system. Then, the impact of wind on sound propagation at low frequencies is investigated. Simulation results show that the wind impacts more ground-stacked subwoofers than flown subwoofers, leading to higher sound levels downwind in the case of ground-stacked subwoofers.

Speakers
avatar for Etienne Corteel

Etienne Corteel

Director of Education & Scientific Outreach, global, L-Acoustics
Governing the scientific outreach strategy, Etienne and his team are the interface between L-Acoustics and the scientific and education communities. Their mission is to develop and maintain an education program tailor-made for the professional sound industry. Etienne also contributes... Read More →
avatar for Thomas Mouterde

Thomas Mouterde

Field application research engineer, L-Acoustics
Thomas Mouterde is a field application research engineer at L-Acoustics, a French manufacturer of loudspeakers, amplifiers, and signal processing devices. He is a member of the “Education and Scientific Outreach” department that aims at developing the education program of the... Read More →


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Overview and Comparison of Acoustical Characteristics in Three Historically Significant Nashville Recording Studios
Several key studios in Nashville, TN served as the focus for the creation of the recorded music experience known as “the Nashville Sound.” Recordings were notable for their songwriting style, musical arrangement, and the nature of the technical processes employed, including the specific recording spaces themselves. Three historically significant studios were selected as representative of this era. This study reviewed the historical background of the studios and investigated whether there may be similarities in these studios’ acoustical properties that resulted in a particular recording approach within these environments. Standard acoustic measurements were obtained and analysed in each of these three recording spaces.

Speakers
avatar for Doyuen Ko

Doyuen Ko

Associate Professor, Belmont University
Dr. Doyuen Ko is an Associate Professor of Audio Engineering Technology at Belmont University in Nashville, Tennessee. He received his Ph.D. and Master of Music from the Sound Recording Department at McGill University, Canada. Before studying at McGill, he has worked as a sound designer... Read More →
avatar for Jim Kaiser

Jim Kaiser

Belmont University
Jim Kaiser is an Instructor of Audio Engineering Technology at Belmont University in Nashville, TN.  He serves on the AES Technical Council, the Recording Academy Producers & Engineers Wing, and the Nashville Engineer Relief Fund Board.  Jim is a Past President of the International... Read More →
WB

Wesley Bulla

Belmont University


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Parametric Array Using Amplitude Modulated Pulse Trains: Experimental Evaluation of Beamforming and Single Sideband Modulation
We present a parametric array system realized with a microcontroller and MOSFET drivers. Pulse train signals with fundamental frequency of 40 kHz are generated by the microcontroller. The pulse trains are amplitude modulated by exploiting the switching mechanism of the MOSFETs. The higher-order harmonics are attenuated by the band-pass characteristic of the ultrasonic transducers, emitting only the carrier frequency and the sideband components. The sound beam can be steered by applying phase shifts to the pulse signals, which can be implemented by relatively inexpensive hardware. A new single sideband modulation is also introduced, where the upper sidebands of two double sideband modulation signals are acoustically cancelled. The proposed approaches for beamforming and single sideband modulation are evaluated by anechoic measurements.

Speakers
NH

Nara Hahn

Institute of Communications Engineering, University of Rostock
JA

Jens Ahrens

Division of Applied Acoustics, Chalmers University of Technology
CA

Carl Andersson

Division of Applied Acoustics, Chalmers University of Technology


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Phoneme Mappings for Online Vocal Percussion Transcription
Vocal Percussion Transcription (VPT) aims at detecting vocal percussion sound events in a beatboxing performance and classifying them into the correct drum instrument class (kick, snare, or hi-hat). To do this in an online (real-time) setting, however, algorithms are forced to classify these events within just a few milliseconds after they are detected. The purpose of this study was to investigate which phoneme-to-instrument mappings are the most robust for online transcription purposes. We used three different evaluation criteria to base our decision upon: frequency of use of phonemes among different performers, spectral similarity to reference drum sounds, and classification separability. With these criteria applied, the recommended mappings would potentially feel natural for performers to articulate while enabling the classification algorithms to achieve the best performance possible. Given the final results, we provided a detailed discussion on which phonemes to choose given different contexts and applications.

Speakers
AL

Alejandro Luezas

Roli / Queen Mary University of London
CS

Charalampos Saitis

Queen Mary University of London
MS

Mark Sandler

Queen Mary University of London


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Response clustering in loudspeaker radiation balloons
The measurement of the radiation balloon of a loudspeaker involves the acquisition of 2664 responses when acquired with 5º resolution in Theta and Phi angles, each response with magnitude and phase at a high number of frequencies which depends on the measurement spectral resolution. This large amount of information causes, many times, that its analysis is limited to certain frequencies and to certain planes (horizontal and vertical polar plots or isobars). In order to help to investigate radiation balloons, unsupervised machine learning data analysis tools have been applied to automatically group the loudspeaker responses that conforms a full balloon measurement according to their similarity, in order to extract meaningful patterns. Similar algorithms have also been applied to reduce the number of involved frequencies, keeping the same radiation information.

Speakers

Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Spatial auditory masking between real sound signals and virtual sound images
In augmented reality (AR) environment, audio signals of real world and virtual world are simultaneously presented to a listener. It is desirable that a virtual sound content and a real sound source do not interfere each other. In order to make it possible, we have examined spatial auditory masking between maskers and maskees, where maskers are real sound signals emitted from loudspeakers, and maskees are virtual sound images, generated by using head related transfer functions (HRTFs), emitted from headphones. Open-ear headphones were used for the experiment, which allow us to listen to the audio content while hearing the environmental sound. The results are very similar to those of the previous experiment [1, 2] where masker and maskee were both real signals emitted from loudspeakers. That is, with a given masker location, masking threshold levels as a function of maskee locations have symmetric property with respect to the frontal plane of a subject. Masking threshold level is, however, lowered than the previous experiment perhaps because of limitation of sound image localization by HRTFs. The results indicate that spatial auditory masking of human hearing occurs with virtually localized sound images in the same way as real sound signals.

Speakers
avatar for Masayuki Nishiguchi

Masayuki Nishiguchi

Professor, Akita Prefectural University
Masayuki Nishiguchi received his B.E., M.S., and Ph.D. degrees from Tokyo Institute of Technology, University of California Santa Barbara, and Tokyo Institute of Technology, in 1981, 1989, and 2006 respectively.  He was with Sony corporation from 1981 to 2015, where he was involved... Read More →
SI

Soma Ishihara

Akita Prefectural University
KW

Kanji Watanabe

Akita Prefectural University
KA

Koji Abe

Akita Prefectural University
ST

Shouichi Takane

Akita Prefectural University


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Synthesizing Reverberation Impulse Responses from Audio Signals: Auto-Reverberation and Interactive Environments
A method for creating reverberation impulse responses from a variety of audio source materials forms the basis of a family of novel reverberation effects. In auto-reverberation, segments of audio are selected and processed to form an evolving sequence of reverberation impulse responses that are applied to the original source material—that is, the audio is reverberating itself. In cross-reverberation, impulse responses derived from one audio track are applied to another audio track. The reverberation impulse responses are formed by summing randomly selected segments of the source audio, and imposing reverberation characteristics, including reverberation time and wet equalization. By controlling the number and timing of the selected source audio segments, the method produces an array of impulse responses that represent a trajectory through the source material. In so doing, the evolving impulse responses will have the character of room reverberation while also expressing the changing timbre and dynamics of the source audio. Processing architectures are described, and off-line and real-time virtual acoustic sound examples derived from the music of Bach and Dick Dale are presented.

Speakers
avatar for Eoin Callery

Eoin Callery

Irish World Academy of Music and Dance, University of Limerick
Eoin Callery is an Irish artist and researcher who develops electroacoustic systems relating to chamber music, performance space augmentation, and sound installation. This often involves exploring acoustic phenomena – especially feedback and virtual acoustics – in live situations... Read More →
JA

Jonathan Abel

CCRMA, Stanford University
KS

Kyle Spratt

Applied Research Laboratories, The University of Texas at Austin


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Teaching Modular Synth & Sound Design Online During COVID-19: Maximizing Learning Outcomes Through Open-source Software and Student-centered Pedagogy
This study introduces an inclusive and innovative online teaching pedagogy in sound design and modular synthesis using open-source software to achieve ideal student-centered learning outcomes and experience during the COVID-19 pandemic. This pedagogy proved to be effective after offering the course, conducting human subject research, and analyzing class evaluation data. The teaching strategies include comprehensive analysis in sound synthesis theory using sample patches, introduction to primary electronics, collaborative learning, hands-on lab experiments, student presentations, and alternative reading assignments in the form of educational videos. Online teaching software solutions were implemented to track student engagement. From a transformative perspective, the authors aim to cultivate student-centered learning, inclusive education, and equal opportunity in higher education in an online classroom setting. The goal is to achieve the same level of engagement as in-person classes, inspire a diverse student body, offer ample technical and mental support, as well as open the possibility of learning sound design through Eurorack modular synthesizers without investing money in expensive hardware. Students’ assignments, midterms, and final projects demonstrated their thorough understanding of the course material, strong motivation, and vibrant creativity. Human subject research was conducted during the course to improve the students’ learning experience and further shape the pedagogy. Three surveys and one-on-one interviews were given to a class of 25 students. The qualitative and quantitative data indicates the satisfaction and effectiveness of this student-centered learning pedagogy. Promoting social interaction and student well-being while teaching challenging topics during challenging times was also achieved.

Speakers
avatar for Jiayue Cecilia Wu

Jiayue Cecilia Wu

Assistant Professor, Graduate Program Director (MSRA), University of Colorado Denver
Originally from Beijing, Dr. Jiayue Cecilia Wu (AKA: 武小慈) is a scholar, composer, audio engineer, and multimedia technologist. Her work focuses on how technology can augment the healing power of music. She earned her Bachelor of Science degree in Design and Engineering in 2000... Read More →
AF

Ashell Fox

University of Colorado Denver


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

The effect of user's hands on mobile device frequency response, Part 1
First results of a study of the effects of the user’s hands on the frequency response and channel balance of a mobile phone hands-free loudspeaker are presented. The results show that the response variation caused by the user’s hands is high (up to 10 dB boost in narrow ranges) and highly user dependent, although general trends can be observed. The variation between users is strong especially above 5 kHz. The acoustical causes for the observed response shape are studied using a FEM model, indicating that especially the shape of the palm explains the observed features of the frequency responses. A conclusion of the results is that developing more realistic measurement methods is needed if more natural tonal balance is attempted in handheld devices.

Speakers
JB

Juha Backman

AAC Technologies
LV

Lauri Veko

AAC Technologies Solutions Finland Oy
YJ

Yuheng Jiang

AAC Technologies Holdings Inc


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

The influence of stage acoustics on the singers' performance and perception: A pilot study
It is known that musicians tend to adjust their performance to the acoustical properties of the hall as they perceive. In a large reverberant hall, for example, they may play staccato notes even shorter than they would in a less reverberant hall to make the music more clearly understandable by the audience. In this study, four singers were invited to sing two (slow and fast) pieces of music in three venues, of which the reverberation times were 0.3, 1.8, and 3.4 seconds. Singers were surveyed with questions regarding the tempo, intonation, resonance and diction of their performance in each venue. Also, the singing voice was recorded by using a headset microphone and analyzed to relate the audio features to the characteristics of the venues. The results showed that the singers’ perception of the vocal resonance was significantly related to the venue (p=0.024), and so were the average sound level and the dynamic range of the sound level (p=0.040 for both dependent variables), which could partly be explained in relation to the reverberation time.

Speakers
KK

Kajornsak Kittimathaveenan

Institute of Music, Science and Engineering, King Mongkut's Institute of Technology Ladkrabang
MP

Munhum Park

Institute of Music, Science and Engineering, King Mongkut's Institute of Technology Ladkrabang


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Tools For Visual Thinking: Teaching Electronic Music
Teaching the history and compositional techniques of electronic music can be challenging because there are few practical resources available for developing course curriculums, and current music styles are constantly changing. Here we explain the benefits of a few assignments that help students connect the analysis of classic Electronic Dance Music (EDM) songs with creating their own compositions that “nail the style.” Creating timeline analyses of classic EDM songs form a visual representation of how the elements of an arrangement develop. Students later use these timeline analyses as visual blueprints for EDM song arrangements that they compose. Critical listening plays a vital role in creating these detailed timeline analyses that encourage self-discovery of each element’s musical characteristics. This work positively influences the composer’s ability to “nail the style.” Pedagogical experiences based on self-discovery offer greater permanence through structured learning.

Speakers
avatar for Graham Spice

Graham Spice

Associate Professor of Music Production and Recording Technology, Shenandoah University


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand

9:00pm EDT

Transducer design considerations for slim TV applications.
With the development of new space-efficient display technologies over the last years has come the trend to overall decrease thickness of consumer electronics such as televisions. This slim form factor creates a challenge for the design of integrated audio systems because it severely limits the physical performance possibilities of any acoustic transducer. Modern DSP and amplifier technologies have been able to utilize the transducer up to its performance limits and thus maintained the audio quality, however this will not be enough if a further thickness reduction is desired. This paper discusses the physical limits of the current designs and suggests a new layout of a moving coil transducer for ultra-slim applications.

Speakers
avatar for Felix C. Kochendörfer

Felix C. Kochendörfer

Samsung Research America
Felix Kochendörfer was born in 1985 in Weimar, Germany. He received a M.Sc. Degree in Acoustics and Signal Processing from Aalborg University, Denmark in 2010 and a Diploma Degree in Electrical Engineering from Dresden University of Technology in 2011. After a short time at Klippel... Read More →


Thursday October 28, 2021 9:00pm - Friday December 3, 2021 6:00pm EST
On-Demand
 
Friday, October 29
 

9:00pm EDT

Deep Learning for Audio Signal Processing, with Python and Pytorch Examples Tutorial
In this tutorial, we will show some basic building blocks of deep learning, particularly for audio, from the perspective of signal processing. The idea is to show some similarities between familiar signal processing structures and deep learning architectures. For that, we use examples in Python and Pytorch.

We start with best practices for deep learning, then exploring convolutional neural networks as filter banks (analysis and synthesis) and autoencoders as a filter bank-based audio coder, and finally, we discuss recurrent neural networks as IIR (infinite impulse response) filters. This is done using audio examples and Python Pytorch program examples.

Content:

  • Best Practices for machine learning in audio
  • Specific properties of audio signals and typical features
  • Convolutional layers as filter banks
  • Autoencoders as Filter bank with optimization
  • Variational autoencoder as audio coder with quantization
  • Recurrent Neural Networks as Infinite Impulse Response filters

The Jupyter notebook file for the tutorial slides can be found at github.com/TUIlmenauAMS/AES_Tutorial_2021.

Speakers
avatar for Renato Profeta

Renato Profeta

Institut für Medientechnik, TU Ilmenau
Renato Profeta is a Ph.D. Candidate in Audio Signal Processing at the Ilmenau University of Technology.He received a Master of Engineering degree in Electrical Engineering from Kempten University of Applied Sciences and a Bachelor of Engineering in Electrical Engineering from Riga... Read More →
SS

Sascha Spors

University of Rostock
avatar for Gerald Schuller

Gerald Schuller

Professor, Ilmenau University of Technology
Audio Coding, Machine Learning for Audio. Short Bio: Gerald Schuller is a full professor at the Institute for Media Technology of the Technical University of Ilmenau, since 2008. He was head of the Audio Coding for Special Applications group of the Fraunhofer Institute for Digital... Read More →


Friday October 29, 2021 9:00pm - Friday December 3, 2021 5:45pm EST
On-Demand

9:00pm EDT

Archiving the '90s (the '80s Edition)
Archival practice often spotlights the challenges of working with magnetic tape and grooved media. This panel shifts focus to the formats used frequently in 1980s recording production: Small format analog multitrack such as 8 track 1/2” and 16 track 1”, DBX and Dolby noise reduction,  DASH, ProDigi and PCM F-1. Loads of great records were made on these formats, frequently in project studios with smaller budgets. Sadly, they are some of the most at-risk formats, both because the carriers are awful and the because playback machines in working order are hard to find and maintain. The fact that most of the studios using these formats were smaller project studios with minimal budgets only heightens the urgency of preserving this content. Panelists will talk about playback and preservation of these formats, specific considerations in capturing audio, timecode and other data, sourcing and maintaining playback machines, and noise reduction techniques with tapes that often have less than optimal notes and are degrading faster than they can be cataloged.

Moderators
avatar for Jason Bitner

Jason Bitner

Traffic Entertainment Group
Jason Bitner is a mastering engineer and production supervisor at an independent record distributor in the Boston area. His roots in the Boston music scene go back to repairing brasswinds for Rayburn music. Now, in addition to daily production operations, Jason can be found transferring... Read More →

Speakers
avatar for Eddie Ciletti

Eddie Ciletti

Audio Engineering Society
Eddie has been a self-employed audio tech (and occasional recording engineer) for most of his career, with stints at MCI, Bearsville, Atlantic, Record Plant, and R/Greenberg Associates along the way. He began his professional career in 1975 as a keyboard technician for Hall and Oates... Read More →
avatar for Dan Johnson

Dan Johnson

Owner, Audio Archiving Services, Inc
Dan Johnson is the owner of Audio Archiving Services in Burbank, CA. Over the course of his career, he has worked with priceless masters by high-profile legacy artists such as Led Zeppelin, John Lennon, KISS, The Doors, Eagles, Prince, The Ramones, Van Halen, De La Soul, WAR, and... Read More →
avatar for Kelly Pribble

Kelly Pribble

Director of Media Recovery Technology, Iron Mountain Entertainment Services (IMES)
Kelly Pribble, Director of Media Recovery Technology at Iron Mountain Entertainment Services (IMES), is a veteran studio engineer, studio builder, archivist and inventor. In March 2022, Kelly was issued a Patent for Media Recovery Technology. Before joining Iron Mountain, Kelly... Read More →
avatar for Catherine Vericolli

Catherine Vericolli

Infrasonic Transfers & Archival
Catherine Vericolli is a transfer and archival engineer based in Nashville, TN at Infrasonic Transfers & Archival. With over 15 years of experience with analog media, she specializes in the preservation of historic records and collections across a wide range of mediums, often in need... Read More →


Friday October 29, 2021 9:00pm - Friday December 3, 2021 5:45pm EST
On-Demand

9:00pm EDT

Smaller, louder, smarter - loudspeaker design in the 21st century
Electronic recording and reproduction of sound is roughly 100 years old. Historical milestones include magnetic tape recording, the transition from vacuum tubes to transistors, the introduction of the digital compact disc, and modern digital signal processing. The full impact of digital technology has taken hold since the turn of the century, and its influence on loudspeaker design is evident in at least two applications: directional control of loudspeaker arrays, and squeezing low frequency output from very small devices. Although the two technologies are quite different, they are combined in some smart speaker designs. This informal review highlights some notable achievements, common misconceptions, and practical guidelines.

Speakers
avatar for George Augspurger

George Augspurger

Perception Inc.
George L. Augspurger received his M.A. degree from UCLA and has done postgraduate work at Northwestern University. After working in sound contracting and television production he joined James B. Lansing Sound, Inc. in 1958 where he served as Technical Service Manager and later as... Read More →


Friday October 29, 2021 9:00pm - Friday December 3, 2021 5:45pm EST
On-Demand
 
  • Timezone
  • Filter By Date AES Fall Online Convention Oct 11 -31, 2021
  • Filter By Venue Online
  • Filter By Type
  • Acoustics & Psychoacoustics
  • Applications in Audio
  • Archiving & Restoration
  • Audio Builders Workshop
  • Audio for Cinema
  • Broadcast & Online Delivery
  • Diversity & Inclusion
  • E-Briefs on-demand
  • Education
  • Electronic Dance Music
  • Electronic Instrument Design & Applications
  • Game Audio & XR
  • Hip-Hop/R&B
  • Historical
  • Immersive Music
  • Immersive & Spatial Audio
  • Networked Audio
  • Papers on-demand
  • Recording & Production
  • Sound Reinforcement
  • Special Event
  • Tech Tours
  • Technical Committee Meeting
  • Company
  • Subject
  • Area


Filter sessions
Apply filters to sessions.