26th EAAAI (EANN) 2025, 26 - 29 June 2025, Limassol, Cyprus

FusionNet:Leveraging Dual Speech Separation Networks for Enhanced Multi-Speaker Isolation

Ravindran Sumedh, Mallesh Shreyas, Bhatagunaki Raghavendra, Chitnis Srikrishna, Sharath Shylaja

Abstract:

  In the realm of audio signal processing, isolating human speech from background noise poses a significant challenge. In noisy environments, multiple human audio signals may overlap, and the presence of ambient noise complicates accurate individual speech extraction. Recent advancements leverage Transformer based architectures to efficiently extract human speech signals from mixtures of overlapping background sounds, addressing the shortcomings of traditional techniques. However, existing Transformer models face limitations, including restricted scalability to scenarios with a high number of overlapping speakers, increased computational complexity, and parameter inefficiency. In this paper, we propose an architectural framework that leverages the combined speech separation capabilities of two such novel transformer architectures to enhance multi-speaker separation.  

*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.