0

What Do Neural Networks Learn for TDOA Estimation? A Cross-Architecture Probing Study

Neural networks outperform classical GCC-PHAT for Time-Difference-of-Arrival (TDOA) estimation in noise and reverberation, yet their internal strategy remains unexplored.

Preview
Year
2026
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2606.22020ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Neural networks outperform classical GCC-PHAT for Time-Difference-of-Arrival (TDOA) estimation in noise and reverberation, yet their internal strategy remains unexplored. To uncover it, we turn GCC-PHAT's mathematical steps into diagnostic targets, probing hidden layers of three architectures (MLP, CNN, Transformer) and complementing with gradient attribution and causal frequency masking. We find that cross-power computation consistently emerges across all architectures and conditions, while PHAT whitening, the defining step of GCC-PHAT, fails to emerge. Instead, networks learn a magnitude-aware frequency weighting that preserves per-frequency reliability information discarded by PHAT. This makes PHAT an information bottleneck: removing it from both classical and neural GCC pipelines improves performance under additive noise. On real-world reverberant data, PHAT remains the best classical weighting, but end-to-end networks achieve lower error by learning data-adaptive weighting.