MGDFIS: Multi-scale Global-detail Feature Integration Strategy for Small Object Detection

Small-object detection in Unmanned Aerial Vehicle (UAV) imagery requires preserving weak local evidence while using broader context to separate tiny foreground targets from cluttered backgrounds. Existing multi-scale fusion methods improve feature aggregation, but they often add computation or blur fine details during repeated cross-scale fusion. The central challenge is to balance low-SNR target preservation, clutter suppression, and efficient cross-scale context exchange. To address this challenge, we propose the Multi-scale Global-detail Feature Integration Strategy (MGDFIS), a neck-level feature-fusion strategy that couples global context exchange, local-detail recovery, and pixel-level foreground-background recalibration. MGDFIS integrates three coordinated modules: FusionLock-TSS Attention for stabilizing spectral-spatial responses, Global-detail Integration for combining long-range mixing with local detail capture, and Dynamic Pixel Attention for reweighting compact foreground regions. On the controlled VisDrone setting, YOLO26m + MGDFIS improves AP50:95 from 25.7 to 30.2 and AP50 from 37.2 to 44.2 over the YOLO26m baseline, with 96.1 GFLOPs. Additional dataset-specific evaluations report 38.9 AP50 and 21.9 AP50:95 on UAVDT and 97.4 AP50 on CARPK. The code is available at: https://github.com/JackBaixue/MGDFIS.