Detecting machine-generated text has become a critical challenge amid the rapid advancement of LLMs, yet existing detectors degrade severely under domain shift. Through systematic pilot studies, we trace this vulnerability to two fundamental flaws in current generalization strategies, namely the incomplete preservation of domain-specific knowledge during multi-domain training and the misalignment between knowledge retrieval and the detection objective at inference. To address these gaps, we propose DEER, a Disentangled mixturE-of-ExpeRts framework that explicitly decouples domain-local and domain-invariant knowledge into specialized expert modules. Instead of static domain matching, DEER employs a reinforcement learning-driven router that selects expert pathways based on instance-level detection rewards. This task-aligned, domain-agnostic mechanism ensures robust adaptation to unseen distributions by prioritizing detection utility over stylistic resemblance. Extensive experiments demonstrate that DEER consistently outperforms state-of-the-art detectors, achieving average F1 improvements of 1.28% and 2.92%, and accuracy gains of 1.35% and 2.26% on in-domain and out-of-domain datasets, offering reliable generalization for open-world deployment.
DEER: Disentangled Mixture of Experts with Instance-Adaptive Routing for Generalizable Machine-Generated Text Detection
Detecting machine-generated text has become a critical challenge amid the rapid advancement of LLMs, yet existing detectors degrade severely under domain shift. Through systematic pilot studies, we trace this vulnerability to two fundamental flaws in current generalization…
- Year
- 2025
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2511.01192ARXIV-DEFAULT
- TL;DR
- Semantic Scholar