Agent skills are widely supported by major agentic frameworks and perform well with proprietary models, yet their effectiveness for small and medium-sized open source language models (270 M-80B) remains underexplored. We systematically study the Skill paradigm in resource-constrained industrial settings, where reliance on proprietary APIs is impractical due to data security and budget constraints. Across two open-source tasks and a real-world insurance claims classification task, we find that very small models struggle with reliable skill selection, while models around 30B-80B benefit substantially. Thinking variants do not show major levels of improvement from skills, also considering GPU usage increases due to overthinking. These findings reveal a trade-off between GPU cost and agent performance, and provide actionable insights for effective Skill configuration and SLM deployment in real world settings.
Agent Skill Framework: Perspectives on the Potential of Small to Medium Language Models in Industrial Environments
Agent skills are widely supported by major agentic frameworks and perform well with proprietary models, yet their effectiveness for small and medium-sized open source language models (270 M-80B) remains underexplored.
- Preview

- Year
- 2026
- Hosting
- Excerpt onlyCC-BY-NC-SA-4.0
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2602.16653CC-BY-NC-SA-4.0
- TL;DR
- Semantic Scholar