Jie Yang
- Papers
- 28
Cite
Notes
Only stored in your browser.
Authored papers
28OmniPro: A Comprehensive Benchmark for Omni-Proactive Streaming Video Understanding
arXiv 2026
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters
arXiv 2026
Stage-adaptive Token Selection for Efficient Omni-modal LLMs
arXiv 2026
LongCat-Next: Lexicalizing Modalities as Discrete Tokens
arXiv 2026
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence
arXiv 2026
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development
arXiv 2026
VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
arXiv 2025
WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning
arXiv 2025
BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text
arXiv 2025
Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction
arXiv 2025
Holistic Semantic Representation for Navigational Trajectory Generation
arXiv 2025
Step-Audio 2 Technical Report
arXiv 2025
Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction
arXiv 2025
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model
arXiv 2025
Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks
arXiv 2024
Higher Layers Need More LoRA Experts
arXiv 2024
CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs
arXiv 2024
MambaMIM: Pre-training Mamba with State Space Token Interpolation and its Application to Medical Image Segmentation
arXiv 2024
Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
arXiv 2024
Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models
arXiv 2024
FinTruthQA: A Benchmark Dataset for Evaluating the Quality of Financial Information Disclosure
arXiv 2024
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
arXiv 2023
detrex: Benchmarking Detection Transformers
arXiv 2023
X-Pose: Detecting Any Keypoints
arXiv 2023
Neural Interactive Keypoint Detection
ICCV 2023 1
VSViG: Real-time Video-based Seizure Detection via Skeleton-based Spatiotemporal ViG
arXiv 2023
FoPro: Few-Shot Guided Robust Webly-Supervised Prototypical Learning
arXiv 2022
DFR: Deep Feature Reconstruction for Unsupervised Anomaly Segmentation
arXiv 2020
Affiliations
Frequent co-authors
10from 28 papers