0

Jie Yang

Papers
28

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
28papers

Authored papers

28

OmniPro: A Comprehensive Benchmark for Omni-Proactive Streaming Video Understanding

arXiv 2026

2026

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

arXiv 2026

2026

Stage-adaptive Token Selection for Efficient Omni-modal LLMs

arXiv 2026

2026

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

arXiv 2026

2026

CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence

arXiv 2026

2026

ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development

arXiv 2026

2026

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

arXiv 2025

2025

WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning

arXiv 2025

2025

BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text

arXiv 2025

2025

Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction

arXiv 2025

2025

Holistic Semantic Representation for Navigational Trajectory Generation

arXiv 2025

2025

Step-Audio 2 Technical Report

arXiv 2025

2025

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

arXiv 2025

2025

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

arXiv 2025

2025

Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks

arXiv 2024

2024

Higher Layers Need More LoRA Experts

arXiv 2024

2024

CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs

arXiv 2024

2024

MambaMIM: Pre-training Mamba with State Space Token Interpolation and its Application to Medical Image Segmentation

arXiv 2024

2024

Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding

arXiv 2024

2024

Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models

arXiv 2024

2024

FinTruthQA: A Benchmark Dataset for Evaluating the Quality of Financial Information Disclosure

arXiv 2024

2024

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

arXiv 2023

2023

detrex: Benchmarking Detection Transformers

arXiv 2023

2023

X-Pose: Detecting Any Keypoints

arXiv 2023

2023

Neural Interactive Keypoint Detection

ICCV 2023 1

2023

VSViG: Real-time Video-based Seizure Detection via Skeleton-based Spatiotemporal ViG

arXiv 2023

2023

FoPro: Few-Shot Guided Robust Webly-Supervised Prototypical Learning

arXiv 2022

2022

DFR: Deep Feature Reconstruction for Unsupervised Anomaly Segmentation

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 28 papers