0

Wei Ji

Papers
20

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
20papers

Authored papers

20

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

arXiv 2026

2026

UniM: A Unified Any-to-Any Interleaved Multimodal Benchmark

arXiv 2026

2026

Interp3D: Correspondence-aware Interpolation for Generative Textured 3D Morphing

arXiv 2026

2026

STEP3-VL-10B Technical Report

arXiv 2026

2026

Step-DeepResearch Technical Report

arXiv 2025

2025

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

arXiv 2025

2025

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

arXiv 2025

2025

Step-Audio 2 Technical Report

arXiv 2025

2025

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

arXiv 2025

2025

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

arXiv 2025

2025

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

arXiv 2025

2025

NExT-GPT: Any-to-Any Multimodal LLM

arXiv 2023

2023

Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World

ICCV 2023 1

2023

Towards Robust Multi-Modal Reasoning via Model Selection

arXiv 2023

2023

Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation

arXiv 2023

2023

Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions

arXiv 2023

2023

VPGTrans: Transfer Visual Prompt Generator across LLMs

NeurIPS 2023 11

2023

Generating Visual Spatial Description via Holistic 3D Scene Understanding

arXiv 2023

2023

NExT-Chat: An LMM for Chat, Detection and Segmentation

arXiv 2023

2023

Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 20 papers