0

Chen Li

Papers
34

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
34papers

Authored papers

34

GLM-5: from Vibe Coding to Agentic Engineering

arXiv 2026

2026

FireRed-Image-Edit-1.0 Techinical Report

arXiv 2026

2026

AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents

arXiv 2026

2026

NOVA: Sparse Control, Dense Synthesis for Pair-Free Video Editing

arXiv 2026

2026

What Is Wrong with Synthetic Data for Scene Text Recognition? A Strong Synthetic Engine with Diverse Simulations and Self-Evolution

arXiv 2026

2026

Audio-Omni: Extending Multi-modal Understanding to Versatile Audio Generation and Editing

arXiv 2026

2026

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

arXiv 2025

2025

Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation

arXiv 2025

2025

ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries

arXiv 2025

2025

V-Thinker: Interactive Thinking with Images

arXiv 2025

2025

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

arXiv 2025

2025

X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again

arXiv 2025

2025

We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning

arXiv 2025

2025

Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs

arXiv 2025

2025

ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts

arXiv 2025

2025

UnicEdit-10M: A Dataset and Benchmark Breaking the Scale-Quality Barrier via Unified Verification for Reasoning-Enriched Edits

arXiv 2025

2025

Unleashing High-Quality Image Generation in Diffusion Sampling Using Second-Order Levenberg-Marquardt-Langevin

ICCV 2025

2025

Mamba YOLO: A Simple Baseline for Object Detection with State Space Model

arXiv 2024

2024

DISC: Plug-and-Play Decoding Intervention with Similarity of Characters for Chinese Spelling Check

arXiv 2024

2024

SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing

arXiv 2024

2024

Stacking Brick by Brick: Aligned Feature Isolation for Incremental Face Forgery Detection

CVPR 2025 1

2024

ST-LLM: Large Language Models Are Effective Temporal Learners

st-llm-large-language-models-are-effective

2024

PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance

arXiv 2024

2024

We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?

arXiv 2024

2024

Making LLaMA SEE and Draw with SEED Tokenizer

arXiv 2023

2023

Unleashing the Potential of Spiking Neural Networks by Dynamic Confidence

arXiv 2023

2023

How to Determine the Most Powerful Pre-trained Language Model without Brute Force Fine-tuning? An Empirical Survey

arXiv 2023

2023

NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts

arXiv 2023

2023

Efficient Diffusion Training via Min-SNR Weighting Strategy

ICCV 2023 1

2023

DETR Doesn't Need Multi-Scale or Locality Design

arXiv 2023

2023

Vision-Language Instruction Tuning: A Review and Analysis

arXiv 2023

2023

All in Tokens: Unifying Output Space of Visual Tasks via Soft Token

ICCV 2023 1

2023

NeRF-DS: Neural Radiance Fields for Dynamic Specular Objects

CVPR 2023 1

2023

Weakly-supervised 3D Pose Transfer with Keypoints

ICCV 2023 1

2023

Affiliations

No known affiliations.

Frequent co-authors

10

from 34 papers