0

Yi Liu

Papers
26

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
26papers

Authored papers

26

ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models

arXiv 2026

2026

PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

arXiv 2025

2025

PP-DocLayout: A Unified Document Layout Detection Model to Accelerate Large-Scale Data Construction

arXiv 2025

2025

SpaceR: Reinforcing MLLMs in Video Spatial Reasoning

arXiv 2025

2025

VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?

arXiv 2025

2025

Step-Audio 2 Technical Report

arXiv 2025

2025

Enabling Versatile Controls for Video Diffusion Models

arXiv 2025

2025

MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs

arXiv 2025

2025

Unified Embodied VLM Reasoning with Robotic Action via Autoregressive Discretized Pre-training

arXiv 2025

2025

RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer

arXiv 2024

2024

MooER: LLM-based Speech Recognition and Translation Models from Moore Threads

arXiv 2024

2024

TempCompass: Do Video LLMs Really Understand Videos?

arXiv 2024

2024

CAMixerSR: Only Details Need More "Attention"

CVPR 2024 1

2024

PrefixQuant: Eliminating Outliers by Prefixed Tokens for Large Language Models Quantization

arXiv 2024

2024

Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions

arXiv 2024

2024

PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension

arXiv 2024

2024

A Comprehensive Study of Jailbreak Attack versus Defense for Large Language Models

arXiv 2024

2024

Benchmarking Large Language Models on Controllable Generation under Diversified Instructions

arXiv 2024

2024

DETRs Beat YOLOs on Real-time Object Detection

CVPR 2024 1

2023

PP-MobileSeg: Explore the Fast and Accurate Semantic Segmentation Model on Mobile Devices

arXiv 2023

2023

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark

CVPR 2024 1

2023

Semi-Offline Reinforcement Learning for Optimized Text Generation

arXiv 2023

2023

PentestGPT: An LLM-empowered Automatic Penetration Testing Tool

arXiv 2023

2023

Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection

CVPR 2023 1

2023

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models

arXiv 2023

2023

EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse Gate

dense-to-sparse-gate-for-mixture-of-experts

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 26 papers