0

Li Zhang

Papers
40

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
40papers

Authored papers

40

Disentangled Robot Learning via Separate Forward and Inverse Dynamics Pretraining

arXiv 2026

2026

SGDrive: Scene-to-Goal Hierarchical World Cognition for Autonomous Driving

arXiv 2026

2026

OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia

arXiv 2025

2025

Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and Planning

CVPR 2025 1

2025

Reinforcing Action Policies by Prophesying

arXiv 2025

2025

TexVerse: A Universe of 3D Objects with High-Resolution Textures

arXiv 2025

2025

Reasoning in Space via Grounding in the World

arXiv 2025

2025

Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation

arXiv 2025

2025

Q-Insight: Understanding Image Quality via Visual Reinforcement Learning

arXiv 2025

2025

Does Chain-of-Thought Reasoning Help Mobile GUI Agent? An Empirical Study

arXiv 2025

2025

CodeSwift: Accelerating LLM Inference for Efficient Code Generation

arXiv 2025

2025

UniScene: Unified Occupancy-centric Driving Scene Generation

CVPR 2025 1

2024

Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation

arXiv 2024

2024

ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents

arXiv 2024

2024

LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Task Automation

arXiv 2024

2024

Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement

arXiv 2024

2024

S-Agents: Self-organizing Agents in Open-ended Environments

arXiv 2024

2024

CAMixerSR: Only Details Need More "Attention"

CVPR 2024 1

2024

DeepInteraction++: Multi-Modality Interaction for Autonomous Driving

arXiv 2024

2024

A Survey of Resource-efficient LLM and Multimodal Foundation Models

arXiv 2024

2024

DroidCall: A Dataset for LLM-powered Android Intent Invocation

arXiv 2024

2024

Brain3D: Generating 3D Objects from fMRI

arXiv 2024

2024

On the Limit of Language Models as Planning Formalizers

arXiv 2024

2024

PDDLEGO: Iterative Planning in Textual Environments

arXiv 2024

2024

What Do You Want? User-centric Prompt Generation for Text-to-image Synthesis via Multi-turn Guidance

arXiv 2024

2024

Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review

arXiv 2023

2023

CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning

arXiv 2023

2023

Faithful Chain-of-Thought Reasoning

arXiv 2023

2023

Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping

arXiv 2023

2023

FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks

CVPR 2023 1

2023

PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection

ICCV 2023 1

2023

Causal Reasoning of Entities and Events in Procedural Texts

arXiv 2023

2023

Exploring the Curious Case of Code Prompts

arXiv 2023

2023

Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting

arXiv 2023

2023

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

TMLR

2022

Softmax-free Linear Transformers

arXiv 2022

2022

Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data

ACL 2022 5

2022

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

CVPR 2021 1

2020

Improving Text-to-SQL Evaluation Methodology

improving-text-to-sql-evaluation-methodology-1

2018

Learning a Deep Embedding Model for Zero-Shot Learning

learning-a-deep-embedding-model-for-zero-shot-1

2016

Affiliations

No known affiliations.

Frequent co-authors

10

from 40 papers