0

Song Wang

Papers
24

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
24papers

Authored papers

24

AI for Auto-Research: Roadmap & User Guide

arXiv 2026

2026

Rethinking Memory Mechanisms of Foundation Agents in the Second Half: A Survey

arXiv 2026

2026

VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration

arXiv 2026

2026

How Much Reasoning Do Retrieval-Augmented Models Add beyond LLMs? A Benchmarking Framework for Multi-Hop Inference over Hybrid Knowledge

arXiv 2026

2026

Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning

CVPR 2025 1

2025

Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps

arXiv 2025

2025

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

arXiv 2025

2025

RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning

arXiv 2025

2025

Uncertainty-Instructed Structure Injection for Generalizable HD Map Construction

CVPR 2025 1

2025

A Coarse-to-Fine Approach to Multi-Modality 3D Occupancy Grounding

arXiv 2025

2025

Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems

arXiv 2025

2025

3D and 4D World Modeling: A Survey

arXiv 2025

2025

SAM4D: Segment Anything in Camera and LiDAR Streams

ICCV 2025

2025

PixelThink: Towards Efficient Chain-of-Pixel Reasoning

arXiv 2025

2025

Complex Logical Instruction Generation

arXiv 2025

2025

Reasoning of Large Language Models over Knowledge Graphs with Super-Relations

arXiv 2025

2025

PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance

arXiv 2024

2024

TokenPacker: Efficient Visual Projector for Multimodal LLM

arXiv 2024

2024

Large Language Models for Data Annotation and Synthesis: A Survey

arXiv 2024

2024

Integrative Decoding: Improve Factuality via Implicit Self-consistency

arXiv 2024

2024

Leveraging Inpainting for Single-Image Shadow Removal

ICCV 2023 1

2023

Interpreting Pretrained Language Models via Concept Bottlenecks

arXiv 2023

2023

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization

arXiv 2022

2022

MISF: Multi-level Interactive Siamese Filtering for High-Fidelity Image Inpainting

CVPR 2022 1

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 24 papers