LiWei Wang
- Papers
- 22
Cite
Notes
Only stored in your browser.
Authored papers
22Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models
arXiv 2026
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors
arXiv 2025
Rethinking Chain-of-Thought Reasoning for Videos
arXiv 2025
Distribution Matching Variational AutoEncoder
arXiv 2025
UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
arXiv 2025
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning
ICCV 2025
Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding
CVPR 2025 1
Making Long-Context Language Models Better Multi-Hop Reasoners
arXiv 2024
DPO Meets PPO: Reinforced Token Optimization for RLHF
arXiv 2024
GiT: Towards Generalist Vision Transformer through Universal Language Interface
arXiv 2024
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation
arXiv 2024
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
arXiv 2024
DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
CVPR 2023 1
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
arXiv 2023
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation
ICCV 2023 1
Towards Learning a Generalist Model for Embodied Navigation
CVPR 2024 1
A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests
arXiv 2023
Offline Meta Reinforcement Learning with In-Distribution Online Adaptation
arXiv 2023
VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control
ICCV 2023 1
CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds
arXiv 2022
RBGNet: Ray-based Grouping for 3D Object Detection
CVPR 2022 1
Your Transformer May Not be as Powerful as You Expect
arXiv 2022
Affiliations
Frequent co-authors
10from 22 papers