0

Liang Wang

Papers
47

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
47papers

Authored papers

47

Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation

arXiv 2026

2026

RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation

arXiv 2026

2026

How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing

arXiv 2026

2026

FVG-PT: Adaptive Foreground View-Guided Prompt Tuning for Vision-Language Models

arXiv 2026

2026

Aligning Multimodal LLM with Human Preference: A Survey

arXiv 2025

2025

Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing

arXiv 2025

2025

BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models

arXiv 2025

2025

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

arXiv 2025

2025

mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data

arXiv 2025

2025

Diffusion Models for Molecules: A Survey of Methods and Tasks

arXiv 2025

2025

SimScale: Learning to Drive via Real-World Simulation at Scale

arXiv 2025

2025

Embodied Co-Design for Rapidly Evolving Agents: Taxonomy, Frontiers, and Challenges

arXiv 2025

2025

PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design

arXiv 2025

2025

Thyme: Think Beyond Images

arXiv 2025

2025

MoCa: Modality-aware Continual Pre-training Makes Better Bidirectional Multimodal Embeddings

arXiv 2025

2025

OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use

arXiv 2025

2025

Variational Reasoning for Language Models

arXiv 2025

2025

MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models

arXiv 2025

2025

MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra

arXiv 2025

2025

Latent Sketchpad: Sketching Visual Thoughts to Elicit Multimodal Reasoning in MLLMs

arXiv 2025

2025

AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration

arXiv 2025

2025

OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing

arXiv 2025

2025

Reinforcing General Reasoning without Verifiers

arXiv 2025

2025

PoisonArena: Uncovering Competing Poisoning Attacks in Retrieval-Augmented Generation

arXiv 2025

2025

BlendSQL: A Scalable Dialect for Unifying Hybrid Question Answering in Relational Algebra

arXiv 2024

2024

Generative Representational Instruction Tuning

arXiv 2024

2024

From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents

arXiv 2024

2024

Debiasing Multimodal Large Language Models

arXiv 2024

2024

VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark

arXiv 2024

2024

PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts

arXiv 2024

2024

Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models

arXiv 2024

2024

Little Giants: Synthesizing High-Quality Embedding Data at Scale

arXiv 2024

2024

Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models

arXiv 2024

2024

Beyond Filtering: Adaptive Image-Text Quality Enhancement for MLLM Pretraining

arXiv 2024

2024

LongEmbed: Extending Embedding Models for Long Context Retrieval

arXiv 2024

2024

Inference with Reference: Lossless Acceleration of Large Language Models

arXiv 2023

2023

GSLB: The Graph Structure Learning Benchmark

gslb-the-graph-structure-learning-benchmark

2023

AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation

arXiv 2023

2023

EX-FEVER: A Dataset for Multi-hop Explainable Fact Verification

arXiv 2023

2023

PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training

arXiv 2023

2023

OneNet: Enhancing Time Series Forecasting Models under Concept Drift by Online Ensembling

onenet-enhancing-time-series-forecasting

2023

Exploring Model Dynamics for Accumulative Poisoning Discovery

arXiv 2023

2023

Learning Diverse Document Representations with Deep Query Interactions for Dense Retrieval

arXiv 2022

2022

BEVBert: Multimodal Map Pre-training for Language-guided Navigation

ICCV 2023 1

2022

Relation-aware Heterogeneous Graph for User Profiling

arXiv 2021

2021

Deep Graph Contrastive Representation Learning

arXiv 2020

2020

Bifurcated backbone strategy for RGB-D salient object detection

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 47 papers