0

Huchuan Lu

Papers
39

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
39papers

Authored papers

39

Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

arXiv 2026

2026

Think3D: Thinking with Space for Spatial Reasoning

arXiv 2026

2026

VISTA-Bench: Do Vision-Language Models Really Understand Visualized Text as Well as Pure Text?

arXiv 2026

2026

Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge

arXiv 2025

2025

MultiShotMaster: A Controllable Multi-Shot Video Generation Framework

arXiv 2025

2025

EVEv2: Improved Baselines for Encoder-Free Vision-Language Models

ICCV 2025

2025

EvMic: Event-based Non-contact sound recovery from effective spatial-temporal modeling

arXiv 2025

2025

How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective

arXiv 2025

2025

VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning

arXiv 2025

2025

The Devil is in Temporal Token: High Quality Video Reasoning Segmentation

CVPR 2025 1

2025

OASIS: Open Agent Social Interaction Simulations with One Million Agents

arXiv 2024

2024

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

arXiv 2024

2024

Autoregressive Video Generation without Vector Quantization

arXiv 2024

2024

Unity is Strength: Unifying Convolutional and Transformeral Features for Better Person Re-Identification

arXiv 2024

2024

High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity

high-precision-dichotomous-image-segmentation-1

2024

Towards Real-Time Open-Vocabulary Video Instance Segmentation

arXiv 2024

2024

StableIdentity: Inserting Anybody into Anywhere at First Sight

arXiv 2024

2024

CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models

arXiv 2024

2024

Multi-view Aggregation Network for Dichotomous Image Segmentation

CVPR 2024 1

2024

DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting

arXiv 2024

2024

ReNeg: Learning Negative Embedding with Reward Guidance

CVPR 2025 1

2024

Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception

CVPR 2024 1

2024

Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching

arXiv 2024

2024

GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning

arXiv 2024

2024

SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning

arXiv 2024

2024

Open-Vocabulary Camouflaged Object Segmentation

arXiv 2023

2023

CiteTracker: Correlating Image and Text for Visual Tracking

ICCV 2023 1

2023

Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking

ICCV 2023 1

2023

Tracking Anything in High Quality

arXiv 2023

2023

M^{2}SNet: Multi-scale in Multi-scale Subtraction Network for Medical Image Segmentation

arXiv 2023

2023

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

arXiv 2023

2023

UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces

arXiv 2023

2023

Plug-and-Play Regulators for Image-Text Matching

arXiv 2023

2023

Towards Deeply Unified Depth-aware Panoptic Segmentation with Bi-directional Guidance Learning

ICCV 2023 1

2023

UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory

CVPR 2024 1

2023

Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation

ICCV 2023 1

2023

HS-Diffusion: Semantic-Mixing Diffusion for Head Swapping

arXiv 2022

2022

Similarity Reasoning and Filtration for Image-Text Matching

arXiv 2021

2021

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search

CVPR 2021 1

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 39 papers