0

YaoWei Wang

Papers
26

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
26papers

Authored papers

26

VideoVista-CulturalLingo: 360$^\circ$ Horizons-Bridging Cultures, Languages, and Domains in Video Comprehension

arXiv 2025

2025

A Unified Agentic Framework for Evaluating Conditional Image Generation

arXiv 2025

2025

DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering

CVPR 2025 1

2025

PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models

arXiv 2025

2025

Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis

arXiv 2025

2025

Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents

arXiv 2025

2025

HLFormer: Enhancing Partially Relevant Video Retrieval with Hyperbolic Learning

ICCV 2025

2025

VMamba: Visual State Space Model

arXiv 2024

2024

vHeat: Building Vision Models upon Heat Conduction

arXiv 2024

2024

OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion

arXiv 2024

2024

Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition

arXiv 2024

2024

CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition

arXiv 2024

2024

HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual Grounding

arXiv 2024

2024

Towards Visual Grounding: A Survey

arXiv 2024

2024

Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance

arXiv 2024

2024

OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling

arXiv 2024

2024

M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation

arXiv 2024

2024

Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation

arXiv 2024

2024

CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual Grounding

arXiv 2023

2023

Strip-MLP: Efficient Token Interaction for Vision MLP

ICCV 2023 1

2023

CiteTracker: Correlating Image and Text for Visual Tracking

ICCV 2023 1

2023

MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning

arXiv 2023

2023

ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation

arXiv 2023

2023

HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors

arXiv 2022

2022

Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples

CVPR 2023 1

2022

Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token Migration

CVPR 2023 1

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 26 papers