0

Weidong Cai

Papers
17

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
17papers

Authored papers

17

DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset

arXiv 2026

2026

RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm

arXiv 2025

2025

Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

arXiv 2025

2025

MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention

arXiv 2025

2025

The Collapse of Patches

arXiv 2025

2025

UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning

arXiv 2025

2025

Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights

arXiv 2024

2024

RWKV-CLIP: A Robust Vision-Language Representation Learner

arXiv 2024

2024

Gotta Hear Them All: Sound Source Aware Vision to Audio Generation

arXiv 2024

2024

Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images

arXiv 2024

2024

V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models

arXiv 2023

2023

CelebV-Text: A Large-Scale Facial Text-Video Dataset

CVPR 2023 1

2023

PaRot: Patch-Wise Rotation-Invariant Network via Feature Disentanglement and Pose Restoration

arXiv 2023

2023

Taxonomy Adaptive Cross-Domain Adaptation in Medical Imaging via Optimization Trajectory Distillation

ICCV 2023 1

2023

Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds

arXiv 2022

2022

SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection

CVPR 2023 1

2021

Deep Clustering via Joint Convolutional Autoencoder Embedding and Relative Entropy Minimization

deep-clustering-via-joint-convolutional-1

2017

Affiliations

No known affiliations.

Frequent co-authors

10

from 17 papers