0

Weiming Zhang

Papers
16

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
16papers

Authored papers

16

M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality Assessment

arXiv 2025

2025

LAMIC: Layout-Aware Multi-Image Composition via Scalability of Multimodal Diffusion Transformer

arXiv 2025

2025

OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation

arXiv 2025

2025

ColorAgent: Building A Robust, Personalized, and Interactive OS Agent

arXiv 2025

2025

De-AntiFake: Rethinking the Protective Perturbations Against Voice Cloning Attacks

arXiv 2025

2025

ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing

arXiv 2025

2025

EraseAnything: Enabling Concept Erasure in Rectified Flow Transformers

arXiv 2024

2024

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

CVPR 2024 1

2023

Diversity-Aware Meta Visual Prompting

CVPR 2023 1

2023

Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting

ICCV 2023 1

2023

CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models

arXiv 2023

2023

Watermarking Text Generated by Black-Box Language Models

arXiv 2023

2023

CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet

arXiv 2022

2022

HairCLIP: Design Your Hair by Text and Reference Image

CVPR 2022 1

2021

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows

cswin-transformer-a-general-vision-1

2021

DeepFaceLab: Integrated, flexible and extensible face-swapping framework

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 16 papers