0

Ming-Ming Cheng

Papers
43

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
43papers

Authored papers

43

Infinite-World: Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory

arXiv 2026

2026

Mixture of Style Experts for Diverse Image Stylization

arXiv 2026

2026

Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions

arXiv 2026

2026

Mutual Forcing: Dual-Mode Self-Evolution for Fast Autoregressive Audio-Video Character Generation

arXiv 2026

2026

Continuous-Time Distribution Matching for Few-Step Diffusion Distillation

arXiv 2026

2026

Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought

arXiv 2026

2026

DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation

CVPR 2025 1

2025

VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning

ICCV 2025

2025

InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration

arXiv 2025

2025

RPCANet++: Deep Interpretable Robust PCA for Sparse Object Segmentation

arXiv 2025

2025

Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology

arXiv 2025

2025

TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs

arXiv 2025

2025

The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment

arXiv 2025

2025

A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models

arXiv 2025

2025

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

arXiv 2024

2024

ATPrompt: Textual Prompt Learning with Embedded Attributes

ICCV 2025

2024

Towards RAW Object Detection in Diverse Conditions

CVPR 2025 1

2024

DenseVLM: A Retrieval and Decoupled Alignment Framework for Open-Vocabulary Dense Prediction

ICCV 2025

2024

Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation

arXiv 2024

2024

SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection

arXiv 2024

2024

Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis

arXiv 2024

2024

Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation

arXiv 2024

2024

YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection

arXiv 2023

2023

MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer

ICCV 2023 1

2023

Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference

arXiv 2023

2023

SRFormerV2: Taking a Closer Look at Permuted Self-Attention for Image Super-Resolution

ICCV 2023 1

2023

Multi-Space Neural Radiance Fields

CVPR 2023 1

2023

Referring Camouflaged Object Detection

arXiv 2023

2023

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing

arXiv 2023

2023

MaTe3D: Mask-guided Text-based 3D-aware Portrait Editing

arXiv 2023

2023

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

CVPR 2024 1

2023

Large Selective Kernel Network for Remote Sensing Object Detection

ICCV 2023 1

2023

Make Explicit Calibration Implicit: Calibrate Denoiser Instead of the Noise Model

ICCV 2023 1

2023

AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation

CVPR 2023 1

2023

CrossKD: Cross-Head Knowledge Distillation for Object Detection

CVPR 2024 1

2023

CorrMatch: Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation

CVPR 2024 1

2023

Looking Through the Glass: Neural Surface Reconstruction Against High Specular Reflections

CVPR 2023 1

2023

Masked Autoencoders are Efficient Class Incremental Learners

ICCV 2023 1

2023

Co-Salient Object Detection with Co-Representation Purification

arXiv 2023

2023

Towards An End-to-End Framework for Flow-Guided Video Inpainting

CVPR 2022 1

2022

Visual Attention Network

arXiv 2022

2022

Deep Hough Transform for Semantic Line Detection

ECCV 2020 8

2020

Image Inpainting with Learnable Bidirectional Attention Maps

image-inpainting-with-learnable-bidirectional-1

2019

Affiliations

No known affiliations.

Frequent co-authors

10

from 43 papers