Pingchuan Ma

Papers: 12

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

12papers

Authored papers

Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models

arXiv 2025

2025

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

arXiv 2025

2025

DepthFM: Fast Monocular Depth Estimation with Flow Matching

arXiv 2024

2024

LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery

arXiv 2024

2024

ZigMa: A DiT-style Zigzag Mamba Diffusion Model

arXiv 2024

2024

ROICtrl: Boosting Instance Control for Visual Generation

CVPR 2025 1

2024

Diffusion Models and Representation Learning: A Survey

arXiv 2024

2024

Large Language Models are Strong Audio-Visual Speech Recognition Learners

arXiv 2024

2024

Does VLM Classification Benefit from LLM Description Semantics?

arXiv 2024

2024

Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels

arXiv 2023

2023

Boosting Latent Diffusion with Flow Matching

arXiv 2023

2023

Visual Speech Recognition for Multiple Languages in the Wild

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

from 12 papers

Björn Ommer

Vincent Tao Hu

Maja Pantic

Ming Gui

Stavros Petridis

Johannes Schusterbauer

Dmytro Kotovenko

Honglie Chen

Olga Grebenkova

Stefan Andreas Baumann

2 shared papers