0

Dinesh Manocha

Papers
26

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
26papers

Authored papers

26

Do Audio-Visual Large Language Models Really See and Hear?

arXiv 2026

2026

EgoAVU: Egocentric Audio-Visual Understanding

arXiv 2026

2026

Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks

arXiv 2026

2026

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

arXiv 2025

2025

VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding

arXiv 2025

2025

MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark

arXiv 2024

2024

Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment

CVPR 2025 1

2024

ReCLAP: Improving Zero Shot Audio Classification by Describing Sounds

arXiv 2024

2024

Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs

arXiv 2024

2024

HALLUCINOGEN: A Benchmark for Evaluating Object Hallucination in Large Visual-Language Models

arXiv 2024

2024

CoDa: Constrained Generation based Data Augmentation for Low-Resource NLP

arXiv 2024

2024

AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language Models

arXiv 2024

2024

Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data

arXiv 2024

2024

Prompt Mixing in Diffusion Models using the Black Scholes Algorithm

arXiv 2024

2024

Can an Embodied Agent Find Your "Cat-shaped Mug"? LLM-Guided Exploration for Zero-Shot Object Navigation

arXiv 2023

2023

CrossLoc3D: Aerial-Ground Cross-Source 3D Place Recognition

ICCV 2023 1

2023

iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning

arXiv 2023

2023

HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models

CVPR 2024 1

2023

UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation

arXiv 2023

2023

DALE: Generative Data Augmentation for Low-Resource Legal NLP

arXiv 2023

2023

ASPIRE: Language-Guided Data Augmentation for Improving Robustness Against Spurious Correlations

arXiv 2023

2023

FAR: Fourier Aerial Video Recognition

arXiv 2022

2022

GANav: Efficient Terrain Segmentation for Robot Navigation in Unstructured Outdoor Environments

arXiv 2021

2021

FAST-RIR: Fast neural diffuse room impulse response generator

arXiv 2021

2021

M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers

arXiv 2021

2021

MotionHint: Self-Supervised Monocular Visual Odometry with Motion Constraints

arXiv 2021

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 26 papers