Bhiksha Raj
- Papers
- 19
Cite
Notes
Only stored in your browser.
Authored papers
19Masked Autoencoders Are Effective Tokenizers for Diffusion Models
arXiv 2025
Robust Latent Matters: Boosting Image Generation with Sampling Error
arXiv 2025
Image Tokenizer Needs Post-Training
arXiv 2025
Mellow: a small audio language model for reasoning
arXiv 2025
ControlVAR: Exploring Controllable Visual Autoregressive Modeling
arXiv 2024
Audio Entailment: Assessing Deductive Reasoning for Audio Understanding
arXiv 2024
$\text{R}^2$-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations
arXiv 2024
Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization?
arXiv 2024
uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation in Low-Data Regimes
arXiv 2024
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks
arXiv 2023
Token Prediction as Implicit Classification to Identify LLM-Generated Text
arXiv 2023
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene Understanding
CVPR 2023 1
Training Audio Captioning Models without Audio
arXiv 2023
GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content
arXiv 2023
HEAR: Holistic Evaluation of Audio Representations
arXiv 2022
How many perturbations break this model? Evaluating robustness beyond adversarial accuracy
arXiv 2022
USB: A Unified Semi-supervised Learning Benchmark for Classification
arXiv 2022
Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus
arXiv 2022
VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
arXiv 2022
Affiliations
Frequent co-authors
10from 19 papers