Mayank Mishra
- Papers
- 14
Cite
Notes
Only stored in your browser.
Authored papers
14PaTH Attention: Position Encoding via Accumulating Householder Transformations
arXiv 2025
Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping
arXiv 2025
DeiT-LT Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets
arXiv 2024
The infrastructure powering IBM's Gen AI model development
arXiv 2024
Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization
arXiv 2024
Granite Code Models: A Family of Open Foundation Models for Code Intelligence
arXiv 2024
SantaCoder: don't reach for the stars!
arXiv 2023
Prompting with Pseudo-Code Instructions
arXiv 2023
A Closer Look at Smoothness in Domain Adversarial Training
a-closer-look-at-smoothness-in-domain
Escaping Saddle Points for Effective Generalization on Class-Imbalanced Data
arXiv 2022
Joint Reasoning on Hybrid-knowledge sources for Task-Oriented Dialog
arXiv 2022
Variational Learning for Unsupervised Knowledge Grounded Dialogs
variational-learning-for-unsupervised
Adversarial Approximate Inference for Speech to Electroglottograph Conversion
arXiv 2019
Variational Inference with Latent Space Quantization for Adversarial Resilience
arXiv 2019
Affiliations
Frequent co-authors
10from 14 papers