Bhavya Kailkhura
- Papers
- 16
Cite
Notes
Only stored in your browser.
Authored papers
16The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text
arXiv 2025
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
arXiv 2025
TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention
arXiv 2025
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense
arXiv 2025
EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants
arXiv 2025
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment
arXiv 2025
TrustLLM: Trustworthiness in Large Language Models
arXiv 2024
Introducing v0.5 of the AI Safety Benchmark from MLCommons
arXiv 2024
Transformers Can Do Arithmetic with the Right Embeddings
arXiv 2024
GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations
arXiv 2024
SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning
arXiv 2024
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
arXiv 2024
NEFTune: Noisy Embeddings Improve Instruction Finetuning
arXiv 2023
DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training
arXiv 2023
Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models
arXiv 2023
Adversarial Mutual Information for Text Generation
ICML 2020 1
Affiliations
Frequent co-authors
10from 16 papers