Soheil Feizi

Schoenfeld's Anatomy of Mathematical Reasoning by Language Models

arXiv 2025

Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing

arXiv 2025

DyePack: Provably Flagging Test Set Contamination in LLMs Using Backdoors

arXiv 2025

Gaming Tool Preferences in Agentic LLMs

arXiv 2025

Loki: Low-rank Keys for Efficient Sparse Attention

arXiv 2024

RESTOR: Knowledge Recovery through Machine Unlearning

arXiv 2024

Understanding and Mitigating Compositional Issues in Text-to-Image Generative Models

arXiv 2024

Fast Adversarial Attacks on Language Models In One GPU Minute

arXiv 2024

What do we learn from inverting CLIP models?

arXiv 2024

Certifying LLM Safety against Adversarial Prompting

arXiv 2023

Can AI-Generated Text be Reliably Detected?

arXiv 2023

CUDA: Convolution-based Unlearnable Datasets

CVPR 2023 1

DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified Robustness

arXiv 2023

Exploring Geometry of Blind Spots in Vision Models

exploring-geometry-of-blind-spots-in-vision

Robustness of AI-Image Detectors: Fundamental Limits and Practical Attacks

arXiv 2023

Text-To-Concept (and Back) via Cross-Model Alignment

arXiv 2023

Run-Off Election: Improved Provable Defense against Data Poisoning Attacks

arXiv 2023