Michael Backes
- Papers
- 11
Cite
Notes
Only stored in your browser.
Authored papers
11On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective
arXiv 2025
TrustLLM: Trustworthiness in Large Language Models
arXiv 2024
Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media
arXiv 2024
Memorization in Self-Supervised Learning Improves Downstream Generalization
arXiv 2024
ModSCAN: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities
arXiv 2024
MGTBench: Benchmarking Machine-Generated Text Detection
arXiv 2023
"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
arXiv 2023
Prompt Stealing Attacks Against Text-to-Image Generation Models
arXiv 2023
Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
arXiv 2023
Generated Graph Detection
arXiv 2023
Data Poisoning Attacks Against Multimodal Encoders
arXiv 2022
Affiliations
Frequent co-authors
10from 11 papers