Patrick Schramowski
- Papers
- 25
Cite
Notes
Only stored in your browser.
Authored papers
25Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
arXiv 2025
MSTS: A Multimodal Safety Test Suite for Vision-Language Models
arXiv 2025
Introducing v0.5 of the AI Safety Benchmark from MLCommons
arXiv 2024
ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming
arXiv 2024
T-FREE: Subword Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings
arXiv 2024
SCAR: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs
arXiv 2024
Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You
arXiv 2024
LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models
arXiv 2024
AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation
NeurIPS 2023 11
Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations
arXiv 2023
Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness
arXiv 2023
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
CVPR 2023 1
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis
arXiv 2022
A Typology for Exploring the Mitigation of Shortcut Behavior
arXiv 2022
ILLUME: Rationalizing Vision-Language Models through Human Interactions
arXiv 2022
Revision Transformers: Instructing Language Models to Change their Values
arXiv 2022
Speaking Multiple Languages Affects the Moral Bias of Language Models
arXiv 2022
Can Machines Help Us Answering Question 16 in Datasheets, and In Turn Reflecting on Inappropriate Content?
arXiv 2022
Does CLIP Know My Face?
arXiv 2022
Inferring Offensiveness In Images From Natural Language Supervision
inferring-offensiveness-in-images-from-1
Adaptive Rational Activations to Boost Deep Reinforcement Learning
arXiv 2021
Large Pre-trained Language Models Contain Human-like Biases of What is Right and Wrong to Do
arXiv 2021
Interactively Providing Explanations for Transformer Language Models
interactively-generating-explanations-for-1
Making deep neural networks right for the right scientific reasons by interacting with their explanations
arXiv 2020
Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks
ICLR 2020 1
Affiliations
Frequent co-authors
10from 25 papers
Kristian Kersting
Felix Friedrich
Manuel Brack
Björn Deiseroth
Dominik Hintersdorf
Lukas Struppek
Wolfgang Stammer
Alejandro Molina
Alexander Fraser
Alicia Parrish