Víctor Gallego
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12Metal-Sci: A Scientific Compute Benchmark for Evolutionary LLM Kernel Search on Apple Silicon
arXiv 2026
Beyond Scalar Rewards: Dense Feedback for LLM Policy Synthesis in Sequential Social Dilemmas
arXiv 2026
Distilling Feedback into Memory-as-a-Tool
arXiv 2026
Specification Self-Correction: Mitigating In-Context Reward Hacking Through Test-Time Refinement
arXiv 2025
MetaSC: Test-Time Safety Specification Optimization for Language Models
arXiv 2025
Configurable Preference Tuning with Rubric-Guided Synthetic Data
arXiv 2025
Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs
arXiv 2024
Configurable Safety Tuning of Language Models with Synthetic Preference Data
arXiv 2024
Merging Improves Self-Critique Against Jailbreak Attacks
arXiv 2024
Distilled Self-Critique of LLMs with Synthetic Data: a Bayesian Perspective
arXiv 2023
ZYN: Zero-Shot Reward Models with Yes-No Questions for RLAIF
arXiv 2023
Personalizing Text-to-Image Generation via Aesthetic Gradients
arXiv 2022