Cite
Notes
Only stored in your browser.
Attribution
Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs
arXiv 2025
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
arXiv 2023
from 2 papers
Hang Li
Hao Cheng
Hao Sun
Mihaela van der Schaar
Muhammad Faaiz Taufiq
Ruocheng Guo
Xiaoying Zhang
Yang Liu
Yegor Klochkov
Yuanshun Yao