Yue Huang
- Papers
- 22
Cite
Notes
Only stored in your browser.
Authored papers
22Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?
arXiv 2026
Emergent Social Intelligence Risks in Generative Multi-Agent Systems
arXiv 2026
Digital Twin AI: Opportunities and Challenges from Large Language Models to World Models
arXiv 2026
Prompt-Activation Duality: Improving Activation Steering via Attention-Level Interventions
arXiv 2026
On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective
arXiv 2025
EfficientLLM: Efficiency in Large Language Models
arXiv 2025
DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling
arXiv 2025
Building a Foundational Guardrail for General Agentic Systems via Synthetic Data
arXiv 2025
Preference Leakage: A Contamination Problem in LLM-as-a-judge
arXiv 2025
Generative AI for Autonomous Driving: Frontiers and Opportunities
arXiv 2025
Interleaved Scene Graphs for Interleaved Text-and-Image Generation Assessment
arXiv 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
arXiv 2024
GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding
arXiv 2024
UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models
arXiv 2024
HonestLLM: Toward an Honest and Helpful Large Language Model
arXiv 2024
TrustLLM: Trustworthiness in Large Language Models
arXiv 2024
LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected?
arXiv 2024
AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?
arXiv 2024
AlignBench: Benchmarking Chinese Alignment of Large Language Models
arXiv 2023
DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4
arXiv 2023
CSPRD: A Financial Policy Retrieval Dataset for Chinese Stock Market
arXiv 2023
MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
arXiv 2023
Affiliations
Frequent co-authors
10from 22 papers