Furong Huang
- Papers
- 35
Cite
Notes
Only stored in your browser.
Authored papers
35TSRBench: A Comprehensive Multi-task Multi-modal Time Series Reasoning Benchmark for Generalist Models
arXiv 2026
Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing
arXiv 2026
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
arXiv 2025
On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective
arXiv 2025
Zero-Shot Vision Encoder Grafting via LLM Surrogates
ICCV 2025
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning
arXiv 2025
MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning
arXiv 2025
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs
arXiv 2025
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement
arXiv 2025
PropensityBench: Evaluating Latent Safety Risks in Large Language Models via an Agentic Approach
arXiv 2025
TrustLLM: Trustworthiness in Large Language Models
arXiv 2024
GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment
arXiv 2024
Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
CVPR 2025 1
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization
arXiv 2024
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences
arXiv 2024
Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
arXiv 2024
Is poisoning a real threat to LLM alignment? Maybe more so than you think
arXiv 2024
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control
arXiv 2024
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language Models
arXiv 2024
Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss
arXiv 2024
Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies
arXiv 2024
Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models
arXiv 2024
WAVES: Benchmarking the Robustness of Image Watermarks
arXiv 2024
Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension
ICCV 2025
FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?
arXiv 2024
TACO: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning
arXiv 2023
Explore Spurious Correlations at the Concept Level in Language Models for Text Classification
arXiv 2023
Towards Realistic Mechanisms That Incentivize Federated Participation and Contribution
arXiv 2023
Decodable and Sample Invariant Continuous Object Encoder
arXiv 2023
HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models
CVPR 2024 1
PerceptionCLIP: Visual Classification by Inferring and Conditioning on Contexts
arXiv 2023
DrM: Mastering Visual Reinforcement Learning through Dormant Ratio Minimization
arXiv 2023
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise
cold-diffusion-inverting-arbitrary-image
Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy
arXiv 2022
Datasets for Studying Generalization from Easy to Hard Examples
arXiv 2021
Affiliations
Frequent co-authors
10from 35 papers