0

Jindong Wang

Papers
31

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
31papers

Authored papers

31

AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent

arXiv 2026

2026

TorchUMM: A Unified Multimodal Model Codebase for Evaluation, Analysis, and Post-training

arXiv 2026

2026

Masked Autoencoders Are Effective Tokenizers for Diffusion Models

arXiv 2025

2025

RewardAnything: Generalizable Principle-Following Reward Models

arXiv 2025

2025

UniGame: Turning a Unified Multimodal Model Into Its Own Adversary

arXiv 2025

2025

Prompt Candidates, then Distill: A Teacher-Student Framework for LLM-driven Data Annotation

arXiv 2025

2025

MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents

arXiv 2025

2025

HAROOD: A Benchmark for Out-of-distribution Generalization in Sensor-based Human Activity Recognition

arXiv 2025

2025

TrustLLM: Trustworthiness in Large Language Models

arXiv 2024

2024

AgentReview: Exploring Peer Review Dynamics with LLM Agents

arXiv 2024

2024

MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms

arXiv 2024

2024

SciEvo: A 2 Million, 30-Year Cross-disciplinary Dataset for Temporal Scientometric Analysis

arXiv 2024

2024

Diff-eRank: A Novel Rank-Based Metric for Evaluating Large Language Models

arXiv 2024

2024

FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models

arXiv 2024

2024

Reasoning Through Execution: Unifying Process and Outcome Rewards for Code Generation

arXiv 2024

2024

MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders

arXiv 2024

2024

NegativePrompt: Leveraging Psychology for Large Language Models Enhancement via Negative Emotional Stimuli

arXiv 2024

2024

Time Series Analysis for Education: Methods, Applications, and Future Directions

arXiv 2024

2024

Dynamic Evaluation of Large Language Models by Meta Probing Agents

arXiv 2024

2024

A Survey on Evaluation of Large Language Models

arXiv 2023

2023

Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning

ICCV 2023 1

2023

PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization

arXiv 2023

2023

How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation

arXiv 2023

2023

Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks

arXiv 2023

2023

Supervised Knowledge Makes Large Language Models Better In-context Learners

arXiv 2023

2023

PromptBench: A Unified Library for Evaluation of Large Language Models

arXiv 2023

2023

Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity

arXiv 2023

2023

Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models

distilling-out-of-distribution-robustness

2023

GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-distribution Generalization Perspective

arXiv 2022

2022

Memory-Guided Multi-View Multi-Domain Fake News Detection

arXiv 2022

2022

USB: A Unified Semi-supervised Learning Benchmark for Classification

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 31 papers