Jun Wang
- Papers
- 55
Cite
Notes
Only stored in your browser.
Authored papers
55Memento-Skills: Let Agents Design Agents
arXiv 2026
CASCADE: Case-Based Continual Adaptation for Large Language Models During Deployment
arXiv 2026
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company
arXiv 2026
DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset
arXiv 2026
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories
arXiv 2026
TIDE: Efficient and Lossless MoE Diffusion LLM Inference with I/O-aware Expert Offload
arXiv 2026
Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction
arXiv 2026
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs
arXiv 2026
UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards
arXiv 2026
PhotoBench: Beyond Visual Matching Towards Personalized Intent-Driven Photo Retrieval
arXiv 2026
TaskCraft: Automated Generation of Agentic Tasks
arXiv 2025
MARFT: Multi-Agent Reinforcement Fine-Tuning
arXiv 2025
Decoupled Global-Local Alignment for Improving Compositional Understanding
arXiv 2025
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
arXiv 2025
Memento: Fine-tuning LLM Agents without Fine-tuning LLMs
arXiv 2025
Efficient Agents: Building Effective Agents While Reducing Cost
arXiv 2025
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning
arXiv 2025
Risk-aware Direct Preference Optimization under Nested Risk Measure
arXiv 2025
Ark: An Open-source Python-based Framework for Robot Learning
arXiv 2025
ColorAgent: Building A Robust, Personalized, and Interactive OS Agent
arXiv 2025
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation
arXiv 2025
CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs
arXiv 2025
Human-like Episodic Memory for Infinite Context LLMs
arXiv 2024
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
arXiv 2024
SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
arXiv 2024
Direct Preference Optimization Using Sparse Feature-Level Constraints
arXiv 2024
E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning
arXiv 2024
Hammer: Robust Function-Calling for On-Device Language Models via Function Masking
arXiv 2024
Natural Language Reinforcement Learning
arXiv 2024
Elucidating the design space of language models for image generation
arXiv 2024
CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels
arXiv 2024
Deep Learning for Multivariate Time Series Imputation: A Survey
arXiv 2024
Token-level Direct Preference Optimization
arXiv 2024
D2LLM: Decomposed and Distilled Large Language Models for Semantic Search
arXiv 2024
ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
arXiv 2024
Circuit Transformer: A Transformer That Preserves Logical Equivalence
arXiv 2024
CMAT: A Multi-Agent Collaboration Tuning Framework for Enhancing Small Language Models
arXiv 2024
Large Language Models Are Neurosymbolic Reasoners
arXiv 2024
Nyonic Technical Report
arXiv 2024
Synthesizing Realistic Data for Table Recognition
arXiv 2024
HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios
arXiv 2024
User Behavior Simulation with Large Language Model based Agents
arXiv 2023
How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances
arXiv 2023
ChessGPT: Bridging Policy Learning and Language Modeling
chessgpt-bridging-policy-learning-and
Align and Attend: Multimodal Summarization with Dual Contrastive Losses
CVPR 2023 1
Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
arXiv 2023
Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach
arXiv 2023
A Review of Safe Reinforcement Learning: Methods, Theory and Applications
arXiv 2022
On Realization of Intelligent Decision-Making in the Real World: A Foundation Decision Model Perspective
arXiv 2022
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
arXiv 2022
M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers
arXiv 2021
Real-Time Bidding by Reinforcement Learning in Display Advertising
arXiv 2017
Long Text Generation via Adversarial Training with Leaked Information
arXiv 2017
Efficient Architecture Search by Network Transformation
arXiv 2017
Activation Maximization Generative Adversarial Nets
activation-maximization-generative-1
Affiliations
Frequent co-authors
10from 55 papers