Nan Duan
- Papers
- 43
Cite
Notes
Only stored in your browser.
Authored papers
43Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization
arXiv 2026
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models
arXiv 2026
OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence
arXiv 2026
Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation
arXiv 2026
SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing
arXiv 2026
EasyVideoR1: Easier RL for Video Understanding
arXiv 2026
OmniForcing: Unleashing Real-time Joint Audio-Visual Generation
arXiv 2026
Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model
arXiv 2025
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model
arXiv 2025
Rho-1: Not All Tokens Are What You Need
arXiv 2024
Alchemy: Amplifying Theorem-Proving Capability through Symbolic Mutation
arXiv 2024
LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models
arXiv 2024
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving
arXiv 2023
LongCoder: A Long-Range Pre-trained Language Model for Code Completion
arXiv 2023
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing
arXiv 2023
LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language Models
arXiv 2023
Allies: Prompting Large Language Model with Beam Search
arXiv 2023
PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion
arXiv 2023
GameEval: Evaluating LLMs on Conversational Games
arXiv 2023
ORES: Open-vocabulary Responsible Visual Synthesis
arXiv 2023
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators
arXiv 2023
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
arXiv 2023
Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
arXiv 2023
Low-code LLM: Graphical User Interface over Large Language Models
arXiv 2023
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
arXiv 2023
CMMLU: Measuring massive multitask language understanding in Chinese
arXiv 2023
Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models
arXiv 2023
ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning
arXiv 2023
Constructing Multilingual Code Search Dataset Using Neural Machine Translation
arXiv 2023
GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation
arXiv 2022
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
arXiv 2022
BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning
arXiv 2022
ReACC: A Retrieval-Augmented Code Completion Framework
ACL 2022 5
Unsupervised Context Aware Sentence Representation Pretraining for Multi-lingual Dense Retrieval
arXiv 2022
LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrieval
Findings (ACL) 2022 5
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
arXiv 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
arXiv 2021
EL-Attention: Memory Efficient Lossless Attention for Generation
arXiv 2021
CoSQA: 20,000+ Web Queries for Code Search and Question Answering
ACL 2021 5
Adversarial Retriever-Ranker for dense text retrieval
adversarial-retriever-ranker-for-dense-text-1
WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach
Findings (EMNLP) 2021 11
AR-LSAT: Investigating Analytical Reasoning of Text
arXiv 2021
UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
arXiv 2020
Affiliations
Frequent co-authors
10from 43 papers