0

Nan Duan

Papers
43

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
43papers

Authored papers

43

Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization

arXiv 2026

2026

Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models

arXiv 2026

2026

OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence

arXiv 2026

2026

Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation

arXiv 2026

2026

SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing

arXiv 2026

2026

EasyVideoR1: Easier RL for Video Understanding

arXiv 2026

2026

OmniForcing: Unleashing Real-time Joint Audio-Visual Generation

arXiv 2026

2026

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

arXiv 2025

2025

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

arXiv 2025

2025

Rho-1: Not All Tokens Are What You Need

arXiv 2024

2024

Alchemy: Amplifying Theorem-Proving Capability through Symbolic Mutation

arXiv 2024

2024

LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models

arXiv 2024

2024

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving

arXiv 2023

2023

LongCoder: A Long-Range Pre-trained Language Model for Code Completion

arXiv 2023

2023

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

arXiv 2023

2023

LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language Models

arXiv 2023

2023

Allies: Prompting Large Language Model with Beam Search

arXiv 2023

2023

PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion

arXiv 2023

2023

GameEval: Evaluating LLMs on Conversational Games

arXiv 2023

2023

ORES: Open-vocabulary Responsible Visual Synthesis

arXiv 2023

2023

AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators

arXiv 2023

2023

Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data

arXiv 2023

2023

Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models

arXiv 2023

2023

Low-code LLM: Graphical User Interface over Large Language Models

arXiv 2023

2023

AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models

arXiv 2023

2023

CMMLU: Measuring massive multitask language understanding in Chinese

arXiv 2023

2023

Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models

arXiv 2023

2023

ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning

arXiv 2023

2023

Constructing Multilingual Code Search Dataset Using Neural Machine Translation

arXiv 2023

2023

GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation

arXiv 2022

2022

NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis

arXiv 2022

2022

BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning

arXiv 2022

2022

ReACC: A Retrieval-Augmented Code Completion Framework

ACL 2022 5

2022

Unsupervised Context Aware Sentence Representation Pretraining for Multi-lingual Dense Retrieval

arXiv 2022

2022

LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrieval

Findings (ACL) 2022 5

2022

CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

arXiv 2021

2021

CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval

arXiv 2021

2021

EL-Attention: Memory Efficient Lossless Attention for Generation

arXiv 2021

2021

CoSQA: 20,000+ Web Queries for Code Search and Question Answering

ACL 2021 5

2021

Adversarial Retriever-Ranker for dense text retrieval

adversarial-retriever-ranker-for-dense-text-1

2021

WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach

Findings (EMNLP) 2021 11

2021

AR-LSAT: Investigating Analytical Reasoning of Text

arXiv 2021

2021

UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 43 papers