Jie Tang
OpenAI engineer; co-author of Codex (distinct from Tang Jie of Tsinghua/Zhipu).
- Role
- engineer
- Currently at
- OpenAI
- Papers
- 92
Cite
Notes
Only stored in your browser.
Authored papers
92GLM-5: from Vibe Coding to Agentic Engineering
arXiv 2026
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse
arXiv 2026
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents
arXiv 2026
Beyond Literal Mapping: Benchmarking and Improving Non-Literal Translation Evaluation
arXiv 2026
Training-Free Vector Quantization via Gaussian VAEs
arXiv 2025
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
arXiv 2025
CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution
CVPR 2025 1
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search
arXiv 2025
Dynamic Scaling of Unit Tests for Code Reward Modeling
arXiv 2025
AndroidGen: Building an Android Language Agent under Data Scarcity
arXiv 2025
SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations
arXiv 2025
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
arXiv 2025
UI2Code^N: A Visual Language Model for Test-Time Scalable Interactive UI-to-Code Generation
arXiv 2025
WebVIA: A Web-based Vision-Language Agentic Framework for Interactive and Verifiable UI-to-Code Generation
arXiv 2025
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
arXiv 2025
Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
arXiv 2025
A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models
arXiv 2025
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization
ICCV 2025
LongSafety: Evaluating Long-Context Safety of Large Language Models
arXiv 2025
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning
arXiv 2025
ObjFiller-3D: Consistent Multi-view 3D Inpainting via Video Diffusion Models
arXiv 2025
In-the-wild Audio Spatialization with Flexible Text-guided Localization
arXiv 2025
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario
arXiv 2025
Parameter-Efficient Fine-Tuning for Foundation Models
arXiv 2025
AutoLUT: LUT-Based Image Super-Resolution with Automatic Sampling and Adaptive Residual Learning
CVPR 2025 1
CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis
arXiv 2025
Data-Efficient RLVR via Off-Policy Influence Guidance
arXiv 2025
Small Language Model Makes an Effective Long Text Extractor
arXiv 2025
GLM-TTS Technical Report
arXiv 2025
LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering
arXiv 2024
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
arXiv 2024
Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer
arXiv 2024
GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot
arXiv 2024
AutoWebGLM: A Large Language Model-based Web Navigating Agent
arXiv 2024
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
arXiv 2024
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
arXiv 2024
TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios
arXiv 2024
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents
arXiv 2024
LVBench: An Extreme Long Video Understanding Benchmark
ICCV 2025
Benchmarking Complex Instruction-Following with Multiple Constraints Composition
arXiv 2024
SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models
arXiv 2024
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline
arXiv 2024
Towards Efficient Exact Optimization of Language Model Alignment
arXiv 2024
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
arXiv 2024
AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
arXiv 2024
Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments
arXiv 2024
MSAGPT: Neural Prompting Protein Structure Prediction via MSA Generative Pre-Training
arXiv 2024
A Solution-based LLM API-using Methodology for Academic Information Seeking
arXiv 2024
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
arXiv 2024
CogVLM2: Visual Language Models for Image and Video Understanding
arXiv 2024
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
arXiv 2024
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
arXiv 2024
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
arXiv 2024
AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents
arXiv 2024
LongAlign: A Recipe for Long Context Alignment of Large Language Models
arXiv 2024
CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations
arXiv 2024
LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
arXiv 2024
GPT-4 Technical Report
gpt-4-technical-report
CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models
arXiv 2023
AgentBench: Evaluating LLMs as Agents
arXiv 2023
AgentTuning: Enabling Generalized Agent Abilities for LLMs
arXiv 2023
WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences
arXiv 2023
Relay Diffusion: Unifying diffusion process across resolutions for image synthesis
relay-diffusion-unifying-diffusion-process
SafetyBench: Evaluating the Safety of Large Language Models
arXiv 2023
CVPR 2023 Text Guided Video Editing Competition
arXiv 2023
GLM-Dialog: Noise-tolerant Pre-training for Knowledge-grounded Dialogue Generation
arXiv 2023
KoLA: Carefully Benchmarking World Knowledge of Large Language Models
arXiv 2023
Robust Object Modeling for Visual Tracking
ICCV 2023 1
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X
arXiv 2023
CogAgent: A Visual Language Model for GUI Agents
CVPR 2024 1
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
imagereward-learning-and-evaluating-human
AlignBench: Benchmarking Chinese Alignment of Large Language Models
arXiv 2023
GPT Can Solve Mathematical Problems Without a Calculator
arXiv 2023
Black-Box Prompt Optimization: Aligning Large Language Models without Model Training
arXiv 2023
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation
arXiv 2023
GOAL: A Challenging Knowledge-grounded Video Captioning Benchmark for Real-time Soccer Commentary Generation
arXiv 2023
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
arXiv 2022
GLM-130B: An Open Bilingual Pre-trained Model
arXiv 2022
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
arXiv 2022
GraphMAE: Self-Supervised Masked Graph Autoencoders
arXiv 2022
GACT: Activation Compressed Training for Generic Network Architectures
arXiv 2022
Evaluating Large Language Models Trained on Code
preprint
Anchor-based Plain Net for Mobile Image Super-Resolution
arXiv 2021
FastMoE: A Fast Mixture-of-Expert Training System
arXiv 2021
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
arXiv 2021
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
ACL 2022 5
CogView: Mastering Text-to-Image Generation via Transformers
NeurIPS 2021 12
GPT Understands, Too
arXiv 2021
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training
arXiv 2020
Controllable Multi-Interest Framework for Recommendation
arXiv 2020
CPM: A Large-scale Generative Chinese Pre-trained Language Model
arXiv 2020
Blockwise Self-Attention for Long Document Understanding
Findings of the Association for Computational Linguistics 2020
Affiliations
Frequent co-authors
10from 92 papers