0

Jie Tang

OpenAI engineer; co-author of Codex (distinct from Tang Jie of Tsinghua/Zhipu).

Role
engineer
Currently at
OpenAI
Papers
92

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
92papers

Authored papers

92

GLM-5: from Vibe Coding to Agentic Engineering

arXiv 2026

2026

IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse

arXiv 2026

2026

GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents

arXiv 2026

2026

Beyond Literal Mapping: Benchmarking and Improving Non-Literal Translation Evaluation

arXiv 2026

2026

Training-Free Vector Quantization via Gaussian VAEs

arXiv 2025

2026

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

arXiv 2025

2025

CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution

CVPR 2025 1

2025

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search

arXiv 2025

2025

Dynamic Scaling of Unit Tests for Code Reward Modeling

arXiv 2025

2025

AndroidGen: Building an Android Language Agent under Data Scarcity

arXiv 2025

2025

SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations

arXiv 2025

2025

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

arXiv 2025

2025

UI2Code^N: A Visual Language Model for Test-Time Scalable Interactive UI-to-Code Generation

arXiv 2025

2025

WebVIA: A Web-based Vision-Language Agentic Framework for Interactive and Verifiable UI-to-Code Generation

arXiv 2025

2025

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

arXiv 2025

2025

Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling

arXiv 2025

2025

A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models

arXiv 2025

2025

VPO: Aligning Text-to-Video Generation Models with Prompt Optimization

ICCV 2025

2025

LongSafety: Evaluating Long-Context Safety of Large Language Models

arXiv 2025

2025

MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning

arXiv 2025

2025

ObjFiller-3D: Consistent Multi-view 3D Inpainting via Video Diffusion Models

arXiv 2025

2025

In-the-wild Audio Spatialization with Flexible Text-guided Localization

arXiv 2025

2025

ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

arXiv 2025

2025

Parameter-Efficient Fine-Tuning for Foundation Models

arXiv 2025

2025

AutoLUT: LUT-Based Image Super-Resolution with Automatic Sampling and Adaptive Residual Learning

CVPR 2025 1

2025

CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis

arXiv 2025

2025

Data-Efficient RLVR via Off-Policy Influence Guidance

arXiv 2025

2025

Small Language Model Makes an Effective Long Text Extractor

arXiv 2025

2025

GLM-TTS Technical Report

arXiv 2025

2025

LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

arXiv 2024

2024

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

arXiv 2024

2024

Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer

arXiv 2024

2024

GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot

arXiv 2024

2024

AutoWebGLM: A Large Language Model-based Web Navigating Agent

arXiv 2024

2024

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search

arXiv 2024

2024

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

arXiv 2024

2024

TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios

arXiv 2024

2024

VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents

arXiv 2024

2024

LVBench: An Extreme Long Video Understanding Benchmark

ICCV 2025

2024

Benchmarking Complex Instruction-Following with Multiple Constraints Composition

arXiv 2024

2024

SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models

arXiv 2024

2024

ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline

arXiv 2024

2024

Towards Efficient Exact Optimization of Language Model Alignment

arXiv 2024

2024

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

arXiv 2024

2024

AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models

arXiv 2024

2024

Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments

arXiv 2024

2024

MSAGPT: Neural Prompting Protein Structure Prediction via MSA Generative Pre-Training

arXiv 2024

2024

A Solution-based LLM API-using Methodology for Academic Information Seeking

arXiv 2024

2024

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

arXiv 2024

2024

CogVLM2: Visual Language Models for Image and Video Understanding

arXiv 2024

2024

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

arXiv 2024

2024

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

arXiv 2024

2024

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

arXiv 2024

2024

AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents

arXiv 2024

2024

LongAlign: A Recipe for Long Context Alignment of Large Language Models

arXiv 2024

2024

CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations

arXiv 2024

2024

LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models

arXiv 2024

2024

GPT-4 Technical Report

gpt-4-technical-report

2023

CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models

arXiv 2023

2023

AgentBench: Evaluating LLMs as Agents

arXiv 2023

2023

AgentTuning: Enabling Generalized Agent Abilities for LLMs

arXiv 2023

2023

WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences

arXiv 2023

2023

Relay Diffusion: Unifying diffusion process across resolutions for image synthesis

relay-diffusion-unifying-diffusion-process

2023

SafetyBench: Evaluating the Safety of Large Language Models

arXiv 2023

2023

CVPR 2023 Text Guided Video Editing Competition

arXiv 2023

2023

GLM-Dialog: Noise-tolerant Pre-training for Knowledge-grounded Dialogue Generation

arXiv 2023

2023

KoLA: Carefully Benchmarking World Knowledge of Large Language Models

arXiv 2023

2023

Robust Object Modeling for Visual Tracking

ICCV 2023 1

2023

CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X

arXiv 2023

2023

CogAgent: A Visual Language Model for GUI Agents

CVPR 2024 1

2023

ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation

imagereward-learning-and-evaluating-human

2023

AlignBench: Benchmarking Chinese Alignment of Large Language Models

arXiv 2023

2023

GPT Can Solve Mathematical Problems Without a Calculator

arXiv 2023

2023

Black-Box Prompt Optimization: Aligning Large Language Models without Model Training

arXiv 2023

2023

CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation

arXiv 2023

2023

GOAL: A Challenging Knowledge-grounded Video Captioning Benchmark for Real-time Soccer Commentary Generation

arXiv 2023

2023

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

arXiv 2022

2022

GLM-130B: An Open Bilingual Pre-trained Model

arXiv 2022

2022

CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers

arXiv 2022

2022

GraphMAE: Self-Supervised Masked Graph Autoencoders

arXiv 2022

2022

GACT: Activation Compressed Training for Generic Network Architectures

arXiv 2022

2022

Evaluating Large Language Models Trained on Code

preprint

2021

Anchor-based Plain Net for Mobile Image Super-Resolution

arXiv 2021

2021

FastMoE: A Fast Mixture-of-Expert Training System

arXiv 2021

2021

P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks

arXiv 2021

2021

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

ACL 2022 5

2021

CogView: Mastering Text-to-Image Generation via Transformers

NeurIPS 2021 12

2021

GPT Understands, Too

arXiv 2021

2021

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training

arXiv 2020

2020

Controllable Multi-Interest Framework for Recommendation

arXiv 2020

2020

CPM: A Large-scale Generative Chinese Pre-trained Language Model

arXiv 2020

2020

Blockwise Self-Attention for Long Document Understanding

Findings of the Association for Computational Linguistics 2020

2019

Affiliations

Currently at

OpenAI

engineer · frontier lab

Frequent co-authors

10

from 92 papers