Zhen Yang
- Papers
- 26
Cite
Notes
Only stored in your browser.
Authored papers
26GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents
arXiv 2026
Length Value Model: Scalable Value Pretraining for Token-Level Length Modeling
arXiv 2026
D^3R-DETR: DETR with Dual-Domain Density Refinement for Tiny Object Detection in Aerial Images
arXiv 2026
SiamGM: Siamese Geometry-Aware and Motion-Guided Network for Real-Time Satellite Video Object Tracking
arXiv 2026
Kimi K2.5: Visual Agentic Intelligence
arXiv 2026
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
arXiv 2025
RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification
arXiv 2025
FlexPainter: Flexible and Multi-View Consistent Texture Generation
arXiv 2025
Hard Negative Contrastive Learning for Fine-Grained Geometric Understanding in Large Multimodal Models
arXiv 2025
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
arXiv 2025
StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors
arXiv 2025
UI2Code^N: A Visual Language Model for Test-Time Scalable Interactive UI-to-Code Generation
arXiv 2025
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning
arXiv 2025
WebVIA: A Web-based Vision-Language Agentic Framework for Interactive and Verifiable UI-to-Code Generation
arXiv 2025
Less is More: Improving LLM Reasoning with Minimal Test-Time Intervention
arXiv 2025
MSPLoRA: A Multi-Scale Pyramid Low-Rank Adaptation for Efficient Model Fine-Tuning
arXiv 2025
FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Composition
CVPR 2024 1
FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior
arXiv 2024
Streaming Video Diffusion: Online Video Editing with Diffusion Models
arXiv 2024
Relevance Filtering for Embedding-based Retrieval
arXiv 2024
Thought-Path Contrastive Learning via Premise-Oriented Data Augmentation for Logical Reading Comprehension
arXiv 2024
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
arXiv 2024
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
arXiv 2024
GPT Can Solve Mathematical Problems Without a Calculator
arXiv 2023
Object-aware Inversion and Reassembly for Image Editing
arXiv 2023
LoRAPrune: Structured Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning
arXiv 2023
Affiliations
Frequent co-authors
10from 26 papers