Chao Zhang
- Papers
- 68
Cite
Notes
Only stored in your browser.
Authored papers
68OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond
arXiv 2026
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds
arXiv 2026
LongCat-Flash-Thinking-2601 Technical Report
arXiv 2026
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation
arXiv 2026
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning
arXiv 2026
OpenDecoder: Open Large Language Model Decoding to Incorporate Document Quality in RAG
arXiv 2026
FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios
arXiv 2026
Hackers or Hallucinators? A Comprehensive Analysis of LLM-Based Automated Penetration Testing
arXiv 2026
VID-AD: A Dataset for Image-Level Logical Anomaly Detection under Vision-Induced Distraction
arXiv 2026
Revisiting the Reliability of Language Models in Instruction-Following
arXiv 2025
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
arXiv 2025
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation
arXiv 2025
Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material
arXiv 2025
YuE: Scaling Open Foundation Models for Long-Form Music Generation
arXiv 2025
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test
arXiv 2025
Adaptation of Agentic AI
arXiv 2025
video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language Models
arXiv 2025
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
arXiv 2025
HunyuanImage 3.0 Technical Report
arXiv 2025
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
arXiv 2025
Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models
arXiv 2025
GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks
arXiv 2025
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
arXiv 2025
MMGDreamer: Mixed-Modality Graph for Geometry-Controllable 3D Indoor Scene Generation
mmgdreamer-mixed-modality-graph-for-geometry
Language Model Uncertainty Quantification with Attention Chain
arXiv 2025
Tady: A Neural Disassembler without Structural Constraint Violations
arXiv 2025
Video2Roleplay: A Multimodal Dataset and Framework for Video-Guided Role-playing Agents
arXiv 2025
ACVUBench: Audio-Centric Video Understanding Benchmark
arXiv 2025
Streamlining the Collaborative Chain of Models into A Single Forward Pass in Generation-Based Tasks
arXiv 2025
WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning
arXiv 2025
Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation
arXiv 2025
TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework
arXiv 2025
Unleashing High-Quality Image Generation in Diffusion Sampling Using Second-Order Levenberg-Marquardt-Langevin
ICCV 2025
LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync
arXiv 2024
Time-MMD: Multi-Domain Multimodal Dataset for Time Series Analysis
arXiv 2024
NoteLLM-2: Multimodal Large Representation Models for Recommendation
arXiv 2024
APISR: Anime Production Inspired Real-World Anime Super-Resolution
CVPR 2024 1
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
arXiv 2024
3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding
arXiv 2024
CLAP: Learning Transferable Binary Code Representations with Natural Language Supervision
arXiv 2024
Efficient Evolutionary Search Over Chemical Space with Large Language Models
arXiv 2024
Aligning Large Language Models with Representation Editing: A Control Perspective
arXiv 2024
Retrieve-Plan-Generation: An Iterative Planning and Answering Framework for Knowledge-Intensive LLM Generation
arXiv 2024
Matryoshka: Learning to Drive Black-Box LLMs with LLMs
arXiv 2024
Semantic Map-based Generation of Navigation Instructions
arXiv 2024
Enhancing Audio-Language Models through Self-Supervised Post-Training with Text-Audio Pairs
arXiv 2024
An Engorgio Prompt Makes Large Language Model Babble on
arXiv 2024
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
arXiv 2024
BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers
arXiv 2024
SALMONN: Towards Generic Hearing Abilities for Large Language Models
arXiv 2023
ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval
arXiv 2023
PolyIE: A Dataset of Information Extraction from Polymer Material Scientific Literature
arXiv 2023
Large Language Models for Generative Information Extraction: A Survey
arXiv 2023
ToolQA: A Dataset for LLM Question Answering with External Tools
toolqa-a-dataset-for-llm-question-answering
Can Contextual Biasing Remain Effective with Whisper and GPT-2?
arXiv 2023
C3: Zero-shot Text-to-SQL with ChatGPT
arXiv 2023
Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias
large-language-model-as-attributed-training
AdaPlanner: Adaptive Planning from Feedback with Language Models
adaplanner-adaptive-planning-from-feedback
Rank-DETR for High Quality Object Detection
rank-detr-for-high-quality-object-detection
One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training
ICCV 2023 1
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models
arXiv 2023
COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning
arXiv 2022
A general-purpose material property data extraction pipeline from large polymer corpora using Natural Language Processing
arXiv 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
arXiv 2022
Model-Aware Contrastive Learning: Towards Escaping the Dilemmas
arXiv 2022
DETRs with Hybrid Matching
CVPR 2023 1
A Survey on Programmatic Weak Supervision
arXiv 2022
Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach
NAACL 2021 4
Affiliations
Frequent co-authors
10from 68 papers