Zhuo Chen
- Papers
- 51
Cite
Notes
Only stored in your browser.
Authored papers
51GLM-5: from Vibe Coding to Agentic Engineering
arXiv 2026
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives
arXiv 2026
PhysForge: Generating Physics-Grounded 3D Assets for Interactive Virtual World
arXiv 2026
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation
arXiv 2025
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
arXiv 2025
MagiCodec: Simple Masked Gaussian-Injected Codec for High-Fidelity Reconstruction and Generation
arXiv 2025
A Comprehensive Survey on Long Context Language Modeling
arXiv 2025
Part-X-MLLM: Part-aware 3D Multimodal Large Language Model
arXiv 2025
Scaling Agents via Continual Pre-training
arXiv 2025
Advances in Speech Separation: Techniques, Challenges, and Future Trends
arXiv 2025
Emu3.5: Native Multimodal Models are World Learners
arXiv 2025
P3-SAM: Native 3D Part Segmentation
arXiv 2025
X-Part: high fidelity and structure coherent shape decomposition
arXiv 2025
L$^2$M: Mutual Information Scaling Law for Long-Context Language Modeling
arXiv 2025
AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models
arXiv 2025
Dens3R: A Foundation Model for 3D Geometry Prediction
arXiv 2025
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
arXiv 2024
QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation
arXiv 2024
TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision
arXiv 2024
Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey
arXiv 2024
TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers
arXiv 2024
Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts
arXiv 2024
Scaling Mesh Generation via Compressive Tokenization
CVPR 2025 1
Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking
arXiv 2024
ChatCell: Facilitating Single-Cell Analysis with Natural Language
arXiv 2024
CREST: Cross-modal Resonance through Evidential Deep Learning for Enhanced Zero-Shot Learning
arXiv 2024
Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models
arXiv 2023
Making Large Language Models Perform Better in Knowledge Graph Completion
arXiv 2023
Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering
arXiv 2023
Domain-Agnostic Molecular Generation with Chemical Feedback
arXiv 2023
Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations
arXiv 2023
Multimodal Foundation Models for Material Property Prediction and Discovery
arXiv 2023
ANTN: Bridging Autoregressive Neural Networks and Tensor Networks for Quantum Many-Body Simulation
autoregressive-neural-tensornet-bridging
ELFNet: Evidential Local-global Fusion for Stereo Matching
ICCV 2023 1
Prompting Disentangled Embeddings for Knowledge Graph Completion with Pre-trained Language Model
arXiv 2023
Newton-Cotes Graph Neural Networks: On the Time Evolution of Dynamic Systems
arXiv 2023
MACO: A Modality Adversarial and Contrastive Framework for Modality-missing Multi-modal Knowledge Graph Completion
arXiv 2023
Universal Multi-modal Entity Alignment via Iteratively Fusing Modality Similarity Paths
arXiv 2023
BEATs: Audio Pre-Training with Acoustic Tokenizers
arXiv 2022
LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection
arXiv 2022
Tele-Knowledge Pre-training for Fault Analysis
arXiv 2022
MEAformer: Multi-modal Entity Alignment Transformer for Meta Modality Hybrid
arXiv 2022
DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
arXiv 2022
Disentangled Ontology Embedding for Zero-shot Learning
arXiv 2022
Target-oriented Sentiment Classification with Sequential Cross-modal Semantic Graph
arXiv 2022
UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training
arXiv 2021
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario
arXiv 2021
OntoZSL: Ontology-enhanced Zero-shot Learning
arXiv 2021
Benchmarking Knowledge-driven Zero-shot Learning
arXiv 2021
Molecular Contrastive Learning with Chemical Element Knowledge Graph
arXiv 2021
Zero-shot Visual Question Answering using Knowledge Graph
arXiv 2021
Affiliations
Frequent co-authors
10from 51 papers