0

Zhuo Chen

Papers
51

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
51papers

Authored papers

51

GLM-5: from Vibe Coding to Agentic Engineering

arXiv 2026

2026

Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives

arXiv 2026

2026

PhysForge: Generating Physics-Grounded 3D Assets for Interactive Virtual World

arXiv 2026

2026

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

arXiv 2025

2025

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

arXiv 2025

2025

MagiCodec: Simple Masked Gaussian-Injected Codec for High-Fidelity Reconstruction and Generation

arXiv 2025

2025

A Comprehensive Survey on Long Context Language Modeling

arXiv 2025

2025

Part-X-MLLM: Part-aware 3D Multimodal Large Language Model

arXiv 2025

2025

Scaling Agents via Continual Pre-training

arXiv 2025

2025

Advances in Speech Separation: Techniques, Challenges, and Future Trends

arXiv 2025

2025

Emu3.5: Native Multimodal Models are World Learners

arXiv 2025

2025

P3-SAM: Native 3D Part Segmentation

arXiv 2025

2025

X-Part: high fidelity and structure coherent shape decomposition

arXiv 2025

2025

L$^2$M: Mutual Information Scaling Law for Long-Context Language Modeling

arXiv 2025

2025

AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models

arXiv 2025

2025

Dens3R: A Foundation Model for 3D Geometry Prediction

arXiv 2025

2025

Seed-TTS: A Family of High-Quality Versatile Speech Generation Models

arXiv 2024

2024

QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation

arXiv 2024

2024

TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision

arXiv 2024

2024

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

arXiv 2024

2024

TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers

arXiv 2024

2024

Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts

arXiv 2024

2024

Scaling Mesh Generation via Compressive Tokenization

CVPR 2025 1

2024

Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking

arXiv 2024

2024

ChatCell: Facilitating Single-Cell Analysis with Natural Language

arXiv 2024

2024

CREST: Cross-modal Resonance through Evidential Deep Learning for Enhanced Zero-Shot Learning

arXiv 2024

2024

Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models

arXiv 2023

2023

Making Large Language Models Perform Better in Knowledge Graph Completion

arXiv 2023

2023

Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering

arXiv 2023

2023

Domain-Agnostic Molecular Generation with Chemical Feedback

arXiv 2023

2023

Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations

arXiv 2023

2023

Multimodal Foundation Models for Material Property Prediction and Discovery

arXiv 2023

2023

ANTN: Bridging Autoregressive Neural Networks and Tensor Networks for Quantum Many-Body Simulation

autoregressive-neural-tensornet-bridging

2023

ELFNet: Evidential Local-global Fusion for Stereo Matching

ICCV 2023 1

2023

Prompting Disentangled Embeddings for Knowledge Graph Completion with Pre-trained Language Model

arXiv 2023

2023

Newton-Cotes Graph Neural Networks: On the Time Evolution of Dynamic Systems

arXiv 2023

2023

MACO: A Modality Adversarial and Contrastive Framework for Modality-missing Multi-modal Knowledge Graph Completion

arXiv 2023

2023

Universal Multi-modal Entity Alignment via Iteratively Fusing Modality Similarity Paths

arXiv 2023

2023

BEATs: Audio Pre-Training with Acoustic Tokenizers

arXiv 2022

2022

LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection

arXiv 2022

2022

Tele-Knowledge Pre-training for Fault Analysis

arXiv 2022

2022

MEAformer: Multi-modal Entity Alignment Transformer for Meta Modality Hybrid

arXiv 2022

2022

DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning

arXiv 2022

2022

Disentangled Ontology Embedding for Zero-shot Learning

arXiv 2022

2022

Target-oriented Sentiment Classification with Sequential Cross-modal Semantic Graph

arXiv 2022

2022

UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training

arXiv 2021

2021

AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario

arXiv 2021

2021

OntoZSL: Ontology-enhanced Zero-shot Learning

arXiv 2021

2021

Benchmarking Knowledge-driven Zero-shot Learning

arXiv 2021

2021

Molecular Contrastive Learning with Chemical Element Knowledge Graph

arXiv 2021

2021

Zero-shot Visual Question Answering using Knowledge Graph

arXiv 2021

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 51 papers