Zili Wang
- Papers
- 18
Cite
Notes
Only stored in your browser.
Authored papers
18Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters
arXiv 2026
OProver: A Unified Framework for Agentic Formal Theorem Proving
arXiv 2026
A Comprehensive Survey on Long Context Language Modeling
arXiv 2025
Farseer: A Refined Scaling Law in Large Language Models
arXiv 2025
Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining
arXiv 2025
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
arXiv 2025
Diffusion Language Models are Super Data Learners
arXiv 2025
MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs
arXiv 2025
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs
arXiv 2025
ChatMusician: Understanding and Generating Music Intrinsically with LLM
arXiv 2024
m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers
arXiv 2024
Continuous Speculative Decoding for Autoregressive Image Generation
arXiv 2024
Layerwise Recurrent Router for Mixture-of-Experts
arXiv 2024
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory
arXiv 2024
AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation
arXiv 2024
MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training
arXiv 2023
RefGPT: Dialogue Generation of GPT, by GPT, and for GPT
arXiv 2023
Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation
arXiv 2023
Affiliations
Frequent co-authors
10from 18 papers