0

Shuming Ma

Papers
20

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
20papers

Authored papers

20

BitNet b1.58 2B4T Technical Report

arXiv 2025

2025

Next Block Prediction: Video Generation via Semi-Auto-Regressive Modeling

arXiv 2025

2025

Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning

arXiv 2025

2025

You Only Cache Once: Decoder-Decoder Architectures for Language Models

arXiv 2024

2024

Kosmos-2: Grounding Multimodal Large Language Models to the World

arXiv 2023

2023

Discourse Centric Evaluation of Machine Translation with a Densely Annotated Parallel Corpus

arXiv 2023

2023

Are More Layers Beneficial to Graph Transformers?

arXiv 2023

2023

On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation

arXiv 2023

2023

Auto-ICL: In-Context Learning without Human Supervision

arXiv 2023

2023

A Length-Extrapolatable Transformer

arXiv 2022

2022

Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt

arXiv 2022

2022

CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual Labeled Sequence Translation

arXiv 2022

2022

GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator

arXiv 2022

2022

StableMoE: Stable Routing Strategy for Mixture of Experts

ACL 2022 5

2022

HLT-MT: High-resource Language-specific Training for Multilingual Neural Machine Translation

arXiv 2022

2022

GTrans: Grouping and Fusing Transformer Layers for Neural Machine Translation

arXiv 2022

2022

UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation

arXiv 2022

2022

DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

arXiv 2021

2021

Zero-shot Cross-lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders

EMNLP 2021 11

2021

Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation

arXiv 2021

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 20 papers