Shwai He

Papers: 15

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

15papers

Authored papers

ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model

arXiv 2026

2026

Demystifying When Pruning Works via Representation Hierarchies

arXiv 2026

2026

CoIn: Counting the Invisible Reasoning Tokens in Commercial Opaque LLM APIs

arXiv 2025

2025

Understanding and Harnessing Sparsity in Unified Multimodal Models

arXiv 2025

2025

Making Large Language Models Efficient Dense Retrievers

arXiv 2025

2025

Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers

arXiv 2024

2024

What Matters in Transformers? Not All Attention is Needed

arXiv 2024

2024

Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning

arXiv 2024

2024

Loki: Low-rank Keys for Efficient Sparse Attention

arXiv 2024

2024

Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning

arXiv 2024

2024

Reformatted Alignment

arXiv 2024

2024

Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning

arXiv 2023

2023

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts

arXiv 2023

2023

Vega-MT: The JD Explore Academy Translation System for WMT22

arXiv 2022

2022

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

from 15 papers

Ang Li

Guoheng Sun

Tianyi Zhou

DaCheng Tao

Liang Ding

Ming Li

Bowei Tian

Haichao Zhang

Jiuhai Chen

Jiuxiang Gu