Di He
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12In-Place Test-Time Training
arXiv 2026
The AI Hippocampus: How Far are We From Human Memory?
arXiv 2026
Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models
arXiv 2026
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation
arXiv 2024
DPO Meets PPO: Reinforced Token Optimization for RLHF
arXiv 2024
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks
arXiv 2024
DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
CVPR 2023 1
REST: Retrieval-Based Speculative Decoding
arXiv 2023
A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests
arXiv 2023
Your Transformer May Not be as Powerful as You Expect
arXiv 2022
Less is More: Pre-train a Strong Text Encoder for Dense Retrieval Using a Weak Decoder
arXiv 2021
Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View
understanding-and-improving-transformer-from-1
Affiliations
Frequent co-authors
10from 12 papers