Tianlong Chen
- Papers
- 40
Cite
Notes
Only stored in your browser.
Authored papers
40Skill-Based Mixture-of-Experts: Adaptive Routing for Heterogeneous Reasoning via Inferred Skills
arXiv 2025
PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency
arXiv 2026
GradientStabilizer:Fix the Norm, Not the Gradient
arXiv 2025
VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
arXiv 2025
MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models
arXiv 2025
Quantum Variational Activation Functions Empower Kolmogorov-Arnold Networks
arXiv 2025
Window Token Concatenation for Efficient Visual Large Language Models
arXiv 2025
A Space-Time Transformer for Precipitation Forecasting
arXiv 2025
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
arXiv 2025
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense
arXiv 2025
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment
arXiv 2025
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
arXiv 2024
Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints
ICCV 2025
GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations
arXiv 2024
Contextualization Distillation from Large Language Model for Knowledge Graph Completion
arXiv 2024
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
arXiv 2024
Adapt-$\infty$: Scalable Lifelong Multimodal Instruction Tuning via Dynamic Data Selection
arXiv 2024
Glider: Global and Local Instruction-Driven Expert Router
arXiv 2024
DLF: Disentangled-Language-Focused Multimodal Sentiment Analysis
arXiv 2024
TrustLLM: Trustworthiness in Large Language Models
arXiv 2024
MerRec: A Large-scale Multipurpose Mercari Dataset for Consumer-to-Consumer Recommendation Systems
arXiv 2024
Composable Interventions for Language Models
arXiv 2024
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
arXiv 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
arXiv 2023
Robust Mixture-of-Expert Training for Convolutional Neural Networks
ICCV 2023 1
The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
the-emergence-of-essential-sparsity-in-large
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
arXiv 2023
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
ICCV 2023 1
TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
CVPR 2024 1
M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design
arXiv 2022
Advancing Model Pruning via Bi-level Optimization
arXiv 2022
The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training
the-unreasonable-effectiveness-of-random
APP: Anytime Progressive Pruning
arXiv 2022
Unified Visual Transformer Compression
unified-visual-transformer-compression
Neural Implicit Dictionary via Mixture-of-Expert Training
arXiv 2022
Sparse Training via Boosting Pruning Plasticity with Neuroregeneration
sparse-training-via-boosting-pruning-1
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models
dsee-dually-sparsity-embedded-efficient
You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership
NeurIPS 2021 12
Graph Contrastive Learning with Augmentations
NeurIPS 2020 12
When Does Self-Supervision Help Graph Convolutional Networks?
ICML 2020 1
Affiliations
Frequent co-authors
10from 40 papers