Tianyu Liu
- Papers
- 37
Cite
Notes
Only stored in your browser.
Authored papers
37BabyVision: Visual Reasoning Beyond Language
arXiv 2026
AutoWeather4D: Autonomous Driving Video Weather Conversion via G-Buffer Dual-Pass Editing
arXiv 2026
Prism-Δ: Differential Subspace Steering for Prompt Highlighting in Large Language Models
arXiv 2026
Kimi K2.5: Visual Agentic Intelligence
arXiv 2026
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
arXiv 2025
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
arXiv 2025
A Comprehensive Survey on Long Context Language Modeling
arXiv 2025
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models
arXiv 2025
A Survey on Latent Reasoning
arXiv 2025
A Survey of Vibe Coding with Large Language Models
arXiv 2025
OJBench: A Competition Level Code Benchmark For Large Language Models
arXiv 2025
Multilingual Multimodal Software Developer for Code Generation
arXiv 2025
Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding
arXiv 2025
RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques
arXiv 2025
Qwen2 Technical Report
arXiv 2024
TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training
arXiv 2024
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding
arXiv 2024
An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
arXiv 2024
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
arXiv 2024
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models
arXiv 2024
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
arXiv 2024
MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation
arXiv 2024
LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts
arXiv 2024
Towards a Unified View of Preference Learning for Large Language Models: A Survey
arXiv 2024
Parallel Speculative Decoding with Adaptive Draft Length
arXiv 2024
PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
arXiv 2024
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
arXiv 2024
LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback
arXiv 2024
Likelihood as a Performance Gauge for Retrieval-Augmented Generation
arXiv 2024
ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code
arXiv 2023
MARS: An Instance-aware, Modular and Realistic Simulator for Autonomous Driving
arXiv 2023
Large Language Models are not Fair Evaluators
arXiv 2023
Discourse Centric Evaluation of Machine Translation with a Densely Annotated Parallel Corpus
arXiv 2023
Enhancing Continual Relation Extraction via Classifier Decomposition
arXiv 2023
ImageNetVC: Zero- and Few-Shot Visual Commonsense Evaluation on 1000 ImageNet Categories
arXiv 2023
A Survey on In-context Learning
arXiv 2022
Autoregressive Structured Prediction with Language Models
arXiv 2022
Affiliations
Frequent co-authors
10from 37 papers