Hao liu
- Papers
- 34
Cite
Notes
Only stored in your browser.
Authored papers
34Multimodal OCR: Parse Anything from Documents
arXiv 2026
Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Agentic Reinforcement Learning
arXiv 2026
NTIRE 2026 Challenge on Video Saliency Prediction: Methods and Results
arXiv 2026
Closing the Loop: Universal Repository Representation with RPG-Encoder
arXiv 2026
dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model
arXiv 2025
Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting
arXiv 2025
MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem
arXiv 2025
SongEval: A Benchmark Dataset for Song Aesthetics Evaluation
arXiv 2025
A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone
arXiv 2025
Not All Thoughts are Generated Equal: Efficient LLM Reasoning via Multi-Turn Reinforcement Learning
arXiv 2025
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation
arXiv 2025
Bag of Tricks for Inference-time Computation of LLM Reasoning
arXiv 2025
Can Understanding and Generation Truly Benefit Together -- or Just Coexist?
arXiv 2025
World Model on Million-Length Video And Language With Blockwise RingAttention
arXiv 2024
TrustLLM: Trustworthiness in Large Language Models
arXiv 2024
JAILJUDGE: A Comprehensive Jailbreak Judge Benchmark with Multi-Agent Enhanced Explanation Evaluation Framework
arXiv 2024
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
arXiv 2024
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
arXiv 2024
TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
arXiv 2024
Reflections from the 2024 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry
arXiv 2024
Harmonizing Visual Text Comprehension and Generation
arXiv 2024
Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback
arXiv 2024
Blockwise Parallel Transformer for Large Context Models
arXiv 2023
Ring Attention with Blockwise Transformers for Near-Infinite Context
arXiv 2023
LLMLight: Large Language Models as Traffic Signal Control Agents
arXiv 2023
One for All: Towards Training One Graph Model for All Classification Tasks
arXiv 2023
Chain of Hindsight Aligns Language Models with Feedback
arXiv 2023
UUKG: Unified Urban Knowledge Graph Dataset for Urban Spatiotemporal Prediction
uukg-unified-urban-knowledge-graph-dataset
Masked Autoencoding for Scalable and Generalizable Decision Making
arXiv 2022
Knowledge Mining with Scene Text for Fine-Grained Recognition
CVPR 2022 1
Out-of-Town Recommendation with Travel Intention Modeling
arXiv 2021
UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning
ACL 2021 5
SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis
skep-sentiment-knowledge-enhanced-pre-1
Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions
arXiv 2017
Affiliations
Frequent co-authors
10from 34 papers