0

Hao liu

Papers
34

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
34papers

Authored papers

34

Multimodal OCR: Parse Anything from Documents

arXiv 2026

2026

Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Agentic Reinforcement Learning

arXiv 2026

2026

NTIRE 2026 Challenge on Video Saliency Prediction: Methods and Results

arXiv 2026

2026

Closing the Loop: Universal Repository Representation with RPG-Encoder

arXiv 2026

2026

dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model

arXiv 2025

2025

Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting

arXiv 2025

2025

MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem

arXiv 2025

2025

SongEval: A Benchmark Dataset for Song Aesthetics Evaluation

arXiv 2025

2025

A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone

arXiv 2025

2025

Not All Thoughts are Generated Equal: Efficient LLM Reasoning via Multi-Turn Reinforcement Learning

arXiv 2025

2025

Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation

arXiv 2025

2025

Bag of Tricks for Inference-time Computation of LLM Reasoning

arXiv 2025

2025

Can Understanding and Generation Truly Benefit Together -- or Just Coexist?

arXiv 2025

2025

World Model on Million-Length Video And Language With Blockwise RingAttention

arXiv 2024

2024

TrustLLM: Trustworthiness in Large Language Models

arXiv 2024

2024

JAILJUDGE: A Comprehensive Jailbreak Judge Benchmark with Multi-Agent Enhanced Explanation Evaluation Framework

arXiv 2024

2024

OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning

arXiv 2024

2024

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

arXiv 2024

2024

TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

arXiv 2024

2024

Reflections from the 2024 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry

arXiv 2024

2024

Harmonizing Visual Text Comprehension and Generation

arXiv 2024

2024

Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback

arXiv 2024

2024

Blockwise Parallel Transformer for Large Context Models

arXiv 2023

2023

Ring Attention with Blockwise Transformers for Near-Infinite Context

arXiv 2023

2023

LLMLight: Large Language Models as Traffic Signal Control Agents

arXiv 2023

2023

One for All: Towards Training One Graph Model for All Classification Tasks

arXiv 2023

2023

Chain of Hindsight Aligns Language Models with Feedback

arXiv 2023

2023

UUKG: Unified Urban Knowledge Graph Dataset for Urban Spatiotemporal Prediction

uukg-unified-urban-knowledge-graph-dataset

2023

Masked Autoencoding for Scalable and Generalizable Decision Making

arXiv 2022

2022

Knowledge Mining with Scene Text for Fine-Grained Recognition

CVPR 2022 1

2022

Out-of-Town Recommendation with Travel Intention Modeling

arXiv 2021

2021

UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning

ACL 2021 5

2020

SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis

skep-sentiment-knowledge-enhanced-pre-1

2020

Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions

arXiv 2017

2017

Affiliations

No known affiliations.

Frequent co-authors

10

from 34 papers