Liqiang Nie
- Papers
- 30
Cite
Notes
Only stored in your browser.
Authored papers
30PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric Records
arXiv 2026
Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning
arXiv 2026
AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation
arXiv 2026
NTIRE 2026 Challenge on Video Saliency Prediction: Methods and Results
arXiv 2026
A Comprehensive Survey on Composed Image Retrieval
arXiv 2025
$\text{R}^2\text{ec}$: Towards Large Recommender Models with Reasoning
arXiv 2025
Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable Task Experts
arXiv 2025
Curriculum Coarse-to-Fine Selection for High-IPC Dataset Distillation
CVPR 2025 1
FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers
ICCV 2025
LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant
CVPR 2025 1
CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification
arXiv 2025
Parallel Test-Time Scaling for Latent Reasoning Models
arXiv 2025
Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy
CVPR 2025 1
MegaSR: Mining Customized Semantics and Expressive Guidance for Image Super-Resolution
arXiv 2025
FineCIR: Explicit Parsing of Fine-Grained Modification Semantics for Composed Image Retrieval
arXiv 2025
Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis
arXiv 2025
Open Multimodal Retrieval-Augmented Factual Image Generation
arXiv 2025
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks
arXiv 2024
SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
arXiv 2024
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning
arXiv 2024
Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning
arXiv 2024
Token-level Correlation-guided Compression for Efficient Multimodal Document Understanding
arXiv 2024
Sentiment-enhanced Graph-based Sarcasm Explanation in Dialogue
arXiv 2024
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians
CVPR 2024 1
GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis
CVPR 2024 1
UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models
arXiv 2023
Multi-source Semantic Graph-based Multimodal Sarcasm Explanation Generation
arXiv 2023
MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning
Findings (ACL) 2022 5
Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos
arXiv 2021
REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training
Findings (ACL) 2021 8
Affiliations
Frequent co-authors
10from 30 papers