0

Weijia Li

Papers
22

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
22papers

Authored papers

22

GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization (V1.0)

arXiv 2026

2026

Mind-Brush: Integrating Agentic Cognitive Search and Reasoning into Image Generation

arXiv 2026

2026

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

arXiv 2025

2025

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

arXiv 2025

2025

Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More

arXiv 2025

2025

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation

arXiv 2025

2025

Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration

CVPR 2025 1

2025

RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards

arXiv 2025

2025

MajutsuCity: Language-driven Aesthetic-adaptive City Generation with Controllable 3D Assets and Layouts

arXiv 2025

2025

OmniAID: Decoupling Semantic and Artifacts for Universal AI-Generated Image Detection in the Wild

arXiv 2025

2025

DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation

arXiv 2025

2025

Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation

arXiv 2025

2025

The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

arXiv 2025

2025

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

arXiv 2025

2025

Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation

arXiv 2025

2025

LEGION: Learning to Ground and Explain for Synthetic Image Detection

ICCV 2025

2025

PM4Bench: A Parallel Multilingual Multi-Modal Multi-task Benchmark for Large Vision Language Model

arXiv 2025

2025

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios

arXiv 2024

2024

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

arXiv 2024

2024

VIGC: Visual Instruction Generation and Correction

arXiv 2023

2023

Parrot Captions Teach CLIP to Spot Text

arXiv 2023

2023

Influence Selection for Active Learning

ICCV 2021 10

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 22 papers