0

Xue Yang

Papers
41

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
41papers

Authored papers

41

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

arXiv 2026

2026

From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills

arXiv 2026

2026

SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation

arXiv 2026

2026

MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation

arXiv 2026

2026

Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence

arXiv 2026

2026

RISE-Video: Can Video Generators Decode Implicit World Rules?

arXiv 2026

2026

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

arXiv 2026

2026

PhotoFlow: Agentic 3D Virtual Photography Missions

arXiv 2026

2026

Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

arXiv 2026

2026

BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation

arXiv 2026

2026

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

arXiv 2026

2026

Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation

arXiv 2026

2026

GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing

arXiv 2026

2026

Quantum Kernel Advantage over Classical Collapse in Medical Foundation Model Embeddings

arXiv 2026

2026

DreamWorld: Unified World Modeling in Video Generation

arXiv 2026

2026

ComfyUI-R1: Exploring Reasoning Models for Workflow Generation

arXiv 2025

2025

FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces

arXiv 2025

2025

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

arXiv 2025

2025

Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

arXiv 2025

2025

Can Generative Geospatial Diffusion Models Excel as Discriminative Geospatial Foundation Models?

ICCV 2025

2025

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

arXiv 2025

2025

Mono-InternVL-1.5: Towards Cheaper and Faster Monolithic Multimodal Large Language Models

arXiv 2025

2025

ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder

arXiv 2025

2025

Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding

arXiv 2025

2025

A Simple Aerial Detection Baseline of Multimodal Language Models

arXiv 2025

2025

A Simple Aerial Detection Baseline of Multimodal Language Models

arXiv 2025

2025

Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation

arXiv 2025

2025

SA-Occ: Satellite-Assisted 3D Occupancy Prediction in Real World

arXiv 2025

2025

When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning

ICCV 2025

2025

Co-Training Vision Language Models for Remote Sensing Multi-task Learning

arXiv 2025

2025

A Unified Agentic Framework for Evaluating Conditional Image Generation

arXiv 2025

2025

SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence

arXiv 2025

2025

Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?

arXiv 2025

2025

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

arXiv 2025

2025

STAR: A First-Ever Dataset and A Large-Scale Benchmark for Scene Graph Generation in Large-Size Satellite Imagery

arXiv 2024

2024

GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding

arXiv 2024

2024

FLoRA: Low-Rank Core Space for N-dimension

arXiv 2024

2024

Point2RBox: Combine Knowledge from Synthetic Visual Patterns for End-to-end Oriented Object Detection with Single Point Supervision

CVPR 2024 1

2023

ARS-DETR: Aspect Ratio-Sensitive Detection Transformer for Aerial Oriented Object Detection

arXiv 2023

2023

PointOBB: Learning Oriented Object Detection via Single Point Supervision

CVPR 2024 1

2023

H2RBox: Horizontal Box Annotation is All You Need for Oriented Object Detection

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 41 papers