Boyuan Sun

Papers: 10

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

10papers

Authored papers

See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding

arXiv 2026

2026

GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics

arXiv 2026

2026

HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding

arXiv 2025

2025

HumanOmniV2: From Understanding to Omni-Modal Reasoning with Context

arXiv 2025

2025

Depth Anything at Any Condition

arXiv 2025

2025

LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs

arXiv 2025

2025

LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding

arXiv 2025

2025

Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness

arXiv 2025

2025

Towards RAW Object Detection in Diverse Conditions

CVPR 2025 1

2024

CorrMatch: Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation

CVPR 2024 1

2023

Affiliations

No known affiliations.

Frequent co-authors

from 10 papers

Qibin Hou

Xihan Wei

Jiaxing Zhao

Bowen Yin

Xiang Chen

Detao Bai

Ming-Ming Cheng

Modi Jin

Qize Yang

Shenghao Fu