Junke Wang

Papers: 9

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

9papers

Authored papers

VideoLoom: A Video Large Language Model for Joint Spatial-Temporal Understanding

arXiv 2026

2026

Perception Encoder: The best visual embeddings are not at the output of the network

arXiv 2025

2025

SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL

arXiv 2025

2025

Pix2Cap-COCO: Advancing Visual Comprehension via Pixel-Level Captioning

arXiv 2025

2025

OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation

arXiv 2024

2024

MouSi: Poly-Visual-Expert Vision-Language Models

arXiv 2024

2024

OmniVid: A Generative Framework for Universal Video Understanding

CVPR 2024 1

2024

To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning

arXiv 2023

2023

M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection

arXiv 2021

2021

Affiliations

No known affiliations.

Frequent co-authors

from 9 papers

Zuxuan Wu

Yu-Gang Jiang

Bo He

Zuyao You

Andrea Madotto

Binyue Peng

Boyang Hong

Caishuang Huang

Changhao Jiang

Chen Wei