0

Peng Wang

Papers
47

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
47papers

Authored papers

47

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

arXiv 2026

2026

CodePercept: Code-Grounded Visual STEM Perception for MLLMs

arXiv 2026

2026

From Narrow to Panoramic Vision: Attention-Guided Cold-Start Reshapes Multimodal Reasoning

arXiv 2026

2026

HLE-Verified: A Systematic Verification and Structured Revision of Humanity's Last Exam

arXiv 2026

2026

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

arXiv 2026

2026

Qwen-Image Technical Report

arXiv 2025

2025

Qwen3-Omni Technical Report

arXiv 2025

2025

Qwen2.5-VL Technical Report

arXiv 2025

2025

Qwen3 Technical Report

preprint

2025

Qwen3-VL Technical Report

arXiv 2025

2025

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

arXiv 2025

2025

Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models

arXiv 2025

2025

MiMo-VL Technical Report

arXiv 2025

2025

A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models

arXiv 2025

2025

ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions

arXiv 2025

2025

GP-GS: Gaussian Processes for Enhanced Gaussian Splatting

arXiv 2025

2025

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

arXiv 2024

2024

Qwen2 Technical Report

arXiv 2024

2024

Building a Family of Data Augmentation Models for Low-cost LLM Fine-tuning on the Cloud

arXiv 2024

2024

Pruning All-Rounder: Rethinking and Improving Inference Efficiency for Large Vision Language Models

arXiv 2024

2024

Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering

arXiv 2024

2024

LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context

arXiv 2024

2024

TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use

arXiv 2024

2024

VSFormer: Mining Correlations in Flexible View Set for Multi-view 3D Shape Understanding

arXiv 2024

2024

A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap

arXiv 2024

2024

Diffusion Models as Optimizers for Efficient Planning in Offline RL

arXiv 2024

2024

BAD-Gaussians: Bundle Adjusted Deblur Gaussian Splatting

arXiv 2024

2024

GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks

arXiv 2024

2024

HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

arXiv 2024

2024

Autoregressive Pretraining with Mamba in Vision

arXiv 2024

2024

Qwen Technical Report

arXiv 2023

2023

Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond

arXiv 2023

2023

AerialVLN: Vision-and-Language Navigation for UAVs

ICCV 2023 1

2023

F$^{2}$-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories

arXiv 2023

2023

MVDream: Multi-view Diffusion for 3D Generation

arXiv 2023

2023

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion

mvdiffusion-enabling-holistic-multi-view

2023

AirBirds: A Large-scale Challenging Dataset for Bird Strike Prevention in Real-world Airports

arXiv 2023

2023

PERF: Panoramic Neural Radiance Field from a Single Panorama

arXiv 2023

2023

TouchStone: Evaluating Vision-Language Models by Language Models

arXiv 2023

2023

ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

arXiv 2023

2023

Multi-Granularity Prediction for Scene Text Recognition

arXiv 2022

2022

Transferring General Multimodal Pretrained Models to Text Recognition

arXiv 2022

2022

BAD-NeRF: Bundle Adjusted Deblur Neural Radiance Fields

CVPR 2023 1

2022

Dual Modality Prompt Tuning for Vision-Language Pre-Trained Model

arXiv 2022

2022

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

NeurIPS 2021 12

2021

StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis

stylenerf-a-style-based-3d-aware-generator

2021

WikiAsp: A Dataset for Multi-domain Aspect-based Summarization

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 47 papers