Bo Du
- Papers
- 38
Cite
Notes
Only stored in your browser.
Authored papers
38UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation
arXiv 2026
Retrieval-Augmented Perception: High-Resolution Image Perception Meets Visual RAG
arXiv 2025
Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation
arXiv 2025
MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching
arXiv 2025
AirSim360: A Panoramic Simulation Platform within Drone View
arXiv 2025
DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training
arXiv 2025
A Survey of Safety on Large Vision-Language Models: Attacks, Defenses and Evaluations
arXiv 2025
Keeping Yourself is Important in Downstream Tuning Multimodal Large Language Model
arXiv 2025
MAPO: Mixed Advantage Policy Optimization
arXiv 2025
Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging
arXiv 2025
SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization
arXiv 2025
AnesBench: Multi-Dimensional Evaluation of LLM Reasoning in Anesthesiology
arXiv 2025
Backdoor Cleaning without External Guidance in MLLM Fine-tuning
arXiv 2025
HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model
arXiv 2024
What If the Input is Expanded in OOD Detection?
arXiv 2024
Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient
arXiv 2024
Revisiting Knowledge Distillation for Autoregressive Language Models
arXiv 2024
Multi-modal Auto-regressive Modeling via Visual Words
arXiv 2024
OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models
arXiv 2024
Online GNN Evaluation Under Test-time Graph Distribution Shifts
arXiv 2024
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems
arXiv 2024
IML-ViT: Benchmarking Image Manipulation Localization by Vision Transformer
arXiv 2023
Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion
arXiv 2023
Enhancing Visually-Rich Document Understanding via Layout Structure Modeling
arXiv 2023
XAI for In-hospital Mortality Prediction via Multimodal ICU Data
arXiv 2023
PNT-Edge: Towards Robust Edge Detection with Noisy Labels by Learning Pixel-level Noise Transitions
arXiv 2023
FSUIE: A Novel Fuzzy Span Mechanism for Universal Information Extraction
arXiv 2023
Centroid-centered Modeling for Efficient Vision Transformer Pre-training
arXiv 2023
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
arXiv 2023
Token Contrast for Weakly-Supervised Semantic Segmentation
CVPR 2023 1
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model
arXiv 2022
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
CVPR 2023 1
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation
arXiv 2022
Diff-Font: Diffusion Model for Robust One-Shot Font Generation
arXiv 2022
Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-based Sentiment Analysis
arXiv 2022
Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models
arXiv 2022
Robust Weight Perturbation for Adversarial Training
robust-weight-perturbation-for-adversarial
PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation
arXiv 2022
Affiliations
Frequent co-authors
10from 38 papers