0

DaCheng Tao

Papers
102

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
102papers

Authored papers

102

PixVerve: Advancing Native UHR Image Generation to 100MP with a Large-Scale High-Quality Dataset

arXiv 2026

2026

Understanding and Enforcing Weight Disentanglement in Task Arithmetic

arXiv 2026

2026

UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation

arXiv 2026

2026

GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering

arXiv 2026

2026

VTC-R1: Vision-Text Compression for Efficient Long-Context Reasoning

arXiv 2026

2026

Language-based Trial and Error Falls Behind in the Era of Experience

arXiv 2026

2026

FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios

arXiv 2026

2026

Large Language Model Agent: A Survey on Methodology, Applications and Challenges

arXiv 2025

2025

VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search

arXiv 2025

2025

Retrieval-Augmented Perception: High-Resolution Image Perception Meets Visual RAG

arXiv 2025

2025

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

arXiv 2025

2025

Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities

arXiv 2025

2025

R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPO

arXiv 2025

2025

SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data

arXiv 2025

2025

Unifying Multimodal Large Language Model Capabilities and Modalities via Model Merging

arXiv 2025

2025

R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search

arXiv 2025

2025

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

arXiv 2025

2025

Ada-R1: Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization

arXiv 2025

2025

MAPO: Mixed Advantage Policy Optimization

arXiv 2025

2025

Reasoning with Reinforced Functional Token Tuning

arXiv 2025

2025

Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging

arXiv 2025

2025

JustLogic: A Comprehensive Benchmark for Evaluating Deductive Reasoning in Large Language Models

arXiv 2025

2025

VeriGUI: Verifiable Long-Chain GUI Dataset

arXiv 2025

2025

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

arXiv 2025

2025

GeometryZero: Improving Geometry Solving for LLM with Group Contrastive Policy Optimization

arXiv 2025

2025

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning

arXiv 2025

2025

Improving large language models with concept-aware fine-tuning

arXiv 2025

2025

Safety at Scale: A Comprehensive Survey of Large Model Safety

arXiv 2025

2025

A Survey of Safety on Large Vision-Language Models: Attacks, Defenses and Evaluations

arXiv 2025

2025

Towards Understanding the Safety Boundaries of DeepSeek Models: Evaluation and Findings

arXiv 2025

2025

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

arXiv 2024

2024

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

arXiv 2024

2024

Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models

arXiv 2024

2024

Communication Learning in Multi-Agent Systems from Graph Modeling Perspective

arXiv 2024

2024

AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration

arXiv 2024

2024

EMOv2: Pushing 5M Vision Model Frontier

arXiv 2024

2024

Merging Multi-Task Models via Weight-Ensembling Mixture of Experts

arXiv 2024

2024

Representation Surgery for Multi-Task Model Merging

arXiv 2024

2024

OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models

arXiv 2024

2024

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

arXiv 2024

2024

Intention Analysis Makes LLMs A Good Jailbreak Defender

arXiv 2024

2024

Object Detectors in the Open Environment: Challenges, Solutions, and Outlook

arXiv 2024

2024

A Survey on Knowledge Distillation of Large Language Models

arXiv 2024

2024

HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

arXiv 2024

2024

Trajectory Consistency Distillation: Improved Latent Consistency Distillation by Semi-Linear Consistency Function with Trajectory Mapping

arXiv 2024

2024

Diffusion Model-Based Video Editing: A Survey

arXiv 2024

2024

Aligning Large Language Models from Self-Reference AI Feedback with one General Principle

arXiv 2024

2024

Revisiting Knowledge Distillation for Autoregressive Language Models

arXiv 2024

2024

Deep Learning for Camera Calibration and Beyond: A Survey

arXiv 2023

2023

AdaMerging: Adaptive Model Merging for Multi-Task Learning

arXiv 2023

2023

Upcycling Models under Domain and Category Shift

CVPR 2023 1

2023

Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning

ICCV 2023 1

2023

FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy

arXiv 2023

2023

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts

arXiv 2023

2023

Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

arXiv 2023

2023

Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion

arXiv 2023

2023

Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages

arXiv 2023

2023

ConDaFormer: Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding

condaformer-disassembled-transformer-with

2023

Good Questions Help Zero-Shot Image Reasoning

arXiv 2023

2023

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

arXiv 2023

2023

TriDet: Temporal Action Detection with Relative Boundary Modeling

CVPR 2023 1

2023

VanillaNet: the Power of Minimalism in Deep Learning

vanillanet-the-power-of-minimalism-in-deep

2023

HandRefiner: Refining Malformed Hands in Generated Images by Diffusion-based Conditional Inpainting

arXiv 2023

2023

One for All: Towards Training One Graph Model for All Classification Tasks

arXiv 2023

2023

Vision Transformer with Quadrangle Attention

arXiv 2023

2023

PNT-Edge: Towards Robust Edge Detection with Noisy Labels by Learning Pixel-level Noise Transitions

arXiv 2023

2023

Structured Cooperative Learning with Graphical Model Priors

arXiv 2023

2023

Centroid-centered Modeling for Efficient Vision Transformer Pre-training

arXiv 2023

2023

Revisiting Token Dropping Strategy in Efficient BERT Pretraining

arXiv 2023

2023

Unifying Flow, Stereo and Depth Estimation

arXiv 2022

2022

Unified Discrete Diffusion for Simultaneous Vision-Language Generation

arXiv 2022

2022

Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning

CVPR 2022 1

2022

Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-based Sentiment Analysis

arXiv 2022

2022

Where Does the Performance Improvement Come From? -- A Reproducibility Concern about Image-Text Retrieval

arXiv 2022

2022

Vega-MT: The JD Explore Academy Translation System for WMT22

arXiv 2022

2022

CLAMP: Prompt-based Contrastive Learning for Connecting Language and Animal Pose

CVPR 2023 1

2022

ReAct: Temporal Action Detection with Relational Queries

arXiv 2022

2022

Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models

arXiv 2022

2022

A Contrastive Cross-Channel Data Augmentation Framework for Aspect-based Sentiment Analysis

COLING 2022 10

2022

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters

arXiv 2022

2022

PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation

arXiv 2022

2022

Improving Simultaneous Machine Translation with Monolingual Data

arXiv 2022

2022

Knowledge-Aware Federated Active Learning with Non-IID Data

ICCV 2023 1

2022

On the Complementarity between Pre-Training and Random-Initialization for Resource-Rich Machine Translation

COLING 2022 10

2022

Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model

arXiv 2022

2022

ViTPose++: Vision Transformer for Generic Body Pose Estimation

arXiv 2022

2022

Generating Holistic 3D Human Motion from Speech

CVPR 2023 1

2022

DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

CVPR 2023 1

2022

BatchFormer: Learning to Explore Sample Relationships for Robust Representation Learning

CVPR 2022 1

2022

VSA: Learning Varied-Size Window Attention in Vision Transformers

arXiv 2022

2022

Diff-Font: Diffusion Model for Robust One-Shot Font Generation

arXiv 2022

2022

Fashionformer: A simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition

arXiv 2022

2022

Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation and Understanding

arXiv 2022

2022

E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation

arXiv 2022

2022

TASA: Deceiving Question Answering Models by Twin Answer Sentences Attack

arXiv 2022

2022

GMFlow: Learning Optical Flow via Global Matching

CVPR 2022 1

2021

One-Shot Object Affordance Detection in the Wild

arXiv 2021

2021

CPP-Net: Context-aware Polygon Proposal Network for Nucleus Segmentation

arXiv 2021

2021

Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in Non-Autoregressive Translation

ACL 2021 5

2021

Neural networks behave as hash encoders: An empirical study

neural-networks-behave-as-hash-encoders-an

2021

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data

arXiv 2020

2020

ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering

arXiv 2019

2019

Affiliations

No known affiliations.

Frequent co-authors

10

from 102 papers