0

Lijun Wu

Papers
34

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
34papers

Authored papers

34

ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch

arXiv 2026

2026

Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility

arXiv 2026

2026

MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

arXiv 2026

2026

Tracing the Roots: A Multi-Agent Framework for Uncovering Data Lineage in Post-Training LLMs

arXiv 2026

2026

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

arXiv 2025

2025

A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers

arXiv 2025

2025

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

arXiv 2025

2025

NatureLM: Deciphering the Language of Nature for Scientific Discovery

arXiv 2025

2025

CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges

arXiv 2025

2025

Efficient Reasoning for LLMs through Speculative Chain-of-Thought

arXiv 2025

2025

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

arXiv 2025

2025

OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value

arXiv 2025

2025

MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion

arXiv 2025

2025

A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis

arXiv 2025

2025

GGBench: A Geometric Generative Reasoning Benchmark for Unified Multimodal Models

arXiv 2025

2025

PM4Bench: A Parallel Multilingual Multi-Modal Multi-task Benchmark for Large Vision Language Model

arXiv 2025

2025

MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer

arXiv 2025

2025

Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights

arXiv 2025

2025

Sequential Diffusion Language Models

arXiv 2025

2025

Revisiting Long-context Modeling from Context Denoising Perspective

arXiv 2025

2025

ScaleDiff: Scaling Difficult Problems for Advanced Mathematical Reasoning

arXiv 2025

2025

Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem

arXiv 2025

2025

REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once

arXiv 2025

2025

Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning

arXiv 2025

2025

Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning

arXiv 2025

2025

LEMMA: Learning from Errors for MatheMatical Advancement in LLMs

arXiv 2025

2025

GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition

arXiv 2025

2025

3D-MolT5: Leveraging Discrete Structural Information for Molecule-Text Modeling

arXiv 2024

2024

Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey

arXiv 2024

2024

Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining

arXiv 2024

2024

FABind: Fast and Accurate Protein-Ligand Binding

NeurIPS 2023 11

2023

SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity Prediction

arXiv 2022

2022

Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change

arXiv 2022

2022

A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 34 papers