Wei zhang
- Papers
- 42
Cite
Notes
Only stored in your browser.
Authored papers
42IQuest-Coder-V1 Technical Report
arXiv 2026
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters
arXiv 2026
InCoder-32B: Code Foundation Model for Industrial Scenarios
arXiv 2026
Large-Scale Terminal Agentic Trajectory Generation from Dockerized Environments
arXiv 2026
A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression
arXiv 2026
UniFinEval: Towards Unified Evaluation of Financial Multimodal Models across Text, Images and Videos
arXiv 2026
L2P: Unlocking Latent Potential for Pixel Generation
arXiv 2026
SEAD: Self-Evolving Agent for Multi-Turn Service Dialogue
arXiv 2026
Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning
arXiv 2025
NatureLM: Deciphering the Language of Nature for Scientific Discovery
arXiv 2025
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models
arXiv 2025
JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation
arXiv 2025
LLaSO: A Foundational Framework for Reproducible Research in Large Language and Speech Model
arXiv 2025
KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation
arXiv 2025
Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM
arXiv 2025
UniFit: Towards Universal Virtual Try-on with MLLM-Guided Semantic Alignment
arXiv 2025
Multilingual Multimodal Software Developer for Code Generation
arXiv 2025
ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement
arXiv 2025
InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search
arXiv 2025
FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents
arXiv 2025
AIM 2024 Challenge on Video Saliency Prediction: Methods and Results
arXiv 2024
SVIPTR: Fast and Efficient Scene Text Recognition with Vision Permutable Extractor
arXiv 2024
E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning
arXiv 2024
HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction
arXiv 2024
Rethinking Remote Sensing Change Detection With A Mask View
arXiv 2024
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
CVPR 2025 1
The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield Better Language Models
arXiv 2024
Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions
arXiv 2024
D2LLM: Decomposed and Distilled Large Language Models for Semantic Search
arXiv 2024
CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels
arXiv 2024
OpenLane-V2: A Topology Reasoning Benchmark for Unified 3D HD Mapping
openlane-v2-a-topology-reasoning-benchmark
Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation
arXiv 2023
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
arXiv 2023
MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition
ICCV 2023 1
PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation
arXiv 2023
E2E-LOAD: End-to-End Long-form Online Action Detection
ICCV 2023 1
SoccerNet 2023 Challenges Results
arXiv 2023
Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models
arXiv 2023
SoccerNet 2022 Challenges Results
arXiv 2022
LVOS: A Benchmark for Long-term Video Object Segmentation
ICCV 2023 1
CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks
arXiv 2021
OMPQ: Orthogonal Mixed Precision Quantization
arXiv 2021
Affiliations
Frequent co-authors
10from 42 papers