0

Lei LI

Papers
79

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
79papers

Authored papers

79

GLM-5: from Vibe Coding to Agentic Engineering

arXiv 2026

2026

MiMo-V2-Flash Technical Report

arXiv 2026

2026

Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining

arXiv 2026

2026

TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions

arXiv 2026

2026

Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents

arXiv 2026

2026

PatchAlign3D: Local Feature Alignment for Dense 3D Shape understanding

arXiv 2026

2026

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

arXiv 2026

2026

Mending the Holes: Mitigating Reward Hacking in Reinforcement Learning for Multilingual Translation

arXiv 2026

2026

Efficient Document Parsing via Parallel Token Prediction

arXiv 2026

2026

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

arXiv 2025

2025

MiMo-VL Technical Report

arXiv 2025

2025

Seed1.5-VL Technical Report

arXiv 2025

2025

R2MED: A Benchmark for Reasoning-Driven Medical Retrieval

arXiv 2025

2025

Medal S: Spatio-Textual Prompt Model for Medical Segmentation

arXiv 2025

2025

JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence

arXiv 2025

2025

GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation

arXiv 2025

2025

Could Thinking Multilingually Empower LLM Reasoning?

arXiv 2025

2025

DIS-CO: Discovering Copyrighted Content in VLMs Training Data

arXiv 2025

2025

RARE: Retrieval-Aware Robustness Evaluation for Retrieval-Augmented Generation Systems

arXiv 2025

2025

TEMPLE:Temporal Preference Learning of Video LLMs via Difficulty Scheduling and Pre-SFT Alignment

arXiv 2025

2025

TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos

arXiv 2025

2025

InfiniSST: Simultaneous Translation of Unbounded Speech with Large Language Model

arXiv 2025

2025

BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

arXiv 2025

2025

EasyInstruct: An Easy-to-use Instruction Processing Framework for Large Language Models

arXiv 2024

2024

Revealing the Barriers of Language Agents in Planning

arXiv 2024

2024

GIRAFFE: Design Choices for Extending the Context Length of Visual Language Models

arXiv 2024

2024

Scaling LLM Inference with Optimized Sample Compute Allocation

arXiv 2024

2024

Ada-Retrieval: An Adaptive Multi-Round Retrieval Paradigm for Sequential Recommendations

arXiv 2024

2024

DE-COP: Detecting Copyrighted Content in Language Models Training Data

arXiv 2024

2024

Temporal Reasoning Transfer from Text to Video

arXiv 2024

2024

TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs

arXiv 2024

2024

A Practical Examination of AI-Generated Text Detectors for Large Language Models

arXiv 2024

2024

BPO: Staying Close to the Behavior LLM Creates Better Online LLM Alignment

arXiv 2024

2024

AutoMIR: Effective Zero-Shot Medical Information Retrieval without Relevance Labels

arXiv 2024

2024

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

arXiv 2024

2024

Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models

arXiv 2024

2024

TempCompass: Do Video LLMs Really Understand Videos?

arXiv 2024

2024

Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models

arXiv 2024

2024

LumberChunker: Long-Form Narrative Document Segmentation

arXiv 2024

2024

Weak-to-Strong Jailbreaking on Large Language Models

arXiv 2024

2024

CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications

arXiv 2024

2024

LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages

arXiv 2024

2024

MindMerger: Efficient Boosting LLM Reasoning in non-English Languages

arXiv 2024

2024

Jailbreaking as a Reward Misspecification Problem

arXiv 2024

2024

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

arXiv 2024

2024

Protecting Language Generation Models via Invisible Watermarking

arXiv 2023

2023

ImageNetVC: Zero- and Few-Shot Visual Commonsense Evaluation on 1000 ImageNet Categories

arXiv 2023

2023

How Vocabulary Sharing Facilitates Multilingualism in LLaMA?

arXiv 2023

2023

Large Language Models are not Fair Evaluators

arXiv 2023

2023

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

arXiv 2023

2023

INSTRUCTSCORE: Explainable Text Generation Evaluation with Finegrained Feedback

arXiv 2023

2023

Tool-Augmented Reward Modeling

arXiv 2023

2023

Can We Edit Factual Knowledge by In-Context Learning?

arXiv 2023

2023

Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation

arXiv 2023

2023

Provable Robust Watermarking for AI-Generated Text

arXiv 2023

2023

Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis

arXiv 2023

2023

Don't Fine-Tune, Decode: Syntax Error-Free Tool Use via Constrained Decoding

arXiv 2023

2023

Extrapolating Large Language Models to Non-English by Aligning Languages

arXiv 2023

2023

Can Language Models Understand Physical Concepts?

arXiv 2023

2023

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models

arXiv 2023

2023

ReDi: Efficient Learning-Free Diffusion Inference via Trajectory Retrieval

arXiv 2023

2023

A Survey on In-context Learning

arXiv 2022

2022

Not All Errors are Equal: Learning Text Generation Metrics using Stratified Error Synthesis

arXiv 2022

2022

Lego-MT: Learning Detachable Models for Massively Multilingual Machine Translation

arXiv 2022

2022

Compressing Sentence Representation for Semantic Retrieval via Homomorphic Projective Distillation

Findings (ACL) 2022 5

2022

Multimodal Analogical Reasoning over Knowledge Graphs

arXiv 2022

2022

Cross-modal Contrastive Learning for Speech Translation

NAACL 2022 7

2022

$\textit{latent}$-GLAT: Glancing at Latent Variables for Parallel Text Generation

arXiv 2022

2022

Calibrating Factual Knowledge in Pretrained Language Models

arXiv 2022

2022

Delving into the Openness of CLIP

arXiv 2022

2022

LightNER: A Lightweight Tuning Paradigm for Low-resource NER via Pluggable Prompting

COLING 2022 10

2021

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation

ACL 2021 5

2021

Well-classified Examples are Underestimated in Classification with Deep Neural Networks

arXiv 2021

2021

Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models

NAACL 2021 4

2021

UniKeyphrase: A Unified Extraction and Generation Framework for Keyphrase Prediction

Findings (ACL) 2021 8

2021

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

CVPR 2021 1

2020

Glancing Transformer for Non-Autoregressive Neural Machine Translation

ACL 2021 5

2020

LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification

arXiv 2020

2020

In Conclusion Not Repetition: Comprehensive Abstractive Summarization With Diversified Attention Based On Determinantal Point Processes

in-conclusion-not-repetition-comprehensive-1

2019

Affiliations

No known affiliations.

Frequent co-authors

10

from 79 papers