Ser-Nam Lim
- Papers
- 20
Cite
Notes
Only stored in your browser.
Authored papers
20When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
arXiv 2025
Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View
arXiv 2025
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis
arXiv 2025
Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models
ICCV 2025
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization
arXiv 2025
Delta Activations: A Representation for Finetuned Large Language Models
arXiv 2025
VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation
arXiv 2024
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
CVPR 2024 1
Next Patch Prediction for Autoregressive Visual Generation
arXiv 2024
HNeRV: A Hybrid Neural Representation for Videos
CVPR 2023 1
Object Recognition as Next Token Prediction
CVPR 2024 1
Graph Inductive Biases in Transformers without Message Passing
arXiv 2023
Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?
ICCV 2023 1
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
arXiv 2022
Visual Prompt Tuning
arXiv 2022
RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and Out Distribution Robustness
arXiv 2022
$BT^2$: Backward-compatible Training with Basis Transformation
arXiv 2022
M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection
arXiv 2021
Rethinking Nearest Neighbors for Visual Classification
arXiv 2021
On Feature Normalization and Data Augmentation
CVPR 2021 1
Affiliations
Frequent co-authors
10from 20 papers