Nicu Sebe
- Papers
- 51
Cite
Notes
Only stored in your browser.
Authored papers
51DVD: Deterministic Video Depth Estimation with Generative Priors
arXiv 2026
Panoramic Affordance Prediction
arXiv 2026
Generalizable Knowledge Distillation from Vision Foundation Models for Semantic Segmentation
arXiv 2026
Token Reduction via Local and Global Contexts Optimization for Efficient Video Large Language Models
arXiv 2026
TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation
arXiv 2026
EarthMind: Towards Multi-Granular and Multi-Sensor Earth Observation with Large Multimodal Models
arXiv 2025
A Survey on Efficient Vision-Language-Action Models
arXiv 2025
Retrieval Augmented Generation and Understanding in Vision: A Survey and New Outlook
arXiv 2025
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
ICCV 2025
Reverse Personalization
arXiv 2025
Fully-Geometric Cross-Attention for Point Cloud Registration
arXiv 2025
Loomis Painter: Reconstructing the Painting Process
arXiv 2025
Visual Text Processing: A Comprehensive Review and Unified Evaluation
arXiv 2025
When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
arXiv 2025
Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals
arXiv 2025
Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation
arXiv 2025
VidText: Towards Comprehensive Evaluation for Video Text Understanding
arXiv 2025
Video-BrowseComp: Benchmarking Agentic Video Research on Open Web
arXiv 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
CVPR 2025 1
Multi-focal Conditioned Latent Diffusion for Person Image Synthesis
CVPR 2025 1
NullFace: Training-Free Localized Face Anonymization
arXiv 2025
Superpowering Open-Vocabulary Object Detectors for X-ray Vision
arXiv 2025
Deep Learning-Based Object Pose Estimation: A Comprehensive Survey
arXiv 2024
Bilateral Reference for High-Resolution Dichotomous Image Segmentation
arXiv 2024
Face Anonymization Made Simple
arXiv 2024
Democratizing Fine-grained Visual Recognition with Large Language Models
arXiv 2024
Bringing Masked Autoencoders Explicit Contrastive Properties for Point Cloud Self-Supervised Learning
arXiv 2024
Curriculum Direct Preference Optimization for Diffusion and Consistency Models
CVPR 2025 1
Towards Localized Fine-Grained Control for Facial Expression Generation
arXiv 2024
Uncertainty-Aware Testing-Time Optimization for 3D Human Pose Estimation
arXiv 2024
OpenBias: Open-set Bias Detection in Text-to-Image Generative Models
CVPR 2024 1
A Lie Group Approach to Riemannian Batch Normalization
arXiv 2024
UVMap-ID: A Controllable and Personalized UV Map Generative Model
arXiv 2024
3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
arXiv 2024
LESS: Label-Efficient and Single-Stage Referring 3D Segmentation
arXiv 2024
Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation
CVPR 2024 1
PAIR-Diffusion: A Comprehensive Multimodal Object-Level Image Editor
arXiv 2023
Improving Fairness using Vision-Language Driven Image Augmentation
arXiv 2023
Large-scale Pre-trained Models are Surprisingly Strong in Incremental Novel Class Discovery
arXiv 2023
Latent Traversals in Generative Models as Potential Flows
arXiv 2023
Diversified in-domain synthesis with efficient fine-tuning for few-shot classification
arXiv 2023
StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model
ICCV 2023 1
Householder Projector for Unsupervised Latent Semantics Discovery
ICCV 2023 1
Class-incremental Novel Class Discovery
arXiv 2022
Controllable Person Image Synthesis with Spatially-Adaptive Warped Normalization
arXiv 2021
Bi-Mix: Bidirectional Mixing for Domain Adaptive Nighttime Semantic Segmentation
arXiv 2021
First Order Motion Model for Image Animation
first-order-motion-model-for-image-animation
Whitening for Self-Supervised Representation Learning
arXiv 2020
Unified Generative Adversarial Networks for Controllable Image-to-Image Translation
arXiv 2019
Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion
arXiv 2019
Deformable GANs for Pose-based Human Image Generation
deformable-gans-for-pose-based-human-image-1
Affiliations
Frequent co-authors
10from 51 papers