Rita Cucchiara
- Papers
- 30
Cite
Notes
Only stored in your browser.
Authored papers
30ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering
arXiv 2026
Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval
CVPR 2025 1
Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives
arXiv 2025
LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning
arXiv 2025
Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals
arXiv 2025
RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors
arXiv 2025
Hyperbolic Safety-Aware Vision-Language Models
CVPR 2025 1
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation
ICCV 2025
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities
arXiv 2024
Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering
CVPR 2025 1
Trends, Applications, and Challenges in Human Attention Modelling
arXiv 2024
Multimodal-Conditioned Latent Diffusion Models for Fashion Image Editing
arXiv 2024
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization
arXiv 2024
Binarizing Documents by Leveraging both Space and Frequency
arXiv 2024
Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas
arXiv 2024
BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues
arXiv 2024
μgat: Improving Single-Page Document Parsing by Providing Multi-Page Context
arXiv 2024
Bringing Masked Autoencoders Explicit Contrastive Properties for Point Cloud Self-Supervised Learning
arXiv 2024
Monocular Per-Object Distance Estimation with Masked Object Modeling
arXiv 2024
Unveiling the Truth: Exploring Human Gaze Patterns in Fake Images
arXiv 2024
LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On
arXiv 2023
Handwritten Text Generation from Visual Archetypes
CVPR 2023 1
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
arXiv 2023
Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing
ICCV 2023 1
HWD: A Novel Evaluation Score for Styled Handwritten Text Generation
arXiv 2023
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning
ICCV 2023 1
Input Perturbation Reduces Exposure Bias in Diffusion Models
arXiv 2023
Evaluating Synthetic Pre-Training for Handwriting Processing Tasks
arXiv 2023
Volumetric Fast Fourier Convolution for Detecting Ink on the Carbonized Herculaneum Papyri
arXiv 2023
Dress Code: High-Resolution Multi-Category Virtual Try-On
arXiv 2022
Affiliations
Frequent co-authors
10from 30 papers
Marcella Cornia
Lorenzo Baraldi
Silvia Cascianelli
Federico Cocchi
Sara Sarto
Vittorio Pippi
Fabio Quattrini
Davide Caffagni
Davide Morelli
Giuseppe Cartella