0

Rita Cucchiara

Papers
30

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
30papers

Authored papers

30

ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering

arXiv 2026

2026

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

CVPR 2025 1

2025

Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives

arXiv 2025

2025

LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning

arXiv 2025

2025

Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals

arXiv 2025

2025

RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors

arXiv 2025

2025

Hyperbolic Safety-Aware Vision-Language Models

CVPR 2025 1

2025

Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation

ICCV 2025

2024

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities

arXiv 2024

2024

Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering

CVPR 2025 1

2024

Trends, Applications, and Challenges in Human Attention Modelling

arXiv 2024

2024

Multimodal-Conditioned Latent Diffusion Models for Fashion Image Editing

arXiv 2024

2024

Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization

arXiv 2024

2024

Binarizing Documents by Leveraging both Space and Frequency

arXiv 2024

2024

Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas

arXiv 2024

2024

BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues

arXiv 2024

2024

μgat: Improving Single-Page Document Parsing by Providing Multi-Page Context

arXiv 2024

2024

Bringing Masked Autoencoders Explicit Contrastive Properties for Point Cloud Self-Supervised Learning

arXiv 2024

2024

Monocular Per-Object Distance Estimation with Masked Object Modeling

arXiv 2024

2024

Unveiling the Truth: Exploring Human Gaze Patterns in Fake Images

arXiv 2024

2024

LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On

arXiv 2023

2023

Handwritten Text Generation from Visual Archetypes

CVPR 2023 1

2023

Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models

arXiv 2023

2023

Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

ICCV 2023 1

2023

HWD: A Novel Evaluation Score for Styled Handwritten Text Generation

arXiv 2023

2023

With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning

ICCV 2023 1

2023

Input Perturbation Reduces Exposure Bias in Diffusion Models

arXiv 2023

2023

Evaluating Synthetic Pre-Training for Handwriting Processing Tasks

arXiv 2023

2023

Volumetric Fast Fourier Convolution for Detecting Ink on the Carbonized Herculaneum Papyri

arXiv 2023

2023

Dress Code: High-Resolution Multi-Category Virtual Try-On

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 30 papers