0

VCR: Video representation for Contextual Retrieval

The proposed system uses semantic embeddings and a fusion of visual, audio, and textual features to index, categorize, and recommend video content, enhancing user interaction with a media archives topics ontology map.

Year
2024
Venue
arXiv 2024
Authors
5
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2402.07466ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Streamlining content discovery within media archives requires integrating advanced data representations and effective visualization techniques for clear communication of video topics to users. The proposed system addresses the challenge of efficiently navigating large video collections by exploiting a fusion of visual, audio, and textual features to accurately index and categorize video content through a text-based method. Additionally, semantic embeddings are employed to provide contextually relevant information and recommendations to users, resulting in an intuitive and engaging exploratory experience over our topics ontology map using OpenAI GPT-4.

Authors

5