0

CLEVR Parser: A Graph Parser Library for Geometric Learning on Language Grounded Image Scenes

A graph parser library for the CLEVR dataset facilitates the extraction and representation of attributes and relationships for dual-modal learning, aiding in language grounding and other tasks through integration with deep graph neural networks.

Year
2020
Venue
EMNLP (NLPOSS) 2020 11
Authors
2
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2009.09154v2ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

The CLEVR dataset has been used extensively in language grounded visual reasoning in Machine Learning (ML) and Natural Language Processing (NLP) domains. We present a graph parser library for CLEVR, that provides functionalities for object-centric attributes and relationships extraction, and construction of structural graph representations for dual modalities. Structural order-invariant representations enable geometric learning and can aid in downstream tasks like language grounding to vision, robotics, compositionality, interpretability, and computational grammar construction. We provide three extensible main components - parser, embedder, and visualizer that can be tailored to suit specific learning setups. We also provide out-of-the-box functionality for seamless integration with popular deep graph neural network (GNN) libraries. Additionally, we discuss downstream usage and applications of the library, and how it accelerates research for the NLP research community.

Authors

2