0

Semantic Understanding of Scenes through the ADE20K Dataset

The Cascade Segmentation Module enhances semantic segmentation networks by enabling cascaded parsing of scenes into stuff, objects, and object parts, demonstrating improved performance using the ADE20K dataset.

Year
2016
Venue
arXiv 2016
Authors
7
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/1608.05442v2ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Scene parsing, or recognizing and segmenting objects and stuff in an image, is one of the key problems in computer vision. Despite the community's efforts in data collection, there are still few image datasets covering a wide range of scenes and object categories with dense and detailed annotations for scene parsing. In this paper, we introduce and analyze the ADE20K dataset, spanning diverse annotations of scenes, objects, parts of objects, and in some cases even parts of parts. A generic network design called Cascade Segmentation Module is then proposed to enable the segmentation networks to parse a scene into stuff, objects, and object parts in a cascade. We evaluate the proposed module integrated within two existing semantic segmentation networks, yielding significant improvements for scene parsing. We further show that the scene parsing networks trained on ADE20K can be applied to a wide variety of scenes and objects.

Authors

7