SupLID: Geometrical Guidance for Out-of-Distribution Detection in Semantic Segmentation

Abstract

This paper introduces SupLID, a novel method for out-of-distribution (OOD) detection in semantic segmentation, leveraging supervised Local Intrinsic Dimensionality (LID) estimation. SupLID captures intrinsic data manifold information by enforcing LID consistency through a multi-scale consistency loss and integrating a semantic-aware LID prior. Experimental results demonstrate that SupLID achieves state-of-the-art performance on various OOD benchmarks, effectively distinguishing OOD samples at the pixel level. The proposed approach provides robust and reliable OOD detection crucial for safety-critical applications.

1. Introduction

Semantic segmentation models often produce confident but erroneous predictions when encountering out-of-distribution (OOD) data, posing significant risks in real-world applications. Reliable OOD detection is essential for ensuring the safety and robustness of these models, particularly in autonomous systems and medical imaging. Current methods struggle with fine-grained pixel-level OOD detection due to their reliance on high-level features rather than intrinsic data geometry. Models used in this article include DeepLabV3+, HRNetV2, Swin-Unet, and the proposed SupLID framework.

2. Related Work

Existing OOD detection methods can be broadly categorized into reconstruction-based, density-estimation, and uncertainty-based approaches, each with limitations in semantic segmentation contexts. Many prior works, including those leveraging Local Intrinsic Dimensionality (LID), often rely on unsupervised LID estimation, which can be noisy and less effective for fine-grained OOD detection. This section highlights the gap in methods that effectively utilize supervised geometric guidance for OOD detection in segmentation tasks.

3. Methodology

SupLID proposes a supervised approach to Local Intrinsic Dimensionality (LID) estimation, guiding the network to learn geometry-aware features explicitly. The core methodology introduces a multi-scale consistency loss to ensure stable and consistent LID estimates across different feature resolutions, enhancing robustness. Furthermore, a semantic-aware LID prior is incorporated, which leverages in-distribution semantic information to regularize LID estimation, enabling more precise differentiation of OOD pixels. This integrated framework allows for the generation of accurate OOD scores by detecting deviations from the learned geometric properties of in-distribution data.

4. Experimental Results

SupLID demonstrates superior performance against state-of-the-art OOD detection methods across various benchmarks, including Cityscapes→ACDC and PASCAL VOC→ADE20K. Key metrics such as FPR95 and AUPRC consistently show significant improvements, highlighting the method's effectiveness in accurately identifying OOD pixels. Ablation studies confirm the individual contributions of the multi-scale consistency loss and the semantic-aware LID prior, both critical for the model's overall robustness. These findings indicate that leveraging supervised and geometrically guided LID estimation provides a more reliable foundation for OOD detection in semantic segmentation tasks.

Table summarizing the main OOD detection performance on the Cityscapes to ACDC benchmark for DeepLabV3+ with HRNetV2-W48 backbone. Lower FPR95 and higher AUPRC indicate better performance.

Method	FPR95 (%) ↓	AUPRC (%) ↑
Energy	80.5	15.2
Max-logit	78.1	18.5
ReAct	75.3	22.1
SynthOOD	68.9	29.8
SupLID (ours)	55.2	45.6

5. Discussion

The superior performance of SupLID underscores the benefits of incorporating supervised Local Intrinsic Dimensionality and geometrical guidance for robust OOD detection. By focusing on the intrinsic manifold properties of data, SupLID offers a principled and effective method for distinguishing novel samples compared to conventional feature-based approaches. This research has profound implications for deploying semantic segmentation models in real-world scenarios where reliability and safety are paramount, particularly in open-world settings. Future work could involve extending this methodology to other computer vision tasks or exploring its adaptability to different types of OOD challenges.