Weakly-Supervised Semantic Segmentation via Depth-guided Class Activation Map

Abstract

Weakly-supervised semantic segmentation (WSSS) performs semantic segmentation using only class-level annotations, without requiring pixel-level labels. In WSSS, pseudo-labels generated from class activation maps (CAMs) are commonly used to train the model. Therefore, prior research has primarily focused on improving CAM quality to enhance segmentation performance.

In this project, I identified that CAM-based segmentation struggles in two key scenarios: (1) small objects and (2) occluded objects, leading to suboptimal performance. To address these issues, I propose a depth map-guided cropping method to refine CAMs and improve segmentation accuracy.


Problem Statement

1. Small objects

2. Occluded objects


Method

1. Depth-guided Cropping Method for CAM Refinement


Results

1. Qualitative Results

2. Quantitative Results

Conclusion

In this project, I proposed a depth-guided cropping method to refine CAMs for WSSS. The proposed method effectively segments small objects and occluded objects, demonstrating superior performance compared to the baseline method. The results suggest that the depth-guided cropping method can enhance segmentation accuracy in WSSS, providing a promising direction for future research. My code provides detailed implementation of the proposed method.