Weakly-Supervised Semantic Segmentation via Depth-guided Class Activation Map
Abstract
Weakly-supervised semantic segmentation (WSSS) performs semantic segmentation using only class-level annotations, without requiring pixel-level labels. In WSSS, pseudo-labels generated from class activation maps (CAMs) are commonly used to train the model. Therefore, prior research has primarily focused on improving CAM quality to enhance segmentation performance.
In this project, I identified that CAM-based segmentation struggles in two key scenarios: (1) small objects and (2) occluded objects, leading to suboptimal performance. To address these issues, I propose a depth map-guided cropping method to refine CAMs and improve segmentation accuracy.
Problem Statement
1. Small objects
- WSSS struggles to accurately segment small objects due to the global average pooling operation in the CAM generation process.
2. Occluded objects
- WSSS struggles to segment occluded objects because CAMs primarily rely on visible features, causing occluded regions to receive weak activation.
Method
1. Depth-guided Cropping Method for CAM Refinement
- I proposed a depth-guided cropping method to refine CAMs by focusing on informative regions and excluding irrelevant areas. The depth map is used to crop the CAMs, enabling the model to concentrate on small objects or occluded objects and improve segmentation accuracy.
Results
1. Qualitative Results
- The proposed method effectively segments small objects and occluded objects, demonstrating superior performance compared to the baseline method.
2. Quantitative Results
- The proposed method outperforms the baseline method in terms of mIoU, demonstrating the effectiveness of the depth-guided cropping method.
Conclusion
In this project, I proposed a depth-guided cropping method to refine CAMs for WSSS. The proposed method effectively segments small objects and occluded objects, demonstrating superior performance compared to the baseline method. The results suggest that the depth-guided cropping method can enhance segmentation accuracy in WSSS, providing a promising direction for future research. My code provides detailed implementation of the proposed method.