We propose a novel end-to-end effective waste segmentation network (EWSegNet) which improves the model's computational efficiency without compromising the segmentation performance.
Our frequency context module (FCM) optimally captures global context by using data-dependent kernels in the frequency domain to improve segmentation results.
The proposed spatial context module (SCM) performs feature excitation and weighting to complement the FCM for rich feature extraction.
To emphasize the boundaries of waste items and blob regions, we design the auxiliary feature enhancement module (AFEM) that uses difference of Gaussian filtering and pooled attention, thereby improving waste segmentation performance in cluttered scenes.
Overall framework of the proposed effective waste segmentation network (EWSegNet). The encoder consists of four stages that provide multiscale feature representations (F1, F2, F3, F4). Each stage (i) contains Ni number of EWFE layers (where i ∈ [1,2,3,4]). Before each stage, a convolution layer is used to downsample the feature maps. Feature representations of stage three are fed to auxiliary feature enhancement (AFE) module to emphasize boundaries and blob regions and obtain feature maps F5. Finally, these multiscale features are fed to the decoder to obtain the segmentation map.
Efficient waste feature extraction (EWFE) layer is shown in Fig. a). Fig. b) represents the spatial context module (SCM) that is used for feature excitation and weighting in the spatial domain. In c), frequency context module (FCM) is illustrated that captures global contextual relationship between pixels in the frequency domain.
This figure demonstrates the auxiliary feature enhancement module (AFEM) that has dual functions: boundaries enhancement (BE) and blob amplification (BA). BE emphasizes the fine details by using difference of Gaussian filtration while BA uses pooled attention to focus on semantic regions.
Performance comparison of EWSegNet with state-of-the-art waste segmentation methods. Encoder FLOPs are reported with RGB image size of 512×512.
| Method | Encoder Params (M) ↓ | GFLOPs ↓ | Latency (ms) ↓ | mIoU (%) ↑ | Pix. Acc. (%) ↑ |
|---|---|---|---|---|---|
| ReCo | — | — | — | 52.28 | 89.33 |
| DeepLabv3+ | — | — | — | 52.13 | 91.38 |
| FANet | 36.0 | 30.3 | 74.5 | 54.89 | 91.41 |
| FocalNet-B | 88.7 | 80.6 | — | 54.26 | 91.28 |
| COSNet | 27.3 | 24.4 | 73.6 | 56.67 | 91.91 |
| EWSegNet Ours | 23.3 | 20.5 | 64.8 | 56.44 | 91.75 |
Class-wise IoU (%) comparison on ZeroWaste-f dataset.
| Method | Background | Cardboard | Soft Plastic | Rigid Plastic | Metal |
|---|---|---|---|---|---|
| DeepLabv3+ | 91.02 | 54.47 | 63.18 | 24.82 | 27.14 |
| COSNet | 91.44 | 59.13 | 65.92 | 37.24 | 29.61 |
| EWSegNet Ours | 91.45 | 59.24 | 63.17 | 33.28 | 35.05 |
mIoU (%) and mF1 (%) comparison on ZeroWaste-aug dataset.
| Method | mIoU (%) ↑ | mF1 (%) ↑ |
|---|---|---|
| TopFormer-S | 52.53 | 66.87 |
| SeaFormer-S | 54.47 | 68.75 |
| AFFormer-B | 54.76 | 69.03 |
| PIDNet-S | 57.74 | 71.13 |
| FeedFormer-B0 | 59.18 | 72.58 |
| DDRNet-slim | 61.12 | 74.35 |
| DeepLabv3+ | 52.50 | — |
| LWCHNet | 63.16 | 76.03 |
| EWSegNet Ours | 74.10 | 84.31 |
Performance comparison in terms of mIoU (%) and class-wise IoU (%) on the SpectralWaste dataset.
| Method | mIoU (%) ↑ | IoU (%) ↑ | |||||
|---|---|---|---|---|---|---|---|
| Film | Basket | Cardboard | Video Tape | Filament | Trash Bag | ||
| MiniNet-v2 | 44.5 | 63.1 | 58.9 | 55.4 | 30.6 | 10.0 | 49.2 |
| SegFormer-B0 | 48.4 | 66.9 | 71.3 | 48.9 | 33.6 | 15.2 | 54.6 |
| InternImage-T | 47.99 | 42.38 | 82.80 | 69.10 | 41.39 | 16.50 | 35.77 |
| FANet | 67.83 | 72.47 | 82.98 | 75.26 | 41.28 | 67.65 | 67.36 |
| COSNet | 69.96 | 77.61 | 83.65 | 75.14 | 42.95 | 69.06 | 71.38 |
| EWSegNet Ours | 71.03 | 77.88 | 84.16 | 79.77 | 42.05 | 73.39 | 68.96 |
Qualitative comparison of EWSegNet with the recent waste segmentation methods FANet and COSNet on ZeroWaste-f. Proposed EWSegNet provides reasonably better segmentation performance as highlighted in yellow boxes.
Visual comparison of EWSegNet with recent waste segmentation methods FANet and COSNet on Spectral Waste dataset. As highlighted in yellow boxes, proposed EWSegNet is fairly better to segment the waste objects in cluttered scenes.
@inproceedings{javaid2026ewsegnet,
title = {Towards Effective Waste Segmentation for Automated
Waste Recycling in Cluttered Background},
author = {Javaid, Mamoona and Noman, Mubashir and Hannan, Abdul
and Nawaz, Shah and Fiaz, Mustansar and Ghuffar, Sajid},
booktitle = {International Conference on Machine Learning (ICML)},
year = {2026}
}