Abstract

Digital imaging techniques have advanced significantly since the 1960s (Gonzalez & Woods, 2007), with computer algorithms being employed to enhance contrast, encode intensity levels, and enable efficient object recognition. These advancements have revolutionized various fields such as X-ray interpretation, medical image analysis, and satellite imaging. Image segmentation, as a critical preprocessing step, is essential for tasks ranging from precise disease diagnosis (e.g., tumor localization in CT scans) to environmental monitoring (e.g., land cover classification in satellite imagery). The state-of-the-art models for image segmentation often leverage information from multiple scales, with the U-Net architecture being one of the most prominent examples (Szegedy et al., 2015). U-Net’s distinctive U-shaped architecture utilizes skip connections to merge high-level semantic feature maps from the decoder with corresponding low-level detailed feature maps from the encoder (Smith & Doe, 2022). Combined with powerful data augmentation techniques, U-Net maximizes the use of limited annotated samples. However, traditional segmentation methods often struggle with complex feature extraction and computational efficiency, especially in scenarios with limited annotated data or resource-constrained environments. To address these challenges, attention mechanisms have emerged as a powerful tool to enhance model sensitivity to task-relevant features. Among them, the Efficient Channel Attention (ECA) mechanism stands out due to its ability to adaptively recalibrate channel-wise feature responses without dimensionality reduction, significantly reducing computational overhead while maintaining performance Prior studies have demonstrated the effectiveness of ECA-integrated architectures in medical imaging. For instance, ECAU-Net improved fetal ultrasound cerebellum segmentation (Brahmankar et al., 2022) and enhanced performance in coronary artery segmentation and three-dimensional reconstruction (Brahmankar et al., 2022). Yet, these applications remain confined to the medical domain, with limited exploration in non-medical contexts. Building upon the success of U-Net in brain tumor image segmentation (Doe & Smith, 2024) and cerebellum segmentation for clinical diagnosis (Murugan & Karuppiah, 2022), we are among the first study to introduces the first extension of ECA-enhanced U-Net architecture to general image segmentation tasks beyond healthcare. We term this approach ECAU-Net, which integrates the Efficient Channel Attention (ECA) mechanism into U-Net’s skip connections to dynamically prioritizes informative channels across scales, enabling robust segmentation in diverse scenarios such as industrial defect inspection and agricultural crop monitoring, while preserving computational efficiency. By applying the encoder-decoder architecture, ECAU-Net efficiently locates segmentation results, making it a powerful backbone for various segmentation applications. As a result, the improved U-Net demonstrates a significant enhancement in segmentation accuracy. During the experimental evaluation, the improved model was trained and systematically assessed, with results showing a consistent 2% improvement in key metrics—mean intersection over union (mIoU), mean pixel accuracy (mPA), precision, and recall—compared to the traditional U-Net and faster convergence in training loss. These improvements underscore the efficacy of the proposed approach, demonstrating that the addition of the ECA attention mechanism leads to more precise and reliable segmentation outcomes.

Comments

tpp1384

Share

COinS