In recent years, deep learning techniques have been introduced to the field of autonomous driving and have evolved with the technology to cover all aspects of autonomous driving. Researchers have begun to explore lightweighting image semantic segmentation networks and applying them to road traffic scenarios. However, existing semantic segmentation networks are usually studied based on high-resolution images, and the algorithms are difficult to be deployed on devices with limited hardware resources due to the need for more resources to compute a large number of parameters. To address this problem, this paper proposes GPIDNet, a lightweight model based on the global attention mechanism, which aims to optimize the computational efficiency while maintaining the performance. Notably, GPIDNet is able to obtain 72% mIoU and 57.9 FPS forward inference speed with a low-resolution input size of 256x256 on the Cityscapes dataset.
Esophageal cancer is a common digestive tract tumor with a high mortality rate. Currently, the identification of esophageal carcinoma predominantly hinges on thoracic computed tomography scanning. Because the area of the esophagus on CT images is small and belongs to the small target region, the traditional image segmentation network is difficult to accurately segment the tumor area of esophageal cancer. Although the traditional U-Net architecture and Transformer integrated variants perform well in medical segmentation tasks, they lack image localization and channel feature extraction capabilities. To address the aforementioned issues, this manuscript proposes a novel medical image segmentation model, christened DAA-TransUNet. We propose a new module called Dual Attention Area (DAA) module,it can extract image position and channel features. And we combine the DAA module and the Efficient Channel Attention (ECA) module into a new module, which we call the Efficient Channel Dual Attention (ECDA) module. DAA-TransUNet integrates Transformer and ECDA modules into a traditional U-shaped architecture, enabling it to extract not only global and local information, but also image position and channel features. We have verified the superior performance of our network by performing numerous experiments on our collected esophageal dataset. We used DAA-TransUNet to perform experiments on our collected dataset, and the average DSC(%) reached 75.26% and the average HD95(mm) reached 3.54.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.