To see the other types of publications on this topic, follow the link: YOLO-based attention network segmentation.

Journal articles on the topic 'YOLO-based attention network segmentation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'YOLO-based attention network segmentation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Huang, Zhihao, Jiajun Wu, Lumei Su, Yitao Xie, Tianyou Li, and Xinyu Huang. "SP-YOLO-Lite: A Lightweight Violation Detection Algorithm Based on SP Attention Mechanism." Electronics 12, no. 14 (2023): 3176. http://dx.doi.org/10.3390/electronics12143176.

Full text
Abstract:
In the operation site of power grid construction, it is crucial to comprehensively and efficiently detect violations of regulations for the personal safety of the workers with a safety monitoring system based on object detection technology. However, common general-purpose object detection algorithms are difficult to deploy on low-computational-power embedded platforms situated at the edge due to their high model complexity. These algorithms suffer from drawbacks such as low operational efficiency, slow detection speed, and high energy consumption. To address this issue, a lightweight violation detection algorithm based on the SP (Segmentation-and-Product) attention mechanism, named SP-YOLO-Lite, is proposed to improve the YOLOv5s detection algorithm and achieve low-cost deployment and efficient operation of object detection algorithms on low-computational-power monitoring platforms. First, to address the issue of excessive complexity in backbone networks built with conventional convolutional modules, a Lightweight Convolutional Block was employed to construct the backbone network, significantly reducing computational and parameter costs while maintaining high detection model accuracy. Second, in response to the problem of existing attention mechanisms overlooking spatial local information, we introduced an image segmentation operation and proposed a novel attention mechanism called Segmentation-and-Product (SP) attention. It enables the model to effectively capture local informative features of the image, thereby enhancing model accuracy. Furthermore, a Neck network that is both lightweight and feature-rich is proposed by introducing Depthwise Separable Convolution and Segmentation-and-Product attention module to Path Aggregation Network, thus addressing the issue of high computation and parameter volume in the Neck network of YOLOv5s. Experimental results show that compared with the baseline network YOLOv5s, the proposed SP-YOLO-Lite model reduces the computation and parameter volume by approximately 70%, achieving similar detection accuracy on both the VOC dataset and our self-built SMPC dataset.
APA, Harvard, Vancouver, ISO, and other styles
2

Jiang, Yifan, Ziyin Wu, Fanlin Yang, et al. "YOLO-SG: Seafloor Topography Unit Recognition and Segmentation Algorithm Based on Lightweight Upsampling Operator and Attention Mechanisms." Journal of Marine Science and Engineering 13, no. 3 (2025): 583. https://doi.org/10.3390/jmse13030583.

Full text
Abstract:
The recognition and segmentation of seafloor topography play a crucial role in marine science research and engineering applications. However, traditional methods for seafloor topography recognition and segmentation face several issues, such as poor capability in analyzing complex terrains and limited generalization ability. To address these challenges, this study introduces the SG-MKD dataset (Submarine Geomorphology Dataset—Seamounts, Sea Knolls, Submarine Depressions) and proposes YOLO-SG (You Only Look Once—Submarine Geomorphology), an algorithm for seafloor topographic unit recognition and segmentation that leverages a lightweight upsampling operator and attention mechanisms. The SG-MKD dataset provides instance segmentation annotations for three types of seafloor topographic units—seamounts, sea knolls, and submarine depressions—across a total of 419 images. YOLO-SG is an optimized version of the YOLOv8l-Segment model, incorporating a convolutional block attention module in the backbone network to enhance feature extraction. Additionally, it integrates a lightweight, general upsampling operator to create a new feature fusion network, thereby improving the model’s ability to fuse and represent features. Experimental results demonstrate that YOLO-SG significantly outperforms the original YOLOv8l-Segment, with a 14.7% increase in mean average precision. Furthermore, inference experiments conducted across various research areas highlight the model’s strong generalization capability.
APA, Harvard, Vancouver, ISO, and other styles
3

Guo, Jun, Tiancheng Li, and Baigang Du. "Segmentation Head Networks with Harnessing Self-Attention and Transformer for Insulator Surface Defect Detection." Applied Sciences 13, no. 16 (2023): 9109. http://dx.doi.org/10.3390/app13169109.

Full text
Abstract:
Current methodologies for insulator defect detection are hindered by limitations in real-world applicability, spatial constraints, high computational demand, and segmentation challenges. Addressing these shortcomings, this paper presents a robust fast detection algorithm combined segmentation head networks with harnessing self-attention and transformer (HST-Net), which is based on the You Only Look Once (YOLO) v5 to recognize and assess the extent and types of damage on the insulator surface. Firstly, the original backbone network is replaced by the transformer cross-stage partial (Transformer-CSP) networks to enrich the network’s ability by capturing information across different depths of network feature maps. Secondly, an insulator defect segmentation head network is presented to handle the segmentation of defect areas such as insulator losses and flashovers. It facilitates instance-level mask prediction for each insulator object, significantly reducing the influence of intricate backgrounds. Finally, comparative experiment results show that the positioning accuracy and defect segmentation accuracy of the proposed both surpass that of other popular models. It can be concluded that the proposed model not only satisfies the requirements for balance between accuracy and speed in power facility inspection, but also provides fresh perspectives for research in other defect detection domains.
APA, Harvard, Vancouver, ISO, and other styles
4

Xie, Yufei, and Liping Chen. "CBLN-YOLO: An Improved YOLO11n-Seg Network for Cotton Topping in Fields." Agronomy 15, no. 4 (2025): 996. https://doi.org/10.3390/agronomy15040996.

Full text
Abstract:
The positioning of the top bud by the topping machine in the cotton topping operation depends on the recognition algorithm. The detection results of the traditional target detection algorithm contain a lot of useless information, which is not conducive to the positioning of the top bud. In order to obtain a more efficient recognition algorithm, we propose a top bud segmentation algorithm CBLN-YOLO based on the YOLO11n-seg model. Firstly, the standard convolution and multihead self-attention (MHSA) mechanisms in YOLO11n-seg are replaced by linear deformable convolution (LDConv) and coordinate attention (CA) mechanisms to reduce the parameter growth rate of the original model and better mine detailed features of the top buds. In the neck, the feature pyramid network (FPN) is reconstructed using an enhanced interlayer feature correlation (EFC) module, and regression loss is calculated using the Inner CIoU loss function. When tested on a self-built dataset, the mAP@0.5 values of CBLN-YOLO for detection and segmentation are 98.3% and 95.8%, respectively, which are higher than traditional segmentation models. At the same time, CBLN-YOLO also shows strong robustness under different weather and time periods, and its recognition speed reaches 135 frames per second, which provides strong support for cotton top bud positioning in the field environment.
APA, Harvard, Vancouver, ISO, and other styles
5

Xiong, Mengying, Aiping Wu, Yue Yang, and Qingqing Fu. "Efficient Brain Tumor Segmentation for MRI Images Using YOLO-BT." Sensors 25, no. 12 (2025): 3645. https://doi.org/10.3390/s25123645.

Full text
Abstract:
Aiming at the problems of inaccurate segmentation and low detection efficiency caused by irregular tumor shape and large size differences in brain MRI images, this study proposes a brain tumor segmentation algorithm, YOLO-BT, based on YOLOv11. YOLO-BT uses UNetV2 as the backbone network to enhance the feature extraction ability of key regions through the attention mechanism. The BiFPN structure is introduced into the neck network to replace the traditional feature splicing method, realize the two-way fusion of cross-scale features, improve detection accuracy, and reduce the amount of calculations required. The D-LKA mechanism is introduced into the C3k2 structure, and the large convolution kernel is used to process complex image information to enhance the model’s ability to characterize different scales and irregular tumors. In this study, multiple sets of experiments were performed on the Figshare Brain Tumor dataset to test the performance of YOLO-BT. The data results show that YOLO-BT improves Precision by 2.7%, Recall, mAP50 by 0.9%, and mAP50-95 by 0.3% in the candidate box-based evaluation compared to YOLOv11. In mask-based evaluations, Precision improved by 2.5%, Recall by 2.8%, mAP50 by 1.1%, and mAP50-95 by 0.5%. At the same time, the mIOU increased by 6.1%, and the Dice coefficient increased by 3.6%. It can be seen that the YOLO-BT algorithm is suitable for brain tumor detection and segmentation.
APA, Harvard, Vancouver, ISO, and other styles
6

Yu, Caili, Yanheng Mai, Caijuan Yang, Jiaqi Zheng, Yongxin Liu, and Chaoran Yu. "IA-YOLO: A Vatica Segmentation Model Based on an Inverted Attention Block for Drone Cameras." Agriculture 14, no. 12 (2024): 2252. https://doi.org/10.3390/agriculture14122252.

Full text
Abstract:
The growing use of drones in precision agriculture highlights the needs for enhanced operational efficiency, especially in the scope of detection tasks, even in segmentation. Although the ability of computer vision based on deep learning has made remarkable progress in the past ten years, the segmentation of images captured by Unmanned Aerial Vehicle (UAV) cameras, an exact detection task, still faces a conflict between high precision and low inference latency. Due to such a dilemma, we propose IA-YOLO (Inverted Attention You Only Look Once), an efficient model based on IA-Block (Inverted Attention Block) with the aim of providing constructive strategies for real-time detection tasks using UAV cameras. The working details of this paper are outlined as follows: (1) We construct a component named IA-Block, which is integrated into the YOLOv8-seg structure as IA-YOLO. It specializes in pixel-level classification of UAV camera images, facilitating the creation of exact maps to guide agricultural strategies. (2) In experiments on the Vatica dataset, compared with any other lightweight segmentation model, IA-YOLO achieves at least a 3.3% increase in mAP (mean Average Precision). Further validation on diverse species datasets confirms its robust generalization. (3) Without overloading the complex attention mechanism and deeper and deeper network, a stem that incorporates efficient feature extraction components, IA-Block, still possess credible modeling capabilities.
APA, Harvard, Vancouver, ISO, and other styles
7

Hua, Yue, Rui Chen, and Hang Qin. "YOLO-DentSeg: A Lightweight Real-Time Model for Accurate Detection and Segmentation of Oral Diseases in Panoramic Radiographs." Electronics 14, no. 4 (2025): 805. https://doi.org/10.3390/electronics14040805.

Full text
Abstract:
Panoramic radiography is vital in dentistry, where accurate detection and segmentation of diseased regions aid clinicians in fast, precise diagnosis. However, the current methods struggle with accuracy, speed, feature extraction, and suitability for low-resource devices. To overcome these challenges, this research introduces a unique YOLO-DentSeg model, a lightweight architecture designed for real-time detection and segmentation of oral dental diseases, which is based on an enhanced version of the YOLOv8n-seg framework. First, the C2f(Channel to Feature Map)-Faster structure is introduced in the backbone network, achieving a lightweight design while improving the model accuracy. Next, the BiFPN(Bidirectional Feature Pyramid Network) structure is employed to enhance its multi-scale feature extraction capabilities. Then, the EMCA(Enhanced Efficient Multi-Channel Attention) attention mechanism is introduced to improve the model’s focus on key disease features. Finally, the Powerful-IOU(Intersection over Union) loss function is used to optimize the detection box localization accuracy. Experiments show that YOLO-DentSeg achieves a detection precision (mAP50(Box)) of 87%, segmentation precision (mAP50(Seg)) of 85.5%, and a speed of 90.3 FPS. Compared to YOLOv8n-seg, it achieves superior precise and faster inference times while decreasing the model size, computational load, and parameter count by 44.9%, 17.5%, and 44.5%, respectively. YOLO-DentSeg enables fast, accurate disease detection and segmentation, making it practical for devices with limited computing power and ideal for real-world dental applications.
APA, Harvard, Vancouver, ISO, and other styles
8

Rathinam, Vinoth, Sasireka Rajendran, and Valarmathi Krishnasamy. "A unique YOLO-based gated attention deep convolution network-Lichtenberg optimization algorithm model for a precise breast cancer segmentation and classification." International Journal of Electrical and Computer Engineering (IJECE) 15, no. 2 (2025): 1670–85. https://doi.org/10.11591/ijece.v15i2.pp1670-1685.

Full text
Abstract:
A novel you only look once (YOLO)-based gated attention deep convolution network (GADCN) classification algorithm is developed and utilized in this present study for the detection of breast cancer. In this framework, contrast enhancement-based histogram equalization is applied initially to produce the normalized breast image with reduced noise artifacts. Then, the breast region is accurately segmented from the preprocessed images with low complexity and segmentation error using the YOLO-based attention network model. To diagnose breast cancer with better accuracy, the GADCN model is used to predict the exact class of image (i.e., benign or malignant). During classification, the activation function is optimally computed with the use of the Lichtenberg optimization algorithm (LOA). It aids in achieving improved classification performance with little complexity in training and assessment. The significance of the present study includes the use of a unique, YOLO-based GADCN-LOA model that helps in the prediction of breast cancer with higher accuracy. It was observed that the model exhibited 99% accuracy for the datasets utilized. In addition, the selected model outperforms well with sensitivity, specificity, precision, and F1-score. Hence the proposed model could be exploited for the diagnosis of breast cancer at an early stage to enable preventive care.
APA, Harvard, Vancouver, ISO, and other styles
9

Cao, Lianjun, Xinyu Zheng, and Luming Fang. "The Semantic Segmentation of Standing Tree Images Based on the Yolo V7 Deep Learning Algorithm." Electronics 12, no. 4 (2023): 929. http://dx.doi.org/10.3390/electronics12040929.

Full text
Abstract:
The existence of humans and the preservation of the natural ecological equilibrium depend greatly on trees. The semantic segmentation of trees is very important. It is crucial to learn how to properly and automatically extract a tree’s elements from photographic images. Problems with traditional tree image segmentation include low accuracy, a sluggish learning rate, and a large amount of manual intervention. This research suggests the use of a well-known network segmentation technique based on deep learning called Yolo v7 to successfully accomplish the accurate segmentation of tree images. Due to class imbalance in the dataset, we use the weighted loss function and apply various types of weights to each class to enhance the segmentation of the trees. Additionally, we use an attention method to efficiently gather feature data while reducing the production of irrelevant feature data. According to the experimental findings, the revised model algorithm’s evaluation index outperforms other widely used semantic segmentation techniques. In addition, the detection speed of the Yolo v7 model is much faster than other algorithms and performs well in tree segmentation in a variety of environments, demonstrating the effectiveness of this method in improving the segmentation performance of the model for trees in complex environments and providing a more effective solution to the tree segmentation issue.
APA, Harvard, Vancouver, ISO, and other styles
10

Rathinam, Vinoth, Sasireka Rajendran, and Valarmathi Krishnasamy. "A unique YOLO-based gated attention deep convolution network-Lichtenberg optimization algorithm model for a precise breast cancer segmentation and classification." International Journal of Electrical and Computer Engineering (IJECE) 15, no. 2 (2025): 1670. https://doi.org/10.11591/ijece.v15i2.pp1670-1685.

Full text
Abstract:
A novel you only look once (YOLO)-based gated attention deep convolution network (GADCN) classification algorithm is developed and utilized in this present study for the detection of breast cancer. In this framework, contrast enhancement-based histogram equalization is applied initially to produce the normalized breast image with reduced noise artifacts. Then, the breast region is accurately segmented from the preprocessed images with low complexity and segmentation error using the YOLO-based attention network model. To diagnose breast cancer with better accuracy, the GADCN model is used to predict the exact class of image (i.e., benign or malignant). During classification, the activation function is optimally computed with the use of the Lichtenberg optimization algorithm (LOA). It aids in achieving improved classification performance with little complexity in training and assessment. The significance of the present study includes the use of a unique, YOLO-based GADCN-LOA model that helps in the prediction of breast cancer with higher accuracy. It was observed that the model exhibited 99% accuracy for the datasets utilized. In addition, the selected model outperforms well with sensitivity, specificity, precision, and F1-score. Hence the proposed model could be exploited for the diagnosis of breast cancer at an early stage to enable preventive care.
APA, Harvard, Vancouver, ISO, and other styles
11

Zhao, Ziyu, Zhedong Ge, Mengying Jia, Xiaoxia Yang, Ruicheng Ding, and Yucheng Zhou. "A Particleboard Surface Defect Detection Method Research Based on the Deep Learning Algorithm." Sensors 22, no. 20 (2022): 7733. http://dx.doi.org/10.3390/s22207733.

Full text
Abstract:
Particleboard surface defects have a significant impact on product quality. A surface defect detection method is essential to enhancing the quality of particleboard because the conventional defect detection method has low accuracy and efficiency. This paper proposes a YOLO v5-Seg-Lab-4 (You Only Look Once v5 Segmentation-Lab-4) model based on deep learning. The model integrates object detection and semantic segmentation, which ensures real-time performance and improves the detection accuracy of the model. Firstly, YOLO v5s is used as the object detection network, and it is added into the SELayer module to improve the adaptability of the model to receptive field. Then, the Seg-Lab v3+ model is designed on the basis of DeepLab v3+. In this model, the object detection network is utilized as the backbone network of feature extraction, and the expansion rate of atrus convolution is reduced to the computational complexity of the model. The channel attention mechanism is added onto the feature fusion module, for the purpose of enhancing the feature characterization capabilities of the network algorithm as well as realizing the rapid and accurate detection of lightweight networks and small objects. Experimental results indicate that the proposed YOLO v5-Seg-Lab-4 model has mAP (Mean Average Precision) and mIoU (Mean Intersection over Union) of 93.20% and 76.63%, with a recognition efficiency of 56.02 fps. Finally, a case study of the Huizhou particleboard factory inspection is carried out to demonstrate the tiny detection accuracy and real-time performance of this proposed method, and the missed detection rate of surface defects of particleboard is less than 1.8%.
APA, Harvard, Vancouver, ISO, and other styles
12

Fouladi, Saman, Luca Di Palma, Fatemeh Darvizeh, et al. "Neural Network Models for Prostate Zones Segmentation in Magnetic Resonance Imaging." Information 16, no. 3 (2025): 186. https://doi.org/10.3390/info16030186.

Full text
Abstract:
Prostate cancer (PCa) is one of the most common tumors diagnosed in men worldwide, with approximately 1.7 million new cases expected by 2030. Most cancerous lesions in PCa are located in the peripheral zone (PZ); therefore, accurate identification of the location of the lesion is essential for effective diagnosis and treatment. Zonal segmentation in magnetic resonance imaging (MRI) scans is critical and plays a key role in pinpointing cancerous regions and treatment strategies. In this work, we report on the development of three advanced neural network-based models: one based on ensemble learning, one on Meta-Net, and one on YOLO-V8. They were tailored for the segmentation of the central gland (CG) and PZ using a small dataset of 90 MRI scans for training, 25 MRIs for validation, and 24 scans for testing. The ensemble learning method, combining U-Net-based models (Attention-Res-U-Net, Vanilla-Net, and V-Net), achieved an IoU of 79.3% and DSC of 88.4% for CG and an IoU of 54.5% and DSC of 70.5% for PZ on the test set. Meta-Net, used for the first time in segmentation, demonstrated an IoU of 78% and DSC of 88% for CG, while YOLO-V8 outperformed both models with an IoU of 80% and DSC of 89% for CG and an IoU of 58% and DSC of 73% for PZ.
APA, Harvard, Vancouver, ISO, and other styles
13

Zheng, Shuhe, Yang Liu, Wuxiong Weng, Xuexin Jia, Shilong Yu, and Zuoxun Wu. "Tomato Recognition and Localization Method Based on Improved YOLOv5n-seg Model and Binocular Stereo Vision." Agronomy 13, no. 9 (2023): 2339. http://dx.doi.org/10.3390/agronomy13092339.

Full text
Abstract:
Recognition and localization of fruits are key components to achieve automated fruit picking. However, current neural-network-based fruit recognition algorithms have disadvantages such as high complexity. Traditional stereo matching algorithms also have low accuracy. To solve these problems, this study targeting greenhouse tomatoes proposed an algorithm framework based on YOLO-TomatoSeg, a lightweight tomato instance segmentation model improved from YOLOv5n-seg, and an accurate tomato localization approach using RAFT-Stereo disparity estimation and least squares point cloud fitting. First, binocular tomato images were captured using a binocular camera system. The left image was processed by YOLO-TomatoSeg to segment tomato instances and generate masks. Concurrently, RAFT-Stereo estimated image disparity for computing the original depth point cloud. Then, the point cloud was clipped by tomato masks to isolate tomato point clouds, which were further preprocessed. Finally, a least squares sphere fitting method estimated the 3D centroid co-ordinates and radii of tomatoes by fitting the tomato point clouds to spherical models. The experimental results showed that, in the tomato instance segmentation stage, the YOLO-TomatoSeg model replaced the Backbone network of YOLOv5n-seg with the building blocks of ShuffleNetV2 and incorporated an SE attention module, which reduced model complexity while improving model segmentation accuracy. Ultimately, the YOLO-TomatoSeg model achieved an AP of 99.01% with a size of only 2.52 MB, significantly outperforming mainstream instance segmentation models such as Mask R-CNN (98.30% AP) and YOLACT (96.49% AP). The model size was reduced by 68.3% compared to the original YOLOv5n-seg model. In the tomato localization stage, at the range of 280 mm to 480 mm, the average error of the tomato centroid localization was affected by occlusion and sunlight conditions. The maximum average localization error was ±5.0 mm, meeting the localization accuracy requirements of the tomato-picking robots. This study developed a lightweight tomato instance segmentation model and achieved accurate localization of tomato, which can facilitate research, development, and application of fruit-picking robots.
APA, Harvard, Vancouver, ISO, and other styles
14

Liu, Yanan, Ai Zhang, and Peng Gao. "From Crown Detection to Boundary Segmentation: Advancing Forest Analytics with Enhanced YOLO Model and Airborne LiDAR Point Clouds." Forests 16, no. 2 (2025): 248. https://doi.org/10.3390/f16020248.

Full text
Abstract:
Individual tree segmentation is crucial to extract forest structural parameters, which is vital for forest resource management and ecological monitoring. Airborne LiDAR (ALS), with its ability to rapidly and accurately acquire three-dimensional forest structural information, has become an essential tool for large-scale forest monitoring. However, accurately locating individual trees and mapping canopy boundaries continues to be hindered by the overlapping nature of the tree canopies, especially in dense forests. To address these issues, this study introduces CCD-YOLO, a novel deep learning-based network for individual tree segmentation from the ALS point cloud. The proposed approach introduces key architectural enhancements to the YOLO framework, including (1) the integration of cross residual transformer network extended (CReToNeXt) backbone for feature extraction and multi-scale feature fusion, (2) the application of the convolutional block attention module (CBAM) to emphasize tree crown features while suppressing noise, and (3) a dynamic head for adaptive multi-layer feature fusion, enhancing boundary delineation accuracy. The proposed network was trained using a newly generated individual tree segmentation (ITS) dataset collected from a dense forest. A comprehensive evaluation of the experimental results was conducted across varying forest densities, encompassing a variety of both internal and external consistency assessments. The model outperforms the commonly used watershed algorithm and commercial LiDAR 360 software, achieving the highest indices (precision, F1, and recall) in both tree crown detection and boundary segmentation stages. This study highlights the potential of CCD-YOLO as an efficient and scalable solution for addressing the critical challenges of accuracy segmentation in complex forests. In the future, we will focus on enhancing the model’s performance and application.
APA, Harvard, Vancouver, ISO, and other styles
15

Yan, Pengcheng, Jiarui Liang, Xiaolin Tian, and Yikui Zhai. "A New Lunar Lineament Extraction Method Based on Improved UNet++ and YOLOv5." Sensors 24, no. 7 (2024): 2256. http://dx.doi.org/10.3390/s24072256.

Full text
Abstract:
Lineament is a unique geological structure. The study of Lunar lineament structure has great significance on understanding its history and evolution of Lunar surface. However, the existing geographic feature extraction methods are not suitable for the extraction of Lunar lineament structure. In this paper, a new lineament extraction method is proposed based on improved-UNet++ and YOLOv5. Firstly, new lineament dataset is created containing lineaments structure based on CCD data from LROC. At same time the residual blocks are replaced with the VGG blocks in the down sample part of the UNet++ with adding the attention block between each layer. Secondly, the improved-UNet++ and YOLO networks are trained to execute the object detection and semantic segmentation of lineament structure respectively. Finally, a polygon-match strategy is proposed to combine the results of object detection and semantic segmentation. The experiment result indicate that this new method has relatively better and more stable performance compared with current mainstream networks and the original UNet++ network in the instance segmentation of lineament structure. Additionally, the polygon-match strategy is able to perform preciser edge detail in the instance segmentation of lineament structure result.
APA, Harvard, Vancouver, ISO, and other styles
16

Jiang, Long, Weitao Chen, Hongtai Shi, Hongwen Zhang, and Lei Wang. "Cotton-YOLO-Seg: An Enhanced YOLOV8 Model for Impurity Rate Detection in Machine-Picked Seed Cotton." Agriculture 14, no. 9 (2024): 1499. http://dx.doi.org/10.3390/agriculture14091499.

Full text
Abstract:
The detection of the impurity rate in machine-picked seed cotton is crucial for precision agriculture. This study proposes a novel Cotton-YOLO-Seg cotton-impurity instance segmentation algorithm based on the you only look once version 8 small segmentation model (Yolov8s-Seg). The algorithm achieves precise pixel-level segmentation of cotton and impurities in seed cotton images and establishes a detection model for the impurity rate, enabling accurate detection of the impurity rate in machine-picked cotton. The proposed algorithm removes the Pyramid 4 (P4) feature layer and incorporates Multi-Scale Convolutional Block Attention (MSCBCA) that integrates the Convolutional Block Attention Module (CBAM) and Multi-Scale Convolutional Attention (MSCA) into the Faster Implementation of Cross Stage Partial Bottleneck with 2 Convolutions (C2f) module of the feature extraction network, forming a novel C2f_MSCBCA module. The SlimNeck structure is introduced in the feature fusion network by replacing the P4 feature layer with the small-target detection layer Pyramid 2 (P2). Additionally, transfer learning is employed using the Common Objects in Context (COCO) instance segmentation dataset. The analysis of 100 groups of cotton image samples shows that the Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Mean Absolute Percentage Error (MAPE) for impurity rate detection are 0.29%, 0.33%, and 3.70%, respectively, which are reduced by 52.46%, 48.44%, and 53.75% compared to the Yolov8s-seg model. The Precision (P), Recall (R), and mean Average Precision at an intersection over union of 0.5 (mAP@0.5) are 85.4%, 78.4%, and 80.8%, respectively, which are improved by 4.2%, 6.2%, and 6.4% compared to Yolov8s-seg model, significantly enhancing the segmentation performance of minor impurities. The Cotton-YOLO-Seg model demonstrates practical significance for precisely detecting the impurity rate in machine-picked seed cotton.
APA, Harvard, Vancouver, ISO, and other styles
17

Son, Chang-Hwan. "Leaf Spot Attention Networks Based on Spot Feature Encoding for Leaf Disease Identification and Detection." Applied Sciences 11, no. 17 (2021): 7960. http://dx.doi.org/10.3390/app11177960.

Full text
Abstract:
This study proposes a new attention-enhanced YOLO model that incorporates a leaf spot attention mechanism based on regions-of-interest (ROI) feature extraction into the YOLO framework for leaf disease detection. Inspired by a previous study, which revealed that leaf spot attention based on the ROI-aware feature extraction can improve leaf disease recognition accuracy significantly and outperform state-of-the-art deep learning models, this study extends the leaf spot attention model to leaf disease detection. The primary idea is that spot areas indicating leaf diseases appear only in leaves, whereas the background area does not contain useful information regarding leaf diseases. To increase the discriminative power of the feature extractor that is required in the object detection framework, it is essential to extract informative and discriminative features from the spot and leaf areas. To realize this, a new ROI-aware feature extractor, that is, a spot feature extractor was designed. To divide the leaf image into spot, leaf, and background areas, the leaf segmentation module was first pretrained, and then spot feature encoding was applied to encode spot information. Next, the ROI-aware feature extractor was connected to an ROI-aware feature fusion layer to model the leaf spot attention mechanism, and to be joined with the YOLO detection subnetwork. The experimental results confirm that the proposed ROI-aware feature extractor can improve leaf disease detection by boosting the discriminative power of the spot features. In addition, the proposed attention-enhanced YOLO model outperforms conventional state-of-the-art object detection models.
APA, Harvard, Vancouver, ISO, and other styles
18

Wang, Xun, Hanlin Li, and Pan Zheng. "Automatic Detection and Segmentation of Ovarian Cancer Using a Multitask Model in Pelvic CT Images." Oxidative Medicine and Cellular Longevity 2022 (October 11, 2022): 1–13. http://dx.doi.org/10.1155/2022/6009107.

Full text
Abstract:
Ovarian cancer is one of the most common malignant tumours of female reproductive organs in the world. The pelvic CT scan is a common examination method used for the screening of ovarian cancer, which shows the advantages in safety, efficiency, and providing high-resolution images. Recently, deep learning applications in medical imaging attract more and more attention in the research field of tumour diagnostics. However, due to the limited number of relevant datasets and reliable deep learning models, it remains a challenging problem to detect ovarian tumours on CT images. In this work, we first collected CT images of 223 ovarian cancer patients in the Affiliated Hospital of Qingdao University. A new end-to-end network based on YOLOv5 is proposed, namely, YOLO-OCv2 (ovarian cancer). We improved the previous work YOLO-OC firstly, including balanced mosaic data enhancement and decoupled detection head. Then, based on the detection model, a multitask model is proposed, which can simultaneously complete the detection and segmentation tasks.
APA, Harvard, Vancouver, ISO, and other styles
19

Sun, Jinsheng, Xiaojuan Ban, Bing Han, Xueyuan Yang, and Chao Yao. "Interactive Image Segmentation Based on Feature-Aware Attention." Symmetry 14, no. 11 (2022): 2396. http://dx.doi.org/10.3390/sym14112396.

Full text
Abstract:
Interactive segmentation is a technique for picking objects of interest in images according to users’ input interactions. Some recent works take the users’ interactive input to guide the deep neural network training, where the users’ click information is utilized as weak-supervised information. However, limited by the learning capability of the model, this structure does not accurately represent the user’s interaction intention. In this work, we propose a multi-click interactive segmentation solution for employing human intention to refine the segmentation results. We propose a coarse segmentation network to extract semantic information and generate rough results. Then, we designed a feature-aware attention module according to the symmetry of user intention and image semantic information. Finally, we establish a refinement module to combine the feature-aware results with coarse masks to generate precise intentional segmentation. Furthermore, the feature-aware module is trained as a plug-and-play tool, which can be embedded into most deep image segmentation models for exploiting users’ click information in the training process. We conduct experiments on five common datasets (SBD, GrabCut, DAVIS, Berkeley, MS COCO) and the results prove our attention module can improve the performance of image segmentation networks.
APA, Harvard, Vancouver, ISO, and other styles
20

Fan, Zhiyong, Jianmin Hou, Qiang Zang, Yunjie Chen, and Fei Yan. "River Segmentation of Remote Sensing Images Based on Composite Attention Network." Complexity 2022 (January 5, 2022): 1–13. http://dx.doi.org/10.1155/2022/7750281.

Full text
Abstract:
River segmentation of remote sensing images is of important research significance and application value for environmental monitoring, disaster warning, and agricultural planning in an area. In this study, we propose a river segmentation model in remote sensing images based on composite attention network to solve the problems of abundant river details in images and the interference of non-river information including bridges, shadows, and roads. To improve the segmentation efficiency, a composite attention mechanism is firstly introduced in the central region of the network to obtain the global feature dependence of river information. Next, in this study, we dynamically combine binary cross-entropy loss that is designed for pixel-wise segmentation and the Dice coefficient loss that measures the similarity of two segmentation objects into a weighted one to optimize the training process of the proposed segmentation network. The experimental results show that compared with other semantic segmentation networks, the evaluation indexes of the proposed method are higher than those of others, and the river segmentation effect of CoANet model is significantly improved. This method can segment rivers in remote sensing images more accurately and coherently, which can meet the needs of subsequent research.
APA, Harvard, Vancouver, ISO, and other styles
21

Sevugan, Prabu, Venkatesan Rudhrakoti, Tai-hoon Kim, et al. "Class-aware feature attention-based semantic segmentation on hyperspectral images." PLOS ONE 20, no. 2 (2025): e0309997. https://doi.org/10.1371/journal.pone.0309997.

Full text
Abstract:
This research explores an innovative approach to segment hyperspectral images. Aclass-aware feature-based attention approach is combined with an enhanced attention-based network, FAttNet is proposed to segment the hyperspectral images semantically. It is introduced to address challenges associated with inaccurate edge segmentation, diverse forms of target inconsistency, and suboptimal predictive efficacy encountered in traditional segmentation networks when applied to semantic segmentation tasks in hyperspectral images. First, the class-aware feature attention procedure is used to improve the extraction and processing of distinct types of semantic information. Subsequently, the spatial attention pyramid is employed in a parallel fashion to improve the correlation between spaces and extract context information from images at different scales. Finally, the segmentation results are refined using the encoder-decoder structure. It enhances precision in delineating distinct land cover patterns. The findings from the experiments demonstrate that FAttNet exhibits superior performance compared to established semantic segmentation networks commonly used. Specifically, on the GaoFen image dataset, FAttNet achieves a higher mean intersection over union (MIoU) of 77.03% and a segmentation accuracy of 87.26% surpassing the performance of the existing network.
APA, Harvard, Vancouver, ISO, and other styles
22

Han, Yaosheng, Yunpeng Jin, Chunmei Li, and Xiangjie Huang. "Improvement of YOLO v8 Segmentation Algorithm and Its Study in the Identification of Hazards in Plateau Pika." Applied Sciences 14, no. 23 (2024): 11088. http://dx.doi.org/10.3390/app142311088.

Full text
Abstract:
Rodent infestation has become one of the important factors in grassland degradation on the Qinghai–Tibet Plateau, one of the hindrances to ecological and environmental protection, and a threat to the balance and development of the ecosystem in the Sanjiangyuan region. Based on the need for the scientific planning for ecological protection, this paper designs a method for detecting rodent infestation in plateau scenarios. Firstly, data were collected and annotated, and a dataset of plateau rodent distribution in the Qinghai region was constructed. The collected data include videos captured through drone-based field surveys, which were processed using OpenCV and annotated with LabelMe. The dataset is categorized into four specific types: ungobbled rat holes, gobbled rat holes, rocks, and cow dung. This categorization allows the model to effectively differentiate between rodent-related features and other environmental elements, which is crucial for the segmentation task. Secondly, the latest segmentation algorithm provided by YOLO v8 is improved to design a segmentation algorithm that can accurately detect the distribution of rodent infestation in the plateau scene. The specific improvements are as follows: firstly, the Contextual Transformer module is introduced in YOLO v8 to improve the global modeling capability; secondly, the DRConv dynamic region-aware convolution is introduced in YOLO v8 to improve the convolutional representation capability; thirdly, the attention mechanism is incorporated in the backbone of YOLO v8 to enhance the feature extraction capability of the network capability. A comparison test with the original algorithm on the plateau rodent distribution dataset showed that the new algorithm improved the detection accuracy from 77.9% to 82.74% and MIoU from 67.65% to 72.69% on the plateau rodent distribution dataset. The accuracy of the evaluation of plateau rodent damage levels has been greatly improved.
APA, Harvard, Vancouver, ISO, and other styles
23

Jin, Houxin, Le Cao, Xiu Kan, Weizhou Sun, Wei Yao, and Xialin Wang. "Coal petrography extraction approach based on multiscale mixed-attention-based residual U-net." Measurement Science and Technology 33, no. 7 (2022): 075402. http://dx.doi.org/10.1088/1361-6501/ac5439.

Full text
Abstract:
Abstract Coal petrography extraction is crucial for the accurate analysis of coal reaction characteristics in coal gasification, coal coking, and metal smelting. Nevertheless, automatic extraction remains a challenging task because of the grayscale overlap between exinite and background regions in coal photomicrographs. Inspired by the excellent performance of neural networks in the image segmentation field, this study proposes a reliable coal petrography extraction method that achieves precise segmentation of coal petrography from the background regions. This method uses a novel semantic segmentation model based on Unet, referred to as M2AR-Unet. To improve the efficiency of network learning, the proposed M2AR-Unet framework takes Unet as a baseline and further optimizes the network structure in four ways, namely, an improved residual block composed of four units, a mixed attention module containing multiple attention mechanisms, an edge feature enhancement strategy, and a multiscale feature extraction module composed of a feature pyramid and atrous spatial pyramid pooling module. Compared to current state-of-the-art segmentation network models, the proposed M2AR-Unet offers improved coal petrography extraction integrity and edge extraction.
APA, Harvard, Vancouver, ISO, and other styles
24

Li, Huachang, Jing Zhong, Liyan Lin, Yanping Chen, and Peng Shi. "Semi-supervised nuclei segmentation based on multi-edge features fusion attention network." PLOS ONE 18, no. 5 (2023): e0286161. http://dx.doi.org/10.1371/journal.pone.0286161.

Full text
Abstract:
The morphology of the nuclei represents most of the clinical pathological information, and nuclei segmentation is a vital step in current automated histopathological image analysis. Supervised machine learning-based segmentation models have already achieved outstanding performance with sufficiently precise human annotations. Nevertheless, outlining such labels on numerous nuclei is extremely professional needing and time consuming. Automatic nuclei segmentation with minimal manual interventions is highly needed to promote the effectiveness of clinical pathological researches. Semi-supervised learning greatly reduces the dependence on labeled samples while ensuring sufficient accuracy. In this paper, we propose a Multi-Edge Feature Fusion Attention Network (MEFFA-Net) with three feature inputs including image, pseudo-mask and edge, which enhances its learning ability by considering multiple features. Only a few labeled nuclei boundaries are used to train annotations on the remaining mostly unlabeled data. The MEFFA-Net creates more precise boundary masks for nucleus segmentation based on pseudo-masks, which greatly reduces the dependence on manual labeling. The MEFFA-Block focuses on the nuclei outline and selects features conducive to segment, making full use of the multiple features in segmentation. Experimental results on public multi-organ databases including MoNuSeg, CPM-17 and CoNSeP show that the proposed model has the mean IoU segmentation evaluations of 0.706, 0.751, and 0.722, respectively. The model also achieves better results than some cutting-edge methods while the labeling work is reduced to 1/8 of common supervised strategies. Our method provides a more efficient and accurate basis for nuclei segmentations and further quantifications in pathological researches.
APA, Harvard, Vancouver, ISO, and other styles
25

Jiang, Yun, Tongtong Cheng, Jinkun Dong, et al. "Dermoscopic image segmentation based on Pyramid Residual Attention Module." PLOS ONE 17, no. 9 (2022): e0267380. http://dx.doi.org/10.1371/journal.pone.0267380.

Full text
Abstract:
We propose a stacked convolutional neural network incorporating a novel and efficient pyramid residual attention (PRA) module for the task of automatic segmentation of dermoscopic images. Precise segmentation is a significant and challenging step for computer-aided diagnosis technology in skin lesion diagnosis and treatment. The proposed PRA has the following characteristics: First, we concentrate on three widely used modules in the PRA. The purpose of the pyramid structure is to extract the feature information of the lesion area at different scales, the residual means is aimed to ensure the efficiency of model training, and the attention mechanism is used to screen effective features maps. Thanks to the PRA, our network can still obtain precise boundary information that distinguishes healthy skin from diseased areas for the blurred lesion areas. Secondly, the proposed PRA can increase the segmentation ability of a single module for lesion regions through efficient stacking. The third, we incorporate the idea of encoder-decoder into the architecture of the overall network. Compared with the traditional networks, we divide the segmentation procedure into three levels and construct the pyramid residual attention network (PRAN). The shallow layer mainly processes spatial information, the middle layer refines both spatial and semantic information, and the deep layer intensively learns semantic information. The basic module of PRAN is PRA, which is enough to ensure the efficiency of the three-layer architecture network. We extensively evaluate our method on ISIC2017 and ISIC2018 datasets. The experimental results demonstrate that PRAN can obtain better segmentation performance comparable to state-of-the-art deep learning models under the same experiment environment conditions.
APA, Harvard, Vancouver, ISO, and other styles
26

Ji, Naihua, Huiqian Dong, Fanyun Meng, and Liping Pang. "Semantic Segmentation and Depth Estimation Based on Residual Attention Mechanism." Sensors 23, no. 17 (2023): 7466. http://dx.doi.org/10.3390/s23177466.

Full text
Abstract:
Semantic segmentation and depth estimation are crucial components in the field of autonomous driving for scene understanding. Jointly learning these tasks can lead to a better understanding of scenarios. However, using task-specific networks to extract global features from task-shared networks can be inadequate. To address this issue, we propose a multi-task residual attention network (MTRAN) that consists of a global shared network and two attention networks dedicated to semantic segmentation and depth estimation. The convolutional block attention module is used to highlight the global feature map, and residual connections are added to prevent network degradation problems. To ensure manageable task loss and prevent specific tasks from dominating the training process, we introduce a random-weighted strategy into the impartial multi-task learning method. We conduct experiments to demonstrate the effectiveness of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
27

Fang, Jie, QingBiao Zhou, and Shuxia Wang. "Segmentation Technology of Nucleus Image Based on U-Net Network." Scientific Programming 2021 (June 10, 2021): 1–10. http://dx.doi.org/10.1155/2021/1892497.

Full text
Abstract:
To solve the problems of rough edge and poor segmentation accuracy of traditional neural networks in small nucleus image segmentation, a nucleus image segmentation technology based on U-Net network is proposed. First, the U-Net network is used to segment the nucleus image, which stitches the feature images in the channel dimension to achieve feature fusion, and the skip structure is used to combine the low- and high-level features. Then, the subregional average pooling is proposed to improve the global average pooling in the attention module, and an attention channel expansion module is designed to improve the accuracy of image segmentation. Finally, the improved attention module is integrated into the U-Net network to achieve accurate segmentation of the nuclear image. Based on the Python platform, the experimental results show that the proposed segmentation technology can achieve fast convergence, and the mean intersection over union (MIoU) is 85.02%, which is better than other comparison technologies and has a good application prospect.
APA, Harvard, Vancouver, ISO, and other styles
28

Li, Zhou, and Hanan Aljuaid. "Integrated Design of Building Environment Based on Image Segmentation and Retrieval Technology." International Journal of Information Technologies and Systems Approach 17, no. 1 (2024): 1–14. http://dx.doi.org/10.4018/ijitsa.340774.

Full text
Abstract:
Existing models still exhibit a deficiency in capturing more detailed contextual information when processing architectural images. This paper introduces a model for architectural image segmentation and retrieval based on an image segmentation network. Primarily, spatial attention is incorporated into the U-Net segmentation network to enhance the extraction of image features. Subsequently, a dual-path attention mechanism is integrated into the U-Net backbone network, facilitating the seamless integration of information across different spaces and scales. Experimental results showcase the superior performance of the proposed model on the test set, with average dice coefficient, accuracy, and recall reaching 94.67%, 95.61%, and 97.88%, respectively, outperforming comparative models. The proposed model can enhance the U-Net network's capability to identify targets within feature maps. The amalgamation of image segmentation networks and attention mechanisms in artificial intelligence technology enables precise segmentation and retrieval of architectural images.
APA, Harvard, Vancouver, ISO, and other styles
29

Qu, SiCong, YiLin Dong, ChangMing Zhu, Lei Cao, and KeZhu Zuo. "Attention-based credible evidential segmentation network for remote sensing ship segmentation." International Journal of Approximate Reasoning 184 (September 2025): 109456. https://doi.org/10.1016/j.ijar.2025.109456.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Xia, Min, Junhao Qian, Xiaodong Zhang, Jia Liu, and Yiqing Xu. "River segmentation based on separable attention residual network." Journal of Applied Remote Sensing 14, no. 03 (2019): 1. http://dx.doi.org/10.1117/1.jrs.14.032602.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Liu, Changhua, Tao Qin, and Liangjin Liu. "Evaluation of Ischemic Penumbra in Stroke Patients Based on Deep Learning and Multimodal CT." Journal of Healthcare Engineering 2021 (November 30, 2021): 1–12. http://dx.doi.org/10.1155/2021/3215107.

Full text
Abstract:
In order to investigate the value of multimodal CT for quantitative assessment of collateral circulation, ischemic semidark zone, core infarct volume in patients with acute ischemic stroke (AIS), and prognosis assessment in intravenous thrombolytic therapy, segmentation model which is based on the self-attention mechanism is prone to generate attention coefficient maps with incorrect regions of interest. Moreover, the stroke lesion is not clearly characterized, and lesion boundary is poorly differentiated from normal brain tissue, thus affecting the segmentation performance. To address this problem, a primary and secondary path attention compensation network structure is proposed, which is based on the improved global attention upsampling U-Net model. The main path network is responsible for performing accurate lesion segmentation and outputting segmentation results. Likewise, the auxiliary path network generates loose auxiliary attention compensation coefficients, which compensate for possible attention coefficient errors in the main path network. Two hybrid loss functions are proposed to realize the respective functions of main and auxiliary path networks. It is experimentally demonstrated that both the improved global attention upsampling U-Net and the proposed primary and secondary path attention compensation networks show significant improvement in segmentation performance. Moreover, patients with good collateral circulation have a small final infarct area volume and a good clinical prognosis after intravenous thrombolysis. Quantitative assessment of collateral circulation and ischemic semidark zone by multimodal CT can better predict the clinical prognosis of intravenous thrombolysis.
APA, Harvard, Vancouver, ISO, and other styles
32

Xia, Shuang, Qian Sun, Yiheng Zhou, et al. "A Lightweight Neural Network for Cell Segmentation Based on Attention Enhancement." Information 16, no. 4 (2025): 295. https://doi.org/10.3390/info16040295.

Full text
Abstract:
Deep neural networks have made significant strides in medical image segmentation tasks, but their large-scale parameters and high computational complexity limit their applicability on resource-constrained edge devices. To address this challenge, this paper introduces a lightweight nuclear segmentation network called Attention-Enhanced U-Net (AttE-Unet) for cell segmentation. AttE-Unet enhances the network’s feature extraction capabilities through an attention mechanism and combines the strengths of deep learning with traditional image filtering algorithms, while substantially reducing computational and storage demands. Experimental results on the PanNuke dataset demonstrate that AttE-Unet, despite its significant reduction in model size—with the number of parameters and floating-point operations per second reduced to 1.57% and 0.1% of the original model, respectively—still maintains a high level of segmentation performance. Specifically, the F1 score and Intersection over Union (IoU) score are 91.7% and 89.3% of the original model’s scores. Furthermore, deployment on an MCU consumes only 2.09 MB of Flash and 1.38 MB of RAM, highlighting the model’s lightweight nature and its potential for practical deployment as a medical image segmentation solution on edge devices.
APA, Harvard, Vancouver, ISO, and other styles
33

Zhu, Zhiliang, Leiningxin Qiu, Jiaxin Wang, Jinquan Xiong, and Hua Peng. "Video Object Segmentation Using Multi-Scale Attention-Based Siamese Network." Electronics 12, no. 13 (2023): 2890. http://dx.doi.org/10.3390/electronics12132890.

Full text
Abstract:
Video target segmentation is a fundamental problem in computer vision that aims to segment targets from a background by learning their appearance information and movement information. In this study, a video target segmentation network based on the Siamese structure was proposed. This network has two inputs: the current video frame, used as the main input, and the adjacent frame, used as the auxiliary input. The processing modules for the inputs use the same structure, optimization strategy, and encoder weights. The input is encoded to obtain features with different resolutions, from which good target appearance features can be obtained. After processing using the encoding layer, the motion features of the target are learned using a multi-scale feature fusion decoder based on an attention mechanism. The final predicted segmentation results were calculated from a layer of decoded features. The video object segmentation framework proposed in this study achieved optimal results on CDNet2014 and FBMS-3D, with scores of 78.36 and 86.71, respectively. It outperformed the second-ranked method by 4.3 on the CDNet2014 dataset and by 0.77 on the FBMS-3D dataset. Suboptimal results were achieved on the video primary target segmentation datasets SegTrackV2 and DAVIS2016, with scores of 60.57 and 81.08, respectively.
APA, Harvard, Vancouver, ISO, and other styles
34

Hui, Haisheng, Xueying Zhang, Zelin Wu, and Fenlian Li. "Dual-Path Attention Compensation U-Net for Stroke Lesion Segmentation." Computational Intelligence and Neuroscience 2021 (August 31, 2021): 1–16. http://dx.doi.org/10.1155/2021/7552185.

Full text
Abstract:
For the segmentation task of stroke lesions, using the attention U-Net model based on the self-attention mechanism can suppress irrelevant regions in an input image while highlighting salient features useful for specific tasks. However, when the lesion is small and the lesion contour is blurred, attention U-Net may generate wrong attention coefficient maps, leading to incorrect segmentation results. To cope with this issue, we propose a dual-path attention compensation U-Net (DPAC-UNet) network, which consists of a primary network and auxiliary path network. Both networks are attention U-Net models and identical in structure. The primary path network is the core network that performs accurate lesion segmentation and outputting of the final segmentation result. The auxiliary path network generates auxiliary attention compensation coefficients and sends them to the primary path network to compensate for and correct possible attention coefficient errors. To realize the compensation mechanism of DPAC-UNet, we propose a weighted binary cross-entropy Tversky (WBCE-Tversky) loss to train the primary path network to achieve accurate segmentation and propose another compound loss function called tolerance loss to train the auxiliary path network to generate auxiliary compensation attention coefficient maps with expanded coverage area to perform compensate operations. We conducted segmentation experiments using the 239 MRI scans of the anatomical tracings of lesions after stroke (ATLAS) dataset to evaluate the performance and effectiveness of our method. The experimental results show that the DSC score of the proposed DPAC-UNet network is 6% higher than the single-path attention U-Net. It is also higher than the existing segmentation methods of the related literature. Therefore, our method demonstrates powerful abilities in the application of stroke lesion segmentation.
APA, Harvard, Vancouver, ISO, and other styles
35

Wang, Bowen. "Brain Tumor Image Segmentation Method Based on Multi-scale and Attention." BIO Web of Conferences 111 (2024): 03014. http://dx.doi.org/10.1051/bioconf/202411103014.

Full text
Abstract:
Brain tumor, as a high-risk disease of the brain, has been a threat to human life and health. In order to help doctors diagnose some parts of brain tumor accurately in hospitals, multi-scale fusion brain tumor image segmentation network has shown strong feature extraction ability and image segmentation accuracy improvement. In the original Unet network, only the feature information of the current layer is used in the jump connection layer, and the relevant feature information of the shallow network is ignored, so the segmentation accuracy will be affected accordingly. We use an improved segmentation network to solve this problem. Firstly, the multi-scale feature fusion module MFF is added to the encoder to fuse the features of different scales to improve the segmentation ability of the network. Secondly, the attention module ResCBAM is added to the jump connection layer of the encoder and decoder to guide the encoder to adaptively learn the important feature information in the jump connection. The BraTS2020 dataset in MICCAI competition was used for ablation experiments and contrast experiments, and Dice coefficient and HD95 were used as evaluation indicators. Through the experimental results, it can be seen that the improved network can extract more features in the whole tumor, tumor core and enhanced tumor region, and the segmentation effect of brain tumors is good. At the same time, the model parameters and the number of iterations are reduced.
APA, Harvard, Vancouver, ISO, and other styles
36

Yao, Zhaochen, Yanjuan Li, Hao Fu, et al. "Research on Concrete Crack and Depression Detection Method Based on Multi-Level Defect Fusion Segmentation Network." Buildings 15, no. 10 (2025): 1657. https://doi.org/10.3390/buildings15101657.

Full text
Abstract:
Cracks and dents in concrete structures are core defects that threaten building safety, but the existing YOLO series algorithms face a huge bottleneck in complex engineering scenarios. Tiny cracks are susceptible to background texture interference, leading to misjudgment. The traditional detection frame has difficulty in accurately characterizing the dent geometry, which affects the quantitative damage assessment. In this paper, we propose a Multi-level Defect Fusion Segmentation Network (MDFNet) to break through the single-task limitation through the detection segmentation synergy framework. We improve the anchor frame strategy of YOLOv11 and enhance the recall of small targets by combining Copy–Pasting, and then enhance the pixel-level characterization of crack edges and dent contours by embedding the Head Attention-Expanded Convolutional Fusion Module (HAEConv) in U-Net with squeeze-and-excitation (SE) channel attention. Joint detection loss and segmentation loss are used for task co-optimization. On our self-constructed concrete defect dataset, MDFNet significantly outperforms the baseline model. In terms of accuracy, the MDFNet Dice coefficient is 92.4%, an improvement of 4.1 percentage points compared to YOLOv11-Seg. Our mean Intersection over Union (mIoU) reaches 81.6%, with strong generalization ability under complex background interference. In terms of engineering efficacy, the model achieves a processing speed of 45 frames per second (FPS) for 640 × 640 images, which is able to meet real-time monitoring requirements. The experimental results verify the feasibility of the model in the research field of crack and dent detection in concrete structures.
APA, Harvard, Vancouver, ISO, and other styles
37

Wu, Nengkai, Dongyao Jia, Ziqi Li, and Zihao He. "Weak Edge Target Segmentation Network Based on Dual Attention Mechanism." Applied Sciences 14, no. 19 (2024): 8963. http://dx.doi.org/10.3390/app14198963.

Full text
Abstract:
Segmentation of weak edge targets such as glass and plastic poses a challenge in the field of target segmentation. The detection process is susceptible to background interference and various external factors due to the transparent nature of these materials. To address this issue, this paper introduces a segmentation network for weak edge target objects (WETS-Net). To effectively extract edge information of such objects and eliminate redundant information during feature extraction, a dual-attention mechanism is employed, including the Edge Attention Extraction Module (EAEM) and the Multi-Scale Information Fusion Module (MIFM). Specifically, the EAEM combines improved edge feature extraction kernels to selectively enhance the importance of edge features, aiding in more precise target region extraction. The MIFM utilizes spatial attention mechanisms to fuse multi-scale features, reducing background and external interference. These innovations enhance the performance of WETS-Net, offering a new direction for weak edge target segmentation research. Finally, through ablation experiments, the effectiveness of each module is effectively validated. Moreover, the proposed algorithm achieves an average detection accuracy of 95.83% and 96.13% on the dataset and a self-made dataset, respectively, outperforming similar U-Net-improved networks.
APA, Harvard, Vancouver, ISO, and other styles
38

Li, Jinglun, Jiapeng Xiu, Zhengqiu Yang, and Chen Liu. "Dual Path Attention Net for Remote Sensing Semantic Image Segmentation." ISPRS International Journal of Geo-Information 9, no. 10 (2020): 571. http://dx.doi.org/10.3390/ijgi9100571.

Full text
Abstract:
Semantic segmentation plays an important role in being able to understand the content of remote sensing images. In recent years, deep learning methods based on Fully Convolutional Networks (FCNs) have proved to be effective for the sematic segmentation of remote sensing images. However, the rich information and complex content makes the training of networks for segmentation challenging, and the datasets are necessarily constrained. In this paper, we propose a Convolutional Neural Network (CNN) model called Dual Path Attention Network (DPA-Net) that has a simple modular structure and can be added to any segmentation model to enhance its ability to learn features. Two types of attention module are appended to the segmentation model, one focusing on spatial information the other focusing upon the channel. Then, the outputs of these two attention modules are fused to further improve the network’s ability to extract features, thus contributing to more precise segmentation results. Finally, data pre-processing and augmentation strategies are used to compensate for the small number of datasets and uneven distribution. The proposed network was tested on the Gaofen Image Dataset (GID). The results show that the network outperformed U-Net, PSP-Net, and DeepLab V3+ in terms of the mean IoU by 0.84%, 2.54%, and 1.32%, respectively.
APA, Harvard, Vancouver, ISO, and other styles
39

Liu, Wenhuan, Yun Jiang, Jingyao Zhang, and Zeqi Ma. "RFARN: Retinal vessel segmentation based on reverse fusion attention residual network." PLOS ONE 16, no. 12 (2021): e0257256. http://dx.doi.org/10.1371/journal.pone.0257256.

Full text
Abstract:
Accurate segmentation of retinal vessels is critical to the mechanism, diagnosis, and treatment of many ocular pathologies. Due to the poor contrast and inhomogeneous background of fundus imaging and the complex structure of retinal fundus images, this makes accurate segmentation of blood vessels from retinal images still challenging. In this paper, we propose an effective framework for retinal vascular segmentation, which is innovative mainly in the retinal image pre-processing stage and segmentation stage. First, we perform image enhancement on three publicly available fundus datasets based on the multiscale retinex with color restoration (MSRCR) method, which effectively suppresses noise and highlights the vessel structure creating a good basis for the segmentation phase. The processed fundus images are then fed into an effective Reverse Fusion Attention Residual Network (RFARN) for training to achieve more accurate retinal vessel segmentation. In the RFARN, we use Reverse Channel Attention Module (RCAM) and Reverse Spatial Attention Module (RSAM) to highlight the shallow details of the channel and spatial dimensions. And RCAM and RSAM are used to achieve effective fusion of deep local features with shallow global features to ensure the continuity and integrity of the segmented vessels. In the experimental results for the DRIVE, STARE and CHASE datasets, the evaluation metrics were 0.9712, 0.9822 and 0.9780 for accuracy (Acc), 0.8788, 0.8874 and 0.8352 for sensitivity (Se), 0.9803, 0.9891 and 0.9890 for specificity (Sp), area under the ROC curve(AUC) was 0.9910, 0.9952 and 0.9904, and the F1-Score was 0.8453, 0.8707 and 0.8185. In comparison with existing retinal image segmentation methods, e.g. UNet, R2UNet, DUNet, HAnet, Sine-Net, FANet, etc., our method in three fundus datasets achieved better vessel segmentation performance and results.
APA, Harvard, Vancouver, ISO, and other styles
40

Liu, Jie, Songren Mao, and Liangrui Pan. "Attention-Based Two-Branch Hybrid Fusion Network for Medical Image Segmentation." Applied Sciences 14, no. 10 (2024): 4073. http://dx.doi.org/10.3390/app14104073.

Full text
Abstract:
Accurate segmentation of medical images is vital for disease detection and treatment. Convolutional Neural Networks (CNN) and Transformer models are widely used in medical image segmentation due to their exceptional capabilities in image recognition and segmentation. However, CNNs often lack an understanding of the global context and may lose spatial details of the target, while Transformers struggle with local information processing, leading to reduced geometric detail of the target. To address these issues, this research presents a Global-Local Fusion network model (GLFUnet) based on the U-Net framework and attention mechanisms. The model employs a dual-branch network that utilizes ConvNeXt and Swin Transformer to simultaneously extract multi-level features from pathological images. It enhances ConvNeXt’s local feature extraction with spatial and global attention up-sampling modules, while improving Swin Transformer’s global context dependency with channel attention. The Attention Feature Fusion module and skip connections efficiently merge local detailed and global coarse features from CNN and Transformer branches at various scales. The fused features are then progressively restored to the original image resolution for pixel-level prediction. Comprehensive experiments on datasets of stomach and liver cancer demonstrate GLFUnet’s superior performance and adaptability in medical image segmentation, holding promise for clinical analysis and disease diagnosis.
APA, Harvard, Vancouver, ISO, and other styles
41

Zhang, Jinfeng, Tianzhong Zhang, and Yihua Lou. "Angle Steel Tower Bolt Installation Detection Based on Simplified Attention Mechanism." Journal of Physics: Conference Series 2218, no. 1 (2022): 012074. http://dx.doi.org/10.1088/1742-6596/2218/1/012074.

Full text
Abstract:
Abstract Incorporating position and channel attention into the image semantic segmentation network is an important strategy to improve the segmentation effect. However, there is a huge computational overhead in counting the correlation between any two positions or channels. To address the issue, this paper proposes a simplified attention mechanism network integrating position attention and channel attention. In addition, to further optimize the segmentation results, this paper adopts a new data-dependent upsampling method to expand the high-level feature map to the input size. Compared with the traditional attention network, our method reduces 32.6% of parameter and calculation amount, and obtains an average cross-comparison result of 95.42% on the dataset of angle steel tower bolt.
APA, Harvard, Vancouver, ISO, and other styles
42

Wang, Wenjun, and Chao Su. "Convolutional Neural Network-Based Pavement Crack Segmentation Using Pyramid Attention Network." IEEE Access 8 (2020): 206548–58. http://dx.doi.org/10.1109/access.2020.3037667.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Qi, Chunkai, and Jian Di. "CSA-Net: A Transformer-based Polyp Segmentation Network." Academic Journal of Science and Technology 10, no. 2 (2024): 191–95. http://dx.doi.org/10.54097/y0dv2v19.

Full text
Abstract:
It is well known that colorectal polyps are a precursor to colorectal cancer. Accurate segmentation of polyp images from colonoscopy can assist clinicians in localizing polyp regions and reduce the occurrence of misdiagnosis accurately. Many existing methods achieve good results in the polyp segmentation task, but their extraction of global and local features is often insufficient. In this paper, we propose a transformer-based polyp segmentation network (CSA-Net) that utilizes two types of attention modules- spatial attention and channel attention-to further adaptively fuse local features with their global features. The proposed network is validated on five polyp datasets. Experimental results show that our model outperforms previously proposed models.
APA, Harvard, Vancouver, ISO, and other styles
44

Huang, Caiyun, and Changhua Yin. "A coronary artery CTA segmentation approach based on deep learning." Journal of X-Ray Science and Technology 30, no. 2 (2022): 245–59. http://dx.doi.org/10.3233/xst-211063.

Full text
Abstract:
Presence of plaque and coronary artery stenosis are the main causes of coronary heart disease. Detection of plaque and coronary artery segmentation have become the first choice in detecting coronary artery disease. The purpose of this study is to investigate a new method for plaque detection and automatic segmentation and diagnosis of coronary arteries and to test its feasibility of applying to clinical medical image diagnosis. A multi-model fusion coronary CT angiography (CTA) vessel segmentation method is proposed based on deep learning. The method includes three network layer models namely, an original 3-dimensional full convolutional network (3D FCN) and two networks that embed the attention gating (AG) model in the original 3D FCN. Then, the prediction results of the three networks are merged by using the majority voting algorithm and thus the final prediction result of the networks is obtained. In the post-processing stage, the level set function is used to further iteratively optimize the results of network fusion prediction. The JI (Jaccard index) and DSC (Dice similarity coefficient) scores are calculated to evaluate accuracy of blood vessel segmentations. Applying to a CTA dataset of 20 patients, accuracy of coronary blood vessel segmentation using FCN, FCN-AG1, FCN-AG2 network and the fusion method are tested. The average values of JI and DSC of using the first three networks are (0.7962, 0.8843), (0.8154, 0.8966) and (0.8119, 0.8936), respectively. When using new fusion method, average JI and DSC of segmentation results increase to (0.8214, 0.9005), which are better than the best result of using FCN, FCN-AG1 and FCN-AG2 model independently.
APA, Harvard, Vancouver, ISO, and other styles
45

Wang, Liang, and Kan Ren. "Attention-Based Mask R-CNN Enhancement for Infrared Image Target Segmentation." Symmetry 17, no. 7 (2025): 1099. https://doi.org/10.3390/sym17071099.

Full text
Abstract:
Image segmentation is an important method in the field of image processing, while infrared (IR) image segmentation is one of the challenges in this field due to the unique characteristics of IR data. Infrared imaging utilizes the infrared radiation emitted by objects to produce images, which can supplement the performance of visible-light images under adverse lighting conditions to some extent. However, the low spatial resolution and limited texture details in IR images hinder the achievement of high-precision segmentation. To address these issues, an attention mechanism based on symmetrical cross-channel interaction—motivated by symmetry principles in computer vision—was integrated into a Mask Region-Based Convolutional Neural Network (Mask R-CNN) framework. A Bottleneck-enhanced Squeeze-and-Attention (BNSA) module was incorporated into the backbone network, and novel loss functions were designed for both the bounding box (Bbox) regression and mask prediction branches to enhance segmentation performance. Furthermore, a dedicated infrared image dataset was constructed to validate the proposed method. The experimental results demonstrate that the optimized model achieves higher segmentation accuracy and better segmentation performance compared to the original network and other mainstream segmentation models on our dataset, demonstrating how symmetrical design principles can effectively improve complex vision tasks.
APA, Harvard, Vancouver, ISO, and other styles
46

Jin, Baixin, Pingping Liu, Peng Wang, Lida Shi, and Jing Zhao. "Optic Disc Segmentation Using Attention-Based U-Net and the Improved Cross-Entropy Convolutional Neural Network." Entropy 22, no. 8 (2020): 844. http://dx.doi.org/10.3390/e22080844.

Full text
Abstract:
Medical image segmentation is an important part of medical image analysis. With the rapid development of convolutional neural networks in image processing, deep learning methods have achieved great success in the field of medical image processing. Deep learning is also used in the field of auxiliary diagnosis of glaucoma, and the effective segmentation of the optic disc area plays an important assistant role in the diagnosis of doctors in the clinical diagnosis of glaucoma. Previously, many U-Net-based optic disc segmentation methods have been proposed. However, the channel dependence of different levels of features is ignored. The performance of fundus image segmentation in small areas is not satisfactory. In this paper, we propose a new aggregation channel attention network to make full use of the influence of context information on semantic segmentation. Different from the existing attention mechanism, we exploit channel dependencies and integrate information of different scales into the attention mechanism. At the same time, we improved the basic classification framework based on cross entropy, combined the dice coefficient and cross entropy, and balanced the contribution of dice coefficients and cross entropy loss to the segmentation task, which enhanced the performance of the network in small area segmentation. The network retains more image features, restores the significant features more accurately, and further improves the segmentation performance of medical images. We apply it to the fundus optic disc segmentation task. We demonstrate the segmentation performance of the model on the Messidor dataset and the RIM-ONE dataset, and evaluate the proposed architecture. Experimental results show that our network architecture improves the prediction performance of the base architectures under different datasets while maintaining the computational efficiency. The results render that the proposed technologies improve the segmentation with 0.0469 overlapping error on Messidor.
APA, Harvard, Vancouver, ISO, and other styles
47

Zhou, Yiheng, Kainan Ma, Qian Sun, Zhaoyuxuan Wang, and Ming Liu. "Edge-Guided Cell Segmentation on Small Datasets Using an Attention-Enhanced U-Net Architecture." Information 15, no. 4 (2024): 198. http://dx.doi.org/10.3390/info15040198.

Full text
Abstract:
Over the past several decades, deep neural networks have been extensively applied to medical image segmentation tasks, achieving significant success. However, the effectiveness of traditional deep segmentation networks is substantially limited by the small scale of medical datasets, a limitation directly stemming from current medical data acquisition capabilities. To this end, we introduce AttEUnet, a medical cell segmentation network enhanced by edge attention, based on the Attention U-Net architecture. It incorporates a detection branch enhanced with edge attention and a learnable fusion gate unit to improve segmentation accuracy and convergence speed on small medical datasets. The AttEUnet allows for the integration of various types of prior information into the backbone network according to different tasks, offering notable flexibility and generalization ability. This method was trained and validated on two public datasets, MoNuSeg and PanNuke. The results show that AttEUnet significantly improves segmentation performance on small medical datasets, especially in capturing edge details, with F1 scores of 0.859 and 0.888 and Intersection over Union (IoU) scores of 0.758 and 0.794 on the respective datasets, outperforming both convolutional neural networks (CNNs) and transformer-based baseline networks. Furthermore, the proposed method demonstrated a convergence speed over 10.6 times faster than that of the baseline networks. The edge attention branch proposed in this study can also be added as an independent module to other classic network structures and can integrate more attention priors based on the task at hand, offering considerable scalability.
APA, Harvard, Vancouver, ISO, and other styles
48

Li, Mingjiang, Pan Zhang, and Tao Hai. "Pore extraction method of rock thin section based on Attention U-Net." Journal of Physics: Conference Series 2467, no. 1 (2023): 012016. http://dx.doi.org/10.1088/1742-6596/2467/1/012016.

Full text
Abstract:
Abstract This paper proposes a solution to the shortcomings of traditional segmentation methods. The labeling method uses the incomplete labeling method in weakly supervised labeling to simplify labeling and combines transfer learning to initialize the weight of the network in advance. According to the above ideas, an end-to-end deep learning model is trained. The fine rock particles have a greater segmentation impact, and in addition to that, when compared with the popular deep learning semantic segmentation approaches, they also have a significant improvement. The next phase is to continue improving the network by optimizing the parameters, with the number of network layers and the total number of parameters remaining unaltered. This requirement must be satisfied before moving on to the next stage. The capability of generalization enhances the impact of segmentation on particles as well as their accuracy. Experiments show that this method is significantly better than the traditional method for segmenting rock flakes with manual operation and has better results in the segmentation and extraction of fine particles compared with the mainstream convolutional neural network.
APA, Harvard, Vancouver, ISO, and other styles
49

Dong, Yuying, Liejun Wang, Shuli Cheng, and Yongming Li. "FAC-Net: Feedback Attention Network Based on Context Encoder Network for Skin Lesion Segmentation." Sensors 21, no. 15 (2021): 5172. http://dx.doi.org/10.3390/s21155172.

Full text
Abstract:
Considerable research and surveys indicate that skin lesions are an early symptom of skin cancer. Segmentation of skin lesions is still a hot research topic. Dermatological datasets in skin lesion segmentation tasks generated a large number of parameters when data augmented, limiting the application of smart assisted medicine in real life. Hence, this paper proposes an effective feedback attention network (FAC-Net). The network is equipped with the feedback fusion block (FFB) and the attention mechanism block (AMB), through the combination of these two modules, we can obtain richer and more specific feature mapping without data enhancement. Numerous experimental tests were given by us on public datasets (ISIC2018, ISBI2017, ISBI2016), and a good deal of metrics like the Jaccard index (JA) and Dice coefficient (DC) were used to evaluate the results of segmentation. On the ISIC2018 dataset, we obtained results for DC equal to 91.19% and JA equal to 83.99%, compared with the based network. The results of these two main metrics were improved by more than 1%. In addition, the metrics were also improved in the other two datasets. It can be demonstrated through experiments that without any enhancements of the datasets, our lightweight model can achieve better segmentation performance than most deep learning architectures.
APA, Harvard, Vancouver, ISO, and other styles
50

Tao, Sha, and Zhenfeng Wang. "Tooth CT Image Segmentation Method Based on the U-Net Network and Attention Module." Computational and Mathematical Methods in Medicine 2022 (August 19, 2022): 1–8. http://dx.doi.org/10.1155/2022/3289663.

Full text
Abstract:
Traditional image segmentation methods often encounter problems of low segmentation accuracy and being time-consuming when processing complex tooth Computed Tomography (CT) images. This paper proposes an improved segmentation method for tooth CT images. Firstly, the U-Net network is used to construct a tooth image segmentation model. A large number of feature maps in downsampling are supplemented to downsampling to reduce information loss. At the same time, the problem of inaccurate image segmentation and positioning is solved. Then, the attention module is introduced into the U-Net network to increase the weight of important information and improve the accuracy of network segmentation. Among them, subregion average pooling is used instead of global average pooling to obtain spatial features. Finally, the U-Net network combined with the improved attention module is used to realize the segmentation of tooth CT images. And based on the image collection provided by West China Hospital for experimental demonstration, compared with other algorithms, our method has better segmentation performance and efficiency. The contours of the teeth obtained are clearer, which is helpful to assist the doctor in the diagnosis.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!