基于自监督学习的玉米植株图像小样本语义分割模型

doi:10.6041/j.issn.1000-1298.2026.01.007

首页 > 过刊浏览>2026年第57卷第1期 >72-82. DOI:10.6041/j.issn.1000-1298.2026.01.007

基于自监督学习的玉米植株图像小样本语义分割模型
DOI:
                        10.6041/j.issn.1000-1298.2026.01.007
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家重点研发计划项目（2022YFD2002303-01）和辽宁省教育厅基本科研项目面上项目（JYTM20231303）

Self-supervised Few-shot Semantic Segmentation Model for Maize Plant Images

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

图像语义分割技术是获取玉米植株表型信息的重要手段之一，传统的全监督语义分割方法往往依赖大量像素级标签，但玉米在不同生长阶段形态多变，导致图像标注成本高昂，制约模型在实际生产中的应用。为了去掉模型训练中的人工标注过程，本研究提出了一种基于自监督学习的玉米植株图像小样本语义分割网络（Self-supervised few-shot semantic segmentation network for maize plant images，MSDANet），以提高不同生长时期玉米植株图像的语义分割精度和模型泛化能力。MSDANet利用基于超像素的自监督学习方法生成伪标签，无需人工标注即可为支持集图像构建初步监督信号；设计混合遮蔽机制（Mixed masking，MM），应用基于伪标签的语义遮蔽，在特征空间构建多样性遮蔽样本，促进模型学习更鲁棒性的特征表达，从而提高复杂背景下的分割精度。针对图像中玉米植株存在的弯曲、重叠、遮挡等复杂形态问题，本研究为模型设计了多尺度可变形大核卷积注意力机制（Multi-scale deformable large kernel attention，MS-DLKA），通过融合多尺度感受野和可变形卷积，能够灵活感知玉米植株在不同尺度下的重要结构信息，有效提高了语义分割精度。在小样本数据集上进行验证，在1-shot设置下，MSDANet的mIoU和FB-IoU分别达到75.63%和87.12%；在5-shot设置下，mIoU和FB-IoU分别达到76.04%和87.21%，均优于本研究给出的同类其他模型。此外，与当前主流的全监督小样本语义分割模型对比，在1-shot和5-shot设置下，mIoU分别提升2.9、2.93个百分点。结果表明，MSDANet模型能够在无人工标签和小样本的前提下，实现高精度的玉米植株图像语义分割任务，为不同生长时期的玉米图像分析与植物表型测量提供了技术支持。

Abstract:

Image semantic segmentation technology is one of the key methods for obtaining phenotypic information of maize plants. Traditional fully supervised semantic segmentation methods typically rely on a large number of pixel-level labels. However, maize exhibits significant morphological variability across different growth stages, leading to high costs associated with image annotation and limiting the practical application of such models in real-world production scenarios. To eliminate the need for manual annotation during model training, a self-supervised few-shot semantic segmentation network for maize plant images (MSDANet) was proposed based on self-supervised learning, aiming to improve the semantic segmentation accuracy and model generalization capability of maize plant images across different growth stages. MSDANet utilized a superpixel-based self-supervised learning method to generate pseudo labels, enabling the construction of preliminary supervision signals for the support set images without manual annotation. It designed a mixed masking mechanism (MM) that applied pseudo label-based semantic masking to construct diverse masked samples in the feature space, promoting the model to learn more robust feature representations and thereby improving segmentation accuracy in complex backgrounds. To address the complex morphological issues of corn plants in images, such as bending, overlapping, and occlusion, a multi-scale deformable large kernel attention mechanism (MS-DLKA) for the model was designed. By integrating multi-scale receptive fields and deformable convolutions, it can flexibly perceive important structural information of corn plants at different scales, effectively improving semantic segmentation accuracy. When validated on a small sample dataset, MSDANet achieved mIoU and FB-IoU of 75.63% and 87.12%, respectively, in the 1-shot setting;in the 5-shot setting, mIoU and FB-IoU reached 76.04% and 87.21%, respectively, both outperforming other models of the same type proposed in this study. Additionally, compared with current mainstream fully supervised few-shot semantic segmentation models, mIoU was improved by 2.9 and 2.93 percentage points under 1-shot and 5-shot settings, respectively. The results demonstrated that the MSDANet model can achieve high-precision semantic segmentation of corn plant images without human labels and with few samples, providing technical support for corn image analysis and plant phenotyping at different growth stages.

参考文献

相似文献

引证文献

引用本文

邓寒冰,刘鑫,李朝阳,苗腾.基于自监督学习的玉米植株图像小样本语义分割模型[J].农业机械学报,2026,57(1):72-82. DENG Hanbing, LIU Xin, LI Chaoyang, MIAO Teng. Self-supervised Few-shot Semantic Segmentation Model for Maize Plant Images[J]. Transactions of the Chinese Society for Agricultural Machinery,2026,57(1):72-82.

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2025-08-01
最后修改日期:
录用日期:
在线发布日期: 2026-01-01
出版日期:

期刊浏览

EI收录结果

引用本文

分享

相关视频

文章指标

历史

文章二维码