面向樱桃番茄采摘识别的轻量化Transformer架构优化研究
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

山东省重点研发计划项目(2023CXGC010715)和中国机械工业集团有限公司科技专项(ZDZX2023-2)


Performance Optimization of Lightweight Transformer Architecture for Cherry Tomato Picking
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为进一步提升穗收型樱桃番茄识别准确率和识别速度,实现设施环境番茄自动采摘,提出了一种基于改进Transformer的轻量化樱桃番茄穗态识别模型。首先,构建了包含不同光照环境和采摘姿态的樱桃番茄数据集,并对樱桃番茄果穗姿态进行了划分。然后,提出了一种基于改进RE-DETR的轻量化穗收樱桃番茄识别模型,通过引入一个轻量级的骨干网络EfficientViT替换RT-DETR原有的骨干网络,显著减少了模型参数和计算量;同时设计了一个自适应细节融合模块,旨在高效处理并融合不同尺度特征图,并进一步降低计算复杂度。最后,引入加权函数滑动机制和指数移动平均思想来优化损失函数,来处理样本分类中的不确定性。实验结果表明,该轻量化模型在保持高识别准确率(90%)的同时,实现了快速检测(41.2f/s)和低计算量(8.7×109 FLOPs)。与原始网络模型、Faster R-CNN和Swin Transformer相比,平均识别准确率提高1.24%~15.38%,每秒处理帧数(FPS)提高25.61%~255.17%,同时浮点运算量实现了69.37%~92.37%的大幅降低。该模型在综合性能上有着较强的鲁棒性,兼顾了精度与速度,可为番茄采摘机器人完成视觉任务提供技术支撑。

    Abstract:

    To further improve the recognition accuracy and speed of truss-harvested cherry tomatoes, targeting the scenario of automated tomato harvesting in facility environments, a lightweight cherry tomato truss recognition model was proposed based on an improved transformer. Firstly, a cherry tomato dataset encompassing various lighting conditions and harvesting postures was constructed, and the postures of cherry tomato trusses were categorized. Then a lightweight trussharvested cherry tomato recognition model based on an improved RE-DETR was proposed. This model introduced a lightweight backbone network, EfficientViT, to replace the original backbone of RT-DETR, which significantly reduced model parameters and computational complexity. Additionally, an adaptive detail fusion module was designed to efficiently process and merge feature maps of different scales while further lowered computational complexity. Finally, a weighted function sliding mechanism and exponential moving average concept were introduced to optimize the loss function, which addressed uncertainties in sample classification. Experimental results demonstrated that this lightweight model achieved high recognition accuracy (90.00%) while enabled fast detection (41.2f/s) and low computational cost (8.7×109 FLOPs). Compared with that of the original network model, Faster R-CNN, and Swin Transformer, the average recognition accuracy was improved by 1.24%~15.38%, the frames processed per second (FPS) was increased by 25.61%~255.17%, while simultaneously achieved a substantial reduction of 69.37%~92.37% in floating-point operations. The model exhibited strong robustness in overall performance, balancing accuracy and speed, and can serve as a reference for tomato harvesting robots in completing visual tasks.

    参考文献
    相似文献
    引证文献
引用本文

赵博,柳苏纯,张巍朋,朱立成,韩振浩,冯旭光,王瑞雪.面向樱桃番茄采摘识别的轻量化Transformer架构优化研究[J].农业机械学报,2024,55(10):62-71,105. ZHAO Bo, LIU Suchun, ZHANG Weipeng, ZHU Licheng, HAN Zhenhao, FENG Xuguang, WANG Ruixue. Performance Optimization of Lightweight Transformer Architecture for Cherry Tomato Picking[J]. Transactions of the Chinese Society for Agricultural Machinery,2024,55(10):62-71,105.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-06-03
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-10-10
  • 出版日期:
文章二维码