基于音视频信息融合与Self-Attention-DSC-CNN6网络的鲈鱼摄食强度分类方法

doi:10.6041/j.issn.1000-1298.2025.01.002

首页 > 过刊浏览>2025年第56卷第1期 >16-24. DOI:10.6041/j.issn.1000-1298.2025.01.002

基于音视频信息融合与Self-Attention-DSC-CNN6网络的鲈鱼摄食强度分类方法
DOI:
                        10.6041/j.issn.1000-1298.2025.01.002
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家重点研发计划项目（2022YFD2001703）

Classification Method of Feeding Intensity of Sea Bass Based on Self-Attention-DSC-CNN6 and Multi-modal Fusion

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

摄食强度识别分类是实现水产养殖精准投喂的重要环节。现有的投喂方式存在过度依赖人工经验判断、投喂量不精确、饲料浪费严重等问题。基于多模态融合的鱼类摄食程度分类能够综合不同类型的数据（如：视频、声音和水质参数），为鱼群的投喂提供更加全面精准的决策依据。因此，提出了一种融合视频和音频数据的多模态融合框架，旨在提升鲈鱼摄食强度分类性能。将预处理后的Mel频谱图（Mel Spectrogram）和视频帧图像分别输入到Self-Attention-DSC-CNN6（Self-attention-depthwise separable convolution-CNN6）优化模型进行高层次的特征提取，并将提取的特征进一步拼接融合，最后将拼接后的特征经分类器分类。针对Self-Attention-DSC-CNN6优化模型，基于CNN6算法进行了改进，将传统卷积层替换为深度可分离卷积（Depthwise separable convolution，DSC）来达到减少计算复杂度的效果，并引入Self-Attention注意力机制以增强特征提取能力。实验结果显示，本文所提出的多模态融合框架鲈鱼摄食强度分类准确率达到90.24%，模型可以有效利用不同数据源信息，提升了对复杂环境中鱼群行为的理解，增强了模型决策能力，确保了投喂策略的及时性与准确性，从而有效减少了饲料浪费。

Abstract:

Feeding intensity recognition and classification is an important link to realize accurate feeding in aquaculture. Existing feeding methods have problems such as over-reliance on manual experience judgment, imprecise feeding amount, and serious feed waste. Fish feeding degree classification based on multi-modal fusion can synthesize different types of data (e.g., video, sound, and water quality parameters) to provide a more comprehensive and accurate decision basis for fish feeding. Therefore, a multi-modal fusion framework that integrated video and audio data was proposed with the aim of improving the performance of sea bass feeding intensity classification. The preprocessed Mel Spectrogram (Mel) and video frame images were input into the self-attention-depthwise separable convolution-CNN6 (Self-Attention-DSC-CNN6) optimization model for high-level feature extraction, respectively, and the extracted features were further spliced and fused, and finally the spliced features were classified by a classifier. The Self-Attention-DSC-CNN6 optimization model was improved based on the CNN6 algorithm by replacing the traditional convolutional layers with depthwise separable convolution (DSC) to reduce the computational complexity, and the Self-Attention mechanism was introduced to enhance the feature extraction capability. The experimental results showed that the multi-modal fusion framework proposed achieved an accuracy of 90.24% in sea bass feeding intensity classification, and the model can effectively utilize the information from different data sources to improve the understanding of fish behavior in complex environments, enhance the decision-making ability of the model, and ensure the timeliness and accuracy of the feeding strategy, thus effectively reducing the waste of feed. This not only provided strong technical support for the intelligent management of aquaculture, but also laid the foundation for the development of intelligent feeding system.

参考文献

相似文献

引证文献

引用本文

李道亮,李万超,杜壮壮.基于音视频信息融合与Self-Attention-DSC-CNN6网络的鲈鱼摄食强度分类方法[J].农业机械学报,2025,56(1):16-24. LI Daoliang, LI Wanchao, DU Zhuangzhuang. Classification Method of Feeding Intensity of Sea Bass Based on Self-Attention-DSC-CNN6 and Multi-modal Fusion[J]. Transactions of the Chinese Society for Agricultural Machinery,2025,56(1):16-24.

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-11-25
最后修改日期:
录用日期:
在线发布日期: 2025-01-10
出版日期:

期刊浏览

EI收录结果

引用本文

相关视频

分享

文章指标

历史

文章二维码