Abstract:Aiming to address issues in the secondary stem-cutting operation of cherry tomato,such as occlusion of cutting points by terminal fruits,low positioning accuracy,and risk of fruit damage,a solution based on an improved YOLO v8-Pose was proposed. By introducing an active recognition strategy,the tomato was rotated to determine the optimal viewing angle according to a four-category classification and spatial plane normal vector method,while key points were used to locate secondary cutting positions. In terms of network architecture,the PSA partial self-attention mechanism was integrated to enhance global context modeling. A Dyn_GSConv module with multi-scale convolutional kernels was designed,and a Dyn_VoVGSCSP module was constructed based on a Slim-Neck structure to replace the original C2f neck,reducing model complexity while preserving local details and global features. Meanwhile,a VCM module was adopted to replace standard convolution in the feature extraction layer,further lightweighting the network. Experiments showed that the improved network achieved an mAP50 of 97.4% in both detection and keypoint recognition,an improvement of 1.4 percentage points over the original model. The precision rates reached 93.3% and 90.9%,increasing by 5.3 percentage points and 5.0 percentage points,respectively. Furthermore,GFLOPs were reduced by 9×108,and the number of parameters was decreased by 1.9×105. For optimal viewing angle selection,the network's category output was used to adjust randomly posed tomatoes to a front view,and depth information was employed to construct spatial plane normals of fruit clusters. When the angle between the normal vector and the camera's yz-plane was less than 10°,the optimal viewing angle was determined,providing a clear observation view for keypoint recognition and secondary cutting operations. The research result can provide an effective visual perception solution for achieving automated,high-precision,and low-damage secondary stem-cutting in cherry tomato production.