Abstract:Addressing critical technical bottlenecks in precision orchard spraying, specifically the insufficient real?time performance, poor generalization in complex scenarios, and model lightweighting challenges in citrus tree canopy instance segmentation, an innovative lightweight instance segmentation model was proposed based on an enhanced YOLO 11n?seg architecture. Key technical innovations included employing depthwise separable convolution (DSConv) to compress parameters in critical layers to merely 11.3% of the original structure, coupled with a channel separation strategy that significantly reduced computational redundancy, introducing a novel global attention mechanism (GAM) that achieved fused three?dimensional channel?spatial weighting through dimensional permutation operations, effectively suppressing 42.7% of overexposed region misdetections while enhancing edge feature representation and designing a lightweight segmentation detection head (LSDH) that integrated multi?scale feature fusion with dynamic channel pruning, reducing computational load by 31.4% while maintaining segmentation accuracy. To address data scarcity, a specialized RGB?D citrus canopy dataset containing 2 500 annotated samples captured by using Kinect DK depth cameras was constructed. This dataset was expanded through depth threshold filtering and five?dimensional adversarial augmentation (incorporating geometric transformations, photometric variations, and synthetic noise injection) to comprehensively represent complex orchard environments. Experimental validation under realistic 35% foliage occlusion conditions on a mobile spray platform operating at 0.5 m/s demonstrated the model?s superior performance: segmentation accuracy ((Seg) AP50) reached 92.6% (2.4 percentage points improvement over baseline), inference time achieved 0.178 s (12.7% faster than that of YOLOACT), and parameter count was reduced to only 2.53×10^6 (24% of Mask R?CNN). Field deployment results confirmed the system?s practical viability: utilizing keyframe point cloud fusion technology, image processing latency was constrained to 320 ms, enabling precise spraying control with just 0.2025 m displacement error at 0.5 m/s vehicle speed (total system delay: 404.93 ms). Variable?rate spraying validation showed 45.75% pesticide reduction, spray distribution uniformity (coefficient of variation) of 10.87%, and 17.53% reduction in overspray within overlapping canopy regions demonstrating improvements over conventional spraying methods.