Abstract:The ripeness of tomatoes is closely related to their quality, and it serves as a crucial basis for key production processes such as harvesting and sorting. To address the issues of simple functionality in crop ripeness grading and detection systems, and high costs associated with manual system upgrades, taking tomatoes as an example. It collected and constructed a tomato image dataset under natural scenarios, and a semi-automatic tomato image annotation algorithm was designed based on the tomato fruit ripeness grading algorithm to annotate the collected data. Building on the YOLO v8 model, the FPN structure was replaced with the BiFPN structure to achieve more efficient multi-scale feature fusion. It utilized the SE attention mechanism for fused feature extraction across spatial and channel dimensions, and introduced the Focal SIoU loss function to measure the angular difference between the predicted bounding box and the ground truth box. This results in the development of the tomato ripeness grading and detection model YOLO v8_BFS which was based on color feature quantization and the improved YOLO v8, and can identify five different ripeness stages during tomato growth. Experimental results showed that the proposed model effectively solved the problems of false detection and missed detection in tomato ripeness grading and detection under complex natural scenarios. While there was a slight increase in model computational complexity (FLOPs), parameter count (Params), and memory storage size, the detection accuracy of the proposed model reached 94.10%, which was 3.0 percentage points higher than that of the original YOLO v8 model. Compared with target detection models such as Faster R-CNN-Resnet50, YOLO v5, YOLO v7-tiny, YOLO v8, YOLO v10, and YOLO 11, the proposed model demonstrated significant advantages in detection accuracy, providing a reliable method for tomato ripeness detection.