Abstract:Rapid acquisition of grain number in rice spike is important for screening high-yielding and high-quality varieties. Aiming to address the problems that threshing counting destroyed the topology of the rice spike and cannot be used for the measurement of other phenotypic parameters, a method for counting rice grains in the spike was proposed. Considering the in-situ counting of rice grains as a density prediction problem, based on deformable convolution, a backbone network for feature extraction of rice spike images was designed. With a small number of selected paradigms for feature correlation of rice grains and spike images, feature correlation maps were generated through feature correlation layers, and based on the feature correlation maps, the image features were reused and cascaded to predict the distribution of density of the rice grains, which was then summed up to obtain the counting results through the density maps. The test results showed that the method had high counting accuracy. The mean absolute error (MAE), root mean square error (RMSE), and mean relative error (MRE) of rice grain counts of the test samples were 4.71, 6.92, and 2.9%. respectively, with MRE being only 0.7 percentage points higher than that of the manual walk-through, and MRE reduction of 9.9, 8.6 and 11.6 percentage points compared with that of existing benchmark methods FamNet, CSRNet and ICACount. Rice spike image feature extraction network designed with deformable convolution can effectively improve the accuracy of rice grain counting. With a close number of parameters, the model-based on this network was 19.3% and 12.9% lower than that of ResNet-50 in MAE and RMSE, and the model had a good fit with coefficient of determination R2 of 0.940 5. Deformable convolution reduced 28.9% and 22.0% of rice grain count MAE and RMSE, and 1.6 percentage points of MRE than conventional convolution for the same network architecture. Image feature reuse played an important role in improving the accuracy of rice grain counting, and this treatment decreased the MAE and RMSE of the model on the test set by 27.6% and 22.1%, and the MRE by 2.2 percentage points. The processing time of single rice spike image of this method was 0.92 s, which effectively improved the work efficiency, and the research can provide technical reference for rice spike phenotype detection and platform design.