Abstract:Due to geographical or cultural differences, the entity names in agricultural texts are confused, which makes automatic identification and extraction of information complicated and limits the development of agricultural informatization. In view of this, an agricultural entity normalization method based on mBART was proposed. Firstly, based on the knowledge and experience of experts in the agricultural field, a crop-oriented agricultural text dataset was constructed, covering the three major crops of “legumes”, “cereals” and “oil crops”, with a total of 22440 pieces of high-quality agricultural labeling data. Secondly, the problem of agricultural entity normalization involved the detection and identification of non-normalized agricultural entities. A unified generative framework was proposed based on mBART to jointly detect and identify agricultural non-normalized entities and directly complete the task of normalizing agricultural named entities. Furthermore, in order to improve the normalization effect of agricultural entities, auxiliary tasks of agricultural non-normalized entity detection and agricultural non-normalized entity recognition were additionally introduced into the model. Finally, extensive experiments were conducted on the proposed crop dataset. The results showed that the proposed method achieved P, R, and F1 above 0.99 in the task of agricultural entity normalization, and all indexes were optimal compared with other methods. Compared with the large language models, the proposed method also had significant advantages.