Abstract:In response to the problems of insufficient utilization of pretrained language models, low utilization of external knowledge injection, and low recognition rate of nested named entities in the process of named entity recognition in the field of agricultural diseases, a named entity recognition model continuous prompts for machine reading comprehension (CP-MRC) was proposed based on continuous prompt injection and pointer network. This model introduced the bidirectional encoder representation from transformers (BERT) pretraining model, which freezed the original parameters of the BERT model and retained its text representation ability obtained during the pretraining stage. To enhance the applicability of the model to domain data, continuous trainable hint vectors were inserted into each layer of Transformer. To improve the accuracy of nested named entity recognition, a pointer network was used to extract entity sequences. A comparative experiment was conducted on a self built agricultural disease dataset, which included 2933 text corpora, 8 entity types, and a total of 10414 entities. The experimental results showed that the accuracy, recall, and F1 values of the CP-MRC model reached 83.55%, 81.4%, and 82.4%, which was superior to other models. The recognition rate of nested entities in pathogens and crops was increased by 3 percentage points and 13 percentage points in F1 value compared with that of others, and the recognition rate of nested entities was significantly improved. The model still had good recognition performance with only a small number of trainable parameters, providing ideas for the application of large-scale pretrained models in information extraction tasks.