设施园艺辅助生产的知识增强大语言模型PengKGPT研究

doi:10.6041/j.issn.1000-1298.2026.03.026

首页 > 过刊浏览>2026年第57卷第3期 >270-283. DOI:10.6041/j.issn.1000-1298.2026.03.026

设施园艺辅助生产的知识增强大语言模型PengKGPT研究
DOI:
                        10.6041/j.issn.1000-1298.2026.03.026
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:陕西省重点研发计划项目（S2024-YF-ZDCXL-ZDLNY-0159）和2024年省级财政农业专项资金项目

PengKGPT: A Multi-source Knowledge-enhanced Large Language Model for Assisting in Protected Horticulture Production

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

中国设施园艺产业的高速发展，使得对智能知识服务需求激增。然而，当前碎片化、关联性低的设施园艺知识体系和低精度、低效率的知识服务手段在指导生产中存在较大的缺陷。此外，从业者对于问题的描述不够全面，进一步增加了设施园艺问题的解决难度。本文结合知识图谱（Knowledge graph，KG）和大语言模型（Large language model，LLM）的优势，提出了多源设施园艺知识增强的问答模型，用于分析解决设施园艺生产中的问题。首先，构建了一个包含60余种设施园艺常见种植品类，近150万字的设施园艺知识数据集，通过语义分割获得26349个文本块存储于向量数据库，并提取数据集中与生产技术相关的文本知识构建了KG。同时，提出了一个基于KG实体匹配的语义信息增强模型，挖掘了KG实体之间的潜在关联，通过实体匹配的方式，增强用户输入的语义信息。其次，本文设计了一种具有KG和向量数据库双重引导提示的检索增强生成方法，将KG和相关文本信息共同输入提示模板增强LLM的问题分析能力。此外，为了增强其在设施园艺领域的适应性，在相关问答语料上使用低阶适应（Low-rank adaptation，LoRA）微调了LLM。基于此，开发了一个多源知识增强的LLM（命名为PengKGPT），用于对设施园艺生产中的问题进行推理和响应。它使用与生产相关的自然文本描述作为输入，并将多源知识作为额外的语料库。最后，案例研究表明，PengKGPT的得分率和准确率分别达到91.2%和82.10%，较基座模型提高36.6个百分点和32.53个百分点，增强了LLM对垂直领域问题的分析能力；与ERNIE 4.0 Turbo和GPT-4o经典商业模型相比，得分率分别提高10.2、14个百分点，准确率分别提高10.04、12.69个百分点，说明PengKGPT在解决设施园艺生产中的问题方面表现出更高的专业性和可靠性。结果表明，该模型可为设施园艺生产提供辅助作用。

Abstract:

The rapid development of China’s protected horticulture industry has led to a surge in demand for intelligent knowledge services. However, the current fragmented and loosely connected knowledge systems, along with imprecise and inefficient knowledge service methods, pose significant challenges in guiding production. Moreover, practitioners’ descriptions of issues are often incomplete, further complicating the resolution of protected horticulture problems. To address these issues in protected horticulture production, integrating knowledge graph (KG) and large language model (LLM) were proposed to create a multi-source knowledge-enhanced question-answering model. Initially, a knowledge dataset for protected horticulture was constructed, encompassing over 60 commonly cultivated categories in protected horticulture and containing nearly 1.5 million words. Through semantic segmentation, totally 26349 textual blocks were obtained and stored in a vector database. Additionally, textual knowledge related to production techniques was extracted from the dataset to construct a knowledge graph. Concurrently, a semantic information enhancement model was proposed based on KG entity matching. Subsequently, a retrieval-augmented generation method was designed, in which the KG and related textual information were input into the prompt template to improve the LLM’s problem-analysis capabilities. Furthermore, to enhance its adaptability in the field of protected horticulture, the LLM was fine-tuned on relevant question-answering corpora by using low-rank adaptation (LoRA) method. Based on this, a multi-source knowledge-enhanced LLM (named PengKGPT) was developed to reason and respond to issues in protected horticulture production. Finally, the case studies revealed that PengKGPT attained score and accuracy rates of 91.2% and 82.10%, respectively, marking improvements of 36.6 and 32.53 percentage points compared with the base model. This enhancement significantly augmented the large language models analytical capabilities for questions in vertical domains. When benchmarked against classic commercial models such as ERNIE 4.0 Turbo and GPT-4o, PengKGPT demonstrated increases of 10.2 and 14 percentage points in score rate, along with improvements of 10.4 and 12.69 percentage points in accuracy rate, respectively. These results indicated that PengKGPT exhibited superior professionalism and reliability in addressing challenges within protected horticulture production. The results indicated that this approach can provide auxiliary support for protected horticulture production.

参考文献

相似文献

引证文献

引用本文

孙先鹏,项颖峰,付颖,张晨阳,吴伟骏.设施园艺辅助生产的知识增强大语言模型PengKGPT研究[J].农业机械学报,2026,57(3):270-283. SUN Xianpeng, XIANG Yingfeng, FU Ying, ZHANG Chenyang, WU Weijun. PengKGPT: A Multi-source Knowledge-enhanced Large Language Model for Assisting in Protected Horticulture Production[J]. Transactions of the Chinese Society for Agricultural Machinery,2026,57(3):270-283.

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2025-08-25
最后修改日期:
录用日期:
在线发布日期: 2026-02-01
出版日期:

期刊浏览

EI收录结果

引用本文

分享

相关视频

文章指标

历史

文章二维码