基于环境变量筛选与机器学习的土壤养分含量空间插值研究
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金项目(42161042)、兵团科技创新领军人才项目(2023CB008-10)和兵团农业核心攻关项目(2023AA601)


Spatial Interpolation of Soil Nutrients Content Based on Environmental Variables Screening and Machine Learning
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了提高农田土壤养分含量空间插值精度,准确掌握土壤养分的空间分布特征,以新疆玛纳斯河流域绿洲为研究区域,测定土壤有机质含量、全氮含量、有效磷含量、速效钾含量、pH值和盐分含量,协同经度、纬度、地形、气象和植被指数因子作为环境变量,经过皮尔逊相关系数(Person correlation coefficient,PCC)、方差膨胀系数(Variance inflation factor,VIF)和极端梯度提升(Extreme gradient boosting,XGBoost)算法进行变量筛选,采用决策树(Decision tree,DT)、随机森林(Random forest,RF)、径向基函数神经网络(Radial basis function,RBF)和长短期记忆网络(Long short-term memory,LSTM)4种机器学习模型与普通克里格(Ordinary Kriging,OK)方法,对研究区农田土壤有机质、全氮、有效磷和速效钾含量进行空间插值。结果表明:研究区土壤有机质、全氮、有效磷、速效钾含量分别为0.226~32.275 g/kg、0.117~1.272 g/kg、3.159~53.884 mg/kg和81.510~488.422 mg/kg,变异系数为30.636%~43.648%,均属于中等程度变异。PCC、VIF和XGBoost变量筛选均表明,土壤有机质、全氮、有效磷和速效钾间具有一定的关联性,可用于目标属性空间插值的环境变量,但不同变量筛选方法对经度、纬度、地形、气象和植被指数因子筛选结果具有一定的差异性。XGBoost方法可以更有效地筛选出对空间插值结果重要的环境变量,且基于此方法筛选变量后建立的模型精度明显优于PCC和VIF筛选变量后建立的模型精度,而且协同环境变量的机器学习模型精度普遍优于未加入环境变量的OK模型精度,同一土壤养分含量空间插值模型精度从大到小依次为RF、LSTM、RBF、DT、OK,其中基于XGBoost筛选出的变量对土壤有机质、全氮、有效磷和速效钾含量构建的RF空间插值模型精度相较于未加入环境变量的OK模型有显著提高,决定系数分别提高43.02%、101.00%、86.04%和137.89%,均方根误差分别降低27.39%、42.78%、13.12%和28.39%,平均绝对误差分别降低29.01%、43.84%、11.20%和29.62%。利用RF模型对研究区农田土壤养分进行反演得到土壤有机质和全氮含量具有较强的空间分布一致性,含量较高的主要集中在研究区南部和东部区域,有效磷和速效钾含量具有一定的空间相似性,东南部、中北部区域含量较低。综上,XGBoost变量筛选方法结合RF模型可以更好地实现土壤养分空间插值,可作为土壤养分空间插值的有效方法。

    Abstract:

    In order to improve the accuracy of spatial interpolation of soil nutrients in farmland and accurately grasp the spatial distribution characteristics of soil nutrients, variable screening were performed by using Pearson correlation coefficient, variance inflation factor and extreme gradient boosting algorithms. Then, decision tree, random forest, radial basis function and long short-term memory were used with ordinary Kriging to interpolation the content of soil nutrients in the farmland. The results showed that the soil organic matter, total nitrogen, available phosphorus, and available potassium contents in the study area ranged from 0.226 g/kg to 32.275 g/kg, 0.117 g/kg to 1.272 g/kg, 3.159 mg/kg to 53.884 mg/kg, and 81.510 mg/kg to 488.422 mg/kg, respectively, with moderate variability. PCC, VIF and XGBoost variable screening all showed that soil organic matter, total nitrogen, available phosphorus and available potassium had some correlation among them and can be used as environmental variables for the spatial interpolation of target attributes. XGBoost method can more effectively screen out the environmental variables that were important to the spatial interpolation results, and the accuracy of the model built after screening variables based on this method was significantly better than the accuracy of the model built after screening variables by PCC and VIF. Moreover, the accuracy of the machine learning model with the synergistic environmental variables was generally better than the accuracy of the OK model without environmental variables, and the accuracy of the spatial interpolation model for the same soil nutrient content showed the following order: RF>LSTM>RBF>DT>OK. Using the RF model to invert soil nutrients in the study area, it was found that the soil organic matter and total nitrogen higher content was mainly concentrated in the southern and eastern regions of the study area, the available phosphorus and available potassium lower content in the southeastern and north-central regions. In summary, the XGBoost variable screening method combined with RF model can better realize the spatial interpolation of soil nutrients, and can be used as an effective method for the spatial interpolation of soil nutrients.

    参考文献
    相似文献
    引证文献
引用本文

咸阳,宋江辉,王金刚,李维弟,张文旭,王海江.基于环境变量筛选与机器学习的土壤养分含量空间插值研究[J].农业机械学报,2024,55(10):379-391. XIAN Yang, SONG Jianghui, WANG Jin’gang, LI Weidi, ZHANG Wenxu, WANG Haijiang. Spatial Interpolation of Soil Nutrients Content Based on Environmental Variables Screening and Machine Learning[J]. Transactions of the Chinese Society for Agricultural Machinery,2024,55(10):379-391.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-12-20
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-10-10
  • 出版日期:
文章二维码