Abstract:Against the backdrop of intensifying global climate change and escalating food security challenges, accurately and timely estimating crop yields is important. Traditional vegetation indices based on reflectance spectra are difficult to capture the photosynthetic physiological state of crops in real time, while single?model approaches like Transformers and bi?directional long short?term memory network (BiLSTM) also exhibit limitations in extracting yield?related temporal features. Therefore, a hybrid deep learning yield estimation model that integrated data such as solar?induced chlorophyll fluorescence (SIF), actual evapotranspiration (Aet), precipitation (Ppt), and Palmer drought severity index (PDSI) was proposed. By leveraging the advantages of Transformer in extracting global dependencies and the BiLSTM in capturing local detail changes, a Transformer?BiLSTM wheat yield estimation model was constructed. The generalization ability and feature contribution of the model were also evaluated. Results indicated that the Transformer?BiLSTM hybrid model demonstrated superior fitting performance on the 2013—2019 county?level sample test dataset from Henan Province (R2=0.89, NRMSE was 8.18%, RPD was 2.90). All metrics outperformed those of both the single Transformer and BiLSTM models (R2 was increased by 0.04, NRMSE was decreased by 1.46 and 1.22 percentage points respectively, and RPD was improved from 2.46 and 2.53 to 2.90). In the 2020—2022 cross?temporal experiment of county?level data in Henan Province, the Transformer?BiLSTM hybrid model maintained high accuracy (R2=0.89, NRMSE was 8.44%, RPD was 2.77). Compared with single Transformer and BiLSTM models, R2 was improved by 0.05 and 0.07, respectively, while NRMSE was decreased by 1.92 and 1.56 percentage points. The RPD was risen from 2.25 and 2.33 to 2.77, demonstrating the model's robust temporal generalization capability. Further application of this hybrid model to Anhui Province, where yield distributions were more complex, exhibited robust performance (R2=0.87, NRMSE was 11.07%, RPD was 2.73). The R2 values were increased by 0.07 and 0.08, respectively, while the NRMSE was decreased by 2.33 and 3.10 percentage points. The RPD was improved from 2.25 and 2.13 to 2.73, confirming the strong regional generalization capability of the Transformer?BiLSTM hybrid model. Furthermore, aggregating the high?resolution yield distribution maps generated at the pixel scale to the county level for validation demonstrated high consistency with statistical yields (R2>0.8). Based on Shapley additive explanations (SHAP) feature importance analysis, the minimum temperatures from January to February and the SIF from March to June contributed most significantly to the model outputs, with SIF maintaining consistently high importance throughout the entire time series. Concurrently, meteorological factors such as PDSI, Ppt, and Aet during the winter wheat jointing to grain filling stage also exerted significant influence on yield prediction, indicating the model's ability to effectively capture synergistic interactions between crop growth processes and environmental factors.