Abstract:In order to overcome the technical bottlenecks of strong subjectivity, over-reliance on manual experience and sensory evaluation in the process of traditional cigarette formula design and maintenance, an indirect correlation model of “near infrared spectroscopy-chemical composition-sensory indicators” was constructed, and an end-to-end tobacco sensory quality indicators prediction method was proposed based on near infrared spectroscopy and Transformer architecture. Firstly, three spectral preprocessing techniques, Savitzky-Golay convolution smoothing method (SG), first derivative method (D1), and multivariate scattering correction (MSC), were used to effectively eliminate baseline drift and scattering interference;then a Transformer prediction model oriented to spectral data features was designed to achieve accurate prediction of the three-dimensional evaluation system of tobacco sensory quality (style characteristics: freshness, sweet, and burnt;smoke characteristics: concentration and strength;quality characteristics: quality of aroma, volume of aroma, offensive taste, irritating, and pleasant aftertaste). The model was analyzed by using the SHAP method to enhance its interpretability. Results showed that the model’s mean absolute error for each sensory indicators test set was no more than 0.56, demonstrating good usability. For different sensory indicators, the model demonstrated strong capture of distinct spectral feature bands, effectively exploring the synergistic mechanism of spectral features and demonstrating good interpretability. Furthermore, a method for assisting tobacco leaf substitution was designed by combining multidimensional similarity analysis, providing quantitative decision support for tobacco leaf substitution and blend optimization.