CN117809766A - Tobacco leaf near infrared spectrum chemical component model optimization method based on transfer learning - Google Patents

Tobacco leaf near infrared spectrum chemical component model optimization method based on transfer learning Download PDF

Info

Publication number
CN117809766A
CN117809766A CN202311709858.0A CN202311709858A CN117809766A CN 117809766 A CN117809766 A CN 117809766A CN 202311709858 A CN202311709858 A CN 202311709858A CN 117809766 A CN117809766 A CN 117809766A
Authority
CN
China
Prior art keywords
tobacco
model
near infrared
spectrum
infrared spectrum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311709858.0A
Other languages
Chinese (zh)
Inventor
王丽丽
田雷
张志勇
陈秀斋
程云吉
高强
宗浩
谭效磊
刘勇
刘元德
高鹏程
卢溪文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Linyi Tobacco Co Ltd
Original Assignee
Shandong Linyi Tobacco Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Linyi Tobacco Co Ltd filed Critical Shandong Linyi Tobacco Co Ltd
Priority to CN202311709858.0A priority Critical patent/CN117809766A/en
Publication of CN117809766A publication Critical patent/CN117809766A/en
Pending legal-status Critical Current

Links

Landscapes

  • Investigating Or Analysing Materials By Optical Means (AREA)
  • Manufacture Of Tobacco Products (AREA)

Abstract

The invention relates to a tobacco near infrared spectrum chemical component model optimization method based on transfer learning, which belongs to the technical field of tobacco detection, and is characterized in that on the basis of the existing near infrared spectrum tobacco detection model, effective information in a broken leaf spectrum is transferred by using a correlation-ratio transfer learning method to improve the stability of a whole leaf spectrum measurement model and establish a brand new prediction model about total sugar, reducing sugar, total plant alkali, potassium and chlorine content in order to cope with the precision influence of the original model due to the change of environment, climate and other factors. The optimization method can reduce the workload consumed by resampling modeling, establish a more constant prediction model, be used for analyzing chemical components of tobacco leaves in the tobacco production process, meet the requirement of rapid detection of conventional chemical components of the tobacco leaves, and the obtained data can meet the requirements of tobacco production enterprises.

Description

Tobacco leaf near infrared spectrum chemical component model optimization method based on transfer learning
Technical Field
The invention belongs to the technical field of tobacco component detection, and particularly relates to a tobacco near infrared spectrum chemical component model optimization method based on transfer learning.
Background
Near Infrared (NIR) spectrum refers to electromagnetic waves between the Visible (VIS) and mid-Infrared (MIR) spectrum regions, the wavelengths of which are between 780-2526 nm, and can be divided into short wave (780-1100 nm) and long wave (1100-2526 nm) regions. Near infrared spectrum information can reflect frequency multiplication absorption and frequency combination absorption of hydrogen-containing groups such as C-H, O-H, N-H in organic molecules. The absorption of near infrared light varies from chemical composition to chemical composition, and the spectrum obtained varies from one chemical composition to another. By directly or indirectly measuring the near infrared spectrum of the sample, the relation between the content of a certain chemical component of the sample and the near infrared absorption spectrum of the sample, namely a mathematical model between the content of the certain chemical component and the near infrared absorption spectrum of the sample can be established according to the absorption conditions of the sample. And predicting the content of chemical components related to the similar samples by using near infrared spectrum absorption data according to the established mathematical model. The method only needs to obtain near infrared spectrum data of the sample, and has the advantages of high detection speed, high timeliness and no damage to the sample. Currently, NIR analysis techniques have been applied in numerous fields of agriculture, petrochemicals, pharmaceuticals, food, clinic, etc.
The tobacco contains more chemical components, wherein indexes such as total sugar, reducing sugar, total nitrogen, total plant alkali, potassium, chlorine and the like are the most basic chemical component indexes for evaluating the quality of tobacco leaves, and play a very important role in purchasing and grading the tobacco leaves, and in the aspects of formula design and quality monitoring of cigarettes and the like. The detection of chemical components in tobacco by near infrared spectroscopy is becoming popular and generalized in tobacco production. Many researchers have studied this, have formed a very good mathematical model, and have achieved many research results, have promoted the development of the tobacco industry.
Chinese patent No. CN114088661a discloses an online prediction method for chemical components in a tobacco baking process based on transfer learning and near infrared spectrum, which is implemented by collecting tobacco samples multiple times, performing iterative training on tobacco chemical component prediction models through algorithm analysis data, and obtaining tobacco chemical component prediction results according to the chemical component prediction models.
Because the distribution of chemical components of the whole tobacco leaves is uneven, the portable spectrometer is often used for collecting spectral information in a point-to-surface mode, so that the stability of a model is poor, and the chemical components become more uniform after the whole tobacco leaves are processed into broken leaves and flattened, the technical problems can be solved by acquiring the near infrared spectrum of the broken leaves to perform a correlation ratio transfer learning algorithm to establish a whole tobacco leaf component analysis model.
Disclosure of Invention
In order to overcome the defect that a portable spectrometer cannot comprehensively collect the whole leaf spectrum information of tobacco leaves, the invention provides a tobacco leaf near infrared spectrum chemical component model optimization method based on transfer learning, the method considers the effective information of the transferred broken leaf spectrum, and the transfer learning method is applied to the existing component prediction model, so that the accuracy of the whole leaf spectrum measurement model is improved, and the method is more suitable for the rapid detection requirement of conventional chemical components of tobacco leaves; the method is mainly used for establishing a tobacco leaf chemical component prediction model which can be directly subjected to model migration and is more stable, reducing the workload consumed by collecting new samples again and establishing the model, achieving higher prediction accuracy under the condition of using spectrum information of the whole leaf sample, and providing a more constant prediction model for detecting conventional chemical components of tobacco leaves.
In order to achieve the above purpose, the invention provides a method for optimizing a tobacco near infrared spectrum chemical composition model, which is characterized in that a tobacco chemical composition analysis model is established based on near infrared spectrum data of whole tobacco leaves and broken tobacco leaves, the spectrum effective information of the broken tobacco leaves after breaking is migrated into the whole leaf spectrum model, a tobacco near infrared spectrum chemical composition optimizing model based on migration learning is established, and then the optimizing model is utilized to perform composition detection on tobacco to be detected.
The technical scheme adopted by the invention is as follows: the tobacco near infrared spectrum chemical component model optimization method based on transfer learning comprises the following steps:
s1, obtaining a near infrared spectrum of the whole tobacco leaf;
s2, obtaining a near infrared spectrum of broken tobacco leaves;
s3, drying and grinding tobacco leaves, and obtaining a reference chemical component value through an assay of a flow analyzer;
s4, preprocessing the near infrared spectrum data acquired in the step S1 and the step S2;
s5, respectively establishing chemical component analysis models for whole tobacco leaves and broken tobacco leaves according to the near infrared spectrum data of the step S1 and the step S2;
s6, migrating the broken leaf spectrum information into a whole leaf spectrum model by a correlation-ratio migration learning method, and obtaining an optimized model of the near infrared spectrum chemical components of the tobacco leaves; and detecting chemical components of the tobacco leaves according to the optimization model, and obtaining predicted values of the chemical components of the tobacco leaves.
Further, in the step S6, the optimization model of the chemical components of the near infrared spectrum of the tobacco is implemented by the following steps:
step S61, firstly, scanning a tobacco leaf sample by adopting a portable tobacco chemical component analyzer, and collecting whole leaf spectral data information;
step S62, scanning a tobacco leaf broken sample by adopting a portable tobacco chemical component analyzer, and collecting spectrum data information of broken tobacco leaves after breaking;
step S63, determining chemical component value information of total sugar, reducing sugar, total plant alkali, potassium and chlorine content of the flue-cured tobacco leaf sample by adopting a continuous flow method;
step S64, preprocessing the spectrum data by sequentially utilizing a smoothing process, a derivative method and a Standard Normalization (SNV) processing method, and reducing the influence of noise and systematic errors on the spectrum;
step S65, modeling is carried out by selecting a partial least square method (PLS), and the prediction residual square sum of the PLS is smaller, so that the prediction stability is higher;
and step S66, calculating the parameter beta of the optimization model according to a corresponding formula by adopting a correlation-ratio transfer learning method according to the model obtained by PLS modeling in the step S65, so as to obtain the required optimization prediction model.
Further, the method for obtaining the predicted value of the chemical components of the tobacco leaves is realized by the following steps:
s67, collecting spectrum information of the whole tobacco leaves;
step S68, preprocessing the sample spectrum data obtained in the step S67;
and step S69, substituting the data obtained in the step S68 into an optimization model of the chemical components of the near infrared spectrum of the tobacco leaves, and obtaining predicted values of the chemical components of the tobacco leaves.
Further, in the step S66, the migration learning modeling method is as follows:
according to the model obtained by PLS modeling, adopting a correlation-ratio transfer learning method, if models P and Q exist, wherein P= (X) P ,Y P ),Q=((X Q ,Z Q ),Y Q ) Response variableCovariates
If the models P and Q are both linear models, set:
Y P =X P β+ε P ,
Y Q =X Q γ+Z Q η
Q .
if it isAnd (E X P (X P ) T ]) -1 There is a need for a system that,
the model parameters have the following relationship:
β=(E[X P (X P ) T ]) -1 Λ 0 (E[X Q (X Q ) T ]γ+E[X Q (Z Q ) T ]η),
wherein the method comprises the steps ofWherein Q is laboratory acquired data as a model, X is spectrum information obtained by whole leaf scanning, Z is spectrum information obtained by broken leaf scanning, and Y is tobacco chemical component value measured by a continuous flow method;
obtaining a model Y from PLS modeling Q =X Q γ+Z Q η+ε Q
The formula can then be followed:
β=(E[X P (X P ) T ]) -1 Λ 0 (E[X Q (X Q ) T ]γ+E[X Q (Z Q ) T ]η), the parameter β is determined;
calculation model Y P =X P β+ε P Namely, the prediction model needed by us is used for pre-preparing chemical components of tobacco leaves by near infrared spectrumIn time measurement, only the spectrum information of the whole tobacco leaf is collected and is brought into X of a model P P The predicted value Y of the chemical components of the tobacco leaves can be obtained P
Further, the method for obtaining the predicted value of the chemical component of the tobacco further comprises the step of converting the predicted value of the chemical component of the tobacco obtained in the step S69 into an expression mode which can be understood by a user according to a conversion rule and outputting the expression mode.
Further, the accuracy evaluation criterion for the predicted value obtained in step S69 includes an intra-cross prediction decision coefficient R 2 Verification and internal cross root mean square RMSECV verification.
Further, the preprocessing method in step S4 includes smoothing, derivative method, and Standard Normalization (SNV) processing.
Further, the reference chemical component values include total sugar, reducing sugar, total plant alkali, total potassium, and total chlorine values, respectively.
Further, the near infrared spectrometer for obtaining the infrared spectrum in the step S1 and the step S2 is a YKS-H1 portable tobacco chemical component analyzer.
Compared with the prior art, the invention has the beneficial effects that:
the method is mainly characterized in that modeling is firstly carried out by using a Partial Least Squares (PLS), then model optimization is carried out by using a correlation-ratio transfer learning method on the basis of the model, so that a brand new component prediction model is obtained, and when the near infrared spectrum is carried out for predicting the chemical components of the tobacco leaves, the predicted value of the chemical components of the tobacco leaves can be obtained only by collecting the spectrum information of the whole leaves of the tobacco leaves. The method effectively solves the problem that the existing component detection model is not suitable for tobacco leaves in new year due to the change of the factors such as environment, climate and the like, and reduces the workload of collecting samples again and modeling; the problem that the model stability is not ideal due to the fact that the distribution of chemical components of the whole tobacco leaves is uneven and the portable spectrometer is often used for collecting spectrum information in a point-to-surface mode is solved, and the operability condition of whole tobacco leaf collection in an actual production environment is met.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2a is a full leaf near infrared spectrum;
FIG. 2b is a near infrared spectrum of broken leaves.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1-2 b, the invention provides an embodiment of a tobacco near infrared spectrum chemical composition model optimization method based on transfer learning,
the optimization method comprises the following steps:
s1, obtaining a near infrared spectrum of the whole tobacco leaf;
s2, obtaining a near infrared spectrum of broken tobacco leaves;
s3, drying and grinding tobacco leaves, and obtaining a reference chemical component value through an assay of a flow analyzer;
s4, preprocessing the near infrared spectrum data acquired in the step S1 and the step S2;
s5, respectively establishing chemical component analysis models for whole tobacco leaves and broken tobacco leaves according to the near infrared spectrum data of the step S1 and the step S2;
s6, migrating the broken leaf spectrum information into a whole leaf spectrum model by a correlation-ratio migration learning method, and obtaining an optimized model of the near infrared spectrum chemical components of the tobacco leaves; and detecting chemical components of the tobacco leaves according to the optimization model, and obtaining predicted values of the chemical components of the tobacco leaves. According to the tobacco near infrared spectrum chemical component model optimization method based on transfer learning, provided by the invention, a tobacco chemical component analysis model is built based on the near infrared spectrum data of whole tobacco leaves and broken tobacco leaves, the broken tobacco spectrum effective information after breaking is transferred into the whole leaf spectrum model, a tobacco near infrared spectrum chemical component optimization model based on transfer learning is built, and then the component detection is carried out on tobacco to be detected by using the optimization model. In the embodiment of the invention, the near infrared spectrometer for acquiring the infrared spectrum in the step S1 and the step S2 is a YKS-H1 portable tobacco chemical component analyzer. The analyzer adopts the MEMS micro-mirror technology leading the world, is internally provided with a high-performance long-life light source, has the working wavelength range of 1350-2500nm and the working temperature of-5-40 ℃, and can rapidly and nondestructively detect various chemical indexes such as total sugar, reducing sugar, nicotine, potassium, chlorine and the like of tobacco leaves. The continuous flow analyzer used was the netherlands skalar san++.
In the embodiment of the present invention, in the step S6, the optimization model of the near infrared spectrum chemical components of the tobacco is implemented by the following steps:
step S61, firstly, scanning a tobacco leaf sample by adopting a portable tobacco chemical component analyzer, and collecting whole leaf spectral data information;
step S62, scanning a tobacco leaf broken sample by adopting a portable tobacco chemical component analyzer, and collecting spectrum data information of broken tobacco leaves after breaking;
step S63, determining chemical component value information of total sugar, reducing sugar, total plant alkali, potassium and chlorine content of the flue-cured tobacco leaf sample by adopting a continuous flow method;
step S64, preprocessing the spectrum data by sequentially utilizing a smoothing process, a derivative method and a Standard Normalization (SNV) processing method, and reducing the influence of noise and systematic errors on the spectrum;
step S65, modeling is carried out by selecting a partial least square method (PLS), and the prediction residual square sum of the PLS is smaller, so that the prediction stability is higher;
and step S66, calculating the parameter beta of the optimization model according to a corresponding formula by adopting a correlation-ratio transfer learning method according to the model obtained by PLS modeling in the step S65, so as to obtain the required optimization prediction model.
In the implementation of the invention, the method for obtaining the predicted value of the chemical components of the tobacco leaves is realized by the following steps:
s67, collecting spectrum information of the whole tobacco leaves;
step S68, preprocessing the sample spectrum data obtained in the step S67;
and step S69, substituting the data obtained in the step S68 into an optimization model of the chemical components of the near infrared spectrum of the tobacco leaves, and obtaining predicted values of the chemical components of the tobacco leaves.
The method for obtaining the predicted value of the chemical components of the tobacco leaves further comprises the step of converting the predicted value of the chemical components of the tobacco leaves obtained in the step S69 into an expression mode which can be understood by a user according to a conversion rule and outputting the expression mode.
In the embodiment of the invention, (1) the tobacco leaf spectrum collected by the MEMS micro-mirror technology near infrared spectrometer is used as a source domain spectrum, more than 300 strips are used, and the measured wavelength range is 1350-2500nm;
(2) Determining chemical components of a tobacco sample, including total sugar, reducing sugar, total plant alkali, potassium and chlorine content, and determining the total sugar and reducing sugar (YC/T159-2019), total plant alkali (YC/T468-2013), total potassium (YC/T217-2007) and total chlorine content (YC/T162-2011) in the tobacco sample by adopting a continuous flow analyzer according to national industry standards YC/T159-2019, YC/T468-2013, YC/T217-2007) and total chlorine content (YC/T162-2011) of the chemical values of the tobacco sample;
(3) Spectral pretreatment: firstly, smoothing processing, a derivative method and a Standard Normalization (SNV) processing method are adopted to eliminate interference caused by irrelevant factors so as to reduce the influence of noise and systematic errors on the spectrum.
(4) The modeling method comprises the following steps: firstly modeling is carried out by using a partial least square method (PLS), the partial least square method integrates three methods of multiple linear regression, typical correlation analysis and principal component analysis, and the obtained model has smaller prediction residual square sum and higher prediction stability.
(5) The migration method comprises the following steps: in the aspect of effective information of broken leaf spectrum, the transfer learning method has more outstanding advantages than other technical methods, and the established model is optimized by using the association-ratio transfer learning method, so that the brand new prediction model is established, and the prediction of the near infrared spectrum on the chemical components of the tobacco leaves can be performed only through the spectrum information of the whole leaves of the tobacco leaves.
As shown in fig. 1, the establishment of the tobacco near infrared spectrum chemical composition optimization model based on transfer learning is realized by the following steps:
1) Firstly, scanning a tobacco leaf sample by adopting a portable tobacco chemical component analyzer, and collecting whole leaf spectrum data information;
2) Scanning a tobacco leaf sample by adopting a portable tobacco chemical component analyzer, and collecting spectrum data information of broken tobacco leaves after breaking;
3) Determining chemical component value information such as total sugar, reducing sugar, total plant alkali, potassium, chlorine content and the like of the flue-cured tobacco sample by adopting a continuous flow method, wherein the statistical information of chemical components of the tobacco sample is shown in table 1;
table 1 statistical information of chemical composition of tobacco samples
Table 1 Statistics information of chemical components in tobacco leaf samples
4) In order to avoid phenomena such as near infrared spectrum high-frequency noise, baseline drift and the like caused by irrelevant factors, the spectrum data is preprocessed by a smoothing process, a derivative method and a Standard Normalization (SNV) processing method, so that the influence of noise and system errors on the spectrum is reduced;
5) PLS modeling: modeling is carried out by selecting a Partial Least Squares (PLS), the PLS integrates three methods of multiple linear regression, typical correlation analysis and principal component analysis, the square sum of prediction residual errors of the obtained model is small, the prediction stability is high, PLS modeling is carried out on the whole leaf spectrum information of more than 300 samples, and the internal cross validation results are shown in table 2;
TABLE 2 near infrared model internal Cross validation results
Table 2 The cross validation results of NIR models
6) Migration learning modeling: model obtained by PLS modeling and adopting association-ratio migration learningThe method is provided with models P and Q, wherein P= (X) P ,Y P ),Q=((X Q ,Z Q ),Y Q ) Response variableCovariates-> If the models P and Q are both linear models, set:
Y P =X P β+ε P ,
Y Q =X Q γ+Z Q η+ε Q .
if it isAnd (E X P (X P ) T ]) -1 If present, the model parameters have the following relationship:
β=(E[X P (X P ) T ]) -1 Λ 0 (E[X Q (X Q ) T
+E[X Q (Z Q ) T ]η),
wherein the method comprises the steps of
In the invention, the laboratory acquired data is assumed to be a model Q, the spectrum information obtained by whole leaf scanning is X, the spectrum information obtained by broken leaf scanning is Z, the tobacco chemical component value measured by a continuous flow method is Y, and the model Y is obtained by modeling according to PLS Q =X Q γ+Z Q η+ε Q Then, the method can be performed according to the formula beta= (E [ X ] P (X P ) T ]) -1 Λ 0 (E[X Q (X Q ) T ]γ+E[X Q (Z Q ) T ]η) calculation model Y P =X P β+ε P And (b) is Y P =X P β+ε P I.e., the prediction model we need,when the near infrared spectrum predicts the chemical components of tobacco leaves, only the spectrum information of the whole tobacco leaves is collected and is brought into the X of the model P P The predicted value Y of the chemical components of the tobacco leaves can be obtained P
In the embodiment of the present invention, the accuracy evaluation criteria for the predicted value obtained in step S69 includes an intra-cross prediction determination coefficient R 2 Verification and internal cross root mean square RMSECV verification. The whole leaf spectrum and the broken leaf spectrum of 300 samples are modeled, NIRS prediction models of total sugar, reducing sugar, total plant alkali, potassium and chlorine after migration are established according to a formula, and the internal cross-validation shows that the predicted values and the corresponding chemical values have better correlation compared with the results of the table 2, the prediction determination coefficient R2 and the internal cross-validation Root Mean Square (RMSECV) of all the models have better effects than those before migration, and the results are shown in the table 3.
TABLE 3 near infrared model internal Cross validation results
Table 3 The cross validation results of NIR models
7) Model test: the model internal cross-checking is merely illustrative of how well the modeled internal predictions agree with the chemical values, and external cross-checking must be performed in order to check the model's suitability for unknown samples. 87 tobacco leaf samples which do not participate in modeling are randomly selected, and the whole tobacco leaf is predicted by adopting a mathematical model after migration, and the result is shown in table 4.
TABLE 4 near infrared model external test results
Table 4 The test validation results of NIR models
As shown in Table 4, the results of the external test of the component prediction models showed that the average values of total sugar, reducing sugar, total plant alkali, potassium and chlorine were 21.48%, 19.56%, 2.50%, 2.22% and 0.33% respectively, which were samples with normal contents, and the average relative errors were 2.68%, 2.77%, 4.39%, 8.95%,14.42% and the prediction standard deviations were 3.11, 2.36, 0.62, 0.41 and 0.15. The prediction model has good prediction capability on the 5 chemical components.
The test result shows that the mathematical model established by transfer learning has higher matching degree between the predicted result and the result measured by the continuous flow method, is more stable than the prior model, can directly transfer the model, does not need to resample, can be used for analyzing the chemical components s of the tobacco in the tobacco production process, and meets the requirement of rapid detection of the conventional chemical components of the tobacco.
Practice proves that the method is an effective tobacco chemical component prediction method, and the effective information in the broken leaf spectrum is migrated by using a migration learning method, so that the stability of a whole leaf spectrum measurement model is improved, and a brand-new prediction model of total sugar, reducing sugar, total plant alkali, potassium and chlorine content is established. The model has higher accuracy in the process of predicting new samples, and the obtained data can meet the requirements of tobacco production enterprises and meet the requirements of rapid detection of conventional chemical components of tobacco leaves.
The method provided by the invention is different from the traditional model transfer method, and aims to fully exert the operability of the whole leaf collection method compared with the broken leaf collection method, and the model after the effective information of the broken leaf spectrum is transferred improves the accuracy of the whole leaf spectrum measurement model, reduces the workload of collecting new samples again and establishing the model, and has good application effect.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the same, but rather, various modifications and variations may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application.

Claims (9)

1. The tobacco near infrared spectrum chemical component model optimization method based on transfer learning is characterized by comprising the following steps of:
s1, obtaining a near infrared spectrum of the whole tobacco leaf;
s2, obtaining a near infrared spectrum of broken tobacco leaves;
s3, drying and grinding tobacco leaves, and obtaining a reference chemical component value through an assay of a flow analyzer;
s4, preprocessing the near infrared spectrum data acquired in the step S1 and the step S2;
s5, respectively establishing chemical component analysis models for whole tobacco leaves and broken tobacco leaves according to the near infrared spectrum data of the step S1 and the step S2;
s6, migrating the broken leaf spectrum information into a whole leaf spectrum model by a correlation-ratio migration learning method, and obtaining an optimized model of the near infrared spectrum chemical components of the tobacco leaves; and detecting chemical components of the tobacco leaves according to the optimization model, and obtaining predicted values of the chemical components of the tobacco leaves.
2. The optimization method of the tobacco near infrared spectrum chemical composition model based on transfer learning according to claim 1, wherein,
in the step S6, the optimization model of the chemical components of the near infrared spectrum of the tobacco is implemented by the following steps:
step S61, firstly, scanning a tobacco leaf sample by adopting a portable tobacco chemical component analyzer, and collecting whole leaf spectral data information;
step S62, scanning a tobacco leaf broken sample by adopting a portable tobacco chemical component analyzer, and collecting spectrum data information of broken tobacco leaves after breaking;
step S63, determining chemical component value information of total sugar, reducing sugar, total plant alkali, potassium and chlorine content of the flue-cured tobacco leaf sample by adopting a continuous flow method;
step S64, preprocessing the spectrum data by sequentially utilizing a smoothing process, a derivative method and a Standard Normalization (SNV) processing method, and reducing the influence of noise and systematic errors on the spectrum;
step S65, modeling is carried out by selecting a partial least square method (PLS), and the prediction residual square sum of the PLS is smaller, so that the prediction stability is higher;
and step S66, calculating the parameter beta of the optimization model according to a corresponding formula by adopting a correlation-ratio transfer learning method according to the model obtained by PLS modeling in the step S65, so as to obtain the required optimization prediction model.
3. The optimization method of the tobacco near infrared spectrum chemical composition model based on transfer learning according to claim 1, wherein,
the method for obtaining the predicted value of the chemical components of the tobacco leaves is realized by the following steps:
s67, collecting spectrum information of the whole tobacco leaves;
step S68, preprocessing the sample spectrum data obtained in the step S67;
and step S69, substituting the data obtained in the step S68 into an optimization model of the chemical components of the near infrared spectrum of the tobacco leaves, and obtaining predicted values of the chemical components of the tobacco leaves.
4. The optimization method of the tobacco near infrared spectrum chemical composition model based on transfer learning according to claim 3, wherein,
in the step S66, the migration learning modeling method is as follows:
according to the model obtained by PLS modeling, adopting a correlation-ratio transfer learning method, if models P and Q exist, wherein P= (X) P ,Y P ),Q=((X Q ,Z Q ),Y Q ) Response variable Covariates
If the models P and Q are both linear models, set:
Y P =X P β+ε P ,
Y Q =X Q γ+Z Q η+ε Q .
if it isAnd (E X P (X P ) T ]) -1 There is a need for a system that,
the model parameters have the following relationship:
β=(E[X P (X P ) T ]) -1 Λ 0 (E[X Q (X Q ) T ]γ+E[X Q (Z Q ) T ]η),
wherein the method comprises the steps ofWherein Q is laboratory acquired data as a model, X is spectrum information obtained by whole leaf scanning, Z is spectrum information obtained by broken leaf scanning, and Y is tobacco chemical component value measured by a continuous flow method;
obtaining a model Y from PLS modeling Q =X Q γ+Z Q η+ε Q
The formula can then be followed:
β=(E[X P (X P ) T ]) -10 (E[X Q (X Q ) T ]γ+E[X Q (Z Q ) T ]η), the parameter beta is calculated,
calculation model Y P =X P β+ε P Namely, the prediction model required by us is only required to collect the spectrum information of the whole tobacco leaf and bring the spectrum information into the X of the model P when the near infrared spectrum predicts the chemical components of the tobacco leaf P The predicted value Y of the chemical components of the tobacco leaves can be obtained P
5. The optimization method of the tobacco near infrared spectrum chemical composition model based on transfer learning according to claim 3, wherein,
the method for obtaining the predicted value of the chemical components of the tobacco leaves further comprises the step of converting the predicted value of the chemical components of the tobacco leaves obtained in the step S69 into an expression mode which can be understood by a user according to a conversion rule and outputting the expression mode.
6. The optimization method of the tobacco near infrared spectrum chemical composition model based on transfer learning according to claim 3, wherein,
the accuracy evaluation criteria for the predicted value obtained in step S69 include an intra-cross prediction decision coefficient R 2 Verification and internal cross root mean square RMSECV verification.
7. The optimization method of the tobacco near infrared spectrum chemical composition model based on transfer learning according to claim 1, wherein,
the preprocessing method for the optical data in the step S4 includes smoothing, derivative method and Standard Normalization (SNV) processing method.
8. The optimization method of the tobacco near infrared spectrum chemical composition model based on transfer learning according to claim 1, wherein,
the reference chemical component values include total sugar, reducing sugar, total plant alkali, total potassium and total chlorine values, respectively.
9. The optimization method of the tobacco near infrared spectrum chemical composition model based on transfer learning according to claim 1, wherein,
the near infrared spectrometers for obtaining the infrared spectra in the step S1 and the step S2 are YKS-H1 portable tobacco chemical component analyzers.
CN202311709858.0A 2023-12-12 2023-12-12 Tobacco leaf near infrared spectrum chemical component model optimization method based on transfer learning Pending CN117809766A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311709858.0A CN117809766A (en) 2023-12-12 2023-12-12 Tobacco leaf near infrared spectrum chemical component model optimization method based on transfer learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311709858.0A CN117809766A (en) 2023-12-12 2023-12-12 Tobacco leaf near infrared spectrum chemical component model optimization method based on transfer learning

Publications (1)

Publication Number Publication Date
CN117809766A true CN117809766A (en) 2024-04-02

Family

ID=90424492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311709858.0A Pending CN117809766A (en) 2023-12-12 2023-12-12 Tobacco leaf near infrared spectrum chemical component model optimization method based on transfer learning

Country Status (1)

Country Link
CN (1) CN117809766A (en)

Similar Documents

Publication Publication Date Title
Huang et al. Improved generalization of spectral models associated with Vis-NIR spectroscopy for determining the moisture content of different tea leaves
CN104897607A (en) Food modeling and rapid detecting integration method and system adopting portable NIRS (near infrared spectroscopy)
CN101413885A (en) Near-infrared spectrum method for rapidly quantifying honey quality
CN111563436B (en) Infrared spectrum measuring instrument calibration migration method based on CT-CDD
CN104990895B (en) A kind of near infrared spectrum signal standards normal state bearing calibration based on regional area
WO2020248961A1 (en) Method for selecting spectral wavenumber without reference value
CN110967313A (en) Near infrared spectrum prediction modeling method for nicotine content in tobacco tar of electronic cigarette and application
CN111537469A (en) Apple quality rapid nondestructive testing method based on near-infrared technology
CN113158575A (en) Method for transferring online near-infrared spectrum model of assumed standard sample
CN104596979A (en) Method for measuring cellulose of reconstituted tobacco by virtue of near infrared reflectance spectroscopy technique
Nespeca et al. Multivariate filters combined with interval partial least square method: A strategy for optimizing PLS models developed with near infrared data of multicomponent solutions
CN114088661B (en) Tobacco leaf baking process chemical composition online prediction method based on transfer learning and near infrared spectrum
Zimmer et al. Rapid quantification of constituents in tobacco by NIR fiber‐optic probe
CN110672578A (en) Model universality and stability verification method for polar component detection of frying oil
Li et al. A feasibility study on quantitative analysis of low concentration methanol by FT-NIR spectroscopy and aquaphotomics
CN108120694B (en) Multi-element correction method and system for chemical component analysis of sun-cured red tobacco
CN111141809B (en) Soil nutrient ion content detection method based on non-contact type conductivity signal
Tan et al. Determination of total sugar in tobacco by near-infrared spectroscopy and wavelet transformation-based calibration
CN110186870B (en) Method for distinguishing fresh tea leaf producing area of Enshi Yulu tea by extreme learning machine spectrum model
CN117809766A (en) Tobacco leaf near infrared spectrum chemical component model optimization method based on transfer learning
Chang et al. Monitoring of dough fermentation during Chinese steamed bread processing by near‐infrared spectroscopy combined with spectra selection and supervised learning algorithm
Mei et al. Study of an adaptable calibration model of near-infrared spectra based on KF-PLS
CN115824996A (en) Tobacco conventional chemical component general model modeling method based on near infrared spectrum
CN113607683A (en) Automatic modeling method for near infrared spectrum quantitative analysis
CN113984708A (en) Maintenance method and device of chemical index detection model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination