CN115983692A - Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm - Google Patents
Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm Download PDFInfo
- Publication number
- CN115983692A CN115983692A CN202211683374.9A CN202211683374A CN115983692A CN 115983692 A CN115983692 A CN 115983692A CN 202211683374 A CN202211683374 A CN 202211683374A CN 115983692 A CN115983692 A CN 115983692A
- Authority
- CN
- China
- Prior art keywords
- sweet
- tobacco leaf
- tobacco
- random forest
- fragrant
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Abstract
The invention discloses a tobacco leaf sweet and fragrant evaluation method and device based on a random forest algorithm, which comprises the following steps: acquiring historical tobacco leaf index information; constructing a sweet and sweet characteristic model according to the historical tobacco leaf index information; and inputting the indexes of the tobacco leaves to be evaluated into the sweet and fragrant characteristic model, and evaluating the quality of the tobacco leaves. By adopting the technical scheme of the invention, the problem that the quality evaluation and analysis of the tobacco leaves are incomplete is solved.
Description
Technical Field
The invention belongs to the technical field of tobacco leaf evaluation, and particularly relates to a tobacco leaf sweet and fragrant evaluation method and device based on a random forest algorithm.
Background
Tobacco leaves are used as raw materials in the cigarette industry, and the style and characteristics of the tobacco leaves determine the inherent quality of cigarettes. Due to the limitation of objective factors such as climate, ecology and geography, the style and features of the tobacco leaves are not outstanding enough and are not favored by the cigarette industry. The unique geographical position and ecological environment natural conditions of the Yao Yi nationality autonomous state in southwestern of Sichuan province provide the premise for the production of high-quality tobacco leaves. The tobacco area of the summer mountain is one of the most suitable tobacco production areas in China, and the annual total tobacco production amount accounts for 70 percent of the annual output of Sichuan province. At present, some practical problems restricting the quality improvement of tobacco leaves exist in the tobacco curing production process of the mountain cooling tobacco area, one of the practical problems is that the characteristics of the quality and style of the tobacco leaves in the mountain cooling tobacco area are not clear, and the characteristics of the quality and the style of the tobacco leaves need to be clearly positioned. The 'sweet and fragrant' quality style characteristic is a novel style characteristic developed for meeting the requirement of 'sweet and fragrant' cigarette construction by depending on a raw material base in a mountain-cooling tobacco district under the background of high-quality development of cigarettes in Sichuan.
For the conventional tobacco quality evaluation, only simple regression analysis, principal component analysis and cluster analysis are usually utilized, and the problem of incomplete tobacco quality evaluation and analysis exists because of less analysis indexes.
Disclosure of Invention
The invention aims to solve the technical problem of providing a tobacco leaf sweet and fragrant evaluation method and device based on a random forest algorithm, so as to solve the problem that the quality evaluation and analysis of tobacco leaves are not complete.
In order to realize the purpose, the invention adopts the following technical scheme:
a tobacco leaf sweet and fragrant evaluation method based on a random forest algorithm comprises the following steps:
s1, acquiring historical tobacco leaf index information;
s2, constructing a sweet and moist characteristic model according to the historical tobacco leaf index information;
and S3, inputting the indexes of the tobacco leaves to be evaluated into the sweet and fragrant characteristic model, and evaluating the quality of the tobacco leaves.
Preferably, in step S2, the sweet and sweet characteristic model is constructed by using a random forest algorithm; the sweet and fragrant characteristic model comprises: sweet and fragrant characteristic regression model, wetness sensation characteristic regression model and wetness and sweet and fragrant general score regression model.
Preferably, in step S1, the historical tobacco leaf index information includes: tobacco leaf appearance information and tobacco leaf chemical composition.
Preferably, the tobacco leaf appearance information includes: color, maturity, oil content, chroma, leaf structure, identity; the tobacco leaf comprises the following chemical components: total sugar, reducing sugar, total nitrogen, total nicotine, total potassium, water-soluble chlorine, potassium-chlorine ratio, sugar-base ratio, nitrogen-base ratio and starch.
The invention also provides a tobacco leaf sweet and fragrant evaluation device based on the random forest algorithm, which comprises the following components:
the acquisition module is used for acquiring historical tobacco leaf index information;
the building module is used for building a sweet and sweet characteristic model according to the historical tobacco leaf index information;
and the evaluation module is used for inputting the indexes of the tobacco leaves to be evaluated into the sweet and fragrant characteristic model to evaluate the quality of the tobacco leaves.
Preferably, the building module builds the sweet and fragrant characteristic model by adopting a random forest algorithm; the sweet and fragrant characteristic model comprises: sweet and fragrant characteristic regression model, wetness sensation characteristic regression model and wetness and sweet and fragrant general score regression model.
Preferably, the historical tobacco leaf index information includes: tobacco leaf appearance information and tobacco leaf chemical composition.
Preferably, the tobacco leaf appearance information includes: color, maturity, oil content, chroma, leaf structure, identity; the tobacco leaf comprises the following chemical components: total sugar, reducing sugar, total nitrogen, total nicotine, total potassium, water-soluble chlorine, potassium-chlorine ratio, sugar-base ratio, nitrogen-base ratio and starch.
The invention establishes a relation among the appearance, the chemical components and the sensory quality of the tobacco leaves, and simultaneously establishes a 'sweet and fragrant' quality evaluation model by adopting a random forest algorithm, wherein the model takes 16 indexes of the appearance quality and the chemical components of the flue-cured tobacco leaves as prediction variables to establish the sweet and fragrant characteristic, the moistening characteristic and the sweet and fragrant total score random forest regression analysis so as to solve the problem that the quality evaluation analysis of the tobacco leaves is not complete.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the embodiments are briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of a tobacco leaf sweetness evaluation method based on a random forest algorithm according to an embodiment of the invention;
fig. 2 is a flowchart of a tobacco leaf sweet-smelling evaluation device based on a random forest algorithm according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Example 1:
as shown in FIG. 1, the invention provides a tobacco leaf sweet and fragrance evaluation method based on a random forest algorithm, which comprises the following steps:
s1, obtaining historical tobacco leaf index information;
s2, constructing a sweet and moist characteristic model according to the historical tobacco leaf index information;
and S3, inputting the indexes of the tobacco leaves to be evaluated into the sweet and fragrant characteristic model, and evaluating the quality of the tobacco leaves.
As an implementation manner of the embodiment of the present invention, in step S1, the historical tobacco leaf index information includes: tobacco leaf appearance information and tobacco leaf chemical composition. The tobacco leaf appearance information comprises: color, maturity, oil content, color, leaf structure, identity; the tobacco leaf comprises the following chemical components: total sugar, reducing sugar, total nitrogen, total nicotine, total potassium, water-soluble chlorine, potassium-chlorine ratio, sugar-base ratio, nitrogen-base ratio and starch. And the chemical components of the tobacco leaves are obtained by measuring by using a Fourier transform near-infrared spectrometer.
Further, in step S1, a deletion process is performed on the missing value of the historical tobacco leaf index information, and a na.
As an implementation manner of the embodiment of the present invention, in step S2, the sweet and fragrant characteristic model is constructed by using a random forest algorithm. The sweet and fragrant characteristic model comprises: sweet and fragrant characteristic regression model, moist feeling characteristic regression model and moist and sweet fragrance total score regression model.
The embodiment of the invention establishes the relation among the appearance, the chemical components and the sensory quality of the tobacco leaves, and simultaneously establishes a 'sweet and fragrant' quality evaluation model by adopting a random forest algorithm. The method comprises the following specific steps: the sweet and fragrant characteristic model is constructed by adopting a random forest algorithm by respectively taking the sweet and fragrant characteristic, the moist and fragrant characteristic and the total moist and fragrant score of the tobacco as dependent characteristic variables and taking appearance indexes (color, maturity, leaf structure, identity, oil content and chroma) and chemical indexes (total sugar, reducing sugar, total nicotine, sugar-base ratio, potassium, chlorine, potassium-chlorine ratio, total nitrogen, nitrogen-base ratio and starch) as independent characteristic variables. And defining the sweet and sweet characteristics according to the sensory evaluation data of the tobacco leaves. Wherein, the scores of indexes (aroma quality, aroma quantity, miscellaneous gas and aftertaste) expressing the aroma characteristics and the taste characteristics are added to define the sweet aroma characteristics; adding the scores of indexes (irritation, softness, fineness, round moist feeling and dry feeling) for expressing the characteristics of the smoke, namely moistening the throat, to define the characteristics of moistening the feeling; and adding the total sweet and fragrant characteristic score and the total moist feeling characteristic score to define the total moist and sweet and fragrant score. The Random Forest (RF) algorithm is a constituent supervised learning method, and is considered as an extension of a decision tree. In the integrated learning based on the bagging regression, a sample data set is randomly sampled mainly through bootstrap, a plurality of sample combinations are obtained by randomly repeating the sample data and sampling the sample data in a place-to-place sampling mode, a plurality of decision trees are formed through node splitting and random extraction of random characteristic variables to form a forest, and a prediction result obtained by each decision tree is weighted and averaged to be used as a regression result of final prediction of the sample. The random forest can automatically find the optimal segmentation point of each decision tree, about one third of samples in sample collection are not selected, namely, the out-of-bag data (OOB) is not selected, and the data are used for calculating out-of-bag error rate to replace cross inspection to serve as a standard for verifying the generalization capability of the model, so that the random forest can effectively avoid the over-fitting phenomenon. In addition, the random forest model requires that each node of the decision tree only considers one subset of the predicted characteristic variables, so that more weakly correlated characteristic variables can participate in the construction of the decision tree model, and the obtained decision tree has higher reliability.
Further, the random forest algorithm is a random forest algorithm which is trained by 16 features in the historical tobacco leaf index information and comprises 500 decision trees, wherein the number of the features used in each decision tree is 16. The method specifically comprises the following steps: extracting a partial sample subset from a training data set (historical tobacco leaf index information) by using a Bootstrap sampling algorithm; randomly selecting 16 features from all the features in the subset, and training by using a decision tree algorithm tree to obtain a decision tree; and obtaining a random forest model RF containing 500 decision tree models, and taking the average value of all decision trees as a predicted value.
Further, parameter optimization is carried out on the sweet and sweet characteristic model, mtry parameter search is carried out on the sweet and sweet characteristic, the wetness feeling characteristic and the sweet and sweet total score regression model respectively by using a tuneRF () function, and a modeling optimal parameter mtry is searched, namely the optimal variable number used for the binary tree in the designated node.
The parameters of the sweet and fragrant characteristic regression model are optimized as follows: to train a better model, an mtry parameter search is performed on the sweet character regression model using the tuneRF () function. With the increase of mtry value, the value of OBBERror is increased slowly firstly, then is increased rapidly when mtry =5, and when mtry =4, the OOBERror error value is minimum, which shows that a sweet and fragrant characteristic regression model with better prediction precision can be established by using parameter mtry = 4.
The parameters of the regression model of the wetness sensation characteristic are optimized as follows: to train a better model, an mtry parameter search is performed on the wetness characteristic regression model using the tuneRF () function. With the increase of mtry value, the value of OBBERror is rapidly increased and then decreased, and when mtry =4, the OOBERror error value is minimum, which shows that a wetness sensation characteristic regression model with better prediction precision can be established by using parameter mtry = 4.
The parameters of the sweet and sweet scent total score regression model are optimized as follows: with the increase of mtry value, the value of OBB Error is firstly reduced and then increased, when mtry =5, the OOB Error value is minimum, which shows that a sweet and fragrant total score regression model with better prediction precision can be established by using parameter mtry = 5.
Further, the importance of the modeling feature variables is evaluated, and the importance of each feature variable in the 3 random forest regression models is ranked by using a varImplot () function according to the contribution degree of the variables to the models. And verifying the regression model of the sweet and fragrant characteristics, the moist feeling characteristics and the moist and sweet total score by adopting a ten-fold cross verification method, analyzing the relation between the model error and the quantity of the fitting characteristic variables, and accepting or rejecting the predicted characteristic variables.
According to the embodiment of the invention, the flue-cured tobacco 'sweet and fragrant' quality and style characteristic evaluation is based on a random forest model, and the model takes 16 indexes of the appearance quality and the chemical components of the flue-cured tobacco as prediction variables to establish the sweet and fragrant characteristic, the moist feeling characteristic and the sweet and fragrant total score random forest regression analysis. The result analysis has higher precision of the random forest model established by the sweet and fragrant characteristics, the moist feeling characteristics and the moist and sweet total score, and the correlation regression analysis R 2 Between 0.91 and 0.93, the style and the characteristics of the 'sweet and fragrant' of the tobacco leaves are closely related to the appearance quality of the tobacco leaves and each index factor of chemical components. Has excellent regression parameters for sweet and fragrant characteristics and moistening characteristicsAfter the quantization, the root mean square error is reduced. The test structure shows that the random forest regression model of the sweet and fragrant characteristic, the moist feeling characteristic and the moist and sweet total score shows good randomness and strong overfitting resistance, thereby providing technical support for evaluating the quality and style characteristics of flue-cured tobacco in the tobacco zone of the mountain cooling.
Example 2:
as shown in fig. 2, the invention also provides a tobacco leaf sweet and fragrance evaluation device based on a random forest algorithm, which comprises:
the acquisition module is used for acquiring historical tobacco leaf index information;
the building module is used for building a sweet and sweet characteristic model according to the historical tobacco leaf index information;
and the evaluation module is used for inputting the indexes of the tobacco leaves to be evaluated into the sweet and fragrant characteristic model to evaluate the quality of the tobacco leaves.
As an implementation manner of the embodiment of the present invention, the constructing module constructs the sweet and fragrant characteristic model by using a random forest algorithm; the sweet and fragrant characteristic model comprises: sweet and fragrant characteristic regression model, moist feeling characteristic regression model and moist and sweet fragrance total score regression model.
As an implementation manner of the embodiment of the present invention, the historical tobacco leaf index information includes: tobacco leaf appearance information and tobacco leaf chemical composition.
As an implementation manner of the embodiment of the present invention, the tobacco leaf appearance information includes: color, maturity, oil content, chroma, leaf structure, identity; the tobacco leaf comprises the following chemical components: total sugar, reducing sugar, total nitrogen, total nicotine, total potassium, water-soluble chlorine, potassium-chlorine ratio, sugar-base ratio, nitrogen-base ratio and starch.
The above-described embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements of the technical solutions of the present invention can be made by those skilled in the art without departing from the spirit of the present invention, and the technical solutions of the present invention are within the scope of the present invention defined by the claims.
Claims (8)
1. A tobacco leaf sweet and fragrant evaluation method based on a random forest algorithm is characterized by comprising the following steps:
s1, acquiring historical tobacco leaf index information;
s2, constructing a sweet and moist characteristic model according to the historical tobacco leaf index information;
and S3, inputting the indexes of the tobacco leaves to be evaluated into the sweet and fragrant characteristic model, and evaluating the quality of the tobacco leaves.
2. The tobacco leaf sweet-scented evaluation method based on the random forest algorithm according to claim 1, wherein in the step S2, the sweet-scented characteristic model is constructed by adopting the random forest algorithm; the sweet and fragrant characteristic model comprises: sweet and fragrant characteristic regression model, moist feeling characteristic regression model and moist and sweet fragrance total score regression model.
3. The tobacco leaf sweet and fragrance evaluation method based on the random forest algorithm according to claim 2, wherein in the step S1, the historical tobacco leaf index information includes: tobacco leaf appearance information and tobacco leaf chemical composition.
4. The tobacco sweetness and aroma evaluation method based on the random forest algorithm according to claim 3, wherein the tobacco appearance information comprises: color, maturity, oil content, color, leaf structure, identity; the tobacco leaf comprises the following chemical components: total sugar, reducing sugar, total nitrogen, total nicotine, total potassium, water-soluble chlorine, potassium-chlorine ratio, sugar-base ratio, nitrogen-base ratio and starch.
5. The utility model provides a tobacco leaf is sweet and fragrant evaluation device based on random forest algorithm which characterized in that includes:
the acquisition module is used for acquiring historical tobacco leaf index information;
the building module is used for building a sweet and sweet characteristic model according to the historical tobacco leaf index information;
and the evaluation module is used for inputting the tobacco leaf indexes to be evaluated into the sweet and fragrant characteristic model to evaluate the tobacco leaf quality.
6. The tobacco leaf sweet and fragrance evaluation device based on the random forest algorithm according to claim 5, wherein the building module builds the sweet and fragrance characteristic model by adopting a random forest algorithm; the sweet and fragrant characteristic model comprises: sweet and fragrant characteristic regression model, moist feeling characteristic regression model and moist and sweet fragrance total score regression model.
7. The tobacco leaf sweetness evaluation device based on the random forest algorithm according to claim 6, wherein the historical tobacco leaf index information comprises: tobacco leaf appearance information and tobacco leaf chemical composition.
8. The tobacco sweetness evaluation apparatus based on random forest algorithm of claim 7, wherein the tobacco appearance information comprises: color, maturity, oil content, color, leaf structure, identity; the tobacco leaf comprises the following chemical components: total sugar, reducing sugar, total nitrogen, total nicotine, total potassium, water-soluble chlorine, potassium-chlorine ratio, sugar-base ratio, nitrogen-base ratio and starch.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211683374.9A CN115983692A (en) | 2022-12-27 | 2022-12-27 | Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211683374.9A CN115983692A (en) | 2022-12-27 | 2022-12-27 | Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115983692A true CN115983692A (en) | 2023-04-18 |
Family
ID=85969553
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211683374.9A Pending CN115983692A (en) | 2022-12-27 | 2022-12-27 | Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115983692A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117740727A (en) * | 2024-02-19 | 2024-03-22 | 南京信息工程大学 | Textile component quantitative inversion method based on infrared hyperspectrum |
CN117740727B (en) * | 2024-02-19 | 2024-05-14 | 南京信息工程大学 | Textile component quantitative inversion method based on infrared hyperspectrum |
-
2022
- 2022-12-27 CN CN202211683374.9A patent/CN115983692A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117740727A (en) * | 2024-02-19 | 2024-03-22 | 南京信息工程大学 | Textile component quantitative inversion method based on infrared hyperspectrum |
CN117740727B (en) * | 2024-02-19 | 2024-05-14 | 南京信息工程大学 | Textile component quantitative inversion method based on infrared hyperspectrum |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104931430B (en) | A kind of redried natural alcoholization quality evaluation and model building method | |
CN101419207B (en) | Method for predicting main index of flue-cured tobacco flume | |
CN107796782B (en) | Redrying quality stability evaluation method based on tobacco leaf characteristic spectrum consistency measurement | |
CN101419209A (en) | Cigarette sensing appraise and flume index immune neural net prediction method | |
CN106529584A (en) | Flue-cured tobacco aroma type and quality judgment intelligent evaluation method | |
CN102692488A (en) | Jinhua ham grading and identifying method based on electronic nose technology | |
CN105445421A (en) | Method for predicting sensory quality in lamina alcoholization process via appearance indexes | |
CN108414471B (en) | Method for distinguishing sensory characterization information based on near infrared spectrum and sensory evaluation mutual information | |
CN1115112C (en) | Method for creating fuzzy-neural network expert system for evaluating sensing quality of cigarette | |
CN115983692A (en) | Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm | |
CN101419454B (en) | Cigarette recipe maintenance method based on artificial immunity method | |
CN114689746B (en) | Method, device, electronic equipment and medium for screening tobacco extract characteristics | |
CN116519874A (en) | Heated cigarette style sensory evaluation method | |
CN115859784A (en) | Method for establishing production process parameter and cigarette sensory quality characteristic correlation model | |
CN108520276A (en) | A kind of interior characterizing method in aesthetic quality of raw tobacco material | |
CN113907407B (en) | Method for migrating style characteristics of tobacco extract | |
CN114595365A (en) | Method and device for constructing cigarette feature relevance, electronic equipment and medium | |
CN113762775B (en) | Tobacco leaf sweet feeling evaluation method based on total sugar content | |
CN112485372A (en) | Method for evaluating miscellaneous gas in flue gas | |
CN116660458A (en) | Cigar raw material sensory quality prediction method based on BP neural network | |
CN106290725B (en) | A kind of quantitative judgement method of cured tobacco leaf giving off a strong fragrance odor type | |
CN114965815B (en) | Method for classifying and identifying aroma-added cigarette paper based on chemometrics-sensory group | |
CN117495161A (en) | Construction method of sensory evaluation index prediction model of flue-cured tobacco leaves | |
CN115868656A (en) | Tobacco leaf group formula imitation design method based on tobacco leaf substitution | |
CN116380825A (en) | Method for detecting quality similarity of tobacco essence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |