CN115983692A - Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm - Google Patents

Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm Download PDF

Info

Publication number
CN115983692A
CN115983692A CN202211683374.9A CN202211683374A CN115983692A CN 115983692 A CN115983692 A CN 115983692A CN 202211683374 A CN202211683374 A CN 202211683374A CN 115983692 A CN115983692 A CN 115983692A
Authority
CN
China
Prior art keywords
sweet
tobacco leaf
tobacco
random forest
fragrant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211683374.9A
Other languages
Chinese (zh)
Inventor
许嘉阳
陈征
段旺军
杨杰
梁郅哲
贾玮
许自成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Tobacco Sichuan Industrial Co Ltd
Henan Agricultural University
Original Assignee
China Tobacco Sichuan Industrial Co Ltd
Henan Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Tobacco Sichuan Industrial Co Ltd, Henan Agricultural University filed Critical China Tobacco Sichuan Industrial Co Ltd
Priority to CN202211683374.9A priority Critical patent/CN115983692A/en
Publication of CN115983692A publication Critical patent/CN115983692A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention discloses a tobacco leaf sweet and fragrant evaluation method and device based on a random forest algorithm, which comprises the following steps: acquiring historical tobacco leaf index information; constructing a sweet and sweet characteristic model according to the historical tobacco leaf index information; and inputting the indexes of the tobacco leaves to be evaluated into the sweet and fragrant characteristic model, and evaluating the quality of the tobacco leaves. By adopting the technical scheme of the invention, the problem that the quality evaluation and analysis of the tobacco leaves are incomplete is solved.

Description

Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm
Technical Field
The invention belongs to the technical field of tobacco leaf evaluation, and particularly relates to a tobacco leaf sweet and fragrant evaluation method and device based on a random forest algorithm.
Background
Tobacco leaves are used as raw materials in the cigarette industry, and the style and characteristics of the tobacco leaves determine the inherent quality of cigarettes. Due to the limitation of objective factors such as climate, ecology and geography, the style and features of the tobacco leaves are not outstanding enough and are not favored by the cigarette industry. The unique geographical position and ecological environment natural conditions of the Yao Yi nationality autonomous state in southwestern of Sichuan province provide the premise for the production of high-quality tobacco leaves. The tobacco area of the summer mountain is one of the most suitable tobacco production areas in China, and the annual total tobacco production amount accounts for 70 percent of the annual output of Sichuan province. At present, some practical problems restricting the quality improvement of tobacco leaves exist in the tobacco curing production process of the mountain cooling tobacco area, one of the practical problems is that the characteristics of the quality and style of the tobacco leaves in the mountain cooling tobacco area are not clear, and the characteristics of the quality and the style of the tobacco leaves need to be clearly positioned. The 'sweet and fragrant' quality style characteristic is a novel style characteristic developed for meeting the requirement of 'sweet and fragrant' cigarette construction by depending on a raw material base in a mountain-cooling tobacco district under the background of high-quality development of cigarettes in Sichuan.
For the conventional tobacco quality evaluation, only simple regression analysis, principal component analysis and cluster analysis are usually utilized, and the problem of incomplete tobacco quality evaluation and analysis exists because of less analysis indexes.
Disclosure of Invention
The invention aims to solve the technical problem of providing a tobacco leaf sweet and fragrant evaluation method and device based on a random forest algorithm, so as to solve the problem that the quality evaluation and analysis of tobacco leaves are not complete.
In order to realize the purpose, the invention adopts the following technical scheme:
a tobacco leaf sweet and fragrant evaluation method based on a random forest algorithm comprises the following steps:
s1, acquiring historical tobacco leaf index information;
s2, constructing a sweet and moist characteristic model according to the historical tobacco leaf index information;
and S3, inputting the indexes of the tobacco leaves to be evaluated into the sweet and fragrant characteristic model, and evaluating the quality of the tobacco leaves.
Preferably, in step S2, the sweet and sweet characteristic model is constructed by using a random forest algorithm; the sweet and fragrant characteristic model comprises: sweet and fragrant characteristic regression model, wetness sensation characteristic regression model and wetness and sweet and fragrant general score regression model.
Preferably, in step S1, the historical tobacco leaf index information includes: tobacco leaf appearance information and tobacco leaf chemical composition.
Preferably, the tobacco leaf appearance information includes: color, maturity, oil content, chroma, leaf structure, identity; the tobacco leaf comprises the following chemical components: total sugar, reducing sugar, total nitrogen, total nicotine, total potassium, water-soluble chlorine, potassium-chlorine ratio, sugar-base ratio, nitrogen-base ratio and starch.
The invention also provides a tobacco leaf sweet and fragrant evaluation device based on the random forest algorithm, which comprises the following components:
the acquisition module is used for acquiring historical tobacco leaf index information;
the building module is used for building a sweet and sweet characteristic model according to the historical tobacco leaf index information;
and the evaluation module is used for inputting the indexes of the tobacco leaves to be evaluated into the sweet and fragrant characteristic model to evaluate the quality of the tobacco leaves.
Preferably, the building module builds the sweet and fragrant characteristic model by adopting a random forest algorithm; the sweet and fragrant characteristic model comprises: sweet and fragrant characteristic regression model, wetness sensation characteristic regression model and wetness and sweet and fragrant general score regression model.
Preferably, the historical tobacco leaf index information includes: tobacco leaf appearance information and tobacco leaf chemical composition.
Preferably, the tobacco leaf appearance information includes: color, maturity, oil content, chroma, leaf structure, identity; the tobacco leaf comprises the following chemical components: total sugar, reducing sugar, total nitrogen, total nicotine, total potassium, water-soluble chlorine, potassium-chlorine ratio, sugar-base ratio, nitrogen-base ratio and starch.
The invention establishes a relation among the appearance, the chemical components and the sensory quality of the tobacco leaves, and simultaneously establishes a 'sweet and fragrant' quality evaluation model by adopting a random forest algorithm, wherein the model takes 16 indexes of the appearance quality and the chemical components of the flue-cured tobacco leaves as prediction variables to establish the sweet and fragrant characteristic, the moistening characteristic and the sweet and fragrant total score random forest regression analysis so as to solve the problem that the quality evaluation analysis of the tobacco leaves is not complete.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the embodiments are briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of a tobacco leaf sweetness evaluation method based on a random forest algorithm according to an embodiment of the invention;
fig. 2 is a flowchart of a tobacco leaf sweet-smelling evaluation device based on a random forest algorithm according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Example 1:
as shown in FIG. 1, the invention provides a tobacco leaf sweet and fragrance evaluation method based on a random forest algorithm, which comprises the following steps:
s1, obtaining historical tobacco leaf index information;
s2, constructing a sweet and moist characteristic model according to the historical tobacco leaf index information;
and S3, inputting the indexes of the tobacco leaves to be evaluated into the sweet and fragrant characteristic model, and evaluating the quality of the tobacco leaves.
As an implementation manner of the embodiment of the present invention, in step S1, the historical tobacco leaf index information includes: tobacco leaf appearance information and tobacco leaf chemical composition. The tobacco leaf appearance information comprises: color, maturity, oil content, color, leaf structure, identity; the tobacco leaf comprises the following chemical components: total sugar, reducing sugar, total nitrogen, total nicotine, total potassium, water-soluble chlorine, potassium-chlorine ratio, sugar-base ratio, nitrogen-base ratio and starch. And the chemical components of the tobacco leaves are obtained by measuring by using a Fourier transform near-infrared spectrometer.
Further, in step S1, a deletion process is performed on the missing value of the historical tobacco leaf index information, and a na.
As an implementation manner of the embodiment of the present invention, in step S2, the sweet and fragrant characteristic model is constructed by using a random forest algorithm. The sweet and fragrant characteristic model comprises: sweet and fragrant characteristic regression model, moist feeling characteristic regression model and moist and sweet fragrance total score regression model.
The embodiment of the invention establishes the relation among the appearance, the chemical components and the sensory quality of the tobacco leaves, and simultaneously establishes a 'sweet and fragrant' quality evaluation model by adopting a random forest algorithm. The method comprises the following specific steps: the sweet and fragrant characteristic model is constructed by adopting a random forest algorithm by respectively taking the sweet and fragrant characteristic, the moist and fragrant characteristic and the total moist and fragrant score of the tobacco as dependent characteristic variables and taking appearance indexes (color, maturity, leaf structure, identity, oil content and chroma) and chemical indexes (total sugar, reducing sugar, total nicotine, sugar-base ratio, potassium, chlorine, potassium-chlorine ratio, total nitrogen, nitrogen-base ratio and starch) as independent characteristic variables. And defining the sweet and sweet characteristics according to the sensory evaluation data of the tobacco leaves. Wherein, the scores of indexes (aroma quality, aroma quantity, miscellaneous gas and aftertaste) expressing the aroma characteristics and the taste characteristics are added to define the sweet aroma characteristics; adding the scores of indexes (irritation, softness, fineness, round moist feeling and dry feeling) for expressing the characteristics of the smoke, namely moistening the throat, to define the characteristics of moistening the feeling; and adding the total sweet and fragrant characteristic score and the total moist feeling characteristic score to define the total moist and sweet and fragrant score. The Random Forest (RF) algorithm is a constituent supervised learning method, and is considered as an extension of a decision tree. In the integrated learning based on the bagging regression, a sample data set is randomly sampled mainly through bootstrap, a plurality of sample combinations are obtained by randomly repeating the sample data and sampling the sample data in a place-to-place sampling mode, a plurality of decision trees are formed through node splitting and random extraction of random characteristic variables to form a forest, and a prediction result obtained by each decision tree is weighted and averaged to be used as a regression result of final prediction of the sample. The random forest can automatically find the optimal segmentation point of each decision tree, about one third of samples in sample collection are not selected, namely, the out-of-bag data (OOB) is not selected, and the data are used for calculating out-of-bag error rate to replace cross inspection to serve as a standard for verifying the generalization capability of the model, so that the random forest can effectively avoid the over-fitting phenomenon. In addition, the random forest model requires that each node of the decision tree only considers one subset of the predicted characteristic variables, so that more weakly correlated characteristic variables can participate in the construction of the decision tree model, and the obtained decision tree has higher reliability.
Further, the random forest algorithm is a random forest algorithm which is trained by 16 features in the historical tobacco leaf index information and comprises 500 decision trees, wherein the number of the features used in each decision tree is 16. The method specifically comprises the following steps: extracting a partial sample subset from a training data set (historical tobacco leaf index information) by using a Bootstrap sampling algorithm; randomly selecting 16 features from all the features in the subset, and training by using a decision tree algorithm tree to obtain a decision tree; and obtaining a random forest model RF containing 500 decision tree models, and taking the average value of all decision trees as a predicted value.
Further, parameter optimization is carried out on the sweet and sweet characteristic model, mtry parameter search is carried out on the sweet and sweet characteristic, the wetness feeling characteristic and the sweet and sweet total score regression model respectively by using a tuneRF () function, and a modeling optimal parameter mtry is searched, namely the optimal variable number used for the binary tree in the designated node.
The parameters of the sweet and fragrant characteristic regression model are optimized as follows: to train a better model, an mtry parameter search is performed on the sweet character regression model using the tuneRF () function. With the increase of mtry value, the value of OBBERror is increased slowly firstly, then is increased rapidly when mtry =5, and when mtry =4, the OOBERror error value is minimum, which shows that a sweet and fragrant characteristic regression model with better prediction precision can be established by using parameter mtry = 4.
The parameters of the regression model of the wetness sensation characteristic are optimized as follows: to train a better model, an mtry parameter search is performed on the wetness characteristic regression model using the tuneRF () function. With the increase of mtry value, the value of OBBERror is rapidly increased and then decreased, and when mtry =4, the OOBERror error value is minimum, which shows that a wetness sensation characteristic regression model with better prediction precision can be established by using parameter mtry = 4.
The parameters of the sweet and sweet scent total score regression model are optimized as follows: with the increase of mtry value, the value of OBB Error is firstly reduced and then increased, when mtry =5, the OOB Error value is minimum, which shows that a sweet and fragrant total score regression model with better prediction precision can be established by using parameter mtry = 5.
Further, the importance of the modeling feature variables is evaluated, and the importance of each feature variable in the 3 random forest regression models is ranked by using a varImplot () function according to the contribution degree of the variables to the models. And verifying the regression model of the sweet and fragrant characteristics, the moist feeling characteristics and the moist and sweet total score by adopting a ten-fold cross verification method, analyzing the relation between the model error and the quantity of the fitting characteristic variables, and accepting or rejecting the predicted characteristic variables.
According to the embodiment of the invention, the flue-cured tobacco 'sweet and fragrant' quality and style characteristic evaluation is based on a random forest model, and the model takes 16 indexes of the appearance quality and the chemical components of the flue-cured tobacco as prediction variables to establish the sweet and fragrant characteristic, the moist feeling characteristic and the sweet and fragrant total score random forest regression analysis. The result analysis has higher precision of the random forest model established by the sweet and fragrant characteristics, the moist feeling characteristics and the moist and sweet total score, and the correlation regression analysis R 2 Between 0.91 and 0.93, the style and the characteristics of the 'sweet and fragrant' of the tobacco leaves are closely related to the appearance quality of the tobacco leaves and each index factor of chemical components. Has excellent regression parameters for sweet and fragrant characteristics and moistening characteristicsAfter the quantization, the root mean square error is reduced. The test structure shows that the random forest regression model of the sweet and fragrant characteristic, the moist feeling characteristic and the moist and sweet total score shows good randomness and strong overfitting resistance, thereby providing technical support for evaluating the quality and style characteristics of flue-cured tobacco in the tobacco zone of the mountain cooling.
Example 2:
as shown in fig. 2, the invention also provides a tobacco leaf sweet and fragrance evaluation device based on a random forest algorithm, which comprises:
the acquisition module is used for acquiring historical tobacco leaf index information;
the building module is used for building a sweet and sweet characteristic model according to the historical tobacco leaf index information;
and the evaluation module is used for inputting the indexes of the tobacco leaves to be evaluated into the sweet and fragrant characteristic model to evaluate the quality of the tobacco leaves.
As an implementation manner of the embodiment of the present invention, the constructing module constructs the sweet and fragrant characteristic model by using a random forest algorithm; the sweet and fragrant characteristic model comprises: sweet and fragrant characteristic regression model, moist feeling characteristic regression model and moist and sweet fragrance total score regression model.
As an implementation manner of the embodiment of the present invention, the historical tobacco leaf index information includes: tobacco leaf appearance information and tobacco leaf chemical composition.
As an implementation manner of the embodiment of the present invention, the tobacco leaf appearance information includes: color, maturity, oil content, chroma, leaf structure, identity; the tobacco leaf comprises the following chemical components: total sugar, reducing sugar, total nitrogen, total nicotine, total potassium, water-soluble chlorine, potassium-chlorine ratio, sugar-base ratio, nitrogen-base ratio and starch.
The above-described embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements of the technical solutions of the present invention can be made by those skilled in the art without departing from the spirit of the present invention, and the technical solutions of the present invention are within the scope of the present invention defined by the claims.

Claims (8)

1. A tobacco leaf sweet and fragrant evaluation method based on a random forest algorithm is characterized by comprising the following steps:
s1, acquiring historical tobacco leaf index information;
s2, constructing a sweet and moist characteristic model according to the historical tobacco leaf index information;
and S3, inputting the indexes of the tobacco leaves to be evaluated into the sweet and fragrant characteristic model, and evaluating the quality of the tobacco leaves.
2. The tobacco leaf sweet-scented evaluation method based on the random forest algorithm according to claim 1, wherein in the step S2, the sweet-scented characteristic model is constructed by adopting the random forest algorithm; the sweet and fragrant characteristic model comprises: sweet and fragrant characteristic regression model, moist feeling characteristic regression model and moist and sweet fragrance total score regression model.
3. The tobacco leaf sweet and fragrance evaluation method based on the random forest algorithm according to claim 2, wherein in the step S1, the historical tobacco leaf index information includes: tobacco leaf appearance information and tobacco leaf chemical composition.
4. The tobacco sweetness and aroma evaluation method based on the random forest algorithm according to claim 3, wherein the tobacco appearance information comprises: color, maturity, oil content, color, leaf structure, identity; the tobacco leaf comprises the following chemical components: total sugar, reducing sugar, total nitrogen, total nicotine, total potassium, water-soluble chlorine, potassium-chlorine ratio, sugar-base ratio, nitrogen-base ratio and starch.
5. The utility model provides a tobacco leaf is sweet and fragrant evaluation device based on random forest algorithm which characterized in that includes:
the acquisition module is used for acquiring historical tobacco leaf index information;
the building module is used for building a sweet and sweet characteristic model according to the historical tobacco leaf index information;
and the evaluation module is used for inputting the tobacco leaf indexes to be evaluated into the sweet and fragrant characteristic model to evaluate the tobacco leaf quality.
6. The tobacco leaf sweet and fragrance evaluation device based on the random forest algorithm according to claim 5, wherein the building module builds the sweet and fragrance characteristic model by adopting a random forest algorithm; the sweet and fragrant characteristic model comprises: sweet and fragrant characteristic regression model, moist feeling characteristic regression model and moist and sweet fragrance total score regression model.
7. The tobacco leaf sweetness evaluation device based on the random forest algorithm according to claim 6, wherein the historical tobacco leaf index information comprises: tobacco leaf appearance information and tobacco leaf chemical composition.
8. The tobacco sweetness evaluation apparatus based on random forest algorithm of claim 7, wherein the tobacco appearance information comprises: color, maturity, oil content, color, leaf structure, identity; the tobacco leaf comprises the following chemical components: total sugar, reducing sugar, total nitrogen, total nicotine, total potassium, water-soluble chlorine, potassium-chlorine ratio, sugar-base ratio, nitrogen-base ratio and starch.
CN202211683374.9A 2022-12-27 2022-12-27 Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm Pending CN115983692A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211683374.9A CN115983692A (en) 2022-12-27 2022-12-27 Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211683374.9A CN115983692A (en) 2022-12-27 2022-12-27 Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm

Publications (1)

Publication Number Publication Date
CN115983692A true CN115983692A (en) 2023-04-18

Family

ID=85969553

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211683374.9A Pending CN115983692A (en) 2022-12-27 2022-12-27 Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm

Country Status (1)

Country Link
CN (1) CN115983692A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117740727A (en) * 2024-02-19 2024-03-22 南京信息工程大学 Textile component quantitative inversion method based on infrared hyperspectrum
CN117740727B (en) * 2024-02-19 2024-05-14 南京信息工程大学 Textile component quantitative inversion method based on infrared hyperspectrum

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117740727A (en) * 2024-02-19 2024-03-22 南京信息工程大学 Textile component quantitative inversion method based on infrared hyperspectrum
CN117740727B (en) * 2024-02-19 2024-05-14 南京信息工程大学 Textile component quantitative inversion method based on infrared hyperspectrum

Similar Documents

Publication Publication Date Title
CN104931430B (en) A kind of redried natural alcoholization quality evaluation and model building method
CN101419207B (en) Method for predicting main index of flue-cured tobacco flume
CN107796782B (en) Redrying quality stability evaluation method based on tobacco leaf characteristic spectrum consistency measurement
CN101419209A (en) Cigarette sensing appraise and flume index immune neural net prediction method
CN106529584A (en) Flue-cured tobacco aroma type and quality judgment intelligent evaluation method
CN102692488A (en) Jinhua ham grading and identifying method based on electronic nose technology
CN105445421A (en) Method for predicting sensory quality in lamina alcoholization process via appearance indexes
CN108414471B (en) Method for distinguishing sensory characterization information based on near infrared spectrum and sensory evaluation mutual information
CN1115112C (en) Method for creating fuzzy-neural network expert system for evaluating sensing quality of cigarette
CN115983692A (en) Tobacco leaf moistening and sweet flavor evaluation method and device based on random forest algorithm
CN101419454B (en) Cigarette recipe maintenance method based on artificial immunity method
CN114689746B (en) Method, device, electronic equipment and medium for screening tobacco extract characteristics
CN116519874A (en) Heated cigarette style sensory evaluation method
CN115859784A (en) Method for establishing production process parameter and cigarette sensory quality characteristic correlation model
CN108520276A (en) A kind of interior characterizing method in aesthetic quality of raw tobacco material
CN113907407B (en) Method for migrating style characteristics of tobacco extract
CN114595365A (en) Method and device for constructing cigarette feature relevance, electronic equipment and medium
CN113762775B (en) Tobacco leaf sweet feeling evaluation method based on total sugar content
CN112485372A (en) Method for evaluating miscellaneous gas in flue gas
CN116660458A (en) Cigar raw material sensory quality prediction method based on BP neural network
CN106290725B (en) A kind of quantitative judgement method of cured tobacco leaf giving off a strong fragrance odor type
CN114965815B (en) Method for classifying and identifying aroma-added cigarette paper based on chemometrics-sensory group
CN117495161A (en) Construction method of sensory evaluation index prediction model of flue-cured tobacco leaves
CN115868656A (en) Tobacco leaf group formula imitation design method based on tobacco leaf substitution
CN116380825A (en) Method for detecting quality similarity of tobacco essence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination