CN114764682B - Rice safety risk assessment method based on multi-machine learning algorithm fusion - Google Patents
Rice safety risk assessment method based on multi-machine learning algorithm fusion Download PDFInfo
- Publication number
- CN114764682B CN114764682B CN202210306564.2A CN202210306564A CN114764682B CN 114764682 B CN114764682 B CN 114764682B CN 202210306564 A CN202210306564 A CN 202210306564A CN 114764682 B CN114764682 B CN 114764682B
- Authority
- CN
- China
- Prior art keywords
- expert
- weight
- rice
- index
- hazard
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 235000007164 Oryza sativa Nutrition 0.000 title claims abstract description 85
- 235000009566 rice Nutrition 0.000 title claims abstract description 85
- 238000000034 method Methods 0.000 title claims abstract description 74
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 58
- 238000012502 risk assessment Methods 0.000 title claims abstract description 53
- 238000010801 machine learning Methods 0.000 title claims abstract description 21
- 230000004927 fusion Effects 0.000 title claims abstract description 19
- 240000007594 Oryza sativa Species 0.000 title 1
- 241000209094 Oryza Species 0.000 claims abstract description 84
- 238000011156 evaluation Methods 0.000 claims abstract description 66
- 238000001514 detection method Methods 0.000 claims abstract description 57
- 238000007781 pre-processing Methods 0.000 claims abstract description 6
- 239000011159 matrix material Substances 0.000 claims description 21
- 238000012549 training Methods 0.000 claims description 20
- 230000008569 process Effects 0.000 claims description 10
- 238000005457 optimization Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 8
- 238000001914 filtration Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 230000010354 integration Effects 0.000 claims description 6
- 239000000126 substance Substances 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 6
- 238000002790 cross-validation Methods 0.000 claims description 5
- 239000013056 hazardous product Substances 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 5
- 230000001473 noxious effect Effects 0.000 claims description 4
- 230000003595 spectral effect Effects 0.000 claims description 4
- 241000282461 Canis lupus Species 0.000 claims description 2
- 238000007667 floating Methods 0.000 claims description 2
- 229910052738 indium Inorganic materials 0.000 claims description 2
- 210000002569 neuron Anatomy 0.000 claims description 2
- 230000003321 amplification Effects 0.000 abstract description 3
- 238000003199 nucleic acid amplification method Methods 0.000 abstract description 3
- 230000009467 reduction Effects 0.000 abstract description 3
- 230000004044 response Effects 0.000 abstract description 3
- 235000013305 food Nutrition 0.000 description 14
- 241000607479 Yersinia pestis Species 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 239000000383 hazardous chemical Substances 0.000 description 3
- VYZAMTAEIAYCRO-UHFFFAOYSA-N Chromium Chemical compound [Cr] VYZAMTAEIAYCRO-UHFFFAOYSA-N 0.000 description 2
- 231100000678 Mycotoxin Toxicity 0.000 description 2
- OQIQSTLJSLGHID-WNWIJWBNSA-N aflatoxin B1 Chemical compound C=1([C@@H]2C=CO[C@@H]2OC=1C=C(C1=2)OC)C=2OC(=O)C2=C1CCC2=O OQIQSTLJSLGHID-WNWIJWBNSA-N 0.000 description 2
- 229910052785 arsenic Inorganic materials 0.000 description 2
- RQNWIZPPADIBDY-UHFFFAOYSA-N arsenic atom Chemical compound [As] RQNWIZPPADIBDY-UHFFFAOYSA-N 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 229910052804 chromium Inorganic materials 0.000 description 2
- 239000011651 chromium Substances 0.000 description 2
- 238000011109 contamination Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000003344 environmental pollutant Substances 0.000 description 2
- 238000013210 evaluation model Methods 0.000 description 2
- 229910001385 heavy metal Inorganic materials 0.000 description 2
- 238000013178 mathematical model Methods 0.000 description 2
- 239000002636 mycotoxin Substances 0.000 description 2
- 231100000719 pollutant Toxicity 0.000 description 2
- BBEAQIROQSPTKN-UHFFFAOYSA-N pyrene Chemical compound C1=CC=C2C=CC3=CC=CC4=CC=C1C2=C43 BBEAQIROQSPTKN-UHFFFAOYSA-N 0.000 description 2
- 238000011158 quantitative evaluation Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- IHGSAQHSAGRWNI-UHFFFAOYSA-N 1-(4-bromophenyl)-2,2,2-trifluoroethanone Chemical compound FC(F)(F)C(=O)C1=CC=C(Br)C=C1 IHGSAQHSAGRWNI-UHFFFAOYSA-N 0.000 description 1
- VYLQGYLYRQKMFU-UHFFFAOYSA-N Ochratoxin A Natural products CC1Cc2c(Cl)cc(CNC(Cc3ccccc3)C(=O)O)cc2C(=O)O1 VYLQGYLYRQKMFU-UHFFFAOYSA-N 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 239000002115 aflatoxin B1 Substances 0.000 description 1
- 229930020125 aflatoxin-B1 Natural products 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 125000005605 benzo group Chemical group 0.000 description 1
- 229910052793 cadmium Inorganic materials 0.000 description 1
- BDOSMKKIYDKNTQ-UHFFFAOYSA-N cadmium atom Chemical compound [Cd] BDOSMKKIYDKNTQ-UHFFFAOYSA-N 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000013278 delphi method Methods 0.000 description 1
- LINOMUASTDIRTM-QGRHZQQGSA-N deoxynivalenol Chemical compound C([C@@]12[C@@]3(C[C@@H](O)[C@H]1O[C@@H]1C=C(C([C@@H](O)[C@@]13CO)=O)C)C)O2 LINOMUASTDIRTM-QGRHZQQGSA-N 0.000 description 1
- 229930002954 deoxynivalenol Natural products 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- GVEPBJHOBDJJJI-UHFFFAOYSA-N fluoranthrene Natural products C1=CC(C2=CC=CC=C22)=C3C2=CC=CC3=C1 GVEPBJHOBDJJJI-UHFFFAOYSA-N 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000011005 laboratory method Methods 0.000 description 1
- WABPQHHGFIMREM-UHFFFAOYSA-N lead(0) Chemical compound [Pb] WABPQHHGFIMREM-UHFFFAOYSA-N 0.000 description 1
- QSHDDOUJBYECFT-UHFFFAOYSA-N mercury Chemical compound [Hg] QSHDDOUJBYECFT-UHFFFAOYSA-N 0.000 description 1
- 229910052753 mercury Inorganic materials 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- RWQKHEORZBHNRI-BMIGLBTASA-N ochratoxin A Chemical compound C([C@H](NC(=O)C1=CC(Cl)=C2C[C@H](OC(=O)C2=C1O)C)C(O)=O)C1=CC=CC=C1 RWQKHEORZBHNRI-BMIGLBTASA-N 0.000 description 1
- DAEYIVCTQUFNTM-UHFFFAOYSA-N ochratoxin B Natural products OC1=C2C(=O)OC(C)CC2=CC=C1C(=O)NC(C(O)=O)CC1=CC=CC=C1 DAEYIVCTQUFNTM-UHFFFAOYSA-N 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- AXFBAIOSECPASO-UHFFFAOYSA-N pentacyclo[6.6.2.02,7.04,16.011,15]hexadeca-1(14),2(7),3,5,8(16),9,11(15),12-octaene Chemical compound C1=C(C=C23)C4=C5C3=CC=CC5=CC=C4C2=C1 AXFBAIOSECPASO-UHFFFAOYSA-N 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000013077 scoring method Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000013068 supply chain management Methods 0.000 description 1
- MBMQEIFVQACCCH-UHFFFAOYSA-N trans-Zearalenon Natural products O=C1OC(C)CCCC(=O)CCCC=CC2=CC(O)=CC(O)=C21 MBMQEIFVQACCCH-UHFFFAOYSA-N 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
- 229910052721 tungsten Inorganic materials 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- LINOMUASTDIRTM-UHFFFAOYSA-N vomitoxin hydrate Natural products OCC12C(O)C(=O)C(C)=CC1OC1C(O)CC2(C)C11CO1 LINOMUASTDIRTM-UHFFFAOYSA-N 0.000 description 1
- MBMQEIFVQACCCH-QBODLPLBSA-N zearalenone Chemical compound O=C1O[C@@H](C)CCCC(=O)CCC\C=C\C2=CC(O)=CC(O)=C21 MBMQEIFVQACCCH-QBODLPLBSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/02—Agriculture; Fishing; Forestry; Mining
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Human Resources & Organizations (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Strategic Management (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Business, Economics & Management (AREA)
- Computational Linguistics (AREA)
- Tourism & Hospitality (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- Biomedical Technology (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Medical Informatics (AREA)
- Agronomy & Crop Science (AREA)
- Animal Husbandry (AREA)
- Marine Sciences & Fisheries (AREA)
- Mining & Mineral Resources (AREA)
Abstract
The invention provides a rice safety risk assessment method based on multi-machine learning algorithm fusion. The method comprises the following steps: acquiring rice hazard detection data and preprocessing the rice hazard detection data; starting from the aspect of hazard indexes, realizing expert classification according to an AHP algorithm and an SC algorithm, solving weights in expert categories and weights among experts by combining consistency weight differences of expert evaluation results, constructing a rice safety risk evaluation index system, and carrying out weighted summation on preprocessed hazard detection data and comprehensive weights to obtain a rice hazard risk value; and a multi-machine learning algorithm is adopted to fuse and construct a rice safety risk assessment model so as to realize rapid risk assessment. The method effectively considers the opinions of all experts in a more objective mode, and avoids the amplification of invalid information and the reduction of valid information. The invention can effectively reduce the supervision cost, improve the risk discovery and response handling efficiency and provide accurate and efficient decision basis for supervision departments.
Description
Technical Field
The invention belongs to the technical field of food quality detection and food safety risk assessment, relates to technologies such as big data processing and machine learning, and particularly relates to a rice safety risk assessment method based on multi-machine learning algorithm fusion.
Background
In recent years, food safety events are frequent, higher requirements are put forward for food safety supervision, and countries in the world have successively introduced a series of strict food safety supervision policies. In order to further strengthen risk monitoring, risk assessment and supply chain management and improve risk discovery and response handling efficiency, all levels of government departments vigorously promote digital construction in the food safety field, strengthen 'big data + food' supervision and play the advantages and roles of technologies such as big data, artificial intelligence and the like in the fields of food safety risk assessment, supervision and the like.
At present, food safety risk assessment methods mainly comprise three major categories, namely a qualitative assessment method, a quantitative assessment method and a comprehensive risk assessment method. The qualitative assessment method is an assessment method with strong subjectivity, and mainly analyzes and judges the risk index according to the knowledge and experience of an evaluator, and calculates the index risk value according to the judgment result and a matrix model. The qualitative assessment method based on single expert assessment is relatively mature and comprises a Delphi method, an analytic hierarchy process, a decision laboratory method, an index scoring method and the like. Qualitative assessment methods based on multiple experts are divided into subjective weighting and objective weighting, wherein the subjective weighting method is to divide the expert weights based on expert prior information, such as: the prestige, the knowledge level and the like, and calculating a risk value according to an expert weight result; and dividing the expert weight based on the consistency index value of the expert evaluation result by an objective weighting method, and calculating the risk value according to the expert weight result. In actual decision making, the qualitative assessment method based on multiple experts has high credibility, and in the research of the expert weighting method, the objective weighting method is more widely applied compared with the subjective method. The quantitative evaluation method is an evaluation method with strong objectivity, and index risk values are calculated through a mathematical model, and the evaluation method comprises a Monte Carlo quantitative evaluation method, a grey correlation theory method, a fuzzy comprehensive evaluation model, a machine learning artificial neural network model and the like. The comprehensive risk assessment method is a combination of qualitative and quantitative assessment methods, an index system is established through the qualitative assessment method, and a risk assessment model is established according to the index system and the quantitative assessment method.
With the acceleration of digital transformation, food detection data grows exponentially and explosively, data processing and analysis are difficult to become the first problems restricting food safety risk supervision, and the accuracy of a risk assessment model taking data as a carrier is directly influenced. In the existing risk assessment method, the qualitative assessment method is high in labor cost and long in assessment process, and the quantitative assessment method has the problems of low index precision or weak overfitting performance and the like, so that the accuracy of a risk assessment result is low, the time cost is high, and the capability of accurately positioning a risk value is lost.
Disclosure of Invention
Aiming at the problems that in the prior art, food safety risk assessment time is long, assessment results are low in accuracy rate, and risks cannot be accurately located, the invention provides a rice safety risk assessment method based on multi-machine learning algorithm fusion.
The invention discloses a rice safety risk assessment method based on multi-machine learning algorithm fusion, which is realized by the following steps:
(1) And acquiring rice hazard detection data and preprocessing the rice hazard detection data.
The preprocessing comprises noise filtering, data integration and normalization processing of the detection data in sequence.
Setting k kinds of hazards, wherein the preprocessed hazard detection data comprise standardized detection values of all the hazards;
(2) And constructing a rice safety risk assessment index system.
Obtaining the evaluation result of the expert on the rice hazard indexes, and then executing: (1) Firstly, calculating the evaluation index weight of each expert based on an Analytic Hierarchy Process (AHP), wherein the evaluation index weight refers to the evaluation weight of each rice hazard index; (2) dividing the expert categories based on a spectral clustering method SC; (3) Calculating inter-expert-category weights and intra-expert-category weights; the more the number of experts in the category is, the smaller the consistency difference is, the greater the weight of the expert category is; (4) finally determining the comprehensive weight of each hazard index;
for the jth index, the evaluation weight of the ith expert to the jth index is calculated to be w ij The evaluation results of m experts are grouped into H classes by SC algorithm, wherein the ith expert is divided into H classes i In, h i ∈{h 1 ,h 2 ,...h H Get category h by calculation i Is weighted byClass h i The inner expert i evaluates in weight ^ based on>
And (4) weighting and summing the hazard detection data preprocessed in the step one with the comprehensive weight to obtain a rice hazard risk value Y.
(3) In order to provide visual rice safety risk assessment results more quickly and accurately, the rice safety risk assessment method adopts a multi-machine learning algorithm to construct a rice safety risk assessment model in a fusion mode.
Constructing a rice hazard risk assessment model, selecting two machine learning algorithms of XGboost and LightGBM to form a base learner, and selecting a long-short term memory network LSTM as a meta-learner; inputting the preprocessed hazardous material detection data into a rice hazardous material risk assessment model, inputting the output of two machine learning algorithms in the base learner and the preprocessed hazardous material detection data into a meta-learner, and finally outputting a rice hazardous material risk value Y by the model.
The method of the invention judges the rice quality safety condition according to the rice hazard risk value Y. According to the detection data of each hazard and the weighted value of the corresponding comprehensive weight, the influence of the hazard on the quality safety of the rice can be determined, and the main hazard can be positioned.
Compared with the prior art, the invention has the advantages that:
(1) According to the method, the rice safety risk indexes are screened based on the group decision model, a rice safety risk index evaluation system is constructed, on the premise that few obeys majority, amplification of 'invalid information' and reduction of 'valid information' in group decision are effectively avoided, and opinions of all experts are effectively considered in a more objective mode; the method of the invention fully considers that the expert knowledge level, the experience and the familiarity degree of rice hazard indexes are different, and constructs a rice safety risk assessment index system in a more objective mode.
(2) The method provided by the invention is constructed based on a fusion algorithm, the difference between the angle and the principle of observation data of each algorithm is comprehensively considered, the advantages and the disadvantages of the differentiation algorithm are made up based on a Stacking integrated learning strategy, the rice hazard risk value can be rapidly and accurately analyzed through a rice safety risk assessment model BXGB-BLGB-GLSTM, and a scientific and effective basis is provided for assessment decisions of supervision departments.
(3) According to the method, the danger detection data is preprocessed, effective information is extracted, and the accuracy of rice danger risk assessment model prediction can be improved.
(4) The method solves the problems that the food safety risk evaluation time is long, the accuracy of the evaluation result is low, and the risk cannot be accurately positioned in the prior art, can effectively reduce the supervision cost, improve the risk discovery and response treatment efficiency, and can provide an accurate and efficient decision basis for supervision departments.
Drawings
FIG. 1 is a schematic overall flow chart of the rice safety risk assessment method of the present invention;
FIG. 2 is a schematic diagram of the framework of the hybrid model BXGB-BLGB-GLSTM of the present invention;
FIG. 3 is a comparison graph of the evaluation results of an embodiment of the present invention using the BXGB-BLGB-GLSTM model;
FIG. 4 is a comparison graph of the evaluation results using the XGboost model according to an embodiment of the present invention;
FIG. 5 is a comparison graph of the results of the LightGBM model evaluation according to the embodiment of the invention;
FIG. 6 is a comparison graph of the results of an evaluation using the LSTM model according to an embodiment of the present invention;
FIG. 7 is a comparison graph of the evaluation results of the embodiment of the present invention using the BP model;
FIG. 8 is a comparison graph of the results of an evaluation using an SVM model according to an embodiment of the present invention;
FIG. 9 is a comparison of the results of the evaluation using the KNN model in accordance with the present invention.
Detailed Description
The invention will be described in further detail below with reference to the drawings and examples.
The invention provides a rice safety risk assessment method based on multi-machine learning algorithm fusion, which comprises the following five steps of realizing process and effect verification. The respective steps are specifically described below.
The method comprises the following steps: and preprocessing the acquired rice hazard detection data.
The embodiment of the invention performs example analysis based on rice hazard spot inspection data of 31 provinces (autonomous region, city in direct jurisdiction) except Hongkong and Macao in 2018, wherein the data comprises detection provinces, detection time, detection items and results, and the like, wherein the detection items comprise chromium, benzo [ alpha ] pyrene, lead, inorganic arsenic, aflatoxin B and the like; according to different kinds of the pests, the method is divided into heavy metal pests, mycotoxin pests and pollutant pests; dividing the data into specific values, less than a specific data or undetected data according to the detection result; the results were classified as either pass or fail and the rice hazard detection data samples are shown in table 1.
TABLE 1 Rice hazard detection data sample
In order to extract effective information in the multivariate data, noise filtration, data integration and normalization processing are sequentially carried out on the detection data. The detection data are preprocessed, and effective information is extracted, so that the accuracy of the estimation model prediction is improved.
(1) And (5) filtering noise. Because the detection result of the hazard, the detection unit and the result judgment are separated from each other, the noise in the invention refers to the statistical error caused by unit record error, and the noise filtration is to delete the data which does not accord with the detection result judgment and the detection result judgment of the sample.
(2) And (6) data integration and normalization processing. Because the formats of the detection results are different, the subsequent risk assessment model construction is not facilitated, the unified detection data format is a floating point type, and the unified hazard detection results are standardized by utilizing the trapezoidal membership function of the formula (1).
Wherein, x represents the detection result of a certain hazard, x max Is the national standard value of the hazard,c (x) represents a value normalized to the hazard detection result x, which is a risk-free maximum value.
Step two: and constructing a rice safety risk assessment index system.
When a rice safety index system is constructed, a mature Analytic Hierarchy Process (AHP) in a qualitative assessment method is selected for counting and summarizing scoring results of index experts based on rice hazard detection data and industry authority expert evaluation data, a Spectral Clustering algorithm (SC) which is suitable for high-dimensional Clustering, strong in adaptability to data distribution and excellent in Clustering effect is adopted for constructing a group decision weighting model based on index weight distribution, and the rice safety risk assessment index system is constructed in a more objective mode.
In consideration of different knowledge levels, experiences and familiarity degrees of rice hazard indexes of experts, in order to combine scoring characteristics of different experts, the rice safety risk assessment index system is constructed on the basis of expert scoring results. Firstly, the scoring results of experts are classified without supervision, a group decision weighting model based on index weight distribution is constructed by combining an unsupervised clustering algorithm suitable for high-dimensional data, and a rice hazard risk assessment index system is constructed in a more objective mode, wherein the specific flow is shown in figure 1. Firstly, obtaining the scoring result of each expert on the rice hazard indexes, and then continuing the following steps.
(1) And calculating the weight of the evaluation index based on an AHP algorithm. In the process of calculating the index weight, the AHP algorithm stratifies rice hazard detection items to be analyzed according to different hazard types, and constructs a judgment matrix A shown in a formula (2) according to expert scoring results k×k And endowing each hazard index with a corresponding weight. Wherein k is the number of the indexes of the hazardous substances.
Wherein, the element a in the matrix is judged ij After the ith hazard index is compared with the jth hazard index, the relative influence of the ith hazard index is judged according to a scale method of 1 to 9, and a ij Satisfy the requirement ofa ij The scale and meaning of the elements in the decision matrix are shown in table 2.
Table 2 shows the scale and meaning of the elements in the decision matrix
a ij Scale | a ij Meaning of Scale |
a ij =1 | The ith hazard index has the same influence as the jth hazard index |
a ij =3 | The ith hazard index has slightly stronger influence than the jth hazard index |
a ij =5 | The ith hazard index has stronger influence than the jth hazard index |
a ij =7 | The ith hazard index has much stronger influence than the jth hazard index |
a ij =9 | The ith hazard indicator is much more influential than the jth hazard indicator |
Obtaining the judgment matrix of each expert according to the formula (2), and judging the matrix A for each expert according to the judgment matrix k×k Computing the maximum feature root λ max And expert evaluation index weight W = { W = 1 ,w 2 ,...,w k In which w i And (3) representing the evaluation weight of the expert on the ith rice hazard index, as shown in formulas (3) to (5).
AW=λ max W (3)
Matrix consistency detection can be performed by using the maximum feature root.
And (4) if m experts participating in evaluation are provided, the evaluation weight of all experts on each hazard index is recorded as W, as shown in the formula (6).
In formula (6), w ij And evaluating the weight value of the ith expert to the jth index by the AHP algorithm. The superscript T denotes transposition.
(2) The expert categories are divided based on the SC algorithm. The SC algorithm is a clustering method based on graph theory, and the main idea is that high-dimensional sample data is regarded as a point in space, all data points are connected by edges, the weight of the edge between two points close to each other is higher, and the weight of the edge between two points far from each other is lower. Through the graph cutting, the sum of the inner side weights of all sub-graphs after the graph cutting is as large as possible, and the sum of the side weights of different sub-graphs is as small as possible, so that the purpose of clustering high-dimensional sample data is achieved.
In order to improve the objectivity of index weight and reduce subjective errors, the invention combines the scoring characteristics of different experts and adopts an SC algorithm which is suitable for high-dimensional clustering, has strong adaptability to data distribution and excellent clustering effect to perform unsupervised classification on the scoring results of the experts. The method calculates the expert compatibility based on the cosine similarity of the high-dimensional index weight, constructs a compatibility matrix, and takes the compatibility of the expert as the input of an SC (Standard center) algorithm, wherein the cosine similarity l is shown as a formula (7).
In formula (7), W x ,W y The evaluation index weights of the experts x and y are represented respectively, and k is the number of the hazard indexes.
According to the similarity calculation formula in the formula (7), an m-dimensional vector compatibility matrix L can be obtained, as shown in the formula (8).
Element l in the matrix xy (x, y =1,2, \ 8230; m) represents the degree of compatibility of experts x and y, calculated according to equation (7).
In the SC algorithm d classification, in order to achieve the optimal clustering result, the invention selects CH _ score shown in formula (9) to evaluate the clustering effect, and selects the clustering result with the maximum value by comparing the sizes of the CH _ score.
Wherein, B D Is a covariance matrix, W, between expert classes D Is an expert category inner covariance matrix, tr is a trace of the matrix, and d is the number of categories. Let C q Set of results representing all expert evaluations in class q, c q Cluster center point representing current class q, c e Center point, m, representing all expert evaluation results q Indicating the number of expert evaluation results contained in the class q. According to the spectral clustering principle, the smaller the covariance of data in the classes, the better the covariance is, the larger the covariance between the classes, the higher the Calinski-Harabaz score is, and the better the clustering result is.
(3) And calculating the weight between the expert categories. For the calculation of index weight between expert categories, the invention is designed to divide the evaluation result of m experts into H categories expressed as { H } through SC algorithm 1 ,h 2 ,...h H AtClustering cluster h i (i =1,2, \8230H) in which the larger the number of experts in a category, the smaller the difference in consistency, and the smaller the assignment of H i A relatively high weight value. The method comprises the following specific steps:
and 3.1, constructing consistency weight difference values among expert categories. Setting the weight of the ith expert evaluation index obtained based on the AHP algorithm as W i The category is h i And h is i In which comprisesAnd (5) evaluating the result by each expert. W i The difference value of the consistency weight with the weight of other expert evaluation indexes is D i H is as shown in formula (12) i A difference value of a correspondence weight between a class of expert and another class of expert being { }>As shown in formula (13);
and 3.2, constructing weight constraint conditions among expert classes. Based on comprehensive consideration of the number of experts and the consistency difference, obtaining a weight calculation model and constraint conditions among experts, and satisfying the formulas (14) and (15);
And 3.3, calculating a weight coefficient among the expert categories. Calculating by a formula to obtain a cluster h i Inter-expert-category weights ofAs shown in equation (16).
And recording the weight result among all the expert categories as beta based on the expert classification result, as shown in the formula (17).
(4) Weights within the expert categories are calculated. The invention also starts from the expert index weight, carries out consistency check on the expert evaluation result, eliminates the index weight which does not pass the consistency check, determines an index reasonable interval and constructs a weight optimization model in the expert category, and the concrete implementation steps comprise the following steps:
and 4.1, determining a reasonable index interval. Set cluster h i In which comprisesBased on the weight information given by the expert, each risk indicator is present->A weight value, utilizing>And determining reasonable index intervals according to the density distribution of the index weights.
For the index j, the range of values of the index that all experts can accept isSatisfies the following conditions:
the length of the interval of the index value is r, and the index j meets the requirementLet δ = r j 2, δ is the conformity test criterion, if w ij Does not contain other weight values of the index j in the delta field of (d), then w ij Are singular points.
Determining the reasonable interval of the jth index after all singular points are deleted by traversing the ownership weight value of the index j
And 4.2, constructing a weight optimization model in the expert category. In order to maximally integrate the expert opinions in a reasonable interval, the objective function Obj in the model as the formula (18) satisfies the weight value in the expert categoryAnd w ij The sum of the deviations of (a); the constraint condition in the model is that T is in the reasonable index interval, and the sum of the weight values of the experts in the category is 1, as shown in formula (19).
And (4) recording the weight results in all the expert categories as T based on the expert classification results, as shown in the formula (20).
Wherein, t ij And the weight of the ith expert evaluation result in the jth cluster is calculated.
And 4.3, weighting to obtain the comprehensive index weight. Weighting according to the calculation result of the intra-expert-category weight optimization model, the clustering result and the calculation result of the inter-expert-category weight to obtain the comprehensive weight S = { S } of each index 1 ,s 2 ,...,s k As shown in formulas (21) and (22).
Wherein s is ij Representing the weight, s, of the weighted ith expert on the jth index i And (4) representing the comprehensive weight of the ith index after the group decision. Here, theMeans class h i And (4) evaluating the weight occupied by the result by the internal expert i.
And 4.4, calculating the comprehensive risk value of the rice hazards. And (4) weighting the data C cleaned in the step one with the comprehensive index weight S to obtain a low-dimensional comprehensive risk value, namely an output value Y of the rice hazard risk assessment model, as shown in a formula (23).
Y=S×C(x) (23)
Where C (x) is a vector consisting of normalized k hazard detection values. And multiplying the hazard detection values of all kinds by the corresponding comprehensive weights and then summing to obtain the final rice hazard risk value Y.
The quality safety condition of the current rice can be detected according to the output risk value Y, and the influence of the hazard on the quality safety of the rice can be determined according to the standardized hazard detection value and the comprehensive weight of the hazard, so that high risk factors can be positioned. The supervisory organization can feed back according to the risk value who obtains, supervises rice quality safety, carries out the detection and the processing of important hazardous substances.
The method effectively avoids the amplification of 'invalid information' and the reduction of 'valid information' in group decision, effectively considers the opinions of all experts in a more objective mode, and further constructs a more reasonable and accurate rice hazard index system.
Step three: and constructing a rice hazard risk assessment model.
Compared with the traditional mathematical model, the machine learning algorithm has higher risk identification capability, so that the risk assessment model is built based on the machine learning algorithm, the evaluation accuracy of the single machine learning algorithm is considered to be lower, and in order to further improve the accuracy of the assessment model, a risk assessment model frame formed by the single algorithm is skipped, the advantages of integration, classification and optimization algorithms are integrated, and the rice safety risk assessment model based on the fusion of the multi-machine learning algorithm is built through the Stacking model fusion, so that the intuitive rice safety risk assessment result can be provided for consumers more quickly and accurately when massive and complex data are analyzed.
The Stacking model selected by the invention is an integrated model combining a plurality of different algorithms together, so that the overall prediction precision of the evaluation model is improved. In order to ensure the accuracy of the fusion model evaluation, the selection of the learners should ensure that each learner has better independent prediction capability, so that the Extreme Gradient hoisting XGboost (Extreme Gradient Boosting) algorithm and the lightweight Gradient hoisting GBM (Light Gradient Boosting Machine) algorithm with strong generalization capability are selected as base learners; in order to realize effective complementation of information among algorithms, LSTM (long-short term memory network) which has a larger difference with the principle of the base learner is selected as a meta-learner to construct a fusion model.
In order to improve the operation precision of the model and save the manual parameter adjusting time, for the tree model with more over-parameters, a Bayesian Optimization Algorithm (BOA) is selected to rate the XGboost and LightGBM model parameters; for the neural network algorithm with slow training speed, the gray Wolf optimization algorithm (GWO) with fast convergence speed is adopted to automatically optimize the initial weight, the threshold value and the neuron number of the hidden layer of the LSTM algorithm, and after the model parameters are optimized, a framework of a fusion model BXGB-BLGB-GLSTM is finally formed, as shown in FIG. 2.
As shown in fig. 2, the data preprocessed in the first step is input into a base learner, the XGBoost algorithm and the LightGBM algorithm are used for prediction in the base learner, the output prediction result and the data preprocessed in the first step are used as input data of a meta-learner together, and the LSTM algorithm of the meta-learner is used for predicting and outputting a rice hazard risk result Y. The fusion model BXGB-BLGB-GLSTM realizes the calculation process in the second step.
Step four: and (5) performing model experiments.
1) The data set is partitioned. Firstly, taking the data C (x) preprocessed in the first step as input data of the BXGB-BLGB-GLSTM model, and dividing a data set according to a training test ratio of 3.
2) And (5) training a model. In model fusion, in order to avoid the problem of model overfitting caused by repeated learning of data by a base learner, the method performs K-fold cross validation on a training set. K-fold cross validation is a statistical method to assess generalization performance. In the K-fold cross validation, data are equally divided into K parts, each part is one fold, in the training process, K-1 fold data are used as a training set for training, and the rest 1 fold data are used as a validation set for verifying the model. The K-fold cross validation can be used for fully utilizing data, and the extreme condition that the training set and the validation set are not uniformly distributed due to data difference is avoided.
Dividing the data training set into K sub-training sets with equal size, traversing each sub-training set to enable the base learners (the XGboost model and the LightGBM model) to finish K times of training, and outputting results { x ] on the training set and the testing set respectively after the training of each base learner is finished 1 ,x 2 ,...,x k H, for M base learners, M test set predictions can be output, and the M test set predictions are combined with C (x) to form a metadata set, and the metadata set is passed through a meta-learner (LSTM mode)Type) learning, and outputting the prediction result of the BXGB-BLGB-GLSTM model.
Step five: and (4) evaluating, analyzing and comparing the model.
In order to more clearly compare and illustrate the experimental results of the model of the invention, the invention adopts the correlation coefficient R 2 The model is evaluated by 3 indexes, namely the average absolute error MAE and the average square error MSE, and each index is calculated as the formula (24) to the formula (26).
In expressions (24) to (26), N is the sample data amount; yo i 、ym i Respectively representing the comprehensive risk value and the predicted value of the hazards of the ith sample;respectively representing the comprehensive risk average value and the average predicted value of all samples. R 2 The size of the curve is positively correlated with the fitting degree of the curve; MAE and MSE are important indexes for measuring variable precision and are negatively related to model precision.
The embodiment of the invention performs example analysis based on rice hazard detection data of 31 provinces in 2018 China except Hongkong and Macao, completes pretreatment on the obtained rice hazard detection data according to the method of the step one, and constructs a rice hazard risk index system according to a screening process, as shown in Table 3.
TABLE 3 Rice hazard Risk indicator System
Classes of risk indicators | Risk index |
Heavy metal hazardous substance | Lead, cadmium, chromium, total mercury and inorganic arsenic |
Mycotoxin noxious substances | Aflatoxin B1, ochratoxin A, deoxynivalenol, zearalenone |
Noxious substances of the pollutant class | Benzo [ alpha ] s]Pyrene and aluminum phosphide |
In the aspect of index weight construction, 50 effective expert scoring questionnaires are collected together, and the results of partial expert scoring questionnaires are shown in table 4.
Table 4 partial expert scoring questionnaire results
And calculating the expert comprehensive index weight based on the scoring result in the table 4 in combination with the step two, as shown in the table 5.
TABLE 5 expert synthetic index weights
In the aspect of rice hazard risk assessment model experiments, the experimental environment is a Win10 operating system of i5-6200U CPU and 8G RAM, and codes are realized through python3 based on a Jupitter notewood platform. Based on the environmental configuration, the data C (x) cleaned in the first step is used as input data of a risk assessment model, the comprehensive risk value Y is used as output data of the risk assessment model, and the configuration results of parameters trained through the model are shown in table 6.
TABLE 6 optimal parameter configuration for each model algorithm
The parameters n _ estimators and epochs are used for controlling the quantity of estimators, learning _ rate is learning rate, max _ depth is the maximum depth of the tree model, seed is random number seed, the number of features considered when max _ features is the optimal split point, min _ samples _ split is the minimum number of samples required by splitting internal nodes, subsample is sample sampling rate, batch _ size is the number of samples selected after 1 training, optizer is a model optimizer, and activation is an excitation function.
Based on the simulation parameter configuration of table 6, C (x) is input into the BXGB-BLGB-GLSTM risk assessment model, a comparison curve of the comprehensive risk value and the predicted value of each risk index can be obtained, and the evaluation result pair based on the BXGB-BLGB-GLSTM model is as shown in fig. 3. Wherein the X-axis represents the number of samples (unit: one) and the Y-axis represents the degree of contamination (unit:%) of each type of hazard. Wherein a contamination level greater than 1 (i.e., Y > 1) represents a significant overproof of the hazard; and when Y belongs to (0, 1), the value of the Y axis is positively correlated with the pollution degree of the noxious stances.
Comparing and analyzing the evaluation result of the BXGB-BLGB-GLSTM model with a single model prediction result which is proved to have more prominent prediction effect by research, wherein the evaluation result pair based on the XGboost model is shown in figure 4, the evaluation result pair based on the LightGBM model is shown in figure 5, the evaluation result pair based on the LSTM model is shown in figure 6, the evaluation result pair based on the BP model is shown in figure 7, the evaluation result pair based on the SVM model is shown in figure 8, and the evaluation result pair based on the KNN model is shown in figure 9.
As can be seen from the comparative model experiment curves shown in FIGS. 3 to 9, when Y belongs to (0.2, 0.35), the coincidence degree of the predicted value and the true value of each model is high; when Y belongs to (0, 0.2) U (0.35 ∞), i.e. the pollution degree of various kinds of harmful substances is lower or higher, the average fitting effect of partial models (such as KNN, SVM and BP) is poorer, and the pollution degree is easily overestimated (underestimated) when the pollution degree is higher (lower).
In order to compare the experimental results of various models more clearly, the invention combines R 2 The model is evaluated by the 3 indexes of MAE and MSE, and the evaluation index parameter pairs of each algorithm are shown in a table 7.
TABLE 7 comparison of evaluation index parameters of each model algorithm
Model (model) | R 2 | MAE | MSE |
BXGB-BLGB-GLSTM | 0.937165550918625 | 0.010853262188760 | 0.000205881888677 |
XGBoost | 0.827113560494595 | 0.019379027245068 | 0.000566475670789 |
LightGBM | 0.759188224211823 | 0.022529495706231 | 0.000789038241599 |
LSTM | 0.746908638159939 | 0.021653373041345 | 0.000829273246528 |
BP | 0.729470385424174 | 0.024493740457837 | 0.000886411018260 |
SVM | 0.744607468363948 | 0.023041551003774 | 0.000836813205750 |
KNN | 0.739849107809394 | 0.022235136330000 | 0.000852404338835 |
Compared with a single model algorithm, the BXGB-BLGB-GLSTM mixed model provided by the invention has higher accuracy and stronger stability in the aspect of prediction, can intuitively and accurately analyze the risk value of food safety hazards, and can provide a scientific and effective basis for evaluation and decision making of a supervision department.
The above description is only for the best mode of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof should be equally replaced or changed within the technical scope of the present invention.
Claims (9)
1. A rice safety risk assessment method based on multi-machine learning algorithm fusion is characterized by comprising the following steps:
the method comprises the following steps: acquiring rice hazard detection data and preprocessing the rice hazard detection data;
the preprocessing comprises noise filtering, data integration and normalization processing;
the preprocessed hazard detection data comprise standardized detection values of all hazards;
step two: constructing a rice safety risk assessment index system;
obtaining the evaluation result of the expert on the rice hazard indexes, and then executing: (1) Firstly, calculating the evaluation index weight of each expert based on an Analytic Hierarchy Process (AHP), wherein the evaluation index weight refers to the evaluation weight of each rice hazard index; (2) dividing the expert categories based on a spectral clustering method SC; (3) Calculating inter-expert-category weights and intra-expert-category weights; wherein, the more the number of experts in the category is, the smaller the consistency difference is, the larger the weight of the expert category is; (4) finally determining the comprehensive weight of each hazard index;
for the jth index, the evaluation weight of the ith expert to the jth index obtained in the step (1) is w ij Grouping the evaluation results of the m experts into H classes by step (2), wherein the ith expert is classified into the class H i In, h i ∈{h 1 ,h 2 ,...h H Get category h from step (3) i Has a weight ofObtaining the class h from the step (4) i The evaluation result of an expert intern i takes on a weight of @>Weighting to obtain the ith expert evaluation resultWeight s of the j index of the result pair ij The following were used:
obtaining the comprehensive weight s of the jth index after the group decision i The following were used:
weighting the hazard detection data preprocessed in the step one with the comprehensive weight to obtain a rice hazard risk value Y;
judging the quality safety condition of the rice according to the rice hazard risk value Y;
determining the influence of the hazards on the rice quality safety according to the detection data of the hazards and the weighted value of the corresponding comprehensive weight;
step three: constructing a rice hazard risk assessment model by adopting multi-machine learning algorithm fusion;
the rice hazard risk assessment model selects two machine learning algorithms of XGboost and LightGBM to form a base learner, and selects a long-short term memory network LSTM as a meta-learner; inputting the preprocessed hazard detection data into a rice hazard risk assessment model, inputting the output of two machine learning algorithms in the base learner and the preprocessed hazard detection data into the meta-learner, and finally outputting a rice hazard risk value Y by the model.
2. The method of claim 1, wherein in the first step, the noise filtering is to delete data that is determined to be inconsistent with the detection result; the data integration and normalization processing means that the unified detection data format is a floating point type, and the hazard detection result is standardized and unified by utilizing a trapezoid membership function.
3. The method according to claim 2, wherein in the first step, the hazard detection result after the unified data format is standardized by using the following function;
4. the method as claimed in claim 1, wherein in the second step, when the evaluation weight of the index is calculated based on the AHP, a judgment matrix for the rice hazard index is constructed according to the expert scoring result, the rows and columns of the judgment matrix correspond to the rice hazard index, and the matrix elements represent the relative influence of the two hazard indexes.
5. The method according to claim 1, wherein the step two, when classifying the expert categories based on the SC algorithm, comprises: firstly, calculating a compatibility matrix based on the evaluation weight of experts on rice hazardous material indexes, wherein elements in the matrix represent the compatibility of two experts, and the compatibility is obtained by calculating cosine similarity in the indexes; secondly, inputting the compatibility matrix into an SC algorithm, evaluating the clustering effect by using the CH index, and selecting the classification result with the best clustering effect.
6. The method according to claim 1, wherein in the second step, the method for calculating the weight between expert categories comprises:
step 3.1, constructing consistency weight difference values among expert categories;
let the evaluation index weight of the ith expert be W i The expert category is h i Class h i In which comprises(ii) individual expert assessment results; calculating the consistency weight difference value D of the evaluation index weights of the ith expert and other experts i The following were used:
wherein, W i Evaluation index weight, W, for the jth expert i ={w i1 ,w i2 ,...,w ik }; k is the number of the indexes of the noxious substances;
then h is i Consistency weight difference value between class experts and other expert classesThe calculation is as follows:
step 3.2, constructing weight constraint conditions among expert classes as follows:
step 3.3, calculating the weight among the expert categories as follows:
7. the method according to claim 1, wherein in the second step, the weight in the expert category is calculated as follows:
step 4.1, determining an index reasonable interval;
interval length of index j valueLet consistency check criterion δ = r j 2, if w ij Does not contain other weight values of the index j in the delta field of (d), then w ij Is a singular point;
traversing the ownership weight value of the index j, deleting all singular points, and determining the reasonable interval of the jth index
Step 4.2, constructing a weight optimization model in the expert category;
let the expert Categories h i In which comprisesThe evaluation index weight of the expert is determined, the objective function Obj of the model satisfies the weight value ^ in the expert category>And w ij The deviation sum of (c) is minimum as follows:
wherein, t i Represents a category h i The weight occupied by the inner ith expert evaluation result.
8. The method according to claim 1, wherein in the third step, a Bayesian optimization algorithm BOA is selected to calibrate XGboost and LightGBM model parameters; selecting a wolf optimization algorithm GWOO to automatically optimize the initial weight, the threshold value and the number of neurons in a hidden layer of the LSTM; finally, a fusion model BXGB-BLGB-GLSTM is obtained and used as a rice hazard risk assessment model.
9. The method according to claim 1 or 8, wherein in the third step, the rice hazard risk assessment model is trained, comprising:
(1) Dividing collected hazard detection data into a training set and a testing set according to the proportion of 3; to pair
(2) Dividing the training set into K sub-training sets with equal sizes, performing K-fold cross validation on the training set, and completing K times of training on the rice hazard risk assessment model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210306564.2A CN114764682B (en) | 2022-03-25 | 2022-03-25 | Rice safety risk assessment method based on multi-machine learning algorithm fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210306564.2A CN114764682B (en) | 2022-03-25 | 2022-03-25 | Rice safety risk assessment method based on multi-machine learning algorithm fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114764682A CN114764682A (en) | 2022-07-19 |
CN114764682B true CN114764682B (en) | 2023-04-07 |
Family
ID=82364952
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210306564.2A Active CN114764682B (en) | 2022-03-25 | 2022-03-25 | Rice safety risk assessment method based on multi-machine learning algorithm fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114764682B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115758888B (en) * | 2022-11-17 | 2024-04-23 | 厦门智康力奇数字科技有限公司 | Agricultural product security risk assessment method based on multi-machine learning algorithm fusion |
CN116739617A (en) * | 2023-06-08 | 2023-09-12 | 中国标准化研究院 | Food related product risk management system and method based on data analysis |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014092230A1 (en) * | 2012-12-13 | 2014-06-19 | 대한민국 (식품의약품안전청장) | System and method for inspecting imported food-based harm prediction |
CN111461576A (en) * | 2020-04-27 | 2020-07-28 | 宁波市食品检验检测研究院 | Fuzzy comprehensive evaluation method for safety risk of chemical hazards in food |
CN111582718A (en) * | 2020-05-08 | 2020-08-25 | 国网安徽省电力有限公司电力科学研究院 | Cable channel fire risk assessment method and device based on network analytic hierarchy process |
-
2022
- 2022-03-25 CN CN202210306564.2A patent/CN114764682B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014092230A1 (en) * | 2012-12-13 | 2014-06-19 | 대한민국 (식품의약품안전청장) | System and method for inspecting imported food-based harm prediction |
CN111461576A (en) * | 2020-04-27 | 2020-07-28 | 宁波市食品检验检测研究院 | Fuzzy comprehensive evaluation method for safety risk of chemical hazards in food |
CN111582718A (en) * | 2020-05-08 | 2020-08-25 | 国网安徽省电力有限公司电力科学研究院 | Cable channel fire risk assessment method and device based on network analytic hierarchy process |
Non-Patent Citations (1)
Title |
---|
程加迁等.蔬菜水果重金属膳食暴露评估中风险权重的确定方法.《食品科学》.2018,全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN114764682A (en) | 2022-07-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114764682B (en) | Rice safety risk assessment method based on multi-machine learning algorithm fusion | |
CN107918921A (en) | Criminal case court verdict measure and system | |
CN111461576A (en) | Fuzzy comprehensive evaluation method for safety risk of chemical hazards in food | |
CN113191926B (en) | Method and system for identifying grain and oil crop supply chain hazard based on deep integrated learning network | |
CN107704883A (en) | A kind of sorting technique and system of the grade of magnesite ore | |
CN115602337A (en) | Cryptocaryon irritans disease early warning method and system based on machine learning | |
CN111476274B (en) | Big data predictive analysis method, system, device and storage medium | |
Tembusai et al. | K-nearest neighbor with K-fold cross validation and analytic hierarchy process on data classification | |
Wang et al. | Mushroom toxicity recognition based on multigrained cascade forest | |
Bąk et al. | Fuzzy cognitive maps and their application in the economic sciences | |
CN116502887A (en) | Rice processing chain risk evaluation method based on unsupervised clustering and extreme learning machine | |
CN113837266B (en) | Software defect prediction method based on feature extraction and Stacking ensemble learning | |
CN112766739B (en) | Method for evaluating heavy metal pollution in meat product based on BWM-E model | |
CN113205274A (en) | Quantitative ranking method for construction quality | |
Gallo et al. | A neural network model for classifying olive farms | |
CN110659996A (en) | Stock investment risk early warning system and method based on machine learning | |
Rianasari et al. | The classification of mushroom types using Naïve Bayes and principal component analysis | |
CN111062118A (en) | Multilayer soft measurement modeling system and method based on neural network prediction layering | |
Liu | Deconstruction and Implementation of Strategic Human Resource Management Evaluation Algorithm Using Data Mining Technology | |
CN114398493B (en) | Unmanned aerial vehicle type spectrum construction method based on fuzzy clustering and cost-effectiveness value | |
CN115310999B (en) | Enterprise electricity behavior analysis method and system based on multi-layer perceptron and sequencing network | |
CN115423148B (en) | Agricultural machinery operation performance prediction method and device based on Ke Li jin method and decision tree | |
CN112308319B (en) | Prediction method and device for civil aviation member passenger loss | |
CN112365168A (en) | Method for evaluating ambient air quality based on principal component analysis | |
CN113111961B (en) | Agricultural product information classification processing method and system based on three decision models |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |