CN105046362A - Real-time prediction method of food safety on the basis of association rule mining - Google Patents
Real-time prediction method of food safety on the basis of association rule mining Download PDFInfo
- Publication number
- CN105046362A CN105046362A CN201510440249.9A CN201510440249A CN105046362A CN 105046362 A CN105046362 A CN 105046362A CN 201510440249 A CN201510440249 A CN 201510440249A CN 105046362 A CN105046362 A CN 105046362A
- Authority
- CN
- China
- Prior art keywords
- food
- risk
- factors
- risk value
- safety
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 235000013305 food Nutrition 0.000 title claims abstract description 185
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000005065 mining Methods 0.000 title claims abstract description 28
- 238000004364 calculation method Methods 0.000 claims description 18
- 238000004519 manufacturing process Methods 0.000 claims description 16
- 239000004615 ingredient Substances 0.000 claims description 11
- 238000003860 storage Methods 0.000 claims description 7
- 238000011156 evaluation Methods 0.000 claims description 5
- 238000009826 distribution Methods 0.000 claims description 4
- 239000000126 substance Substances 0.000 description 12
- 241000287828 Gallus gallus Species 0.000 description 11
- 235000015277 pork Nutrition 0.000 description 9
- 235000012041 food component Nutrition 0.000 description 7
- 239000005417 food ingredient Substances 0.000 description 6
- 238000007418 data mining Methods 0.000 description 5
- 235000002566 Capsicum Nutrition 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 239000000796 flavoring agent Substances 0.000 description 3
- 235000019634 flavors Nutrition 0.000 description 3
- 235000013373 food additive Nutrition 0.000 description 3
- 239000002778 food additive Substances 0.000 description 3
- 239000002994 raw material Substances 0.000 description 3
- 150000003839 salts Chemical class 0.000 description 3
- 235000015067 sauces Nutrition 0.000 description 3
- 241000894006 Bacteria Species 0.000 description 2
- 241000758706 Piperaceae Species 0.000 description 2
- 244000052616 bacterial pathogen Species 0.000 description 2
- 238000011109 contamination Methods 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 230000000813 microbial effect Effects 0.000 description 2
- 241000272525 Anas platyrhynchos Species 0.000 description 1
- 239000006002 Pepper Substances 0.000 description 1
- 241000722363 Piper Species 0.000 description 1
- 235000016761 Piper aduncum Nutrition 0.000 description 1
- 235000017804 Piper guineense Nutrition 0.000 description 1
- 235000008184 Piper nigrum Nutrition 0.000 description 1
- 238000003915 air pollution Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 235000013601 eggs Nutrition 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000383 hazardous chemical Substances 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 210000003739 neck Anatomy 0.000 description 1
- 244000052769 pathogen Species 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 238000009781 safety test method Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 235000013580 sausages Nutrition 0.000 description 1
- 238000003307 slaughter Methods 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域 technical field
本发明涉及到食品的安全领域,具体的说是一种基于关联规则挖掘的食品安全实时预测方法。 The invention relates to the field of food safety, in particular to a real-time food safety prediction method based on association rule mining.
背景技术 Background technique
食品安全溯源系统最早是1997年欧盟为应对“疯牛病”问题而逐步建立并完善起来。它强调产品的唯一标识和过程追踪,在产品的生产、运输、存储、销售等各个环节,实行ISO9001等质量控制方法进行跟踪与追溯,一旦发生食品安全问题,可以有效地追踪到食品的去向,及时召回不合格产品,将损失降到最低。 The food safety traceability system was first established and perfected by the European Union in 1997 to deal with the "mad cow disease" problem. It emphasizes the unique identification and process tracking of products, and implements ISO9001 and other quality control methods for tracking and tracing in all aspects of product production, transportation, storage, and sales. Once a food safety problem occurs, the whereabouts of the food can be effectively traced. Recall unqualified products in time to minimize losses.
现有的食品安全溯源系统功能单一,食品溯源应用只能完成对食品及其原料本身相关数据的收集,以及一些直观的简单追溯和信息统计等应用。例如,一块市场出售的猪肉发现微生物污染或携带有某种病菌,现有的追溯系统可以追查出该块猪肉来自哪里,但并不能解释微生物污染或出现病菌的原因,更不能预测与该块猪肉同批次的其它猪肉是否也存在问题,也无法预测以猪肉为原料加工的其他食品是否也存在安全问题。猪肉的微生物污染或携带的病菌可能发生在养殖场、屠宰车间、运输过程、销售环节等,在此过程中,猪肉在不同环节出现事故对其他食品的影响程度也不一样。现有系统无法将与问题食品相关的其他食品进行关联,更无法预测其他食品的安全程度。 The existing food safety traceability system has a single function, and food traceability applications can only complete the collection of data related to food and its raw materials, as well as some intuitive and simple traceability and information statistics applications. For example, if a piece of pork sold in the market is found to be contaminated with microorganisms or to carry certain germs, the existing traceability system can trace where the piece of pork came from, but it cannot explain the cause of the microbial contamination or the emergence of germs, let alone predict the origin of the pork. Whether other pork in the same batch also has problems, and it is impossible to predict whether other foods processed with pork as raw materials also have safety problems. Microbial contamination or bacteria carried by pork may occur in farms, slaughtering workshops, transportation processes, sales links, etc. During this process, pork accidents in different links have different impacts on other foods. Existing systems cannot correlate other foods related to the problem food, let alone predict the safety level of other foods.
关联规则挖掘是数据挖掘技术中的一个重要的方向,由AgrawalR.等人首先提出。最初是为了从交易数据库中发现数据属性间的联系,它的一个典型的应用是购物篮分析。关联规则挖掘能够从大量的数据中挖掘出有价值的、描述数据中属性之间的相关联系的知识。 Association rule mining is an important direction in data mining technology, which was first proposed by AgrawalR. et al. It was originally designed to discover the relationship between data attributes from transaction databases. A typical application of it is market basket analysis. Association rule mining can dig out valuable knowledge describing the relationship between attributes in the data from a large amount of data.
由于食品安全检测数据之间有时序关联、因果关联等关系,所以采用基于关联规则挖掘的数据挖掘方法。基于关联规则挖掘的数据挖掘方法,能够发现与问题食品相关的满足最小支持度和最小置信度的其他食品。 Due to the time-series correlation, causal correlation and other relationships among food safety testing data, a data mining method based on association rule mining is adopted. The data mining method based on association rule mining can find other foods that meet the minimum support and minimum confidence related to the problem food.
发明内容 Contents of the invention
为了解决现有技术中食品安全追溯系统不能根据问题食品的安全性对与之相关的其他食品的安全性进行预测的问题,本发明提供了一种基于关联规则挖掘的食品安全实时预测方法,可以对系统数据库中的食品安全信息进行实时监测、分析、评估,对风险因子的危害程度进行不断更新,然后根据衡量食品风险的指标体系实时预测食品的风险系数,以图形和数字的方式直观的显示出来。 In order to solve the problem that the food safety traceability system in the prior art cannot predict the safety of other related foods based on the safety of the problem food, the present invention provides a real-time food safety prediction method based on association rule mining, which can Carry out real-time monitoring, analysis and evaluation of food safety information in the system database, continuously update the degree of hazard of risk factors, and then predict the risk coefficient of food in real time according to the index system for measuring food risk, and display it intuitively in the form of graphics and numbers come out.
本发明为解决上述技术问题所采用的技术方案为:一种基于关联规则挖掘的食品安全实时预测方法,包括以下步骤: The technical solution adopted by the present invention to solve the above technical problems is: a real-time food safety prediction method based on association rule mining, comprising the following steps:
1)收集各种食品的生产流通的安全信息,并将这些信息录入到系统数据库中; 1) Collect safety information on the production and distribution of various foods, and enter this information into the system database;
2)通过关联规则挖掘找出与问题食品相关联的食品; 2) Find out the food associated with the problem food through association rule mining;
3)计算相关食品的风险值; 3) Calculate the risk value of the relevant food;
4)根据计算得到的风险值判断该项食品是否安全,并将信息录入到系统数据库中。 4) Judge whether the food is safe according to the calculated risk value, and enter the information into the system database.
所述步骤2)中通过关联规则挖掘找出与问题食品相关联的食品分为三个步骤,第一步找出所有满足支持度的频繁集,第二步使用频繁集生成关联规则,第三步通过关联规则找出满足最小支持度和可信度的与问题食品相关的其他食品;具体操作如下: In the step 2), it is divided into three steps to find out the food associated with the problem food through association rule mining. The first step is to find out all frequent sets that meet the support degree. The second step is to use the frequent set to generate association rules. The third step The first step is to find out other foods related to the problem food that meet the minimum support and credibility through association rules; the specific operations are as follows:
利用广度优先算法Apriori对系统数据库进行逐层搜索,即利用K-项集探索(K+1)-项集,找出频繁1-项集的集合,将该集合记作,用于找频繁2-项集的集合,而用于找,依此类推,直到不能找到频繁k-项集: Use the breadth-first algorithm Apriori to search the system database layer by layer, that is, use K-itemsets to explore (K+1)-itemsets, find out the set of frequent 1-itemsets, and record the set as , A collection for finding frequent 2-itemsets ,and used to find , and so on, until no frequent k-itemsets can be found:
设定是由m个不同的数据项目组成的集合,其中元素称为项,项的集合称为项集; set up It is a set composed of m different data items, where the elements are called items, and the collection of items is called an item set;
给定一个事务数据库,其中每一个事物T是项集I的一个子集,即; Given a transactional database , where each thing T is a subset of the itemset I, namely ;
为D中的总事务数,X、Y都是T中的项或项集,; is the total number of transactions in D, X and Y are both items or itemsets in T, ;
如果事务T同时包含X和Y,那么就可以得到关联规则: If the transaction T contains both X and Y, then the association rule can be obtained:
(1) (1)
式中,为满足条件的事务T在事务数据库D中所占的比例,即支持度Support,计算公式如下: In the formula, The proportion of the transaction T that satisfies the condition in the transaction database D, that is, the support degree Support, is calculated as follows:
(2) (2)
(3); (3);
根据公式(2)和公式(3)计算出满足最小支持度和可信度的与问题食品相关的其他食品。 According to formula (2) and formula (3), other foods related to the problem food that meet the minimum support and credibility are calculated.
所述步骤3)中计算相关食品的风险值的具体操作如下: The specific operation of calculating the risk value of the relevant food in the step 3) is as follows:
首先,将影响食品安全性的风险因子分为内部因子、外部因子和附加因子,分别计算内部因子、外部因子和附加因子的权重、和,且; First, the risk factors affecting food safety are divided into internal factors , external factors and additional factors , respectively calculate the internal factor , external factors and additional factors the weight of , and ,and ;
其中,内部因子的权重为 Among them, the internal factor the weight of for
(4) (4)
外部因子的权重为 external factors the weight of for
(5) (5)
其次,分别计算内部因子、外部因子和附加因子的风险系数、、; Second, the internal factors are calculated separately , external factors and additional factors risk factor , , ;
其中,内部因子的风险系数为 Among them, the internal factor risk factor for
(6) (6)
式中,表示第种配料,表示第种配料的风险值,为的权重,且; In the formula, Indicates the first ingredients, Indicates the first the risk value of an ingredient, for the weight of , and ;
外部因子的风险系数为 external factors risk factor for
(7) (7)
其中,,分别表示食品的生产、运输、存储、销售环境的风险值,外部因子风险值和权重的计算方法与食品配料风险值和权重的计算方法类似,采用加权和的方法;在本发明中,最后一级风险值,即食品风险指标体系中有害物的风险值,是由食品领域的专家根据食品有害物的危害程度评估出来的,包括企业信誉和消费者反馈所代表的风险值。其余各阶段的风险值是由它下一级的风险值和风险因子的权重计算出来的,每一阶段的权重在实施例中都已给出计算公式。 in, , Respectively represent the risk value of the production, transportation, storage and sales environment of food, the calculation method of external factor risk value and weight is similar to the calculation method of food ingredient risk value and weight, adopt the method of weighted sum; in the present invention, the last Level risk value, that is, the risk value of harmful substances in the food risk index system, is evaluated by experts in the food field based on the degree of harm of food harmful substances, including the risk value represented by corporate reputation and consumer feedback. The risk value of the remaining stages is calculated from the risk value of its next level and the weight of the risk factor, and the weight of each stage The calculation formulas have been given in the examples.
附加因子的风险系数为 additional factor risk factor for
(8) (8)
式中,表示食品生产企业的信誉风险值,表示消费者反馈反映出的风险值,,、、、的值由食品领域专家综合评估设定; In the formula, Indicates the credit risk value of food production enterprises, Indicates the risk value reflected by consumer feedback, , , , , The value of is set by the comprehensive evaluation of experts in the food field;
最后,根据公式计算该食品的风险值,式中,、、分别表示内部因子、外部因子、附加因子的风险系数,。 Finally, according to the formula Calculate the risk value of the food, where, , , represent the risk coefficients of internal factors, external factors, and additional factors, .
本发明的思路为:当某种食品出现安全问题时,首先,使用基于关联规则的挖掘方法找出与问题食品相关的其他食品以及它们之间的相关度,然后依据食品风险指标体系度量影响这些相关食品的风险因子的权重值和风险系数,最后根据风险因子的权重值和风险因子的风险系数计算出相关食品的风险值。如果在对食品的某个风险指标进行度量时,发现其超出正常范围,则直接将该食品定为高风险食品。在整个过程中,风险因子的权重和风险系数在使用前都要重新度量,并将最新值更新到数据库中。 The idea of the present invention is: when a certain food has a safety problem, first, use the mining method based on association rules to find out other foods related to the problem food and the correlation between them, and then measure the influence of these foods according to the food risk index system. The weight value and risk coefficient of the risk factor of the relevant food, and finally calculate the risk value of the relevant food according to the weight value of the risk factor and the risk coefficient of the risk factor. If a risk indicator of a food is measured and found to be outside the normal range, the food will be directly classified as a high-risk food. Throughout the process, the weights of risk factors and risk coefficients are re-measured before use, and the latest values are updated into the database.
由于影响食品安全的因素繁多,所以食品风险指标也有很多,这些指标中大部分可以量化为一个具体的值。影响食品安全的指标中有些是食品配料中的有害物质,有些是食品在生产、运输、存储、销售过程中所接触的环境中的有害物质,另外,消费者的反馈和生产企业的信誉可以从侧面反映食品的安全程度。在这里把所有对食品安全有影响的因子统称为称为有害物。 Since there are many factors affecting food safety, there are also many food risk indicators, and most of these indicators can be quantified into a specific value. Some of the indicators that affect food safety are harmful substances in food ingredients, and some are harmful substances in the environment that food comes into contact with during production, transportation, storage, and sales. In addition, consumer feedback and the reputation of manufacturers can be obtained from The side reflects the safety degree of food. Here, all factors that affect food safety are collectively referred to as harmful substances.
有益效果:本发明提出了一种基于关联规则挖掘的食品安全实时预测方法,它可以发现与某问题食品安全性相关的其他食品,并可以通过度量历史数据库中导致某食品出现安全问题的各种风险因子的次数,自动调节这些影响食品安全的风险因子的权重,从而计算出所有风险因子对某食品的综合影响,即食品的安全程度,为消费者提供食品安全参考,为决策者提供决策依据等。同时,本发明也可以对系统数据库中的食品安全信息进行实时监测、分析、评估,对风险因子的危害程度进行不断更新,然后根据衡量食品风险的指标体系实时预测食品的风险系数,以图形和数字的方式直观的显示出来,从而直观的看出在某种食品出现问题时,跟其相关的食品是否存在安全隐患。 Beneficial effects: the present invention proposes a real-time food safety prediction method based on association rule mining, which can find other foods related to a certain food safety problem, and can measure various food safety problems in the historical database. The number of risk factors automatically adjusts the weight of these risk factors that affect food safety, so as to calculate the comprehensive impact of all risk factors on a certain food, that is, the degree of food safety, providing food safety reference for consumers and decision-making basis for decision makers wait. At the same time, the present invention can also monitor, analyze and evaluate the food safety information in the system database in real time, continuously update the degree of hazard of risk factors, and then predict the risk coefficient of food in real time according to the index system for measuring food risk, and use graphics and The numbers are displayed intuitively, so that when a certain food has a problem, whether there is a safety hazard in the related food.
附图说明 Description of drawings
图1为本发明实施例中食品风险指标体系图; Fig. 1 is the food risk index system diagram in the embodiment of the present invention;
图2为实施例中食品风险值的计算流程; Fig. 2 is the calculation process of food risk value in the embodiment;
图3为实施例中食品风险预测曲线图。 Fig. 3 is the curve chart of food risk prediction in the embodiment.
具体实施方式 Detailed ways
一种基于关联规则挖掘的食品安全实时预测方法,包括以下步骤: A real-time food safety prediction method based on association rule mining, comprising the following steps:
1)收集各种食品的生产流通的安全信息,并将这些信息录入到系统数据库中; 1) Collect safety information on the production and distribution of various foods, and enter this information into the system database;
2)通过关联规则挖掘找出与问题食品相关联的食品; 2) Find out the food associated with the problem food through association rule mining;
通过关联规则挖掘找出与问题食品相关联的食品分为三个步骤,第一步找出所有满足支持度的频繁集,第二步使用频繁集生成关联规则,第三步通过关联规则找出满足最小支持度和可信度的与问题食品相关的其他食品;具体操作如下: Finding the food associated with the problem food through association rule mining is divided into three steps. The first step is to find out all frequent sets that meet the support degree. The second step is to use the frequent set to generate association rules. The third step is to find out through association rules. Other foods related to the problematic food that meet the minimum support and credibility; the specific operations are as follows:
利用广度优先算法Apriori对系统数据库进行逐层搜索,即利用K-项集探索(K+1)-项集,找出频繁1-项集的集合,将该集合记作,用于找频繁2-项集的集合,而用于找,依此类推,直到不能找到频繁k-项集: Use the breadth-first algorithm Apriori to search the system database layer by layer, that is, use K-itemsets to explore (K+1)-itemsets, find out the set of frequent 1-itemsets, and record the set as , A collection for finding frequent 2-itemsets ,and used to find , and so on, until no frequent k-itemsets can be found:
设定是由m个不同的数据项目组成的集合,其中元素称为项,项的集合称为项集; set up It is a set composed of m different data items, where the elements are called items, and the collection of items is called an item set;
给定一个事务数据库,其中每一个事物T是项集I的一个子集,即; Given a transactional database , where each thing T is a subset of the itemset I, namely ;
为D中的总事务数,X、Y都是T中的项或项集,; is the total number of transactions in D, X and Y are both items or itemsets in T, ;
如果事务T同时包含X和Y,那么就可以得到关联规则: If the transaction T contains both X and Y, then the association rule can be obtained:
(1) (1)
式中,为满足条件的事务T在事务数据库D中所占的比例,即支持度Support,计算公式如下: In the formula, The proportion of the transaction T that satisfies the condition in the transaction database D, that is, the support degree Support, is calculated as follows:
(2) (2)
(3); (3);
根据公式(2)和公式(3)计算出满足最小支持度和可信度的与问题食品相关的其他食品; According to formula (2) and formula (3), calculate other foods related to the problem food that meet the minimum support and credibility;
3)计算相关食品的风险值,具体操作如下: 3) Calculate the risk value of the relevant food, the specific operation is as follows:
首先,将影响食品安全性的风险因子分为内部因子、外部因子和附加因子,分别计算内部因子、外部因子和附加因子的权重、和,且; First, the risk factors affecting food safety are divided into internal factors , external factors and additional factors , respectively calculate the internal factor , external factors and additional factors the weight of , and ,and ;
其中,内部因子的权重为 Among them, the internal factor the weight of for
(4) (4)
外部因子的权重为 external factors the weight of for
(5) (5)
其次,分别计算内部因子、外部因子和附加因子的风险系数、、; Second, the internal factors are calculated separately , external factors and additional factors risk factor , , ;
其中,内部因子的风险系数为 Among them, the internal factor risk factor for
(6) (6)
式中,表示第种配料,表示第种配料的风险值,为的权重,且; In the formula, Indicates the first ingredients, Indicates the first the risk value of an ingredient, for the weight of , and ;
外部因子的风险系数为 external factors risk factor for
(7) (7)
其中,,分别表示食品的生产、运输、存储、销售环境的风险值,外部因子风险值和权重的计算方法与内部因子风险值和权重的计算方法相同,均采用加权和的方法计算; in, , Respectively represent the risk value of food production, transportation, storage, and sales environment. The calculation method of the external factor risk value and weight is the same as that of the internal factor risk value and weight, and both are calculated by weighted sum method;
附加因子的风险系数为 additional factor risk factor for
(8) (8)
式中,表示食品生产企业的信誉风险值,表示消费者反馈反映出的风险值,,、、、的值由食品领域专家综合评估设定; In the formula, Indicates the credit risk value of food production enterprises, Indicates the risk value reflected by consumer feedback, , , , , The value of is set by the comprehensive evaluation of experts in the food field;
最后,根据公式计算该食品的风险值,式中,、、分别表示内部因子、外部因子、附加因子的风险系数,; Finally, according to the formula Calculate the risk value of the food, where, , , represent the risk coefficients of internal factors, external factors, and additional factors, ;
4)根据计算得到的风险值判断该项食品是否安全,并将信息录入到系统数据库中。 4) Judge whether the food is safe according to the calculated risk value, and enter the information into the system database.
以上为本发明的基本事实方式,下面结合具体实施例对本发明做进一步的阐述。 The above is the basic factual mode of the present invention, and the present invention will be further elaborated below in conjunction with specific embodiments.
当某种食品出现安全问题时,首先,使用基于关联规则的挖掘方法找出与问题食品相关的其他食品以及它们之间的相关度,然后依据食品风险指标体系度量影响这些相关食品的风险因子的权重值和风险系数,最后根据风险因子的权重值和风险因子的风险系数计算出相关食品的风险值。如果在对食品的某个风险指标进行度量时,发现其超出正常范围,则直接将该食品定为高风险食品。在整个过程中,风险因子的权重和风险系数在使用前都要重新度量,并将最新值更新到数据库中。 When there is a food safety problem, first, use the mining method based on association rules to find out other foods related to the problem food and the correlation between them, and then measure the risk factors affecting these related foods according to the food risk index system. Finally, calculate the risk value of the relevant food according to the weight value of the risk factor and the risk coefficient of the risk factor. If a risk indicator of a food is measured and found to be outside the normal range, the food will be directly classified as a high-risk food. Throughout the process, the weights of risk factors and risk coefficients are re-measured before use, and the latest values are updated into the database.
1关联规则挖掘 1 Association rule mining
关联规则分析方法的挖掘算法有很多,根据食品检测项目数据分布不均的特点,选择广度优先算法Apriori。它的基本思想是:频繁项集的所有非空子集都必须也是频繁的。Apriori使用一种称作逐层搜索的迭代方法,K-项集用于探索(K+1)-项集。首先,找出频繁1-项集的集合,该集合记作。用于找频繁2-项集的集合,而用于找,如此下去,直到不能找到频繁k-项集。找每个需要一次数据库扫描。 There are many mining algorithms for association rule analysis methods. According to the characteristics of uneven distribution of food testing item data, the breadth-first algorithm Apriori is selected. Its basic idea is: all non-empty subsets of frequent itemsets must also be frequent. Apriori uses an iterative method called layer-by-layer search, and K-itemsets are used to explore (K+1)-itemsets. First, find out the set of frequent 1-itemsets, which is denoted as . A collection for finding frequent 2-itemsets ,and used to find , and so on, until no frequent k-itemsets can be found. find each A database scan is required.
设是由m个不同的数据项目组成的集合,其中元素称为项,项的集合称为项集。给定一个事务数据库,其中每一个事物T是项集I的一个子集,即;为D中的总事务数。X、Y都是T中的项或项集,。如果事务T同时包含X和Y,那么就可以得到关联规则: set up It is a collection of m different data items, where the elements are called items, and the collection of items is called an itemset. Given a transactional database , where each thing T is a subset of the itemset I, namely ; is the total number of transactions in D. Both X and Y are items or itemsets in T, . If the transaction T contains both X and Y, then the association rule can be obtained:
(1) (1)
式中为满足条件的事务T在事务数据库D中所占的比例,即支持度(Support),计算公式如下: In the formula The calculation formula is as follows:
(2) (2)
(3) (3)
数据挖掘目标是选择同时大于置信度阈值和支持度阈值的强关联规则。数据挖掘分为两个步骤,第一步找出所有满足支持度的频繁集;第二步使用频繁集生成关联规则。由于第一步需多次扫描事务数据库,时间和空间的消耗是制约挖掘效率的关键。 The goal of data mining is to select strong association rules that are greater than both the confidence threshold and the support threshold. Data mining is divided into two steps. The first step is to find out all frequent sets that meet the support degree; the second step is to use frequent sets to generate association rules. Since the first step needs to scan the transaction database multiple times, the consumption of time and space is the key to restrict the mining efficiency.
通过关联规则挖掘法找出与问题食品满足一定关系的其他食品,例如,如果市场上猪肉出现了病菌,则通过关联规则可以找出以猪肉为原料的火腿肠、酱猪蹄等食品,以及它们之间的相关度。如果超市中销售的袋装酱鸡翅出现病菌污染,则与之相关的酱鸡腿、卤香蛋、酱鸭脖等也有可能出现问题,因为它们含有很多相同的配料,如食用香精、食盐、泡椒、食品添加剂等。把食品看做一个事务,食品配料看做事务的项,通过事务和事务进行挖掘,计算相关度。总之,通过关联规则法都可以找出满足最小支持度和可信度的与问题食品相关的其他食品。 Use the association rule mining method to find out other foods that satisfy a certain relationship with the problem food. For example, if there are pathogens in pork on the market, you can find out ham sausage, sauced trotters and other foods that use pork as raw materials, and their correlation between. If the bagged sauced chicken wings sold in the supermarket are contaminated by bacteria, the related sauced chicken legs, marinated eggs, sauced duck necks, etc. may also have problems, because they contain many of the same ingredients, such as food flavors, salt, pickled peppers , food additives, etc. Think of food as a transaction, food ingredients as items of the transaction, through the transaction and affairs Perform mining and calculate correlation. In a word, other foods related to the problem food can be found through the association rule method and satisfy the minimum support and credibility.
2风险值的设置 2 Setting of risk value
在本发明中将食品风险值的范围定为,其中设置为低风险,设置为中风险,设置为高风险,风险等级对应的范围由食品领域的专家设定,且可以调整。在以下情况中,食品的风险等级直接判定为高风险且风险值为10,无需按食品风险指标体系进行加权计算: In the present invention, the scope of food risk value is defined as ,in set to low risk, set to medium risk, Set to high risk, the range corresponding to the risk level is set by experts in the food field and can be adjusted. In the following cases, the risk level of the food is directly judged as high risk and the risk value is 10, and there is no need to carry out weighted calculation according to the food risk index system:
①食品中的有害物质有些属于限用,有些属于禁用,限用只要在规定范围内对人无害,禁用则是不允许出现的。在食品抽检中,若出现禁用的有害物,则该食品风险值直接设置为10; ① Some harmful substances in food are restricted, and some are prohibited. As long as the restricted use is harmless to humans within the specified range, prohibited substances are not allowed. In the food sampling inspection, if there are prohibited harmful substances, the food risk value is directly set to 10;
②若食品中某限用有害物超出某个值,则该食品风险值直接设置为10; ② If a restricted hazardous substance in food exceeds a certain value, the food risk value is directly set to 10;
③若某食品在近一个月内被投诉次数与它所属类别的食品投诉总次数的比例超过的,则该食品风险值直接设为10; ③If the ratio of the number of complaints against a certain food to the total number of complaints of the category it belongs to in the past month exceeds , then the food risk value is directly set to 10;
④某食品的有害物不合格率超过某个值,则该食品风险值直接设为10,不合格率=(有害物不合格次数/检测总次数)。 ④ If the unqualified rate of harmful substances in a food exceeds a certain value, the risk value of the food is directly set to 10, and the unqualified rate = (number of unqualified harmful substances/total number of inspections).
以上这些指标并非固定的,领域专家可以根据实际情况进行调整,还可以增加一些指标。相应的还可以设定一些直接将食品判定为中等风险的特殊情况。 The above indicators are not fixed, and experts in the field can adjust them according to the actual situation, and some indicators can also be added. Correspondingly, some special circumstances can be set up to directly judge the food as medium risk.
3风险值评估 3 risk value assessment
通过统计食品安全事件的历史记录,得到影响食品安全的风险因子的权重,权重表示某一风险因子对食品的影响程度,权重越大说明该因子对食品的安全性影响越明显。例如,对于超市中销售的袋装的酱鸡腿,影响酱鸡腿安全性的风险因子有很多。根据食品风险指标体系,影响酱鸡腿的风险因子分为内部因子、外部因子和附加因子,内部因子有:食用香精、食盐、泡椒、食品添加剂等配料,外部因子有:生鸡的生长环境污染指数、酱鸡腿生产车间的菌落指数、生产车间的温度、以及运输、存储、销售环节酱鸡腿接触的大气污染指数和环境温度等,附加因子有:生产企业的信誉、消费者投诉次数等。酱鸡腿的风险值,其中、、分别表示内部因子、外部因子、附加因子的风险系数,,的权重为,的权重为,的权重为(常数,根据历史经验评估的值),和分别表示历史记录中由于内部因子、外部因子导致的酱鸡腿安全事件的次数,表示两者之和。 The weight of risk factors affecting food safety is obtained by statistically recording the historical records of food safety incidents. The weight indicates the degree of influence of a certain risk factor on food. The greater the weight, the more obvious the impact of the factor on food safety. For example, for the bagged sauced chicken legs sold in supermarkets, there are many risk factors affecting the safety of sauced chicken drumsticks. According to the food risk index system, the risk factors affecting sauced chicken legs are divided into internal factors , external factors and additional factors , the internal factors include: food flavors, salt, pickled peppers, food additives and other ingredients, and the external factors include: the pollution index of the raw chicken growth environment, the colony index of the sauce chicken leg production workshop, the temperature of the production workshop, and the links of transportation, storage and sales The air pollution index and ambient temperature that sauced chicken legs are exposed to, and additional factors include: the reputation of the production company, the number of consumer complaints, etc. Risk value of chicken drumsticks in sauce ,in , , represent the risk coefficients of internal factors, external factors, and additional factors, , has a weight of , has a weight of , has a weight of (constant, value evaluated from historical experience), and Respectively represent the number of safety incidents of chicken drumsticks in sauce caused by internal factors and external factors in the historical records, represents the sum of the two.
在内部因子中又有很多内部因子的子因子,例如上述酱鸡腿的例子中,食用香精、食盐、泡椒、食品添加剂,这些子因子的权重分别为:,,,,其中。同理,外部因子中也有很多子因子,这些子因子的权重表示方式与内部子因子的表示方式类似。 Among the internal factors, there are many sub-factors of internal factors, such as the above-mentioned example of sauced chicken legs, food flavor ,salt , pickled pepper , food additives , and the weights of these subfactors are: , , , ,in . Similarly, there are many sub-factors in the external factors, and the weight representation of these sub-factors is similar to that of the internal sub-factors.
综上,内部因子权重的一般表示公式为: In summary, the general expression formula of the internal factor weight is:
(4) (4)
外部因子权重的一般表示公式为: The general expression formula of external factor weight is:
(5) (5)
附加因子的权重为,是一个大于0小于1的常数。它的值由领域专家根据历史经验评估出的。 The weight of the additional factor is , is a constant greater than 0 and less than 1. Its value is estimated by domain experts based on historical experience.
食品配料的风险值,其中表示该食品的第种配料;表示配料中第种有害物的风险值,该值由领域专家设定;为第种有害物的权重,。食品由配料影响的风险值,即内部因子的风险值为: Value at Risk of Food Ingredients ,in Indicates that the food's ingredients; Indicates the No. The risk value of a harmful substance, which is set by domain experts; for the first the weight of a harmful substance, . The risk value of food affected by ingredients, that is, the risk value of internal factors:
(6) (6)
其中,表示第种配料,表示第种配料的风险值,为的权重,且。 in, Indicates the first ingredients, Indicates the first the risk value of an ingredient, for the weight of , and .
外部因子的风险值为: The risk value of the external factor is:
(7) (7)
其中,,分别表示食品的生产、运输、存储、销售环境的风险值,外部因子风险值和权重的计算方法与食品配料风险值和权重的计算方法类似,它们的计算方法与食品配料的风险值计算方法类似,采用加权和的方法。 in, , Represents the risk value of food production, transportation, storage, and sales environments, the calculation method of risk value and weight of external factors is similar to the calculation method of risk value and weight of food ingredients, and their calculation methods are similar to the calculation method of risk value of food ingredients , using a weighted sum method.
附加因子的风险值为: The risk value for the additional factor is:
(8) (8)
其中,表示食品生产企业的信誉风险值,表示消费者反馈反映出的风险值,,、、、的值由食品领域专家综合评估设定。 in, Indicates the credit risk value of food production enterprises, Indicates the risk value reflected by consumer feedback, , , , , The value of is set by the comprehensive evaluation of experts in the food field.
某一食品的风险值是: The risk value for a food is:
(9) (9)
其中,、、分别表示内部因子、外部因子、附加因子的风险系数,。 in, , , represent the risk coefficients of internal factors, external factors, and additional factors, .
4食品风险值实时预测 4Real-time prediction of food risk value
由于食品的风险值受多种因素的影响,所以每时每刻都在发生变化。系统对食品安全的预测是实时的,按照设定的频率每隔一定时间对所有食品的风险值进行一次度量,并将计算结果记录数据库中,便于查询。系统可以对每一种食品设定预警阈值,当食品的风险值超出阈值时系统自动报警。食品风险值预测图如图2所示。也可以将预测结果以图形和数字的方式显示出来,预测结果及食品安全走势一目了然。可以通过选择查看条件,查看近一周、近一个月、近一年等的食品风险值和安全走势,也可以查看某一类食品或某一种食品的风险值和安全走势。食品风险预测曲线图如图3所示。 Because the risk value of food is affected by many factors, it is changing every moment. The system's prediction of food safety is real-time, and the risk value of all foods is measured at regular intervals according to the set frequency, and the calculation results are recorded in the database for easy query. The system can set an early warning threshold for each food, and the system will automatically alarm when the risk value of the food exceeds the threshold. The food risk value prediction map is shown in Figure 2. The forecast results can also be displayed in graphics and numbers, and the forecast results and food safety trends can be seen at a glance. You can check the risk value and safety trend of food in the past week, month, year, etc. by selecting the viewing conditions, and you can also view the risk value and safety trend of a certain type of food or a certain type of food. The food risk prediction curve is shown in Figure 3.
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510440249.9A CN105046362A (en) | 2015-07-24 | 2015-07-24 | Real-time prediction method of food safety on the basis of association rule mining |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510440249.9A CN105046362A (en) | 2015-07-24 | 2015-07-24 | Real-time prediction method of food safety on the basis of association rule mining |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105046362A true CN105046362A (en) | 2015-11-11 |
Family
ID=54452889
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510440249.9A Pending CN105046362A (en) | 2015-07-24 | 2015-07-24 | Real-time prediction method of food safety on the basis of association rule mining |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105046362A (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105913164A (en) * | 2016-03-24 | 2016-08-31 | 张远 | Construction method of food safety risk early warning system |
CN106651166A (en) * | 2016-12-15 | 2017-05-10 | 中国南方电网有限责任公司电网技术研究中心 | Natural disaster risk processing method and system based on Internet of Things |
CN107870956A (en) * | 2016-09-28 | 2018-04-03 | 腾讯科技(深圳)有限公司 | A high-utility itemset mining method, device and data processing equipment |
CN107871277A (en) * | 2017-07-25 | 2018-04-03 | 平安普惠企业管理有限公司 | The method and computer-readable recording medium that server, customer relationship are excavated |
CN109801005A (en) * | 2019-03-26 | 2019-05-24 | 北京金和网络股份有限公司 | The construction method of food safety risk model based on machine learning |
CN110807060A (en) * | 2019-10-30 | 2020-02-18 | 北京普瑞华夏国际教育科技有限公司 | Education big data analysis system |
CN111222767A (en) * | 2019-12-29 | 2020-06-02 | 航天信息股份有限公司 | Grain and food flow process quality safety risk assessment method and system |
CN111341446A (en) * | 2020-02-11 | 2020-06-26 | 中山大学 | Personalized physical examination package recommendation method |
CN111382918A (en) * | 2018-12-28 | 2020-07-07 | 内蒙古伊利实业集团股份有限公司 | Food monitoring method and system |
CN111915206A (en) * | 2020-08-11 | 2020-11-10 | 成都市食品药品检验研究院 | Method for recognizing food risk conduction |
CN112232703A (en) * | 2019-12-09 | 2021-01-15 | 马鞍山钢铁股份有限公司 | Casting blank quality determination method and system |
CN113112279A (en) * | 2021-03-16 | 2021-07-13 | 中国科学院计算机网络信息中心 | Imported cold chain food tracing method and system based on secondary tracing |
CN113762764A (en) * | 2021-09-02 | 2021-12-07 | 南京大学 | A system and method for automatic classification and early warning of imported food safety risks |
CN117352178A (en) * | 2023-11-10 | 2024-01-05 | 西安艾派信息技术有限公司 | Big data-based drug risk assessment system and method |
CN118378897A (en) * | 2024-06-21 | 2024-07-23 | 杭州祐全科技发展有限公司 | A data processing method and system for food safety risk identification |
-
2015
- 2015-07-24 CN CN201510440249.9A patent/CN105046362A/en active Pending
Non-Patent Citations (2)
Title |
---|
田春园: "基于数据挖掘的食品安全风险评价与预警系统", 《中国优秀硕士学位论文全文数据库工程科技I辑》 * |
第11期: "基于关联规则挖掘的食品安全信息预警模型", 《软科学》 * |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105913164A (en) * | 2016-03-24 | 2016-08-31 | 张远 | Construction method of food safety risk early warning system |
CN107870956A (en) * | 2016-09-28 | 2018-04-03 | 腾讯科技(深圳)有限公司 | A high-utility itemset mining method, device and data processing equipment |
CN107870956B (en) * | 2016-09-28 | 2021-04-27 | 腾讯科技(深圳)有限公司 | A high-utility itemset mining method, device and data processing equipment |
CN106651166B (en) * | 2016-12-15 | 2020-06-30 | 中国南方电网有限责任公司电网技术研究中心 | Natural disaster risk processing method and system based on Internet of things |
CN106651166A (en) * | 2016-12-15 | 2017-05-10 | 中国南方电网有限责任公司电网技术研究中心 | Natural disaster risk processing method and system based on Internet of Things |
CN107871277A (en) * | 2017-07-25 | 2018-04-03 | 平安普惠企业管理有限公司 | The method and computer-readable recording medium that server, customer relationship are excavated |
CN107871277B (en) * | 2017-07-25 | 2021-04-13 | 平安普惠企业管理有限公司 | Server, client relationship mining method and computer readable storage medium |
CN111382918A (en) * | 2018-12-28 | 2020-07-07 | 内蒙古伊利实业集团股份有限公司 | Food monitoring method and system |
CN109801005A (en) * | 2019-03-26 | 2019-05-24 | 北京金和网络股份有限公司 | The construction method of food safety risk model based on machine learning |
CN110807060A (en) * | 2019-10-30 | 2020-02-18 | 北京普瑞华夏国际教育科技有限公司 | Education big data analysis system |
CN112232703A (en) * | 2019-12-09 | 2021-01-15 | 马鞍山钢铁股份有限公司 | Casting blank quality determination method and system |
CN112232703B (en) * | 2019-12-09 | 2024-11-19 | 马鞍山钢铁股份有限公司 | Method and system for judging quality of casting billet |
CN111222767A (en) * | 2019-12-29 | 2020-06-02 | 航天信息股份有限公司 | Grain and food flow process quality safety risk assessment method and system |
CN111341446A (en) * | 2020-02-11 | 2020-06-26 | 中山大学 | Personalized physical examination package recommendation method |
CN111341446B (en) * | 2020-02-11 | 2022-11-29 | 中山大学 | A personalized medical examination package recommendation method |
CN111915206A (en) * | 2020-08-11 | 2020-11-10 | 成都市食品药品检验研究院 | Method for recognizing food risk conduction |
CN111915206B (en) * | 2020-08-11 | 2024-02-27 | 成都市食品药品检验研究院 | Method for identifying food risk conduction |
CN113112279A (en) * | 2021-03-16 | 2021-07-13 | 中国科学院计算机网络信息中心 | Imported cold chain food tracing method and system based on secondary tracing |
CN113762764A (en) * | 2021-09-02 | 2021-12-07 | 南京大学 | A system and method for automatic classification and early warning of imported food safety risks |
CN113762764B (en) * | 2021-09-02 | 2024-04-12 | 南京大学 | Automatic grading and early warning system and method for imported food safety risks |
CN117352178A (en) * | 2023-11-10 | 2024-01-05 | 西安艾派信息技术有限公司 | Big data-based drug risk assessment system and method |
CN118378897A (en) * | 2024-06-21 | 2024-07-23 | 杭州祐全科技发展有限公司 | A data processing method and system for food safety risk identification |
CN118378897B (en) * | 2024-06-21 | 2025-01-07 | 杭州祐全科技发展有限公司 | A data processing method and system for food safety risk identification |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105046362A (en) | Real-time prediction method of food safety on the basis of association rule mining | |
Roccato et al. | Analysis of domestic refrigerator temperatures and home storage time distributions for shelf-life studies and food safety risk assessment | |
Endrikat et al. | A comparative risk assessment for Listeria monocytogenes in prepackaged versus retail-sliced deli meat | |
Feng et al. | Modeling and evaluation on WSN-enabled and knowledge-based HACCP quality control for frozen shellfish cold chain | |
Luning et al. | Performance assessment of food safety management systems in animal-based food companies in view of their context characteristics: A European study | |
Yan et al. | Risk assessment and control of agricultural supply chains under Internet of Things | |
Masudin et al. | Traceability system model of Indonesian food cold-chain industry: A Covid-19 pandemic perspective | |
CN106096887B (en) | Pork cold chain logistics safety early warning method | |
Xiao et al. | Development and evaluation of an intelligent traceability system for frozen tilapia fillet processing | |
Schaefer et al. | International sourcing decisions in the wake of a food scandal | |
Wang et al. | Design of supply-chain pedigree interactive dynamic explore (SPIDER) for food safety and implementation of hazard analysis and critical control points (HACCPS) | |
Xiao et al. | Developing an intelligent traceability system for aquatic products in cold chain logistics integrated WSN with SPC | |
Augustin et al. | Design of control charts to monitor the microbiological contamination of pork meat cuts | |
Gallagher et al. | FSIS risk assessment for Listeria monocytogenes in deli meats | |
Kim et al. | Estimation of real-time remaining shelf life using mean kinetic temperature | |
Lambertini et al. | The Public Health Impact of Implementing a Concentration‐Based Microbiological Criterion for Controlling Salmonella in Ground Turkey | |
Fernandez-Piquer et al. | Preliminary stochastic model for managing Vibrio parahaemolyticus and total viable bacterial counts in a Pacific oyster (Crassostrea gigas) supply chain | |
Kusolchoo et al. | Digital technologies for food loss and waste in food supply chain management | |
Martínez-Simarro et al. | Applications and business impact of artificial intelligence in the industrial production of food and beverages | |
Kusbandhini et al. | Rice shelf-life prediction using support vector regression algorithm based on electronic nose dataset | |
Seo et al. | Contamination of Clostridium perfringens in soy sauce, and quantitative microbial risk assessment for C. perfringens through soy sauce consumption | |
Batlajery et al. | prFood: ontology principles for provenance and risk in the food domain | |
US10607174B2 (en) | Proactive simulation and detection of outbreaks based on product data | |
Talley et al. | A vector-borne contamination model to assess food-borne outbreak intervention strategies | |
Grau-Noguer et al. | Effectiveness of official food safety control in Barcelona city: Digital and traditional inspections |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20151111 |
|
RJ01 | Rejection of invention patent application after publication |