CN107609582B - Multi-dimensional parameter identification method and device - Google Patents

Multi-dimensional parameter identification method and device Download PDF

Info

Publication number
CN107609582B
CN107609582B CN201710772738.3A CN201710772738A CN107609582B CN 107609582 B CN107609582 B CN 107609582B CN 201710772738 A CN201710772738 A CN 201710772738A CN 107609582 B CN107609582 B CN 107609582B
Authority
CN
China
Prior art keywords
training sample
sample set
parameters
node
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710772738.3A
Other languages
Chinese (zh)
Other versions
CN107609582A (en
Inventor
萧伟
刘雪松
凌娅
陈勇
王振中
姜晓红
毕宇安
李页瑞
包乐伟
章晨峰
王磊
陈永杰
杜定益
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Kanion Pharmaceutical Co Ltd
Zhejiang University ZJU
Original Assignee
Jiangsu Kanion Pharmaceutical Co Ltd
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Kanion Pharmaceutical Co Ltd, Zhejiang University ZJU filed Critical Jiangsu Kanion Pharmaceutical Co Ltd
Priority to CN201710772738.3A priority Critical patent/CN107609582B/en
Publication of CN107609582A publication Critical patent/CN107609582A/en
Application granted granted Critical
Publication of CN107609582B publication Critical patent/CN107609582B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Complex Calculations (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a multi-dimensional parameter identification method and a device, wherein the identification method comprises the following steps: collecting a plurality of training samples to form a training sample set, wherein each training sample comprises a plurality of process parameters, each process parameter has corresponding attribute parameters and categories, and the combination of the attribute parameters and the categories is a plurality of types; acquiring a distribution transfer information value of the training sample set according to the category in the training sample set; according to the distribution transfer information value, obtaining the information gain of each process parameter; selecting the process parameter with the maximum information gain as a splitting node, and establishing a decision tree; according to the decision tree, the decision tree is used, and carrying out category identification on the new data. The invention takes the information gain as the establishment basis of the decision tree model, improves the accuracy of the model, effectively and accurately identifies the data type and provides reliable basis for intelligent parameter feedback.

Description

Multi-dimensional parameter identification method and device
Technical Field
The invention relates to the field of process knowledge systems, in particular to a multi-dimensional parameter identification method and device.
Background
In a process knowledge system (Process Knowledge System, PKS for short), intelligent feedback of parameters mainly includes real-time monitoring of parameter data, and if the parameters are out of range, warning is issued.
In the prior art, one-dimensional data is usually monitored and fed back, and for two-dimensional data or multi-dimensional data, the precondition of parameter feedback is to identify multi-dimensional parameters so as to conveniently judge and monitor the threshold value of the two-dimensional data.
In the prior art, KPS only monitors one-dimensional data, the feedback surface is narrow, and problems existing in a system are difficult to accurately and comprehensively reflect, so that the early warning efficiency of the system is low, and the manufacturing efficiency and quality are affected.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a multi-dimensional parameter identification method and a device, and the technical scheme is as follows:
in one aspect, the present invention provides a multidimensional parameter identification method, including:
collecting a plurality of training samples to form a training sample set, wherein each training sample comprises a plurality of process parameters, each process parameter has corresponding attribute parameters and categories, and the combination of the attribute parameters and the categories is a plurality of types;
acquiring a distribution transfer information value of the training sample set according to the category in the training sample set;
according to the distribution transfer information value, obtaining the information gain of each process parameter;
selecting the process parameter with the maximum information gain as a splitting node, and establishing a decision tree;
and carrying out category identification on the new data according to the decision tree.
Further, the obtaining the distribution transfer information value of the training sample set includes: the distribution transfer information value is obtained by the following calculation formula:
info(S)=-∑Pi*log 2 (Pi), wherein info (S) conveys information values for the distribution of the training sample set, and Pi is the probability that the training sample belongs to the ith class.
Further, the obtaining the information gain of each process parameter includes: the information gain is obtained by the following calculation formula:
wherein info (S) delivers information values for the distribution of training sample sets, info (S v ) Delivering information values for the distribution of a certain property parameter, for example>For the probability that the certain attribute parameter is in a certain category, varies (a) is a set of attribute parameters.
Further, the composing the training sample set further comprises: verifying the validity of the training sample, comprising: detecting whether the process parameters and the categories in each training sample are complete, and if not, rejecting the training samples; and/or
The training sample set is composed of: verifying the validity of the training sample set comprises: and detecting whether attribute parameters corresponding to the process parameters are complete, and if not, judging that the training sample set is invalid.
Further, the process parameter with the maximum information gain is selected as a split node, and the decision tree establishment comprises the following steps:
taking the process parameter with the maximum information gain as a root node and the corresponding attribute parameter as a first branch node;
judging whether the categories corresponding to the attribute parameters are consistent, if so, taking the category as a leaf node, otherwise, taking the last large process parameter of the information gain sequence as a second branch node, and repeating the steps until the category is obtained as the leaf node.
Further, if the root node and each branch node are the same and the leaf nodes are different, counting the number of the leaf nodes under the root node and each branch node in the training sample set, if the counted number is consistent, randomly discarding any leaf node, otherwise discarding the leaf node with lower number.
Further, the process parameters in the same training sample are key process parameters obtained by screening according to key quality attributes, wherein the key quality attributes are attribute parameters selected according to working sections in a process knowledge system.
In another aspect, the present invention provides a multidimensional parameter identification apparatus, comprising:
the data acquisition module is used for acquiring a plurality of training samples to form a training sample set, each training sample comprises a plurality of process parameters, each process parameter has corresponding attribute parameters and categories, and the combination of the attribute parameters and the categories is a plurality of types;
the distribution transfer module is used for acquiring a distribution transfer information value of the training sample set according to the category in the training sample set;
the gain module is used for transmitting information values according to the distribution and acquiring information gain of each process parameter;
the decision tree module is used for selecting the process parameter with the maximum information gain as a splitting node and establishing a decision tree;
and the identification module is used for carrying out category identification on the new data according to the decision tree.
Further, the apparatus further comprises:
a first verification module for verifying the validity of training samples prior to composing a training sample set, comprising: detecting whether the process parameters and the categories in each training sample are complete, and if not, rejecting the training samples; and/or
A second checking module, configured to check validity of the training sample set after composing the training sample set, includes: and detecting whether attribute parameters corresponding to the process parameters are complete, and if not, judging that the training sample set is invalid.
Further, the distribution transfer module obtains the distribution transfer information value through the following calculation formula:
info(S)=-∑Pi*log 2 (Pi), wherein info (S) is a distribution transfer information value of the training sample set, and Pi is a probability that the training sample belongs to the ith class;
the gain module obtains the information gain through the following calculation formula:
wherein info (S) delivers information values for the distribution of training sample sets, info (S v ) Delivering information values for the distribution of a certain property parameter, for example>For the probability that the certain attribute parameter is in a certain category, varies (a) is a set of attribute parameters.
The technical scheme provided by the invention has the following beneficial effects:
1) The information gain is used as a selection standard of the splitting nodes of the decision tree, so that the accuracy of the decision tree is improved;
2) And establishing a decision tree model to realize accurate classification of the multidimensional data.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a multi-dimensional parameter identification method provided by an embodiment of the present invention;
FIG. 2 is a flow chart of a method for creating a decision tree provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of a decision tree according to an embodiment of the present invention;
FIG. 4 is a flowchart of a training sample verification method provided by an embodiment of the present invention;
fig. 5 is a block diagram of a multidimensional parameter identification apparatus according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or device.
Example 1
In one embodiment of the present invention, a multi-dimensional parameter identification method is provided, referring to fig. 1, the method includes the following steps:
s1, collecting a plurality of training samples to form a training sample set.
Specifically, each training sample includes a plurality of process parameters, each process parameter having a corresponding attribute parameter and category, wherein the attribute parameter and category are combined in a plurality of ways. For example, in a certain working section, the production process is related to three process parameters, i.e., temperature (T), pressure (P), and humidity (PH), and the process parameters are temperature, pressure, and humidity, and the attribute parameters are classified into high (H), medium (N), and low (L), and the classification is classified into good, and bad, and the certain training sample can be as follows: t (H), P (N), PH (N), are good, and a plurality of training samples in various same or different states form a training sample set, and the training sample set can be shown in the following table 1.
TABLE 1
Sequence number T P PH Category(s)
1 H N H Good grade (good)
2 L H N Good grade (good)
3 N L L Excellent (excellent)
4 H N N Difference of difference
5 N N H Good grade (good)
6 L N N Excellent (excellent)
7 L H L Excellent (excellent)
8 N H H Difference of difference
9 H L N Good grade (good)
S2, acquiring a distributed transmission information value of the training sample set according to the category in the training sample set.
Let M be a training sample set comprising n classes of samples, denoted by C1, C2, …, cn, respectively, and if a given probability distribution mi= (M1, M2, … Mn) represents the probability of Ci, the amount of information conveyed by this distribution is referred to as the information entropy of M.
The obtaining the distribution transfer information value of the training sample set includes: the distribution transfer information value is obtained by the following calculation formula:
info(S)=-∑Pi*log 2 (Pi), wherein info (S) conveys information values for the distribution of the training sample set, and Pi is the probability that the training sample belongs to the ith class.
Taking table 1 as an example, the training sample set includes 9 training samples, wherein three samples (serial numbers 3, 6, and 7) with good categories are included, 4 with good categories and 2 with bad categories are included.
S3, transmitting information values according to the distribution, and acquiring information gain of each process parameter.
In particular, the information gain is used to measure the expected reduction in information entropy.
The obtaining the information gain of each process parameter includes: the information gain is obtained by the following calculation formula:
wherein info (S) delivers information values for the distribution of training sample sets, info (S v ) Delivering information values for the distribution of a certain property parameter, for example>For the probability that the certain attribute parameter is in a certain category, varies (a) is a set of attribute parameters.
Taking the data in table 1 as an example,with info (T) H ) For example, a->In the same way, info (T) N ) And info (T) L ) Further, gain (S, T) is obtained, and gain (S, P) and gain (S, PH) can be obtained by the same method.
S4, selecting the process parameter with the maximum information gain as a splitting node, and establishing a decision tree.
If gain (S, T) > gain (S, PH) > gain (S, P) is calculated, the temperature T is used as the split node to build the model.
S5, carrying out category identification on the new data according to the decision tree.
In the embodiment of the invention, the information gain is taken as the basis for selecting the split nodes to establish the decision tree model, which is favorable for accurately identifying and classifying two-dimensional (multidimensional) data and provides reliable basis for intelligent feedback of parameters. The two-dimensional (multidimensional) data may be applied in the following scenarios: the process parameters in the same training sample are key process parameters obtained by screening according to key quality attributes, wherein the key quality attributes are attribute parameters selected according to working sections in a process knowledge system, and the parameter types can be identified by adopting the method in the embodiment of the invention under the condition that the same key quality attribute has two or more key process parameters.
Example 2
In one embodiment of the present invention, a method of building a decision tree is provided, see fig. 2, comprising the steps of:
s21, taking the process parameter with the maximum information gain as a root node;
s22, taking the corresponding attribute parameter as a first branch node;
s23, judging whether the categories corresponding to the attribute parameters are consistent, if so, executing S24; otherwise, executing S25;
s24, taking the class as a leaf node;
s25, taking the process parameter with the information gain sequence larger than the last time as a second branch node, and repeating the steps until the category is obtained as a leaf node.
Taking the data in Table 1 as an example, if, through calculation, gain (S, T) > gain (S, PH) > gain (S, P), the decision tree established according to steps S21-S25 is shown in FIG. 3.
According to the decision tree shown in fig. 3, a type judgment can be performed on new data, for example, the new data is T (L) P (H) PH (L), and although the new data does not appear in table 1, according to the decision tree shown in fig. 3, the type of the parameter can be identified as being optimal; and if the new data is T (N) P (H) PH (H), judging that the parameter type is bad, and timely feeding back to the user or making a reminding warning.
If the root node and each branch node are the same and the leaf nodes are different, counting the number of the leaf nodes under the root node and each branch node in the training sample set, if the counted number is consistent, randomly discarding any leaf node, otherwise discarding the leaf node with lower number. Specific examples include training samples with a number of 10 added to table 1: t (N) P (H) PH (H) is of good type, which is obviously different from the type conclusion of the sample with the number of 8 in the table 1, in this case, the number of the samples of the two is counted, the number of the samples is more, and if the number is the same, any type is selected randomly.
Example 3
In one embodiment of the present invention, a training sample verification method is provided, see fig. 4, including the following procedures:
s31, detecting whether the process parameters and the categories in each training sample are complete, if not, rejecting the training sample, and if so, executing S32.
S32, adding training samples into a training sample set, wherein the number of samples in the training sample set is increased by 1;
s33, judging whether the number of samples in the training sample set reaches the index, if so, executing S34, otherwise, repeatedly executing S31-S32.
S34, detecting whether attribute parameters corresponding to the process parameters are complete, and if not, judging that the training sample set is invalid.
In order to make the established decision tree model have integrity, the validity of each training sample needs to be checked before the training sample set is formed, and the validity of the whole training sample set is checked after the training sample set is formed, otherwise, the decision tree cannot give out the condition of type leaf nodes.
Example 4
In an embodiment of the present invention, a multi-dimensional parameter identification apparatus is provided, referring to fig. 5, including the following modules:
the data collection module 510 is configured to collect a plurality of training samples to form a training sample set, where each training sample includes a plurality of process parameters, and each process parameter has a corresponding attribute parameter and category, where the attribute parameter and category are combined in a plurality of ways;
the distribution transfer module 520 is configured to obtain a distribution transfer information value of the training sample set according to the category in the training sample set;
a gain module 530, configured to transmit information values according to the distribution, and obtain information gain of each process parameter;
the decision tree module 540 is configured to select a process parameter with the maximum information gain as a split node, and establish a decision tree;
and the identification module 550 is used for carrying out category identification on the new data according to the decision tree.
Further, the identifying means further comprises a first checking module 511 for checking the validity of the training samples before forming the training sample set, including: detecting whether the process parameters and the categories in each training sample are complete, and if not, rejecting the training samples; and/or
A second checking module 512, configured to check validity of the training sample set after composing the training sample set, includes: and detecting whether attribute parameters corresponding to the process parameters are complete, and if not, judging that the training sample set is invalid.
Wherein, the distribution transfer module 520 obtains the distribution transfer information value by the following calculation formula:
info(S)=-∑Pi*log 2 (Pi), wherein info (S) is a distribution transfer information value of the training sample set, and Pi is a probability that the training sample belongs to the ith class;
the gain module 530 obtains the information gain by the following calculation formula:
wherein info (S) delivers information values for the distribution of training sample sets, info (S v ) Delivering information values for the distribution of a certain property parameter, for example>For the probability that the certain attribute parameter is in a certain category, varies (a) is a set of attribute parameters.
It should be noted that: in the multi-dimensional parameter identification device provided in the above embodiment, when performing intelligent parameter identification, only the division of each functional module is used for illustration, in practical application, the above-mentioned function allocation can be completed by different functional modules according to needs, that is, the internal structure of the multi-dimensional parameter identification device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the embodiment of the multidimensional parameter identification apparatus provided in this embodiment and the multidimensional parameter identification method provided in the foregoing embodiments belong to the same concept, and detailed implementation processes of the multidimensional parameter identification apparatus are referred to as method embodiments, which are not described herein.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims (8)

1. A method for comprehensively detecting temperature, pressure and humidity of multidimensional parameters in a PKS production section, which is characterized by comprising the following steps:
collecting a plurality of training samples to form a training sample set, wherein each training sample comprises a plurality of process parameters, each process parameter has corresponding attribute parameters and categories, the process parameters in the same training sample are key process parameters obtained by screening according to key quality attributes, and the key quality attributes are attribute parameters selected according to working sections in a process knowledge system;
the process parameters comprise temperature, pressure and humidity, the combination of the attribute parameters and categories is multiple, the attribute parameters of the temperature, the pressure and the humidity comprise three steps of high, medium and low, and the categories comprise the categories that the temperature, the pressure and the humidity under the respective attribute parameter conditions are good or bad or poor correspondingly;
acquiring a distribution transfer information value of the training sample set according to the category in the training sample set;
according to the distribution transfer information value, obtaining the information gain of each process parameter;
selecting the process parameter with the maximum information gain as a splitting node, establishing a decision tree, wherein the process for establishing the decision tree comprises the following steps: selecting the process parameter with the maximum information gain in the training sample set as a root node, and taking attribute parameters of the process parameter of the root node in the training sample set as primary branch nodes respectively; selecting a process parameter with the second-highest information gain in a training sample set as a secondary branch node, and searching attribute parameters of the process parameter of the secondary branch node under the attribute parameters of the primary branch node in the training sample set to respectively serve as tertiary branch nodes; according to each training sample in the training sample set, selecting a process parameter with the smallest gain after part or all of the three-level branch nodes as a four-level branch node, and searching attribute parameters of the process parameter of the four-level branch node in the training sample set as five-level branch nodes respectively; if the categories from the root node to the tertiary branch node can be uniquely determined in a training sample library, setting the corresponding category as a leaf node behind the tertiary branch node, otherwise, setting the corresponding category as a leaf node behind the five-stage branch node according to the training sample library; the process of establishing the decision tree does not carry out discrete segmentation operation on any range of high, medium and low continuous values of each attribute parameter;
and identifying the new data according to the decision tree, wherein the new data comprises temperature, pressure and humidity parameters and corresponding new attribute parameters thereof, further identifying and obtaining the category corresponding to the attribute parameters of the new data, finding a first-stage branch node according to the corresponding attribute parameters of the process parameters of the root node in the new data, finding a third-stage branch node according to the corresponding attribute parameters of the second-stage branch node in the new data, identifying the category of the new data as the category corresponding to the directly connected leaf node if the third-stage branch node is directly connected with the leaf node in the decision tree, otherwise, finding a fifth-stage branch node and the corresponding leaf node according to the corresponding attribute parameters of the fourth-stage branch node in the new data, identifying the category of the new data as the category corresponding to the leaf node after the fifth-stage branch node, and feeding back or giving a warning to a user if the identified category is poor.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises, the obtaining the distribution transfer information value of the training sample set includes: the distribution transfer information value is obtained by the following calculation formula:
info(S)=-∑Pi*log 2 (Pi), wherein info (S) conveys information values for the distribution of the training sample set, and Pi is the probability that the training sample belongs to the ith class.
3. The method of claim 1, wherein the obtaining the information gain for each process parameter comprises: the information gain is obtained by the following calculation formula:
wherein info (S) delivers information values for the distribution of training sample sets, info (S v ) Delivering information values for the distribution of a certain property parameter, for example>For the probability that the certain attribute parameter is in a certain category, varies (a) is a set of attribute parameters.
4. The method of claim 1, wherein the composing the training sample set is preceded by: verifying the validity of the training sample, comprising: detecting whether the process parameters and the categories in each training sample are complete, and if not, rejecting the training samples; and/or
The training sample set is composed of: verifying the validity of the training sample set comprises: and detecting whether attribute parameters corresponding to the process parameters are complete, and if not, judging that the training sample set is invalid.
5. The method of claim 1, wherein if the root node and each branch node are the same and the leaf nodes are different, counting the number of leaf nodes under the root node and each branch node in the training sample set, if the counted number is consistent, randomly discarding any one of the leaf nodes, otherwise discarding the leaf node with the lower number.
6. A comprehensive detection device for temperature, pressure and humidity of multidimensional parameters in a PKS production section, which is characterized by comprising the following modules:
the data acquisition module is used for acquiring a plurality of training samples to form a training sample set, each training sample comprises a plurality of process parameters, each process parameter has corresponding attribute parameters and categories, wherein the process parameters in the same training sample are key process parameters obtained by screening according to key quality attributes, and the key quality attributes are attribute parameters selected according to working sections in a process knowledge system;
the process parameters comprise temperature, pressure and humidity, the combination of the attribute parameters and categories is multiple, the attribute parameters of the temperature, the pressure and the humidity comprise three steps of high, medium and low, and the categories comprise the categories that the temperature, the pressure and the humidity under the respective attribute parameter conditions are good or bad or poor correspondingly;
the distribution transfer module is used for acquiring a distribution transfer information value of the training sample set according to the category in the training sample set;
the gain module is used for transmitting information values according to the distribution and acquiring information gain of each process parameter;
the decision tree module is used for selecting the process parameter with the maximum information gain as a splitting node, establishing a decision tree, and the process for establishing the decision tree comprises the following steps: selecting the process parameter with the maximum information gain in the training sample set as a root node, and taking attribute parameters of the process parameter of the root node in the training sample set as primary branch nodes respectively; selecting a process parameter with the second-highest information gain in a training sample set as a secondary branch node, and searching attribute parameters of the process parameter of the secondary branch node under the attribute parameters of the primary branch node in the training sample set to respectively serve as tertiary branch nodes; according to each training sample in the training sample set, selecting a process parameter with the smallest gain after part or all of the three-level branch nodes as a four-level branch node, and searching attribute parameters of the process parameter of the four-level branch node in the training sample set as five-level branch nodes respectively; if the categories from the root node to the tertiary branch node can be uniquely determined in a training sample library, setting the corresponding category as a leaf node behind the tertiary branch node, otherwise, setting the corresponding category as a leaf node behind the five-stage branch node according to the training sample library; the process of establishing the decision tree does not carry out discrete segmentation operation on any range of high, medium and low continuous values of each attribute parameter;
the identification module is used for carrying out category identification on new data according to the decision tree, wherein the new data comprises temperature, pressure and humidity parameters and corresponding new attribute parameters thereof, further identification is carried out to obtain categories corresponding to the attribute parameters of the new data, the identification module comprises the steps of finding a first-level branch node according to the corresponding attribute parameters of the process parameters of the root node in the new data, finding a third-level branch node according to the corresponding attribute parameters of the process parameters of the second-level branch node in the new data, if the third-level branch node is directly connected with a leaf node in the decision tree, identifying the categories of the new data as the categories corresponding to the directly connected leaf node, otherwise, finding a fifth-level branch node and the corresponding leaf node according to the corresponding attribute parameters of the fourth-level branch node in the new data, identifying the categories corresponding to the leaf node after the fifth-level branch node, and if the identified categories are poor, feeding back or giving a warning to a user.
7. The apparatus as recited in claim 6, further comprising:
a first verification module for verifying the validity of training samples prior to composing a training sample set, comprising: detecting whether the process parameters and the categories in each training sample are complete, and if not, rejecting the training samples; and/or
A second checking module, configured to check validity of the training sample set after composing the training sample set, includes: and detecting whether attribute parameters corresponding to the process parameters are complete, and if not, judging that the training sample set is invalid.
8. The apparatus of claim 6, wherein the distribution delivery module obtains the distribution delivery information value by the following calculation formula:
info(S)=-∑Pi*log 2 (Pi), wherein info (S) is a distribution transfer information value of the training sample set, and Pi is a probability that the training sample belongs to the ith class;
the gain module obtains the information gain through the following calculation formula:
wherein info (S) delivers information values for the distribution of training sample sets, info (S v ) Delivering information values for the distribution of a certain property parameter, for example>Vari is the probability of the certain attribute parameter being in a certain categoryes (A) is a set of attribute parameters.
CN201710772738.3A 2017-08-31 2017-08-31 Multi-dimensional parameter identification method and device Active CN107609582B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710772738.3A CN107609582B (en) 2017-08-31 2017-08-31 Multi-dimensional parameter identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710772738.3A CN107609582B (en) 2017-08-31 2017-08-31 Multi-dimensional parameter identification method and device

Publications (2)

Publication Number Publication Date
CN107609582A CN107609582A (en) 2018-01-19
CN107609582B true CN107609582B (en) 2023-10-10

Family

ID=61056798

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710772738.3A Active CN107609582B (en) 2017-08-31 2017-08-31 Multi-dimensional parameter identification method and device

Country Status (1)

Country Link
CN (1) CN107609582B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102049420A (en) * 2009-11-05 2011-05-11 刘斌 Decision tree-based method for extracting key characteristic variables of finish rolling temperature control process
CN103207565A (en) * 2012-01-13 2013-07-17 通用电气公司 Automated incorporation of expert feedback into monitoring system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102049420A (en) * 2009-11-05 2011-05-11 刘斌 Decision tree-based method for extracting key characteristic variables of finish rolling temperature control process
CN103207565A (en) * 2012-01-13 2013-07-17 通用电气公司 Automated incorporation of expert feedback into monitoring system

Also Published As

Publication number Publication date
CN107609582A (en) 2018-01-19

Similar Documents

Publication Publication Date Title
Sugiarto et al. Data classification for air quality on wireless sensor network monitoring system using decision tree algorithm
CN109842588B (en) Network data detection method and related equipment
CN109753499A (en) A kind of O&M monitoring data administering method
CN101561878A (en) Unsupervised anomaly detection method and system based on improved CURE clustering algorithm
CN115877806A (en) LCP film product production control method and system
CN115222303B (en) Industry risk data analysis method and system based on big data and storage medium
CN110662232A (en) Method for evaluating link quality by adopting multi-granularity cascade forest
CN108055227B (en) WAF unknown attack defense method based on site self-learning
CN106603538A (en) Invasion detection method and system
CN115996249A (en) Data transmission method and device based on grading
CN113657747B (en) Intelligent assessment system for enterprise safety production standardization level
CN111314910A (en) Novel wireless sensor network abnormal data detection method for mapping isolation forest
CN117118810B (en) Network communication abnormity early warning method and system
CN105093186A (en) Multi-target fusion detection method based on heterogeneous radar sensing network
Franchi et al. Statistical properties of the maximum Lyapunov exponent calculated via the divergence rate method
CN107609582B (en) Multi-dimensional parameter identification method and device
CN113112188B (en) Power dispatching monitoring data anomaly detection method based on pre-screening dynamic integration
CN113645182A (en) Random forest detection method for denial of service attack based on secondary feature screening
CN108646688B (en) A kind of process parameter optimizing analysis method based on recurrence learning
CN106304084A (en) Information processing method and device
CN104519511A (en) Method and device for detecting scene breaks of communication network cells
CN114936614A (en) Operation risk identification method and system based on neural network
CN112069037A (en) Method and device for detecting no threshold value of cloud platform
CN117579704B (en) Detection data acquisition method and system based on Internet of things
CN111314170B (en) Feature fuzzy P2P protocol identification method based on connection statistical rule analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant