CN113657441A - Classification algorithm based on weighted Pearson correlation coefficient and combined with feature screening - Google Patents

Classification algorithm based on weighted Pearson correlation coefficient and combined with feature screening Download PDF

Info

Publication number
CN113657441A
CN113657441A CN202110774460.XA CN202110774460A CN113657441A CN 113657441 A CN113657441 A CN 113657441A CN 202110774460 A CN202110774460 A CN 202110774460A CN 113657441 A CN113657441 A CN 113657441A
Authority
CN
China
Prior art keywords
feature
decision tree
list
pearson correlation
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110774460.XA
Other languages
Chinese (zh)
Inventor
周红芳
安蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN202110774460.XA priority Critical patent/CN113657441A/en
Publication of CN113657441A publication Critical patent/CN113657441A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Complex Calculations (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a classification algorithm based on weighted Pearson correlation coefficient and combined with feature screening, which comprises the steps of firstly preprocessing original data, and using an IMPROVE _ FCBF algorithm to screen features of a preprocessed data set; dividing the data subjected to feature screening into a training set and a test set by using a ten-fold cross validation method, and constructing a decision tree on the training set by using a decision tree algorithm based on a weighted Pearson correlation coefficient; and finally, classifying the test data by using the constructed decision tree model to obtain a result, and evaluating the decision tree classification model by using the accuracy rate of the evaluation index, the recall rate, the macroscopic F1 value and the construction time of the decision tree. Based on the evaluation indexes, compared with other decision tree classification algorithms, the method has improvement and improvement of different degrees.

Description

Classification algorithm based on weighted Pearson correlation coefficient and combined with feature screening
Technical Field
The invention belongs to the technical field of data mining, and relates to a classification algorithm based on weighted Pearson correlation coefficients and combined with feature screening.
Background
In the era of mobile internet, in the face of massive data, traditional data analysis cannot process the massive data, a new method must be used for processing the massive data, and a data mining technology is one of the best tools for processing massive data. In the technical field of data mining, the classification problem is particularly important, and the method is widely applied to various financial commercial activities such as telecommunication, banks, supermarkets and the like. The classification process can be divided into two steps: firstly, analyzing and calculating known sample data to obtain a function/model; in the second step, the class of other unknown data is predicted using the derived function/model. Currently, there are many kinds of related classification algorithms, such as: decision tree algorithm, genetic algorithm, clustering algorithm, neural network algorithm, etc. Among them, the decision tree classification algorithm is one of the most general classification algorithms because of its advantages of strong interpretability, fast speed, high accuracy, etc. Common decision tree classification algorithms are: the ID3 algorithm, the C4.5 algorithm, the CART algorithm, the PCC-Tree algorithm, and the like.
The traditional decision tree classification algorithms have good effects on processing small-scale data sets, but due to the influences of memory limitation, time complexity and data complexity, the time complexity of the algorithms on processing large-scale data sets is high. Therefore, how to increase the speed of constructing the decision tree is very important.
Disclosure of Invention
The invention aims to provide a classification algorithm based on weighted Pearson correlation coefficients and combined with feature screening, and has the characteristic of effectively improving the classification accuracy of a decision tree model.
The technical scheme adopted by the invention is that a classification algorithm based on weighted Pearson correlation coefficient and combined with feature screening is implemented according to the following steps:
step 1, for a category set C containing m categories, C ═ C1,c2,...cmM 1,2,3.. m, and a feature set F with n as a feature number F ═ F1,f2,f3,...fnPreprocessing a data set of 1,2,3.. n;
step 2, carrying out feature screening on the preprocessed data set by using an IMPROVE _ FCBF algorithm;
step 3, dividing the data set subjected to feature screening into training data and test data;
step 4, constructing a decision tree model on the training set by using a decision tree classification method based on the weighted Pearson correlation coefficient;
and 5, testing the test data by using the established decision tree model, and evaluating the experimental result by using the accuracy, the recall rate, the macro F1 and the time required for constructing the decision tree as evaluation indexes.
The invention is also characterized in that:
the preprocessing in the step 1 is specifically that firstly, discretization is carried out on continuous characteristic values in a data set by using an equal-width method; then converting the character string type characteristic value into a nominal numerical value type; then, complementing the missing characteristic value by using a mode; and finally converting the character string class values in the data set into a nominal numerical type.
The step 2 is implemented according to the following steps:
step 2.1, initialize SlistIs an empty set;
step 2.2, calculate each feature fiSymmetry uncertainty SU (f) between (i ═ 1, …, n) and class CiC) value, and a measure of uncertainty of symmetry SU (f) between each two featuresi,fj) (i, j ≠ j) 1, …, n, and i ≠ j); the formula for calculating the SU values of the two variables X and Y is as follows:
Figure BDA0003154090000000021
step 2.3, will satisfy SU (fi, C)>Feature formation S of 0listSubsets and sorting from large to small;
step 2.4, judging S circularlylistEach feature f in the subsetjWhether or not it is the main feature fiIf the strong redundancy feature is the strong redundancy feature, the strong redundancy feature is selected from SlistRemoving from the subset;
step 2.5, for SlistEach feature F ofk(k is 1, …, n) circularly judging whether the Merits value is reduced or not, and if the Merits value is reduced, rejecting the Merits value; if SlistStopping searching when all the characteristic elements are judged to be finished or meet the early stop criterion;
step 2.6, return to the final feature subset Slist
In step 2.5, if SlistWhen the middle characteristic element is not judged to be finished or does not meet the early stop criterion, repeating the following steps:
step 2.5.1, for each feature Fk(k is 1, …, n), let Slist[k]=FkMerits is calculated according to the formula shown below, where k is the number of features, rcfIs characterized byiThe value of SU (fi, C), r, between class CffFor the pairs SU (f) between each two featuresi,fj) Sum average:
Figure BDA0003154090000000031
step 2.5.2, if k > 1, if
Figure BDA0003154090000000032
The kth feature is deleted, otherwise it is added to the final feature subset SlistAmong them.
And 3, dividing the data set by adopting a ten-fold cross validation method.
Step 4 is specifically implemented according to the following steps:
step 4.1, traverse is carried out on each feature in the subset after feature screening, and S is assumed to be carried out at the momentlistIn which there are n features, each feature f is calculatedi∈Slist(i ═ 1,2,3,.. n) and class C, and the weighted pearson correlation coefficient between the two variables X and Y is calculated as follows:
Figure BDA0003154090000000033
4.2, sorting the features from big to small according to the WPCC values obtained by calculation;
4.3, when each layer of the decision tree is constructed, selecting the characteristic with the maximum WPCC value as a split node to construct the decision tree each time;
and 4.4, iteratively constructing the decision tree until a decision tree termination condition is reached, and completing construction of the decision tree model.
The invention has the beneficial effects that:
1. compared with other four classical decision Tree algorithms (ID3, CART, C4.5 and PCC-Tree), the FS-WPCCT algorithm is superior to a comparison algorithm in evaluation indexes such as accuracy, recall rate and macroscopic F1 value;
2. compared with a PCC-Tree algorithm and a WPCCT algorithm, the FS-WPCCT algorithm has obvious advantages in decision Tree construction time.
Drawings
FIG. 1 is a flow chart of a classification algorithm based on weighted Pearson correlation coefficients in combination with feature screening in accordance with the present invention;
FIG. 2 shows the result of comparing the accuracy of the FS-WPCCT algorithm, the WPCCT algorithm and the classical PCC-Tree algorithm on 25 data sets in the invention;
FIG. 3 is a histogram comparing the average accuracy, recall and macroscopic F1 values of the FS-WPCCT algorithm, WPCCT algorithm and the classical PCC-Tree algorithm over 25 data sets in the present invention;
FIG. 4 is a comparative line graph of construction decision Tree time of FS-WPCCT algorithm, WPCCT algorithm and classical PCC-Tree algorithm on 25 data sets in the invention;
FIG. 5 is a graph of the accuracy of the FS-WPCCT algorithm versus the accuracy of other classical decision Tree algorithms (ID3, CART, C4.5, PCC-Tree) on 25 datasets in the present invention;
FIG. 6 is a plot of recall against score for FS-WPCCT algorithm versus other classical decision Tree algorithms (ID3, CART, C4.5, PCC-Tree) on 25 datasets in the present invention;
FIG. 7 is a macroscopic F1-value comparison line graph of the FS-WPCCT algorithm and other classical decision Tree algorithms (ID3, CART, C4.5, PCC-Tree) on 25 data sets in the invention;
FIG. 8 is a histogram of the average accuracy, average recall and average macroscopic F1 value of the FS-WPCCT algorithm of the present invention versus other classical decision Tree algorithms (ID3, CART, C4.5, PCC-Tree) over 25 data sets.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The relevant definitions in the present invention are as follows:
definition 1 (mutual information): mutual information describes how much information is contained in one random variable about another random variable. For two random variables X, Y, their corresponding mutual information is defined as formula (1), where H (X) is the entropy of X and H (X | Y) is the conditional entropy.
I(X,Y)=H(X)-H(X|Y) (I)
Definition 2 (symmetry uncertainty): for two variables X and Y, the symmetry uncertainty formula between them is shown in formula (2), where H (X) is the entropy of X, H (X | Y) is the conditional entropy for X given variable Y, and I (X, Y) is the mutual information of the two variables.
Figure BDA0003154090000000051
Definition 3 (conditional mutual information): the conditional mutual information indicates how much information is between the variable X and the variable Y when the variable Z is introduced. Given the variable Z, the mutual conditional information of the two random variables X and Y can be defined as equation (3). Where p (x, y, z) is the joint distribution probability and p (x | z), p (y | z) and p (x, y | z) are the conditional distribution probabilities.
Figure BDA0003154090000000052
Definition 4 (normalized interaction score NPIS): for two features FiAnd FjAnd (i ≠ j), giving class C, FiAnd FjThe normalized interaction score NPIS of is defined as equation (4).
Figure BDA0003154090000000061
Definition 5 (pearson correlation coefficient): the pearson correlation coefficient between two variables X and Y is calculated as in equation (5), where cov (X, Y) is the covariance between X and Y, var (X) is the variance of X, and var (Y) is the variance of Y.
Figure BDA0003154090000000062
Definition 6 (weighted pearson correlation coefficient): based on the pearson correlation coefficient, a weighted pearson correlation coefficient between two variables X and Y is calculated as in equation (6), where h (X) is the entropy of X and PCC (X, Y) is the pearson correlation coefficient for the two variables.
Figure BDA0003154090000000063
Definition 7 (accuracy): the ratio of a measurement value satisfying a predetermined condition among a plurality of measurement values under a predetermined experimental condition. The method is used for simultaneously representing the degree of the system error and the random error in the measurement result, and the degree of the approximation of the average value of a plurality of measurement values to the true value. The accuracy calculation formula is as follows:
Figure BDA0003154090000000064
definition 8 (accuracy): in a data set of a plurality of categories, the accuracy is calculated by taking samples of one category as one category and samples of other categories as another category at a time. The definition is as follows:
Figure BDA0003154090000000065
TP: true positive sample number; FP: the number of samples tested as positive, and actually negative.
Definition 9 (recall): in a data set of a plurality of categories, the recall ratio is calculated by taking samples of one category as one category and samples of other categories as another category at a time. The definition is as follows:
Figure BDA0003154090000000066
TP: true positive sample number; FN: the number of samples tested as negative, in fact positive.
Definition 10(F1 value): the F1 value is the harmonic mean of recall and precision, where P refers to precision and R refers to recall. The definition is as follows:
Figure BDA0003154090000000071
definition 11 (macroscopic F1 value): the F1 value represented by equation (10) can be used to measure the binary problem, if the number of classes is greater than 2, then the macro-average F1 can be used, and assuming that the number of classes is n, then the macro-average F1 is the average of the F1 values of the n class classification problems as the F1 values of the n binary classification problems.
The definition is as follows.
Figure BDA0003154090000000072
The classification algorithm based on the weighted Pearson correlation coefficient and combined with feature screening is specifically implemented according to the following steps as shown in FIG. 1:
step 1, a data set is given, and the data set comprises a category set C with the number m of categories { C ═ C }1,c2,...cmM 1,2,3.. m, and a feature set F with n as a feature number F ═ F1,f2,f3,...fnN is 1,2,3. Firstly, discretizing a continuous characteristic value in a data set by using an equal-width method; then converting the character string type characteristic value into a nominal numerical value type; then, complementing the missing characteristic value by using a mode; and finally converting the character string class values in the data set into a nominal numerical type.
Step 2, performing feature screening on the preprocessed data set by using an IMPROVE _ FCBF algorithm to obtain a feature set used for constructing a decision tree, wherein the IMPROVE _ FCBF algorithm comprises the following specific steps:
step 2.1, initialize SlistIs an empty set.
Step 2.2, calculate each feature fiSymmetry uncertainty SU (f) between (i ═ 1, …, n) and class CiC) value, and between each two featuresSymmetry uncertainty measure SU (f)i,fj) (i, j ≠ j) 1, …, n, and i ≠ j); the formula for calculating the SU values of the two variables X and Y is as follows:
Figure BDA0003154090000000081
step 2.3, will satisfy SU (fi, C)>Feature formation S of 0listSubsets and sorting from large to small;
step 2.4, judging S circularlylistEach feature f in the subsetjWhether or not it is the main feature fiIf the strong redundancy feature is the strong redundancy feature, the strong redundancy feature is selected from SlistRemoving from the subset;
step 2.5, for SlistEach feature F ofkAnd (k is 1, …, n) circularly judging whether the Merits value is reduced or not, and if the Merits value is reduced, rejecting the Merits value. If SlistAnd stopping searching when all the characteristic elements are judged to be finished or meet the early stop criterion. Otherwise, repeating the following steps. The specific steps are as follows:
step 2.5.1, for each feature Fk(k is 1, …, n), let Slist[k]=FkMerits is calculated according to the formula shown below, where k is the number of features, rcfIs characterized byiThe value of SU (fi, C), r, between class CffFor the pairs SU (f) between each two featuresi,fj) Sum average:
Figure BDA0003154090000000082
step 2.5.2, if k > 1, if
Figure BDA0003154090000000083
The kth feature is deleted, otherwise it is added to the final feature subset SlistAmong them.
Step 2.6, return to the final feature subset Slist
And 3, dividing the data set subjected to feature screening into a training set and a test set by using ten-fold intersection.
And 4, constructing a decision tree model on the training set by using a decision tree classification method based on the weighted Pearson correlation coefficient. The method comprises the following specific steps:
step 4.1, traverse is carried out on each feature in the subset after feature screening, and S is assumed to be carried out at the momentlistIn which there are n features, each feature f is calculatedi∈SlistA weighted pearson correlation coefficient between (i ═ 1,2,3,. and n) and class C. The weighted pearson correlation coefficient between the two variables X and Y is calculated as follows:
Figure BDA0003154090000000091
and 4.2, sequencing the features from large to small according to the WPCC values obtained by calculation.
And 4.3, when each layer of the decision tree is constructed, selecting the characteristic with the maximum WPCC value as a split node to construct the decision tree each time.
And 4.4, iteratively constructing the decision tree until a decision tree termination condition is reached, and completing construction of the decision tree model.
And 5, testing the test data by using the established decision tree model, and evaluating the experimental result by using the accuracy, the recall rate, the macro F1 and the time required for constructing the decision tree as evaluation indexes.
The pseudo code of the IMPROVE _ FCBF algorithm involved in the present invention is shown in Table 1:
table 1 advance _ FCBF algorithm pseudo code
Figure BDA0003154090000000092
Figure BDA0003154090000000101
Pseudo codes of decision tree classification algorithm based on weighted pearson correlation coefficient in the invention are shown in table 2:
TABLE 2 WPCCT Algorithm pseudocode
Figure BDA0003154090000000102
Figure BDA0003154090000000111
Evaluation of the Performance of the present invention:
in order to verify the effectiveness of the FS-WPCCT decision Tree classification algorithm in the invention, the method is used for comparing with a decision Tree algorithm (WPCCT) only using weighted Pearson correlation coefficients and a classical decision Tree algorithm (ID3, CART, C4.5 and PCC-Tree).
Through comparison experiments on 25 data sets, as can be seen from fig. 2, the average accuracy of the FS-WPCCT decision Tree classification algorithm on each data set is superior to that of the WPCCT algorithm and the PCC-Tree algorithm in most cases; as can be seen from FIG. 3, the average accuracy, average recall rate and average macroscopic F1 value of the FS-WPCCT decision Tree classification algorithm are superior to those of the WPCCT algorithm and the PCC-Tree algorithm on 25 data sets; as can be seen from FIG. 4, the FS-WPCCT algorithm has better time performance than the WPCCT algorithm and the PCC-Tree algorithm in average construction decision Tree on 25 data sets; as can be seen from fig. 5, 6 and 7, the FS-WPCCT algorithm performed optimally on most datasets compared to other classical algorithms in terms of accuracy, recall and macroscopic F1 values over 25 datasets; as can be seen from FIG. 8, FS-WPCCT is superior to other classical decision Tree classification algorithms (ID3, CART, C4.5, PCC-Tree) in average accuracy, average recall, and average macroscopic F1 values over 25 datasets.
Table 4 data set details
Figure BDA0003154090000000121

Claims (6)

1. The classification algorithm based on the weighted Pearson correlation coefficient and combined with feature screening is characterized by being implemented according to the following steps:
step 1, for a category set C containing m categories, C ═ C1,c2,...cmM 1,2,3.. m, and a feature set F with n as a feature number F ═ F1,f2,f3,...fnPreprocessing a data set of 1,2,3.. n;
step 2, carrying out feature screening on the preprocessed data set by using an IMPROVE _ FCBF algorithm;
step 3, dividing the data set subjected to feature screening into training data and test data;
step 4, constructing a decision tree model on the training set by using a decision tree classification method based on the weighted Pearson correlation coefficient;
and 5, testing the test data by using the established decision tree model, and evaluating the experimental result by using the accuracy, the recall rate, the macro F1 and the time required for constructing the decision tree as evaluation indexes.
2. The classification algorithm based on weighted pearson correlation coefficients and combined with feature screening as claimed in claim 1, wherein the preprocessing in step 1 is specifically to firstly discretize the continuous feature values in the data set by using an equal width method; then converting the character string type characteristic value into a nominal numerical value type; then, complementing the missing characteristic value by using a mode; and finally converting the character string class values in the data set into a nominal numerical type.
3. The classification algorithm based on weighted pearson correlation coefficients combined with feature screening according to claim 1, wherein the step 2 is implemented specifically according to the following steps:
step 2.1, initialize SlistIs an empty set;
step 2.2, calculate each feature fiSymmetry uncertainty SU (f) between (i ═ 1, …, n) and class CiC) value, and asymmetry between each two featuresQualitative measures SU (f)i,fj) (i, j ≠ j) 1, …, n, and i ≠ j); the formula for calculating the SU values of the two variables X and Y is as follows:
Figure FDA0003154089990000021
step 2.3, will satisfy SU (fi, C)>Feature formation S of 0listSubsets and sorting from large to small;
step 2.4, judging S circularlylistEach feature f in the subsetjWhether or not it is the main feature fiIf the strong redundancy feature is the strong redundancy feature, the strong redundancy feature is selected from SlistRemoving from the subset;
step 2.5, for SlistEach feature F ofk(k is 1, …, n) circularly judging whether the Merits value is reduced or not, and if the Merits value is reduced, rejecting the Merits value; if SlistStopping searching when all the characteristic elements are judged to be finished or meet the early stop criterion;
step 2.6, return to the final feature subset Slist
4. The classification algorithm based on weighted pearson correlation coefficients combined with feature filtering as claimed in claim 3, wherein in step 2.5, if S islistWhen the middle characteristic element is not judged to be finished or does not meet the early stop criterion, repeating the following steps:
step 2.5.1, for each feature Fk(k is 1, …, n), let Slist[k]=FkMerits is calculated according to the formula shown below, where k is the number of features, rcfIs characterized byiThe value of SU (fi, C), r, between class CffFor the pairs SU (f) between each two featuresi,fj) Sum average:
Figure FDA0003154089990000022
step 2.5.2, if k > 1, if
Figure FDA0003154089990000023
The kth feature is deleted, otherwise it is added to the final feature subset SlistAmong them.
5. The classification algorithm based on weighted pearson correlation coefficients combined with feature screening according to claim 1, wherein the step 3 is a ten-fold cross validation method for dividing the data set.
6. The classification algorithm based on weighted pearson correlation coefficients combined with feature screening according to claim 1, wherein the step 4 is implemented specifically according to the following steps:
step 4.1, traverse is carried out on each feature in the subset after feature screening, and S is assumed to be carried out at the momentlistIn which there are n features, each feature f is calculatedi∈Slist(i ═ 1,2,3,.. n) and class C, and the weighted pearson correlation coefficient between the two variables X and Y is calculated as follows:
Figure FDA0003154089990000031
4.2, sorting the features from big to small according to the WPCC values obtained by calculation;
4.3, when each layer of the decision tree is constructed, selecting the characteristic with the maximum WPCC value as a split node to construct the decision tree each time;
and 4.4, iteratively constructing the decision tree until a decision tree termination condition is reached, and completing construction of the decision tree model.
CN202110774460.XA 2021-07-08 2021-07-08 Classification algorithm based on weighted Pearson correlation coefficient and combined with feature screening Pending CN113657441A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110774460.XA CN113657441A (en) 2021-07-08 2021-07-08 Classification algorithm based on weighted Pearson correlation coefficient and combined with feature screening

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110774460.XA CN113657441A (en) 2021-07-08 2021-07-08 Classification algorithm based on weighted Pearson correlation coefficient and combined with feature screening

Publications (1)

Publication Number Publication Date
CN113657441A true CN113657441A (en) 2021-11-16

Family

ID=78489271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110774460.XA Pending CN113657441A (en) 2021-07-08 2021-07-08 Classification algorithm based on weighted Pearson correlation coefficient and combined with feature screening

Country Status (1)

Country Link
CN (1) CN113657441A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115343676A (en) * 2022-08-19 2022-11-15 黑龙江大学 Feature optimization method for technology for positioning excess inside sealed electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115343676A (en) * 2022-08-19 2022-11-15 黑龙江大学 Feature optimization method for technology for positioning excess inside sealed electronic equipment

Similar Documents

Publication Publication Date Title
CN109657947B (en) Enterprise industry classification-oriented anomaly detection method
CN111882446B (en) Abnormal account detection method based on graph convolution network
CN107391772B (en) Text classification method based on naive Bayes
CN111914090B (en) Method and device for enterprise industry classification identification and characteristic pollutant identification
WO2023279696A1 (en) Service risk customer group identification method, apparatus and device, and storage medium
CN109739844B (en) Data classification method based on attenuation weight
CN105373606A (en) Unbalanced data sampling method in improved C4.5 decision tree algorithm
CN107633444A (en) Commending system noise filtering methods based on comentropy and fuzzy C-means clustering
CN100557616C (en) Protein complex recognizing method based on range estimation
CN117478390A (en) Network intrusion detection method based on improved density peak clustering algorithm
Kumar et al. Comparative analysis of SOM neural network with K-means clustering algorithm
Shu et al. Performance assessment of kernel density clustering for gene expression profile data
CN113516189B (en) Website malicious user prediction method based on two-stage random forest algorithm
CN113657441A (en) Classification algorithm based on weighted Pearson correlation coefficient and combined with feature screening
CN115481841A (en) Material demand prediction method based on feature extraction and improved random forest
CN117349786A (en) Evidence fusion transformer fault diagnosis method based on data equalization
CN112258235A (en) Method and system for discovering new service of electric power marketing audit
CN115018007A (en) Sensitive data classification method based on improved ID3 decision tree
CN113792141B (en) Feature selection method based on covariance measurement factor
CN114722920A (en) Deep map convolution model phishing account identification method based on map classification
CN115186138A (en) Comparison method and terminal for power distribution network data
CN114519605A (en) Advertisement click fraud detection method, system, server and storage medium
CN113657106A (en) Feature selection method based on normalized word frequency weight
CN112733903A (en) Air quality monitoring and alarming method, system, device and medium based on SVM-RF-DT combination
CN113010673A (en) Vulnerability automatic classification method based on entropy optimization support vector machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination