CN109492682A - A kind of multi-branched random forest data classification method - Google Patents
A kind of multi-branched random forest data classification method Download PDFInfo
- Publication number
- CN109492682A CN109492682A CN201811273813.2A CN201811273813A CN109492682A CN 109492682 A CN109492682 A CN 109492682A CN 201811273813 A CN201811273813 A CN 201811273813A CN 109492682 A CN109492682 A CN 109492682A
- Authority
- CN
- China
- Prior art keywords
- sample
- cluster
- center
- sample point
- random forest
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2148—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of multi-branched random forest data classification methods, it is related to random forest data classification technology field, the technical issues of solution, is to provide the classification method of a kind of performance for improving data classification and accuracy rate, this method comprises the following steps: (one) provides unfiled data set, denoises using PCA algorithm to Data Dimensionality Reduction;(2) cluster operation of data is completed using K-means algorithm;(3) multi-branched random forest is constructed;(4) sort operation of the complete paired data of multi-branched Random Forest model is used.The performance and accuracy rate of data classification can be improved using technical solution of the present invention.
Description
Technical field
The present invention relates to random forest data classification technology field more particularly to a kind of multi-branched random forest data classifications
Method.
Background technique
With the development of artificial intelligence, whether image studies, information security etc. require the participation of artificial intelligence.Cluster
It is had important application with sorting algorithm in artificial intelligence field, wherein K-means and random forest are cluster and classification respectively
The representative of algorithm.The classification capacity of random forest is one of preferable algorithm of performance in sorting algorithm, is one based on decision tree
Kind Ensemble Learning Algorithms.But the random forest data classification method of the prior art is when being classified, sample set excessively redundancy, miscellaneous
Disorderly, data purity is low, has a certain impact to classification performance.
Summary of the invention
In view of the deficiencies of the prior art, technical problem solved by the invention is to provide a kind of performance for improving data classification
With the classification method of accuracy rate.
In order to solve the above technical problems, the technical solution adopted by the present invention is that a kind of multi-branched random forest data classification side
Method includes the following steps:
(1) unfiled data set is provided, Data Dimensionality Reduction is denoised using PCA algorithm, specifically as follows step by step:
(1) sample set is expressed as to the matrix X of N × M;
(2) zero averaging is carried out to each row, that is, seeks the average value R of every a line in matrixi, every a line all subtracts the row
Average value Ni-Ri;Find out covariance matrixSeek the eigenvalue λ of covariance matrix C1,λ2…λmIt is special with standardization
Levy vector x1,x2…xm;
(3) by feature vector according to corresponding eigenvalue size from top to bottom by rows at matrix, take before k row composition matrix
P;
(4) matrix P is multiplied with matrix X, the data after obtaining dimensionality reduction remove the redundancy section in data.
(2) cluster operation that data set is completed using K-means algorithm, exports cluster C={ C1, C2..., Ck, specific point
Steps are as follows:
(1) density value of each sample point is calculated
Wherein,dijk=| | xij-xkj||,pijIt is the density of i-th of sample point in classification j;njFor j
Class sample point sum, dijkIt is sample point xijAnd xkjDistance in vector space;By density value pijMaximum sample point conduct
First center that clusters;
(2) it is also contemplated that distance in the selection at the remaining center that clusters, to given sample yn, it is arrived into sample point yl Distance be normalized:
(3) by the density value of the sample point and to the sum of the normalized cumulant for having selected cluster centre;
Wherein, pijIndicate the density of i-th of sample point in classification j, DijtIndicate sample point xijTo the t class selected
Center ytNormalized cumulant;Cluster numbers K value is determined by elbow method;
(4) wijAccording to being ordered from large to small, preceding k-1 sample point and p are selectedijIt is worth maximum point as just
Begin the center C that clusters1, C2..., Ck;
(5) by c1, c2...ckIt is denoted as μ again as the initial center that clusters1, μ2...μk;Set maximum number of iterations R;
(6) the distance dist (x of each sample and the center that clusters is calculatedi, μj)=| | xi-μj||2, wherein i=1,2 ... N,
J=1,2 ... k;
(7) x is determined according to the nearest center of clustering of distanceiCluster label: λj=arg minI ∈ { 1,2..., k }dist(xi,
μj);
(8) by sample xiIt is divided into corresponding cluster: Cλi=Cλi∪{xi};
(9) after clustering to the completion of all samples, new mean value class center is calculated:If μ 'iAnd μi
Unequal, class center is updated to μ 'iIf μiWith μ 'iIt is equal, keeping μiIt is constant;It recalculates corresponding belonging to sample
Cluster;
(10) it repeats step by step (9), the maximum iteration time until all central points that clusters do not change or reach
Number;
(11) output cluster divides C={ C1, C2..., Ck}。
(3) multi-branched random forest is constructed, specific substep is poly- as follows:
(1) building is completed with the training set of known label, provides training set, training set is carried out using K-means algorithm
Data prediction obtains cluster C={ C1, C2..., Ck, detailed process is as follows:
1) density value of each sample point is calculated
Wherein,pijIt is the density of i-th of sample point in classification j;nj
For j class sample point sum, dijkIt is sample point xijAnd xkjDistance in vector space;By density value pijMaximum sample point is made
For first center that clusters;
2) it is also contemplated that distance in the selection at the remaining center that clusters, to given sample yn, it is arrived into sample point yl Distance be normalized:
3) by the density value of the sample point and to the sum of the normalized cumulant for having selected cluster centre:
Wherein, pijIndicate the density of i-th of sample point in classification j, DijtIndicate sample point xijTo the t class selected
Center ytNormalized cumulant;Cluster numbers K value is determined by elbow method;
4) wijAccording to being ordered from large to small, preceding k-1 sample point and p are selectedijIt is worth maximum point as initial
The center of clustering C1, C2..., Ck;
5) by c1, c2…ckIt is denoted as μ again as the initial center that clusters1, μ2...μk;Set maximum number of iterations R;
6) the distance dist (x of each sample and the center that clusters is calculatedi, μj)=| | xi-μj||2, wherein i=1,2 ... N, j
=1,2 ... k;
7) x is determined according to the nearest center of clustering of distanceiCluster label: λj=arg minI ∈ { 1,2..., k }dist(xi, μj);
8) by sample xiIt is divided into corresponding cluster: Cλi=Cλi∪{xi};
9) after clustering to the completion of all samples, new mean value class center is calculated:If μ 'iAnd μiNo
Equal, class center is updated to μ 'iIf μiWith μ 'iIt is equal, keeping μiIt is constant;Recalculate corresponding cluster belonging to sample;
10) repetitive process 9), the maximum the number of iterations until all central points that clusters do not change or reach;
11) output cluster divides C={ C1, C2..., Ck}。
(2) bootstrap sampling sampling method is used, is completed to cluster CiSampling operation, building multi-branched it is gloomy at random
Woods, detailed process is as follows:
1) bootstrap sampling sampling method is used, in cluster CiMiddle use has the sampling put back to, and samples out T containing m
The training set D of a training samplei;
2) assume that the feature quantity of sample is M, m feature (m < M) is randomly selected in the division of base decision tree, to each spy
A and its each value a is levied, is calculated gini index Gini (D, A);
The gini index Gini (D, A), for given sample set D, if belonging to class ckSample set be Ck, then
Gini index are as follows:
Under conditions of feature A, whether the gini index Gini (D, A) of set D: given feature A take some according to it
Probable value a, sample set D are divided into two subsets: D1And D2, in which:Then:
3) choose optimal characteristics and optimal cut-off: in all feature A and all cut-off a, gini index is minimum
A and a be exactly optimal characteristics and optimal cut-off, as tree node.According to optimal characteristics and optimal cut-off by data set Di
It is cut into two child nodes;
4) to child node recursive call process 2), process 3), until in data set gini index be less than predetermined value, that is, complete
The building of base decision tree;
5) multi-branched random forest is formed by base decision tree.
(4) sort operation for the complete paired data of multi-branched Random Forest model completed using building, specifically step by step such as
Under:
(1) step (2) is clustered to the cluster C={ C for completing output1, C2..., CkTo be sequentially inputted to multi-branched gloomy at random
Woods;
(2) sample point hiIn category label cjOutput be denoted as
(3) classification of sample is determined using relative majority ballot method:Repeat the above substep
Poly- (2), substep poly- (3), is completed until all clusters are classified;
(4) output category result.
The performance and accuracy rate of data classification can be improved using technical solution of the present invention.
Detailed description of the invention
Fig. 1 is flow chart of the present invention;
Fig. 2 is construction multi-branched random forest flow diagram.
Specific embodiment
A specific embodiment of the invention is further described with reference to the accompanying drawing, but is not to limit of the invention
It is fixed.
Fig. 1 shows a kind of multi-branched random forest data classification method, includes the following steps:
(1) unfiled data set is provided, Data Dimensionality Reduction is denoised using PCA algorithm, specifically as follows step by step:
(1) sample set is expressed as to the matrix X of N × M;
(2) zero averaging is carried out to each row, that is, seeks the average value R of every a line in matrixi, every a line all subtracts the row
Average value Ni-Ri;Find out covariance matrixSeek the eigenvalue λ of covariance matrix C1,λ2…λmIt is special with standardization
Levy vector x1,x2…xm;
(3) by feature vector according to corresponding eigenvalue size from top to bottom by rows at matrix, take before k row composition matrix
P;
(4) matrix P is multiplied with matrix X, the data after obtaining dimensionality reduction remove the redundancy section in data.
(2) cluster operation that data set is completed using K-means algorithm, exports cluster C={ C1, C2..., Ck, specific point
Steps are as follows:
(1) density value of each sample point is calculated
Wherein,pijIt is the density of i-th of sample point in classification j;nj
For j class sample point sum, dijkIt is sample point xijAnd xkjDistance in vector space;By density value pijMaximum sample point is made
For first center that clusters;
(2) it is also contemplated that distance in the selection at the remaining center that clusters, to given sample yn, it is arrived into sample point yl Distance be normalized:
(3) by the density value of the sample point and to the sum of the normalized cumulant for having selected cluster centre:
Wherein, pijIndicate the density of i-th of sample point in classification j, DijtIndicate sample point xijTo the t class selected
Center ytNormalized cumulant;Cluster numbers K value is determined by elbow method;
(4) wijAccording to being ordered from large to small, preceding k-1 sample point and p are selectedijIt is worth maximum point as just
Begin the center C that clusters1, C2..., Ck。
(5) by c1, c2…ckIt is denoted as μ again as the initial center that clusters1, μ2...μk;Set maximum number of iterations R;
(6) the distance dist (x of each sample and the center that clusters is calculatedi, μj)=| | xi-μj||2, wherein i=1,2 ... N,
J=1,2 ... k;
(7) x is determined according to the nearest center of clustering of distanceiCluster label: λj=arg minI ∈ { 1,2..., k }dist(xi,
μj);
(8) by sample xiIt is divided into corresponding cluster: Cλi=Cλi∪{xi};
(9) after clustering to the completion of all samples, new mean value class center is calculated:If μ 'iAnd μi
Unequal, class center is updated to μ 'iIf μiWith μ 'iIt is equal, keeping μiIt is constant;It recalculates corresponding belonging to sample
Cluster;
(10) it repeats step by step (9), the maximum iteration time until all central points that clusters do not change or reach
Number;
(11) output cluster divides C={ C1, C2..., Ck}。
(3) multi-branched random forest is constructed, detailed process is as shown in Fig. 2, specific substep is poly- as follows:
(1) building is completed with the training set of known label, provides training set, training set is carried out using K-means algorithm
Data prediction obtains cluster C={ C1, C2..., Ck, detailed process is as follows:
1) density value of each sample point is calculated
Wherein,pijIt is the density of i-th of sample point in classification j;nj
For j class sample point sum, dijkIt is sample point xijAnd xkjDistance in vector space;By density value pijMaximum sample point is made
For first center that clusters;
2) it is also contemplated that distance in the selection at the remaining center that clusters, to given sample yn, it is arrived into sample point yl Distance be normalized:
3) by the density value of the sample point and to the sum of the normalized cumulant for having selected cluster centre:
Wherein, pijIndicate the density of i-th of sample point in classification j, DijtIndicate sample point xijTo the t class selected
Center ytNormalized cumulant;Cluster numbers K value is determined by elbow method;
4) wijAccording to being ordered from large to small, preceding k-1 sample point and p are selectedijIt is worth maximum point as initial
The center of clustering C1, C2..., Ck。
5) by c1, c2...ckIt is denoted as μ again as the initial center that clusters1, μ2...μk;Set maximum number of iterations R;
6) the distance dist (x of each sample and the center that clusters is calculatedi, μj)=| | xi-μj||2, wherein i=1,2 ... N, j
=1,2 ... k;
7) determine that the cluster of xi marks according to the nearest center of clustering of distance: λj=arg minI ∈ { 1,2..., k }dist(xi, μj);
8) by sample xiIt is divided into corresponding cluster: Cλi=Cλi∪{xi};
9) after clustering to the completion of all samples, new mean value class center is calculated:If μ 'iAnd μiNo
Equal, class center is updated to μ 'iIf μiWith μ 'iIt is equal, keeping μiIt is constant;Recalculate corresponding cluster belonging to sample;
10) repetitive process 9), the maximum the number of iterations until all central points that clusters do not change or reach;
11) output cluster divides C={ C1, C2..., Ck}。
(2) bootstrap sampling sampling method is used, is completed to cluster CiSampling operation, building multi-branched it is gloomy at random
Woods, detailed process is as follows:
1) bootstrap sampling sampling method is used, in cluster CiMiddle use has the sampling put back to, and samples out T containing m
The training set D of a training samplei;
2) assume that the feature quantity of sample is M, m feature (m < M) is randomly selected in the division of base decision tree, to each spy
A and its each value a is levied, is calculated gini index Gini (D, A);
The gini index: for given sample set D, if belonging to class ckSample set be Ck, then gini index
Are as follows:
Under conditions of feature A, whether the gini index Gini (D, A) of set D: given feature A take some according to it
Probable value a, sample set D are divided into two subsets: D1And D2, in which:Then:
3) choose optimal characteristics and optimal cut-off: in all feature A and all cut-off a, gini index is minimum
A and a be exactly optimal characteristics and optimal cut-off, as tree node.According to optimal characteristics and optimal cut-off by data set Di
It is cut into two child nodes;
4) to child node recursive call process 2), process 3), until in data set gini index be less than predetermined value, that is, complete
The building of base decision tree;
5) multi-branched random forest is formed by base decision tree.
(4) sort operation for the complete paired data of multi-branched Random Forest model completed using building, specifically step by step such as
Under:
(1) step (2) is clustered to the cluster C={ C for completing output1, C2..., CkTo be sequentially inputted to multi-branched gloomy at random
Woods;
(2) sample point hiIn category label cjOutput be denoted as
(3) classification of sample is determined using relative majority ballot method:Repeat the above substep
Poly- (2), substep poly- (3), is completed until all clusters are classified;
(4) output category result.
The performance and accuracy rate of data classification can be improved using technical solution of the present invention.
Detailed description is made that embodiments of the present invention in conjunction with attached drawing above, but the present invention be not limited to it is described
Embodiment.To those skilled in the art, without departing from the principles and spirit of the present invention, to these implementations
Mode carries out various change, modification, replacement and variant are still fallen in protection scope of the present invention.
Claims (8)
1. a kind of multi-branched random forest data classification method, which comprises the steps of:
(1) unfiled data set is provided, Data Dimensionality Reduction is denoised using PCA algorithm;
(2) cluster operation of data is completed using K-means algorithm;
(3) multi-branched random forest is constructed;
(4) sort operation of the complete paired data of multi-branched Random Forest model is used.
2. multi-branched random forest data classification method as described in claim 1, which is characterized in that the step (1) is specific
Substep is poly- as follows:
(1) sample set is expressed as to the matrix X of N × M;
(2) zero averaging is carried out to each row, that is, seeks the average value R of every a line in matrixi, every a line all subtracts being averaged for the row
Value Ni-Ri;Find out covariance matrixSeek the eigenvalue λ of covariance matrix C1,λ2…λmWith standardized feature to
Measure x1,x2…xm;
(3) by feature vector according to corresponding eigenvalue size from top to bottom by rows at matrix, take before k row composition matrix P;
(4) matrix P is multiplied with matrix X, the data after obtaining dimensionality reduction remove the redundancy section in data.
3. multi-branched random forest data classification method as described in claim 1, which is characterized in that the step (2) is specific
It is as follows step by step:
(1) density value of each sample point is calculated
Wherein,dijk=| | xij-xkj||,pijIt is the density of i-th of sample point in classification j;njFor j class sample
This point sum, dijkIt is sample point xijAnd xkjDistance in vector space;By density value pijMaximum sample point is as first
A center that clusters;
(2) it is also contemplated that distance in the selection at the remaining center that clusters, to given sample yn, it is arrived into sample pointDistance be normalized:
(3) by the density value of the sample point and to the sum of the normalized cumulant for having selected cluster centre:
Wherein, pijIndicate the density of i-th of sample point in classification j, DijtIndicate sample point xijTo the center for the t class selected
ytNormalized cumulant;Cluster numbers K value is determined by elbow method;
(4) wijAccording to being ordered from large to small, preceding k-1 sample point and p are selectedijIt is worth maximum point as initial poly-
Cluster center C1, C2..., Ck;
(5) by c1, c2..., ckIt is denoted as μ again as the initial center that clusters1, μ2...μk;Set maximum number of iterations R;
(6) the distance dist (x of each sample and the center that clusters is calculatedi, μj)=| | xi-μj||2, wherein i=1,2 ... N, j=
1,2,…k;
(7) x is determined according to the nearest center of clustering of distanceiCluster label: λj=argminI ∈ { 1,2..., k }dist(xi, μj);
(8) by sample xiIt is divided into corresponding cluster: Cλi=Cλi∪{xi};
(9) after clustering to the completion of all samples, new mean value class center is calculated:If μ 'iAnd μiNot phase
Deng class center is updated to μ 'iIf μiWith μ 'iIt is equal, keeping μiIt is constant;Recalculate corresponding cluster belonging to sample;
(10) it repeats step by step (9), the maximum the number of iterations until all central points that clusters do not change or reach;
(11) output cluster divides C={ C1, C2..., Ck}。
4. multi-branched random forest data classification method as described in claim 1, which is characterized in that the step (3) is specific
Substep is poly- as follows:
(1) building is completed with the training set of known label, provides training set, data are carried out using K-means algorithm to training set
Pretreatment obtains cluster C={ C1, C2..., Ck};
(2) bootstrap sampling sampling method is used, is completed to cluster CiSampling operation, construct multi-branched random forest.
5. multi-branched random forest data classification method as claimed in claim 4, which is characterized in that divide in the step (3)
Walking poly- (1), detailed process is as follows:
1) density value of each sample point is calculated
Wherein,dijk=| | xij-xkj||,pijIt is the density of i-th of sample point in classification j;njFor j class sample
This point sum, dijkIt is sample point xijAnd xkjDistance in vector space;By density value pijMaximum sample point is as first
A center that clusters;
2) it is also contemplated that distance in the selection at the remaining center that clusters, to given sample yn, it is arrived into sample pointDistance be normalized:
3) by the density value of the sample point and to the sum of the normalized cumulant for having selected cluster centre
Wherein, pijIndicate the density of i-th of sample point in classification j, DijtIndicate sample point xijTo the center for the t class selected
ytNormalized cumulant;Cluster numbers K value is determined by elbow method;
4) wijAccording to being ordered from large to small, preceding k-1 sample point and p are selectedijIt is worth maximum point as initial poly-
Cluster center C1, C2..., Ck;
5) by c1, c2..., ckIt is denoted as μ again as the initial center that clusters1, μ2...μk;Set maximum number of iterations R;
6) the distance dist (x of each sample and the center that clusters is calculatedi, μj)=| | xi-μj||2, wherein i=1,2 ... N, j=1,
2,…k;
7) x is determined according to the nearest center of clustering of distanceiCluster label: λj=argminI ∈ { 1,2..., k }dist(xi, μj);
8) by sample xiIt is divided into corresponding cluster: Cλi=Cλi∪{xi};
9) after clustering to the completion of all samples, new mean value class center is calculated:If μ 'iAnd μiNot phase
Deng class center is updated to μ 'iIf μiWith μ 'iIt is equal, keeping μiIt is constant;Recalculate corresponding cluster belonging to sample;
10) repetitive process 9), the maximum the number of iterations until all central points that clusters do not change or reach;
11) output cluster divides C={ C1, C2..., Ck}。
6. multi-branched random forest data classification method as claimed in claim 4, which is characterized in that divide in the step (3)
Walking poly- (2), detailed process is as follows:
1) bootstrap sampling sampling method is used, in cluster CiMiddle use has the sampling put back to, and samples out T containing m training
The training set D of samplei;
2) assume that the feature quantity of sample is M, m feature (m < M) is randomly selected in the division of base decision tree, to each feature A
And its each value a, it calculates gini index Gini (D, A);
3) optimal characteristics and optimal cut-off are chosen: in all feature A and all cut-off a, the smallest A of gini index and
A is exactly optimal characteristics and optimal cut-off, as tree node;According to optimal characteristics and optimal cut-off by data set DiIt is cut into
Two child nodes;
4) to child node recursive call process 2), process 3), until in data set gini index be less than predetermined value, that is, complete base
The building of decision tree;
5) multi-branched random forest is formed by base decision tree.
7. multi-branched random forest data classification method as claimed in claim 6, which is characterized in that the step (3) substep
Gini index Gini (D, A) described in poly- (2), for given sample set D, if belonging to class ckSample set be Ck, then base
Buddhist nun's index are as follows:
Under conditions of feature A, whether the gini index Gini (D, A) of set D: given feature A take some may according to it
Value a, sample set D are divided into two subsets: D1And D2, in which:Then:
8. multi-branched random forest data classification method as described in claim 1, which is characterized in that the step (4) is specific
Substep is poly- as follows:
(1) step (2) is clustered to the cluster C={ C for completing output1, C2..., CkIt is sequentially inputted to multi-branched random forest;
(2) sample point hiIn category label cjOutput be denoted as
(3) classification of sample is determined using relative majority ballot method:It is poly- to repeat the above substep
(2), substep poly- (3) is completed until all clusters are classified;
(4) output category result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811273813.2A CN109492682A (en) | 2018-10-30 | 2018-10-30 | A kind of multi-branched random forest data classification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811273813.2A CN109492682A (en) | 2018-10-30 | 2018-10-30 | A kind of multi-branched random forest data classification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109492682A true CN109492682A (en) | 2019-03-19 |
Family
ID=65691759
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811273813.2A Pending CN109492682A (en) | 2018-10-30 | 2018-10-30 | A kind of multi-branched random forest data classification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109492682A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110344824A (en) * | 2019-06-25 | 2019-10-18 | 中国矿业大学(北京) | A kind of sound wave curve generation method returned based on random forest |
CN110705584A (en) * | 2019-08-21 | 2020-01-17 | 深圳壹账通智能科技有限公司 | Emotion recognition method, emotion recognition device, computer device and storage medium |
CN111797883A (en) * | 2019-09-30 | 2020-10-20 | 浙江浙能中煤舟山煤电有限责任公司 | Coal type identification method based on random forest |
CN113254494A (en) * | 2020-12-04 | 2021-08-13 | 南理工泰兴智能制造研究院有限公司 | New energy research and development classification recording method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105404887A (en) * | 2015-07-05 | 2016-03-16 | 中国计量学院 | White blood count five-classification method based on random forest |
CN105868773A (en) * | 2016-03-23 | 2016-08-17 | 华南理工大学 | Hierarchical random forest based multi-tag classification method |
CN106203508A (en) * | 2016-07-11 | 2016-12-07 | 天津大学 | A kind of image classification method based on Hadoop platform |
CN107395590A (en) * | 2017-07-19 | 2017-11-24 | 福州大学 | A kind of intrusion detection method classified based on PCA and random forest |
-
2018
- 2018-10-30 CN CN201811273813.2A patent/CN109492682A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105404887A (en) * | 2015-07-05 | 2016-03-16 | 中国计量学院 | White blood count five-classification method based on random forest |
CN105868773A (en) * | 2016-03-23 | 2016-08-17 | 华南理工大学 | Hierarchical random forest based multi-tag classification method |
CN106203508A (en) * | 2016-07-11 | 2016-12-07 | 天津大学 | A kind of image classification method based on Hadoop platform |
CN107395590A (en) * | 2017-07-19 | 2017-11-24 | 福州大学 | A kind of intrusion detection method classified based on PCA and random forest |
Non-Patent Citations (3)
Title |
---|
林伟宁等: "一种基于PCA和随机森林分类的入侵检测算法研究", 《技术研究》 * |
梁腾: "基于聚类分析的入侵检测技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
荀港益: "基于聚类分析与随机森林的短期负荷滚动预测", 《智能城市》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110344824A (en) * | 2019-06-25 | 2019-10-18 | 中国矿业大学(北京) | A kind of sound wave curve generation method returned based on random forest |
CN110705584A (en) * | 2019-08-21 | 2020-01-17 | 深圳壹账通智能科技有限公司 | Emotion recognition method, emotion recognition device, computer device and storage medium |
WO2021031817A1 (en) * | 2019-08-21 | 2021-02-25 | 深圳壹账通智能科技有限公司 | Emotion recognition method and device, computer device, and storage medium |
CN111797883A (en) * | 2019-09-30 | 2020-10-20 | 浙江浙能中煤舟山煤电有限责任公司 | Coal type identification method based on random forest |
CN113254494A (en) * | 2020-12-04 | 2021-08-13 | 南理工泰兴智能制造研究院有限公司 | New energy research and development classification recording method |
CN113254494B (en) * | 2020-12-04 | 2023-12-08 | 南理工泰兴智能制造研究院有限公司 | New energy research and development classification recording method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109492682A (en) | A kind of multi-branched random forest data classification method | |
Fränti et al. | Randomised local search algorithm for the clustering problem | |
CN110460605B (en) | Abnormal network flow detection method based on automatic coding | |
CN110046634B (en) | Interpretation method and device of clustering result | |
Shen et al. | Balanced binary neural networks with gated residual | |
CN111125469B (en) | User clustering method and device of social network and computer equipment | |
JP5754310B2 (en) | Identification information providing program and identification information providing apparatus | |
CN110942091A (en) | Semi-supervised few-sample image classification method for searching reliable abnormal data center | |
CN110598061A (en) | Multi-element graph fused heterogeneous information network embedding method | |
Tan | Rule learning and extraction with self-organizing neural networks | |
CN109902808A (en) | A method of convolutional neural networks are optimized based on floating-point numerical digit Mutation Genetic Algorithms Based | |
CN115512772A (en) | High-precision single cell clustering method and system based on marker genes and ensemble learning | |
da Silva et al. | Validity index-based vigilance test in adaptive resonance theory neural networks | |
CN109934286A (en) | Bug based on Text character extraction and uneven processing strategie reports severity recognition methods | |
Sadeghi et al. | Deep clustering with self-supervision using pairwise data similarities | |
CN117478390A (en) | Network intrusion detection method based on improved density peak clustering algorithm | |
CN116612307A (en) | Solanaceae disease grade identification method based on transfer learning | |
Fu et al. | Classification via subspace learning machine (slm): Methodology and performance evaluation | |
Fattore et al. | Optimal scoring of partially ordered data, with an application to the ranking of smart cities | |
Arifuzzaman et al. | An Advanced Decision Tree-Based Deep Neural Network in Nonlinear Data Classification. Technologies 2023, 11, 24 | |
CN115168602A (en) | Triple classification method based on improved concepts and examples | |
CN112463964B (en) | Text classification and model training method, device, equipment and storage medium | |
CN110347933B (en) | Ego network social circle recognition method | |
Satpute et al. | Machine Intelligence Techniques for Protein Classification | |
Liao | Graph neural networks: graph generation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190319 |
|
WD01 | Invention patent application deemed withdrawn after publication |