CN108520284A - A kind of improved spectral clustering and parallel method - Google Patents

A kind of improved spectral clustering and parallel method Download PDF

Info

Publication number
CN108520284A
CN108520284A CN201810344423.3A CN201810344423A CN108520284A CN 108520284 A CN108520284 A CN 108520284A CN 201810344423 A CN201810344423 A CN 201810344423A CN 108520284 A CN108520284 A CN 108520284A
Authority
CN
China
Prior art keywords
cluster
algorithm
bird
nest
particle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810344423.3A
Other languages
Chinese (zh)
Inventor
强保华
孙颢宁
王玉峰
谢武
韦二龙
史喜娜
赵兴朝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
CETC 54 Research Institute
Original Assignee
Guilin University of Electronic Technology
CETC 54 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology, CETC 54 Research Institute filed Critical Guilin University of Electronic Technology
Priority to CN201810344423.3A priority Critical patent/CN108520284A/en
Publication of CN108520284A publication Critical patent/CN108520284A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2323Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Discrete Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of improvement Spectral Clustering based on swarm intelligence algorithm, the corresponding feature vector of preceding 2k maximum eigenvalue by choosing Laplacian Matrix is used as the source data of cluster, then good initialization central point is chosen by swarm intelligence algorithm and carries out cluster operation, improve the stability of the highest accuracy rate and multiple cluster result of cluster.Invention introduces cuckoo searching algorithms to find initialization central point, fitness function during cuckoo searching algorithm is used into sum of squared errors function, applied in spectral clustering, the data point for the minimum error sum of squares that search is obtained is as initialization central point.Lay dimension countermeasures in cuckoo searching algorithm are introduced particle cluster algorithm by the present invention, slow down in particle cluster algorithm convergence rate, using the frequent smaller step-length of Lay dimension countermeasures generation and with larger step size once in a while, the different speed more new formulas stressed of introducing under different step-lengths.

Description

A kind of improved spectral clustering and parallel method
Technical field
The invention belongs to the unsupervised learning methods in machine learning, are related to clustering method and swarm intelligence algorithm.
Background technology
The purpose of cluster is divided to data, and similar data are divided into same class cluster, dissimilar data It is divided into inhomogeneity cluster.The diversity of data is brought with the development of information technology, perhaps multidata dimension is made to have not Correlation, traditional clustering algorithm are difficult the processing incoherent data of these dimensions.Spectral clustering is a kind of novel clustering algorithm, It can be clustered in the sample space of arbitrary shape and globally optimal solution can be converged to, and be widely used in computer and regarded The fields such as feel, text mining and biological information excavation.At this stage, the structure of similar matrix is mainly included for the research of spectral clustering It makes, the problems such as selection of feature vector, class number of clusters purpose are determining and the application of algorithm.In these research fields, feature vector Selection to class cluster division be of crucial importance.Classical NJW algorithms use preceding k (k is class number of clusters mesh) a maximum special The corresponding feature vector of value indicative carries out cluster operation.The it is proposeds such as Sun are based on the feature selecting of principal component analysis (PCA) extraction, i.e., The several feature vectors for no longer using characteristic value larger, but Genetic algorithm searching PCA space is used, target can be reflected by finding The feature vector subset of conceptual information is as principal component direction.Zhao etc. proposes the feature vector selection algorithm to sort based on entropy, Entropy is sorted and selects wherein preferable feature vector, points out exist than most in document by the entropy for first calculating feature vector The better feature vector group of k big feature vector, the feature vector number selected not necessarily k.Rebagliati etc. It is proposed that the difference of characteristic value can assist in the selection of feature vector quantity, the corresponding feature vector of larger characteristic value is advantageous In cluster.Document《Machine-learning research:Four current directions》Middle analysis is suitable for complete One group of feature vector of volume data is not necessarily present, even if there are differ and surely obtained according to a small amount of prior information of offer.
Invention content
The present invention provides a kind of improvement Spectral Clustering based on swarm intelligence algorithm passes through and chooses Laplacian Matrix Source data of the corresponding feature vector of preceding 2k maximum eigenvalue as cluster, then by swarm intelligence algorithm choose it is good just Beginningization central point carries out cluster operation, improves the stability of the highest accuracy rate and multiple cluster result of cluster, more in guarantee Cluster accuracy rate is improved under the premise of secondary cluster result stability.
In order to keep the stability of result, selection is needed preferably to initialize central point, invention introduces cuckoo to search Rope (Cuckoo Search, CS) algorithm finds initialization central point.Fitness function during cuckoo searching algorithm is adopted With sum of squared errors function, it is applied in spectral clustering, the data point for the minimum error sum of squares that search is obtained is as initial Change central point, the accuracy rate unstability generated because randomly selecting central point can be reduced.
In order to make its searching process convergence rate faster, the present invention also provides tie up the Lay in cuckoo searching algorithm to fly The step of row strategy introduces particle cluster algorithm, slows down in particle cluster algorithm convergence rate, is produced using Lay dimension countermeasures Raw frequently smaller step-length and and larger step size once in a while, the different speed more new formulas stressed are introduced under different step-lengths, work as small step The influence for weakening globally optimal solution when long, weakens the influence of current particle history optimal solution when big step-length.By the innovatory algorithm For optimizing, smaller convergency value can be obtained and accelerate convergence rate.
Description of the drawings
Fig. 1 is the spectral clustering flow chart of feature based extension and cuckoo searching algorithm.
Fig. 2 is the particle cluster algorithm flow chart for merging cuckoo search.
Fig. 3 is that parallelization builds Laplacian Matrix DAG figures.
Fig. 4 is parallelization K-means algorithms DAG figures.
Fig. 5 is to improve spectral clustering and other clustering algorithm comparing results.
Fig. 6 is particle cluster algorithm and the comparison of other improvements algorithm for merging cuckoo search.
Fig. 7 is that parallelization builds Laplacian Matrix single machine and the comparison of cluster time.
Fig. 8 is that parallelization K-means algorithms single machine is compared with the cluster time.
Specific implementation mode
Source data of the corresponding feature vector of preceding 2k maximum eigenvalue of Laplacian Matrix as cluster is chosen, both may be used To generate higher than common NJW algorithms cluster accuracy rate as a result, also will produce the knot for clustering accuracy rate less than common NJW algorithms The fluctuation of fruit, multiple cluster result is higher than common NJW algorithms cluster result.Repeatedly cluster accuracy rate occur fluctuation be because For the feature samples space in spectral clustering, low dimensional space sample is more compacted, and after expanding to 2k dimensions by k dimensions, is introduced and is divided The poor dimension of class cluster ability, increases the independence between data, to more disperse compared to low dimensional spatial data points;And And it since spectral clustering is also to use random initializtion central point, increases and is absorbed in the general of locally optimal solution in cluster process Rate causes repeatedly to cluster accuracy rate unstable.In order to keep the stability of result, selection is needed preferably to initialize central point.
Present invention introduces cuckoo search (Cuckoo Search, CS) algorithms to find initialization central point.Cuckoo is searched for Algorithm is a kind of swarm intelligence algorithm, it is to find that nest hatches parasitic mode and Lai Wei flies (L é vy based on cuckoo Flight a kind of) optimizing pattern of mechanism.The algorithm finds premium position according to fitness function value is calculated.Cuckoo is searched Fitness function in rope algorithmic procedure uses sum of squared errors function.By the way that above-mentioned cuckoo searching algorithm is applied to spectrum In cluster, the data point for the minimum error sum of squares that search is obtained can be reduced as initialization central point because of random choosing The accuracy rate unstability for taking central point and generating.
In order to make its searching process convergence rate faster, the present invention draws the Lay dimension countermeasures in cuckoo searching algorithm Enter particle cluster algorithm, particle cluster algorithm is on the basis of to animal cluster activity behavior observation, using the individual in group to information It is shared so that the movement of entire group is generated the evolutionary process from disorder to order in problem solving space, it is optimal to obtain Solution.Slow down in particle cluster algorithm convergence rate, using Lay dimension countermeasures generate frequently smaller step-length and and once in a while compared with Big step-length introduces the different speed more new formulas stressed under different step-lengths, weakens the influence of globally optimal solution when small step-length, Weaken the influence of current particle history optimal solution when big step-length.The innovatory algorithm is used for optimizing, smaller convergence can be obtained It is worth and accelerates convergence rate.
In order to improve the treatment effeciency of mass data, invention introduces Spark distributed computing frameworks, use Spark RDD programming models have designed and Implemented the parallelization that Laplacian Matrix is built in spectral clustering and K-means is calculated The parallelization of method makes the storage and processing process of data become cluster from single machine, improves processing speed and the storage of data With the ability of processing mass data.
Referring to Fig.1, feature based extension and the spectral clustering realization of cuckoo searching algorithm include the following steps:
(1) initial data X=[x are given1,x2,x3,…,xn]∈RdWith clusters number k.
(2) Laplacian Matrix L is calculated according to following formula:
(3) the Standard Process Y of the feature vector of the preceding 2k maximum eigenvalue of Laplacian Matrix L is calculated.
(4) position of n Bird's Nest is randomly initialized in matrix Y
(5) using the position of n Bird's Nest as cluster centre, point carries out clustering respectively, calculates the fitness of each Bird's Nest Value F, and retain the Bird's Nest position of relatively small F values:Fbest=min { F1,F2,…,Fn}。
(6) Bird's Nest position is updated according to the Lay dimension countermeasures in cuckoo searching algorithm:
(7) Bird's Nest behind update position is subjected to clustering again, calculates the fitness of each Bird's Nest, then according to suitable Response compares new and old two generations Bird's Nest, and compares previous generation minimum F values, retains relatively small F values Bird's Nest position:
(8) random number r ∈ [0,1] and detection probability P are usedaCompare, if r < Pa, retain Bird's Nest position;If r > Pa, then lead to Formula update Bird's Nest position is crossed, the smaller Bird's Nest position of the front and back F values of update is retained.
(9) return to step (5) continues to execute if the not up to stop condition of maximum iteration or setting;Otherwise it protects It stays the solution of minimum fitness value and enters below step and operate.
(10) K-means clusters are carried out to the optimal Bird's Nest position of acquisition, finally exports cluster centre point and cluster result.
With reference to Fig. 2, the particle cluster algorithm realization of fusion cuckoo search includes the following steps:
(1) it initializes, the size of population, its position of random initializtion x is seti=(xi1, xi2..., xiD), setting is just Beginning speed vi=(vi1, vi2..., viD)。
(2) the fitness value F for calculating each particle retains the history desired positions p of each particle experiencei=(pi1, pi2..., piD) and group in undergo desired positions pg=(pg1, pg2..., pgD)。
(3) particle updates speed and the position of oneself according to following formula:
Wherein, c1、c2For Studying factors or accelerator coefficient;Generally normal number, generally equal to 2;r1、r2Value range is [0,1] is equally distributed pseudo random number in the section.
(4) decrease speed for judging fitness value in nearest 10 iterative process in an iterative process, when decrease speed is low It in threshold value, introduces Lay and ties up countermeasures, enable step=L é vy (λ), frequently small step-length and big step once in a while are realized with step Long strategy, and difference is stressed to impact factor in small step-length and big step-length;When small step-length, weaken the shadow of globally optimal solution It rings, formula is as follows:When big step-length, weaken current particle history optimal solution Influence, formula is as follows:Wherein: Indicate the global impact factor;Indicate the local influence factor.
(5) return 2 is needed if the not up to stop condition of maximum iteration or setting) it continues to execute;Otherwise it protects Stay the position of minimum fitness value.
Fig. 3 is to build Laplacian Matrix Stage flow charts using Spark RDD parallelizations, is as follows:
1) dataRDD, format Tuple of input data are created<index,data>, wherein index is line number, Data is data;
2) it is cloneRDD to replicate the dataRDD created;
3) above-mentioned two groups of RDD are made into Descartes's collection using cartesian operators, to reduce the amount of computing repeatedly, and used Filter operators filter out an identical half data in set, obtain groupRDD;
4) flatmap operators is used to calculate the similarity of two pairs of data in each group of set in groupRDD;
5) sum of the reduceByKey calculating matrix per a line is used, diagonal matrix matrixD_RDD is obtained;
6) matrixD_RDD obtains matrix D using map algorithms-1/2
7) map operators are used to calculate AD-1/2Obtain AD_RDD;
8) map operators are used to calculate D-1/2AD-1/2Obtain L_RDD.
Fig. 4 be using Spark RDD parallelization K-means algorithm main loop flow charts, the specific steps are:
(1) k central point kClusters of initialization is worth to according to k;
(2) kClusters is set to broadcast variable using setBroadCasst methods;
(3) mapToPair methods are used to calculate each affiliated class cluster of data;
(4) reduceByKey methods are used, the data point of same class cluster is recalculated into central point;
(5) step (2)-(4) are repeated until reaching condition of convergence position.
Fig. 5 is to carrying out accuracy rate comparison after improvement spectral clustering with other clustering algorithms, and the 5 kinds of algorithms compared are K- Means algorithms, common NJW algorithms, the ESBER_D algorithms based on entropy sequence, the 2K_NJW algorithms of feature expansion, fusion cuckoo The CS_2K_NJW algorithms of algorithm and feature extension.
The highest accuracy rate of four groups of data clusters of 2K_NJW algorithms pair is higher than NJW algorithms and ESBER_D algorithms, so obtaining The feature vector number of the high cluster result of accuracy rate not necessarily k.Feature vector per dimension, which all has, divides class The ability of cluster.And it therefrom finds, 2K_NJW algorithms have fluctuation to 20 cluster results of data set, this is because introducing Additional k feature vector in, some feature vectors can play certain interference providing useful information content to cluster operation simultaneously Effect, increases degree of scatter of the data point in characteristic vector space so that cluster process becomes more initialization central point For sensitivity, it is easy to be absorbed in locally optimal solution, so as to cause global poor result.
The cluster result obtained for CS_2K_NJW algorithms, it can be clearly seen that, ensure 2K_NJW algorithm high-accuracies On the basis of maintain preferable stability, to obtain cluster result more preferably than other three kinds of algorithms.The algorithm is using Cuckoo searching algorithm is added on the basis of being clustered in 2k feature vector, is with the sum of the cluster internal variance for minimizing all clusters Target searches out good clustering initialization central point, the ability for making full use of every one-dimensional vector to divide class cluster, more to ensure The stability of secondary cluster result, cuckoo searching algorithm equally increase original calculation in order to which high-quality initialization central point is calculated The calculation amount of method.
Fig. 6 is the knot that convergence time and minimum convergency value comparison are carried out to searching algorithm after improvement and other swarm intelligence algorithms The algorithm of fruit, comparison includes cuckoo search (CS) algorithm, population (PSO), the global particle for introducing cuckoo searching algorithm Group's (CS_PSO (WL)) algorithm and part introduce population (CS_PSO (PL)) algorithm of cuckoo searching algorithm.
As seen in Figure 6, basic PSO convergence speed of the algorithm is very fast, and degree of convergence is substantially better than CS algorithms, 1000 time-consuming 1%-3%s more multi-purpose than CS algorithm of iteration or so, so overall efficiency is better than CS algorithms.And for CS_PSO (WL) Algorithm, convergence rate is slow compared to PSO algorithms, and is also more than PSO algorithms on 1000 time-consuming early most data sets of iteration, But degree of convergence is better than PSO algorithms.CS_PSO (PL) algorithmic statement degree maintains an equal level with CS_PSO (WL) algorithm, is better than PSO Algorithm, the degree of convergence iterations that CS_PSO (PL) algorithm reaches PSO algorithms will be less than PSO algorithms, 1000 operations of iteration Taking will lack compared with CS_PSO (WL) algorithm, than the time that PSO algorithm takes multi-purpose 1%-4% or so.Therefore CS_PSO (PL) The degree of convergence that algorithm reaches PSO algorithms takes and remains basically stable with PSO algorithms, but can reach in the short period be better than behind The convergence effect of PSO algorithms.Consider, CS_PSO (PL) efficiency of algorithm is more excellent.
Fig. 7 and Fig. 8 is respectively the parallel method cluster and single machine state for building Laplacian Matrix and K-means algorithms Under run time comparison.As seen from the figure, the run time of the parallel method cluster of two kinds of operations will be less than single-unit operation Time, parallel data amount is bigger, and the operational efficiency of cluster is higher.
Experiment shows to carry out feature extension to the characteristic vector space of spectral clustering, and is determined using cuckoo searching algorithm The initialization central point of cluster can improve the accuracy rate and stability of spectral clustering.Cuckoo is incorporated in particle cluster algorithm The Lay of bird searching algorithm ties up countermeasures, and improves the step-length weight in formula, can obtain that convergency value is low and fast convergence rate Swarm intelligence algorithm.Using Spark RDD programming model parallelization spectral clustering calculating process, the processing speed of data is improved The ability of degree and storage and processing mass data.

Claims (2)

1. a kind of improved spectral clustering and parallel method, the method includes:
Source data of the corresponding feature vector of preceding 2k maximum eigenvalue of Laplacian Matrix L as cluster is chosen, k is class cluster Number;
Initialization central point is found using cuckoo searching algorithm;
It is described to be included the following steps using cuckoo searching algorithm searching initialization central point:
(1) initial data X=[x are given1,x2,x3,…,xn]∈RdWith clusters number k;
(2) Laplacian Matrix L is calculated according to following formula:
Aij=exp (- | dij|2/2σ2), i ≠ j, Aii=0;
(3) the Standard Process Y of the feature vector of the preceding 2k maximum eigenvalue of Laplacian Matrix L is calculated;
(4) position of n Bird's Nest is randomly initialized in matrix Y
(5) using the position of n Bird's Nest as cluster centre, point carries out clustering respectively, calculates the fitness value F of each Bird's Nest, And retain the Bird's Nest position of relatively small F values:Fbest=min { F1,F2,…,Fn};
(6) Bird's Nest position is updated according to the Lay dimension countermeasures in cuckoo searching algorithm;
(7) Bird's Nest behind update position is subjected to clustering again, the fitness of each Bird's Nest is calculated, then according to fitness New and old two generations Bird's Nest is compared, and compares previous generation minimum F values, retains relatively small F values Bird's Nest position;
(8) random number r ∈ [0,1] and detection probability P are usedaCompare, if r < Pa, retain Bird's Nest position;If r > Pa, then pass through public affairs Formula updates Bird's Nest position, retains the smaller Bird's Nest position of the front and back F values of update;
(9) return to step (5) continues to execute if the not up to stop condition of maximum iteration or setting;Otherwise retain most The solution and entrance below step of small fitness value operate;
(10) K-means clusters are carried out to the optimal Bird's Nest position of acquisition, finally exports cluster centre point and cluster result.
2. according to the method described in claim 1, the Lay dimension countermeasures in the wherein described cuckoo searching algorithm introduce particle Group's algorithm, the particle cluster algorithm include the following steps:
(1) it initializes, the size of population, its position of random initializtion x is seti=(xi1, xi2..., xiD), initial velocity is set vi=(vi1, vi2..., viD);
(2) the fitness value F for calculating each particle retains the history desired positions p of each particle experiencei=(pi1, pi2..., piD) and group in undergo desired positions pg=(pg1, pg2..., pgD);
(3) particle updates speed and the position of oneself according to following formula:
Wherein, c1、c2For Studying factors or accelerator coefficient;r1、r2It is that equally distributed pseudo random number, value range are in the section [0,1];
(4) decrease speed for judging fitness value F in nearest 10 iterative process in an iterative process, when decrease speed is less than threshold It when value, introduces Lay and ties up countermeasures, enable step=L é vy (λ), frequently small step-length and big step-length once in a while are realized with step Strategy;When small step-length, weaken the influence of globally optimal solution, formula is as follows: When big step-length, weaken the influence of current particle history optimal solution, formula is as follows:
Wherein:Indicate the global impact factor;Expression office Portion's impact factor;
(5) return to step (2) continues to execute if the not up to stop condition of maximum iteration or setting;Otherwise retain most The position of small fitness value F.
CN201810344423.3A 2018-04-17 2018-04-17 A kind of improved spectral clustering and parallel method Pending CN108520284A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810344423.3A CN108520284A (en) 2018-04-17 2018-04-17 A kind of improved spectral clustering and parallel method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810344423.3A CN108520284A (en) 2018-04-17 2018-04-17 A kind of improved spectral clustering and parallel method

Publications (1)

Publication Number Publication Date
CN108520284A true CN108520284A (en) 2018-09-11

Family

ID=63428816

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810344423.3A Pending CN108520284A (en) 2018-04-17 2018-04-17 A kind of improved spectral clustering and parallel method

Country Status (1)

Country Link
CN (1) CN108520284A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109739585A (en) * 2018-12-29 2019-05-10 广西交通科学研究院有限公司 The traffic congestion point discovery method calculated based on spark cluster parallelization
CN110334026A (en) * 2019-07-03 2019-10-15 浙江理工大学 Combined test case generation method based on CS-SPSO algorithm
CN110516068A (en) * 2019-08-23 2019-11-29 贵州大学 A kind of various dimensions Text Clustering Method based on metric learning
CN110580077A (en) * 2019-08-20 2019-12-17 广东工业大学 maximum power extraction method of photovoltaic power generation system and related device
CN112149898A (en) * 2020-09-21 2020-12-29 广东电网有限责任公司清远供电局 Fault rate prediction model training method, fault rate prediction method and related device
CN112836786A (en) * 2021-02-07 2021-05-25 中国科学院长春光学精密机械与物理研究所 Cuckoo search method
CN112988693A (en) * 2021-03-26 2021-06-18 武汉大学 Spectral clustering algorithm parallelization method and system in abnormal data detection
CN113141317A (en) * 2021-03-05 2021-07-20 西安电子科技大学 Streaming media server load balancing method, system, computer equipment and terminal
CN116360505A (en) * 2023-06-02 2023-06-30 北京航空航天大学 Integrated automatic control method and system for stratospheric airship and electronic equipment
WO2024074023A1 (en) * 2022-10-07 2024-04-11 南京邮电大学 Task scheduling method based on improved particle swarm optimization algorithm
CN118212001A (en) * 2024-03-19 2024-06-18 涅生科技(广州)股份有限公司 Electronic commerce market trend prediction system based on big data analysis

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109739585B (en) * 2018-12-29 2022-02-18 广西交通科学研究院有限公司 Spark cluster parallelization calculation-based traffic congestion point discovery method
CN109739585A (en) * 2018-12-29 2019-05-10 广西交通科学研究院有限公司 The traffic congestion point discovery method calculated based on spark cluster parallelization
CN110334026A (en) * 2019-07-03 2019-10-15 浙江理工大学 Combined test case generation method based on CS-SPSO algorithm
CN110334026B (en) * 2019-07-03 2023-03-24 浙江理工大学 CS-SPSO algorithm-based combined test case generation method
CN110580077A (en) * 2019-08-20 2019-12-17 广东工业大学 maximum power extraction method of photovoltaic power generation system and related device
CN110516068A (en) * 2019-08-23 2019-11-29 贵州大学 A kind of various dimensions Text Clustering Method based on metric learning
CN110516068B (en) * 2019-08-23 2023-05-26 贵州大学 Multi-dimensional text clustering method based on metric learning
CN112149898A (en) * 2020-09-21 2020-12-29 广东电网有限责任公司清远供电局 Fault rate prediction model training method, fault rate prediction method and related device
CN112149898B (en) * 2020-09-21 2023-10-31 广东电网有限责任公司清远供电局 Training of failure rate prediction model, failure rate prediction method and related device
CN112836786A (en) * 2021-02-07 2021-05-25 中国科学院长春光学精密机械与物理研究所 Cuckoo search method
CN112836786B (en) * 2021-02-07 2024-03-12 中国科学院长春光学精密机械与物理研究所 Cuckoo searching method
CN113141317A (en) * 2021-03-05 2021-07-20 西安电子科技大学 Streaming media server load balancing method, system, computer equipment and terminal
CN112988693A (en) * 2021-03-26 2021-06-18 武汉大学 Spectral clustering algorithm parallelization method and system in abnormal data detection
WO2024074023A1 (en) * 2022-10-07 2024-04-11 南京邮电大学 Task scheduling method based on improved particle swarm optimization algorithm
CN116360505A (en) * 2023-06-02 2023-06-30 北京航空航天大学 Integrated automatic control method and system for stratospheric airship and electronic equipment
CN116360505B (en) * 2023-06-02 2023-08-22 北京航空航天大学 Integrated automatic control method and system for stratospheric airship and electronic equipment
CN118212001A (en) * 2024-03-19 2024-06-18 涅生科技(广州)股份有限公司 Electronic commerce market trend prediction system based on big data analysis

Similar Documents

Publication Publication Date Title
CN108520284A (en) A kind of improved spectral clustering and parallel method
Got et al. Hybrid filter-wrapper feature selection using whale optimization algorithm: A multi-objective approach
Lu et al. A hybrid ensemble algorithm combining AdaBoost and genetic algorithm for cancer classification with gene expression data
CN103745258B (en) Complex network community mining method based on the genetic algorithm of minimum spanning tree cluster
Busa-Fekete et al. Fast boosting using adversarial bandits
CN104392250A (en) Image classification method based on MapReduce
Joelsson et al. Random forest classifiers for hyperspectral data
CN106682116A (en) OPTICS point sorting clustering method based on Spark memory computing big data platform
CN103258147B (en) A kind of parallel evolution super-network DNA micro array gene data categorizing system based on GPU and method
Yi et al. An improved initialization center algorithm for K-means clustering
CN110991653A (en) Method for classifying unbalanced data sets
CN109978050A (en) Decision Rules Extraction and reduction method based on SVM-RF
Mai et al. RETRACTED ARTICLE: Research on semi supervised K-means clustering algorithm in data mining
CN109871934A (en) Feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm
CN110866134A (en) Image retrieval-oriented distribution consistency keeping metric learning method
CN109284766A (en) A kind of feature selection approach of Multivariate Discrete, device, equipment and storage medium
Nayini et al. A novel threshold-based clustering method to solve K-means weaknesses
CN112729826A (en) Bearing fault diagnosis method for artificial shoal-frog leaping optimization extreme learning machine
Xue et al. An archive based particle swarm optimisation for feature selection in classification
CN116821715A (en) Artificial bee colony optimization clustering method based on semi-supervision constraint
Sujana et al. An efficient feature selection using parallel cuckoo search and naïve Bayes classifier
CN110909785B (en) Multitask Triplet loss function learning method based on semantic hierarchy
CN114663770A (en) Hyperspectral image classification method and system based on integrated clustering waveband selection
Kubus Evaluation of resampling methods in the class unbalance problem
CN110796198A (en) High-dimensional feature screening method based on hybrid ant colony optimization algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180911

RJ01 Rejection of invention patent application after publication