CN107423764A - K Means clustering methods based on NSS AKmeans and MapReduce processing big data - Google Patents
K Means clustering methods based on NSS AKmeans and MapReduce processing big data Download PDFInfo
- Publication number
- CN107423764A CN107423764A CN201710619794.3A CN201710619794A CN107423764A CN 107423764 A CN107423764 A CN 107423764A CN 201710619794 A CN201710619794 A CN 201710619794A CN 107423764 A CN107423764 A CN 107423764A
- Authority
- CN
- China
- Prior art keywords
- subset
- mapreduce
- cluster
- akmeans
- nss
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of K Means clustering methods based on NSS AKmeans and MapReduce processing big data, this method is in the case of unknown number of clusters amount, block analysis is carried out to large data sets using based on MapReduce improved NSS AKmeans, obtains number of clusters amount and the cluster center of each subset.The result of each subset is merged afterwards, obtains the number of clusters amount of data set and the initial cluster center close to actual value.Cluster analysis finally is completed to large data sets on the basis of existing initial cluster center using the K Means algorithms of standard.The present invention is solved the problems, such as in the K Means algorithms based on Hadoop known to number of clusters amount needs, and the initial cluster center being calculated is more accurate, in the 3rd MapReduce operation, reduces the iterations of K Means algorithms.
Description
Technical field
It is more particularly to a kind of to be based on NSS-AKmeans and MapReduce the present invention relates to the cluster analysis in machine learning
Handle the K-Means clustering methods of big data.
Background technology
With the arriving in big data epoch, sharply increasing for data brings great challenge for data analysing method.Pass
The machine learning method of system directly directly uses on large data sets, can run into the problem of various.
K-Means as one of ten big machine in normal service learning algorithms has extensive use.K-Means can not only be single
Data analysis is solely carried out, and can be as a part for other learning tasks.K-Means use needs to choose in initial cluster
The heart, and the quality at the center chosen has a great impact to cluster result.Hadoop as a distributed system architecture,
High-speed computation and storage can be carried out using cluster.Its analysis and processing for big data has great significance.
The design that Hadoop framework is most crucial is exactly:HDFS and MapReduce.HDFS provides storage for the data of magnanimity, then
MapReduce provides calculating for the data of magnanimity.MapReduce parallel model can be greatly enhanced K-Means's
Operation efficiency, provided great convenience for processing big data.
Parallelizations of the K-Means on MapReduce is realized and improved, and has many achievements in research to be suggested.It is existing
K-Means methods are the realization on MapReduce still keeps K-Means intrinsic the shortcomings that.As K-Means input, just
The quality at beginning cluster center has a great impact for final cluster result.But the existing K-Means side based on MapReduce
Selection of the method to initial cluster center improvement or it is limited, K-Means iterationses are still very high, and number of clusters amount
Number is also required to known.Documents below makes certain improvements to realization of the K-Means algorithms on MapReduce.
Document 1.Chaturbhuj, Kaustubh S., and Gauri Chaudhary. " Parallel clustering
of large data set on Hadoop using data mining techniques."Futuristic Trends
in Research and Innovation for Social Welfare(Startup Conclave),World
Conference on.IEEE,2016.
Document 2.Moertini, Veronica S., and Liptia Venica. " Enhancing parallel k-
means using map reduce for discovering knowledge from big data."Cloud
Computing and Big Data Analysis(ICCCBDA),2016IEEE International Conference
on.IEEE,2016
Document 1 determines initial cluster center using PSO searching algorithms, and K-Means iterations is reduced with this.This text
Although having arrived preferable initial cluster center using PSO algorithm search in offering, outside the tangible Hadoop platform of realization of PSO algorithms
, and known to the value needs of number of clusters amount.
Document 2 completes cluster analysis using two MapReduce operations.In first MapReduce operation, logarithm
Sampled to obtain a subset according to collection, cluster analysis is carried out to subset using K-Means algorithms, the cluster center for obtaining subset is made
For initial cluster center.In second MapReduce operation, using existing initial cluster center, completed with K-Means algorithms
Cluster analysis.The shortcomings that this algorithm be obtained initial cluster center than randomly choosing close to true cluster center, but difference is still
It is bigger.Therefore in second MapReduce operation, the iterations of K-Means algorithms is still very high.Likewise, this
Known to the value of number of clusters amount needs in algorithm.
The problem of algorithm that document above proposes is primarily present is that the number of clusters amount of data set is required for, it is known that can not be by calculating
Method obtains.The initial cluster center obtained in algorithm is due to apart from each other with true cluster center, in the final cluster of data set is calculated
During the heart, the iterations of K-Means algorithms is still very high.
The content of the invention
It is an object of the invention to provide a kind of K-Means based on NSS-AKmeans and MapReduce processing big datas
Clustering method, to solve the problems, such as that number of clusters amount is needed, it is known that initial cluster center is not accurate enough in background technology.With existing method
Compare, this method can be based on MapReduce realize to large data sets carry out cluster analysis automatically select number of clusters amount and obtain compared with
The clustering method at accurate initial cluster center.
To reach above-mentioned purpose, the present invention is achieved by the following technical solutions:
Based on the K-Means clustering methods of NSS-AKmeans and MapReduce processing big datas, comprise the following steps:
(1) in first MapReduce operation, logarithm value type data set is pre-processed, and is included the cleaning of data, is returned
One changes, and resets;
(2) in the second MapReduce operation of data input for exporting first MapReduce operation, at second
In MapReduce operations, the subset of data volume is obtained to each patch based sampling of the data set of input, utilizes NSS-AKmeans
Algorithm carries out cluster analysis to each subset, obtains the cluster center of each subset, carries out analysis merging to these cluster centers afterwards
Obtain initial cluster center;
(3) in the 3rd MapReduce operation, on the basis of existing initial cluster center, the K- of standard is utilized
Parallelization of the Means algorithms on MapReduce completes cluster analysis to data set.
Further improve of the invention is, in step (1), carries out random rearrangement to data set so that each data strip
Mesh random distribution.
Further improve of the invention is, in step (2), in second MapReduce operation, to the data of input
Each patch based sampling of collection obtains subset of the data volume between 5000~10000.
Further improve of the invention is that in step (2), concrete methods of realizing is as follows:
1) each subset is clustered using NSS-AKmeans algorithms, obtains number of clusters amount and the cluster center of each subset,
Assuming that (n1, n2..., nn) be each subset number of clusters amount, K is mode therein, and K is set as to the number of clusters amount of data set;By son
Result of the cluster quantity not equal to K is deleted, and the subset cluster result obtained afterwards is as follows:
2) using the cluster center of obtained subset, the initial cluster center of data set is calculated, it is as follows:
The present invention has following beneficial effect:
K-Means clustering method of the present invention based on NSS-AKmeans and MapReduce processing big datas, can be in cluster
Big data is clustered in the case that quantity is unknown, obtains accurate initial cluster center, is completed more using K-Means afterwards
Accurate cluster.Known to the method solves the problems, such as in the K-Means algorithms based on Hadoop that number of clusters amount needs, and
The initial cluster center being calculated is more accurate, in the 3rd MapReduce operation, reduces the iteration of K-Means algorithms
Number.
Further, random rearrangement is carried out to data set so that each Data Entry random distribution.So reduce second
The irrational possibility of sampling subset in individual MapReduce operations.Specifically, the purpose of data cleansing is deleted in data set
There is the entry of loss of data.It is more convenient and accurate that normalization data make it that data calculate.Random rearrangement is carried out to data set, made
Obtain each Data Entry random distribution.The irrational possibility of sampling subset after so reducing.
Further, in step (2), in second MapReduce operation, to each patch based sampling of the data set of input
Obtain subset of the data volume between 5000~10000.Sampling obtains the less subset of data volume, overcomes NSS-AKmeans
Algorithm can not handle the difficulty of big data quantity.
In addition, in step (2), existing clustering method NSS-AKmeans processing big data is improved based on MapReduce
Number of clusters amount and accurate initial cluster center can be obtained.This part has obtained the number of clusters amount of data set, solves existing
K-Means algorithms based on Hadoop known will be required number of clusters amount.The initial cluster center obtained simultaneously is than randomly selecting
It is accurate a lot.
Brief description of the drawings
Fig. 1 is the stream for the K-Means clustering methods that the present invention handles big data based on NSS-AKmeans and MapReduce
Cheng Tu;
Fig. 2 is that the present invention reduces the effect of K-Means iterationses and the comparison of other algorithms.
Embodiment
The invention will be further described with specific embodiment below in conjunction with the accompanying drawings.
As shown in figure 1, the K-Means provided by the invention based on NSS-AKmeans and MapReduce processing big datas gathers
Class method, comprises the following steps:
(1) in first MapReduce operation, data are cleaned, normalization and random rearrangement.Data cleansing
Purpose is to delete the entry that data are concentrated with loss of data.Random rearrangement is carried out to data set so that each Data Entry is random
Distribution.The irrational possibility of sampling subset after so reducing.
(2) in second MapReduce, using first MapReduce output as input.
A) data volume is obtained between 5000~10000 to each patch based sampling of the data set of input in Map functions
Subset.Cluster analysis is carried out to each subset using NSS-AKmeans algorithms, obtains the cluster center of each subset.Wherein have
Body step is as follows:
I. each point p t Neighbor Points are calculated, define a set tNN.TNN collection using this t Neighbor Points as point p
The member of conjunction.And the polymerizing energy value each put is calculated according to equation below.
Wherein dipFor point p and the Euclidean distance at set tNN midpoints.
Ii. the tNN set of the maximum point of selective polymerization energy value is used as a cluster, by the tNN set at this cluster midpoint
Point add this cluster.Judge that number of the cluster midpoint occurrence number more than t/2 exceedes more than half, then retain this cluster, and
These points are deleted from data set.If ineligible, this cluster is not retained, and these points are deleted from data set
Remove.Continue to find qualified cluster according to the method described above from remaining data, until data are grouped as null data set.By upper
State operation and rough division has been carried out to subset, the high-density region of subset has been calculated.
Iii. subset is more accurately clustered using the fuzzy K-Means algorithms of polymerization.Thus every height is obtained
Collect accurate number of clusters amount and cluster center.This algorithm is used as the initial cluster center of subset as calculation using the center of subset high-density region
The input of method.If X={ X1,X2,…,XnBe subset in point.Each Xi{ x can be expressed asi,1;xi,2;…;xi,m}.M is represented
The dimension each put.The purpose of cluster of the fuzzy K-Means algorithms of polymerization is to minimize this function:
Wherein ui,jRepresent XiWith j-th of cluster zjBetween relation.Di,jRepresent j-th of cluster center and i-th of point XiBetween
Euclidean distance.
In order to minimize P functions, U fixed first minimizes P by variable of Z, then fixes Z using U as variable to minimize
P, circulation top-operation no longer change until P.U and P updates according to the following formula:
B) the cluster center of each subset obtained above is subjected to analysis merging.Assuming that (n1, n2..., nn) it is each subset
Number of clusters amount, K is mode therein, and K is set as to the number of clusters amount of data set.Result of the subset number of clusters amount not equal to K is deleted,
The subset cluster result obtained afterwards is as follows:
Subset | Number of clusters amount | Cluster center |
Subset 1 | K | p11,p12,…,p1K |
Subset 2 | K | p21,p22,…,p2K |
… | … | … |
Subset n | K | pn1,pn2,…,pnK |
C) using the cluster center of obtained subset, the initial cluster center of data set is calculated, it is as follows:
(3) in the 3rd MapReduce, on the basis of existing initial cluster center, calculated using the K-Means of standard
Parallelization of the method on MapReduce is realized completes cluster analysis to initial data, thus obtains the final cluster center of data set.
The key step of K-Means algorithms is as follows:
A) initial division is carried out to data set according to obtained initial cluster center, obtains K cluster;
B) each point is calculated to the distance at each cluster center, adds it to that nearest cluster;
C) center of each cluster is recalculated;
D) repetitive process b), c), it is known that the center of each cluster no longer changes or reached greatest iteration in some accuracy rating
Number.
Experiment and effect analysis
The initial cluster center for the data set that table 1 obtains for this algorithm, the comparison between final cluster center and true cluster center.
Observed number is it has been found that this algorithm can not only automatically select out the number of clusters amount of data set, and be improved by being based on MapReduce
The obtained initial cluster center of NSS-AKmeans algorithms it is final cluster center it is very close, and more connect compared to more final cluster center
Nearly true cluster center.Thus this algorithm overcomes the number of clusters amount mentioned in background technology and can not automatically selected, it is necessary to which known ask
Topic.Meanwhile the very close real cluster center in the obtained initial cluster center of this algorithm.
Fig. 2 is that the present invention reduces the effect of K-Means iterationses and the comparison of other algorithms.Observation find this algorithm by
In that can obtain more accurately initial cluster center, the effect for reducing K-Means iterationses is better than other algorithms.Show in figure
Show, K-Means algorithms by just having reached the condition of convergence after an iteration, and other method by iteration for several times also
It is not reaching to the condition of convergence.
Table 1 is the comparative result at cluster center and true cluster center required by cluster in the present invention:
Claims (4)
1. based on NSS-AKmeans and MapReduce processing big data K-Means clustering methods, it is characterised in that including with
Lower step:
(1) in first MapReduce operation, logarithm value type data set is pre-processed, and includes the cleaning of data, normalizing
Change, reset;
(2) in the second MapReduce operation of data input for exporting first MapReduce operation, at second
In MapReduce operations, the subset of data volume is obtained to each patch based sampling of the data set of input, utilizes NSS-AKmeans
Algorithm carries out cluster analysis to each subset, obtains the cluster center of each subset, carries out analysis merging to these cluster centers afterwards
Obtain initial cluster center;
(3) in the 3rd MapReduce operation, on the basis of existing initial cluster center, calculated using the K-Means of standard
Parallelization of the method on MapReduce completes cluster analysis to data set.
2. the K-Means cluster sides according to claim 1 based on NSS-AKmeans and MapReduce processing big datas
Method, it is characterised in that in step (1), random rearrangement is carried out to data set so that each Data Entry random distribution.
3. the K-Means cluster sides according to claim 1 based on NSS-AKmeans and MapReduce processing big datas
Method, it is characterised in that in step (2), in second MapReduce operation, to each patch based sampling of the data set of input
Obtain subset of the data volume between 5000~10000.
4. the K-Means cluster sides according to claim 3 based on NSS-AKmeans and MapReduce processing big datas
Method, it is characterised in that in step (2), concrete methods of realizing is as follows:
1) each subset is clustered using NSS-AKmeans algorithms, obtains number of clusters amount and the cluster center of each subset, it is assumed that
(n1, n2..., nn) be each subset number of clusters amount, K is mode therein, and K is set as to the number of clusters amount of data set;By subset cluster
Result of the quantity not equal to K is deleted, and the subset cluster result obtained afterwards is as follows:
2) using the cluster center of obtained subset, the initial cluster center of data set is calculated, it is as follows:
<mrow>
<msub>
<mi>p</mi>
<mi>i</mi>
</msub>
<mo>=</mo>
<mfrac>
<mn>1</mn>
<mi>n</mi>
</mfrac>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>j</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</munderover>
<msub>
<mi>p</mi>
<mrow>
<mi>j</mi>
<mi>i</mi>
</mrow>
</msub>
<mo>,</mo>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
<mo>,</mo>
<mn>2</mn>
<mo>,</mo>
<mo>...</mo>
<mo>,</mo>
<mi>K</mi>
<mo>.</mo>
</mrow>
1
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710619794.3A CN107423764A (en) | 2017-07-26 | 2017-07-26 | K Means clustering methods based on NSS AKmeans and MapReduce processing big data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710619794.3A CN107423764A (en) | 2017-07-26 | 2017-07-26 | K Means clustering methods based on NSS AKmeans and MapReduce processing big data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107423764A true CN107423764A (en) | 2017-12-01 |
Family
ID=60430357
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710619794.3A Pending CN107423764A (en) | 2017-07-26 | 2017-07-26 | K Means clustering methods based on NSS AKmeans and MapReduce processing big data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107423764A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11099790B2 (en) | 2019-01-10 | 2021-08-24 | Samsung Electronics Co., Ltd. | Parallel key value based multithread machine learning leveraging KV-SSDS |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102243641A (en) * | 2011-04-29 | 2011-11-16 | 西安交通大学 | Method for efficiently clustering massive data |
CN103077253A (en) * | 2013-01-25 | 2013-05-01 | 西安电子科技大学 | High-dimensional mass data GMM (Gaussian Mixture Model) clustering method under Hadoop framework |
CN103793438A (en) * | 2012-11-05 | 2014-05-14 | 山东省计算中心 | MapReduce based parallel clustering method |
CN104063518A (en) * | 2014-07-14 | 2014-09-24 | 南京弘数信息科技有限公司 | Big data clustering method based on decomposition and composition |
CN104156463A (en) * | 2014-08-21 | 2014-11-19 | 南京信息工程大学 | Big-data clustering ensemble method based on MapReduce |
CN105653615A (en) * | 2015-12-25 | 2016-06-08 | 石永丽 | Big data based computer data mining discovery method |
CN105844303A (en) * | 2016-04-08 | 2016-08-10 | 云南大学 | Sampling type clustering integration method based on local and global information |
CN106203507A (en) * | 2016-07-11 | 2016-12-07 | 上海凌科智能科技有限公司 | A kind of k means clustering method improved based on Distributed Computing Platform |
CN106295676A (en) * | 2016-07-26 | 2017-01-04 | 重庆邮电大学 | A kind of self adaptation RK means algorithm based on Hadoop |
CN106530132A (en) * | 2016-11-14 | 2017-03-22 | 国家电网公司 | Power load clustering method and device |
-
2017
- 2017-07-26 CN CN201710619794.3A patent/CN107423764A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102243641A (en) * | 2011-04-29 | 2011-11-16 | 西安交通大学 | Method for efficiently clustering massive data |
CN103793438A (en) * | 2012-11-05 | 2014-05-14 | 山东省计算中心 | MapReduce based parallel clustering method |
CN103077253A (en) * | 2013-01-25 | 2013-05-01 | 西安电子科技大学 | High-dimensional mass data GMM (Gaussian Mixture Model) clustering method under Hadoop framework |
CN104063518A (en) * | 2014-07-14 | 2014-09-24 | 南京弘数信息科技有限公司 | Big data clustering method based on decomposition and composition |
CN104156463A (en) * | 2014-08-21 | 2014-11-19 | 南京信息工程大学 | Big-data clustering ensemble method based on MapReduce |
CN105653615A (en) * | 2015-12-25 | 2016-06-08 | 石永丽 | Big data based computer data mining discovery method |
CN105844303A (en) * | 2016-04-08 | 2016-08-10 | 云南大学 | Sampling type clustering integration method based on local and global information |
CN106203507A (en) * | 2016-07-11 | 2016-12-07 | 上海凌科智能科技有限公司 | A kind of k means clustering method improved based on Distributed Computing Platform |
CN106295676A (en) * | 2016-07-26 | 2017-01-04 | 重庆邮电大学 | A kind of self adaptation RK means algorithm based on Hadoop |
CN106530132A (en) * | 2016-11-14 | 2017-03-22 | 国家电网公司 | Power load clustering method and device |
Non-Patent Citations (4)
Title |
---|
LI MA 等: "An Improved K-means Algorithm based on Mapreduce and Grid", 《INTERNATIONAL JOURNAL OF GRID DISTRIBUTION COMPUTING》 * |
VERONICA S. MOERTINI 等: "Enhancing Parallel k-Means Using Map Reduce for Discovering Knowledge from Big Data", 《2016 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS》 * |
YANFENG ZHANG 等: "NSS-AKmeans: An Agglomerative Fuzzy K-Means Clustering Method with Automatic Selection of Cluster Number", 《2010 2ND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL》 * |
张文军: "《生态学研究方法》", 31 October 2007 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11099790B2 (en) | 2019-01-10 | 2021-08-24 | Samsung Electronics Co., Ltd. | Parallel key value based multithread machine learning leveraging KV-SSDS |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Learned index for spatial queries | |
CN107291847A (en) | A kind of large-scale data Distributed Cluster processing method based on MapReduce | |
CN105930862A (en) | Density peak clustering algorithm based on density adaptive distance | |
CN104217015B (en) | Based on the hierarchy clustering method for sharing arest neighbors each other | |
CN110232434A (en) | A kind of neural network framework appraisal procedure based on attributed graph optimization | |
CN104750861A (en) | Method and system for cleaning mass data of energy storage power station | |
CN101324926B (en) | Method for selecting characteristic facing to complicated mode classification | |
CN110888859B (en) | Connection cardinality estimation method based on combined deep neural network | |
CN104765839A (en) | Data classifying method based on correlation coefficients between attributes | |
CN106570250A (en) | Power big data oriented microgrid short-period load prediction method | |
CN101697167B (en) | Clustering-decision tree based selection method of fine corn seeds | |
CN110765582B (en) | Self-organization center K-means microgrid scene division method based on Markov chain | |
CN109840551B (en) | Method for optimizing random forest parameters for machine learning model training | |
CN107145526A (en) | Geographical social activity keyword Reverse nearest neighbor inquiry processing method under a kind of road network | |
CN107516104A (en) | A kind of optimization CART decision tree generation methods and its device based on dichotomy | |
CN109726749A (en) | A kind of Optimal Clustering selection method and device based on multiple attribute decision making (MADM) | |
CN108921324A (en) | Platform area short-term load forecasting method based on distribution transforming cluster | |
CN111177410A (en) | Knowledge graph storage and similarity retrieval method based on evolution R-tree | |
CN109670037A (en) | K-means Text Clustering Method based on topic model and rough set | |
CN104504018A (en) | Top-down real-time big data query optimization method based on bushy tree | |
CN105046323A (en) | Regularization-based RBF network multi-label classification method | |
CN106874367A (en) | A kind of sampling distribution formula clustering method based on public sentiment platform | |
CN103310027B (en) | Rules extraction method for map template coupling | |
WO2020211466A1 (en) | Non-redundant gene clustering method and system, and electronic device | |
CN107423764A (en) | K Means clustering methods based on NSS AKmeans and MapReduce processing big data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171201 |
|
RJ01 | Rejection of invention patent application after publication |