CN113850281B - MEANSHIFT optimization-based data processing method and device - Google Patents
MEANSHIFT optimization-based data processing method and device Download PDFInfo
- Publication number
- CN113850281B CN113850281B CN202110161944.7A CN202110161944A CN113850281B CN 113850281 B CN113850281 B CN 113850281B CN 202110161944 A CN202110161944 A CN 202110161944A CN 113850281 B CN113850281 B CN 113850281B
- Authority
- CN
- China
- Prior art keywords
- sample
- cluster
- class
- centers
- center
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000005457 optimization Methods 0.000 title claims description 11
- 238000003672 processing method Methods 0.000 title abstract description 9
- 238000000034 method Methods 0.000 claims abstract description 20
- 239000013598 vector Substances 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 8
- 230000006399 behavior Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000003064 k means clustering Methods 0.000 description 3
- 238000007418 data mining Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a data processing method and device based on mean shift. The method comprises the following steps: collecting user behavior data in real time as an original sample set; initializing a class cluster center according to the number of class clusters and the original sample set; determining, for each sample in the original set of samples, whether there are two or more cluster centers closest to the sample, if so, calculating a local density gradient direction of the sample using mean shift, calculating a similarity between the local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster centers, and classifying the sample into a cluster corresponding to a maximum similarity; otherwise, dividing the sample into a class cluster closest to the center of the class cluster; and pushing related data to each user group in real time according to the clustering result.
Description
Technical Field
The present invention relates to the field of data mining and machine learning, and more particularly to a MEANSHIFT-based optimized data processing method and apparatus.
Background
With the rapid development of modern information technology, the world has spanned the Internet+big data age. Big data are deeply changing the thinking, production and life style of people, and the big data are deeply fused with various industries to generate unprecedented social and commercial values. Many data processing methods based on data mining and machine learning are generated in the course of big data development, wherein the traditional K-means algorithm is formed by N samplesRandomly select K samplesThe method is used as an initial cluster center, an original sample is divided into clusters which are closest to the original sample based on a minimum distance rule, and when the distances between the sample and the centers of one or more clusters are close to the minimum distance, the clustering effect of K-means is not ideal. How to improve the clustering effect in this scenario becomes a urgent problem to be solved.
The Chinese patent application (CN 201911127104.8) proposes a K-means clustering method based on density Canopy, which takes the density Canopy cluster as a preprocessing step of a K-means algorithm, and compared with the traditional K-means algorithm, the clustering accuracy is improved, but the method does not consider the relation between an original sample and other clusters, only ensures local optimization, but cannot obtain global optimization.
The Chinese patent application (CN 201810570097.8) proposes a K-means clustering method based on a neural network, which solves the problems that the existing K-means iteratively optimizes a clustering center and label distribution by two independent steps, so that the reasoning speed is low, new data, large-scale data and online data cannot be processed, and the method is sensitive to initial values, but the method does not consider the scene of closest and approximate sample and a plurality of clusters, and the sample cannot be reasonably divided under the scene.
Therefore, in order to make sample division more reasonable and further improve clustering accuracy in the case that samples are nearest and approximate to a plurality of class clusters, it is desirable to provide an improved data processing method.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
The invention provides a data processing method and a data processing device based on mean shift optimization, which consider the relation between an original sample and other clusters, so that the edges of each cluster and the peripheral areas of the clusters are divided more reasonably, the clusters are compact, and the clustering precision and speed are greatly improved.
According to an aspect of the present invention, there is provided a data processing method, the method comprising:
collecting user behavior data in real time as an original sample set;
initializing a class cluster center according to the number of class clusters and the original sample set;
for each sample in the original set of samples, determining whether there are two or more cluster-like centers closest to the sample,
if present, then
The local density gradient direction of the sample is calculated using mean shift,
calculating a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster centers, an
Dividing the samples into class clusters corresponding to the maximum similarity;
otherwise, dividing the sample into a class cluster closest to the center of the class cluster; and
and pushing relevant data to each user group in real time according to the clustering result.
According to one embodiment of the invention, determining whether there are two or more cluster-like centers closest to the sample further comprises:
calculating Euclidean distances from the sample to the centers of K class clusters to obtain a distance set aiming at the sample, wherein K is the number of the class clusters;
calculating the distance between the sample and the center c of other clusters q Ratio of the distance of (2) to the smallest distance in the set of distances to obtain a corresponding set of distance ratios
Wherein if there is a setThen determine that +.>The cluster center is closest to the sample, where ε is a threshold set by human experience.
According to a further embodiment of the invention, calculating the local density gradient direction of the sample using mean shift further comprises:
a mean shift vector of the sample part is calculated, wherein the vector represents the direction of maximum increase of estimated density with respect to the sample itself.
According to a further embodiment of the present invention, calculating the similarity further comprises:
a cosine similarity algorithm is utilized to calculate a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster-like centers, wherein the greater the cosine value, the higher the similarity.
According to a further embodiment of the present invention, the initializing cluster centers is performed by a K-means++ clustering algorithm, wherein the distance between the cluster centers is as large as possible.
According to another aspect of the present invention, there is provided a data processing apparatus, the apparatus comprising:
the data acquisition module is configured to acquire user behavior data in real time as an original sample set;
an initialization class cluster center module configured to initialize class cluster centers according to a number of class clusters and the original sample set;
a data clustering module configured to:
for each sample in the original set of samples, determining whether there are two or more cluster-like centers closest to the sample,
if present, then
Calculating a local density gradient direction of the sample using mean shift, calculating a similarity between the local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster centers, and
dividing the samples into class clusters corresponding to the maximum similarity;
otherwise, dividing the sample into a class cluster closest to the center of the class cluster; and
and the data pushing module is configured to push related data to each user group associated with each class cluster in real time based on the clustering result.
According to one embodiment of the invention, determining whether there are two or more cluster-like centers closest to the sample further comprises:
calculating Euclidean distances from the sample to the centers of K class clusters to obtain a distance set aiming at the sample, wherein K is the number of the class clusters;
calculating the distance between the sample and the center c of other clusters q Ratio of the distance of (2) to the smallest distance in the set of distances to obtain a corresponding set of distance ratios
Wherein if there is a setThen determine that +.>The cluster center is closest to the sample, where ε is a threshold set by human experience.
According to a further embodiment of the invention, calculating the local density gradient direction of the sample using mean shift further comprises:
a mean shift vector of the sample part is calculated, wherein the vector represents the direction of maximum increase of estimated density with respect to the sample itself.
According to a further embodiment of the present invention, calculating the similarity further comprises:
a cosine similarity algorithm is utilized to calculate a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster-like centers, wherein the greater the cosine value, the higher the similarity.
According to a further embodiment of the present invention, the initializing cluster centers is performed by a K-means++ clustering algorithm, wherein the distance between the cluster centers is as large as possible.
Compared with the scheme in the prior art, the data processing method and device based on the meanshift optimization provided by the invention have at least the following advantages:
(1) By considering the relation between the original sample and other clusters, the edges of each cluster and the peripheral areas of the clusters are divided more reasonably, the clusters are compact, and the clustering effect is improved, so that the global optimum is achieved.
(2) Compared with the traditional K-means algorithm, the center positions of K class clusters can be estimated more accurately, so that the K class clusters can be converged rapidly, and the iteration times are reduced.
These and other features and advantages will become apparent upon reading the following detailed description and upon reference to the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are explanatory only and are not restrictive of aspects as claimed.
Drawings
So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only certain typical aspects of this invention and are therefore not to be considered limiting of its scope, for the description may admit to other equally effective aspects.
FIG. 1 illustrates an exemplary architecture diagram of a data processing apparatus based on a meanshift optimization in accordance with one embodiment of the present invention.
FIG. 2 shows a flow chart of a method of data processing based on a meanshift optimization in accordance with one embodiment of the present invention.
FIG. 3 shows a flow chart of a mean shift based clustering algorithm according to one embodiment of the invention.
FIG. 4 illustrates an example of a central sample two-dimensional region according to one embodiment of the invention.
Detailed Description
The features of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings.
FIG. 1 is an exemplary architecture diagram of a meanshift optimization-based data processing apparatus 100 in accordance with one embodiment of the present invention. As shown in fig. 1, the apparatus 100 of the present invention includes: the system comprises a data acquisition module 101, an initialization cluster-like center module 102, a data clustering module 103 and a data pushing module 104.
The data acquisition module 101 may acquire user data in real time as an original sample set and store it in a big data platform according to data characteristics classification. As an example, the data collection module 101 may collect, in real time, behavior data of a user watching a television program as an original sample set, wherein a history of television programs watched 30 days before user i is counted each day, accumulated according to the time of their respective watching for each of T program types, and normalized to a score, i.e., time t /(time 1 +time 2 +…+time T ) Wherein the score of each user for each program type is stored as an original sample x i 。
The initialize class cluster center module 102 may initialize the class cluster center based on the number of class clusters and the original sample set. As one example, the initialize cluster center module 102 may initialize K cluster centers with a K-means++ clustering algorithm to maximize the distance between them. The K-means++ algorithm comprises the following specific steps: (1) Firstly, randomly selecting a sample point X from an original sample set X i As the first initial cluster center c i The method comprises the steps of carrying out a first treatment on the surface of the (2) Then calculate each sample point x i The shortest distance D (x) between the current existing cluster center and the current existing cluster center is calculated, and each sample point x is calculated i Probability P (x) of being selected as the next cluster center, and finally selecting the maximum probability value pairSample point x of interest i As the center of the next cluster; and (3) repeating the step (2) until K cluster centers are selected.
The data clustering module 103 may calculate, for each sample in the original set of samples that is closest and approximate to the center of two or more clusters, a local density gradient direction for that sample using a mean shift algorithm; calculating a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster-like centers; and attributing the sample to the class cluster corresponding to the maximum similarity for clustering. In particular, the data clustering module 103 may calculate each sample X in the original sample set X i Euclidean distance to the center of K class clusters (as can be seen in FIG. 4, each arrow in FIG. 4 points from the center sample point to the center of each class cluster), for each sample x i Obtaining a distance set, and calculating a sample x according to the distance set i Corresponding distance ratio set, judging sample x i Whether the sample x is closest to or similar to two or more cluster centers exists or not, if so, recording the corresponding cluster centers, and calculating the sample x i A local mean shift vector representing the direction of maximum increase in estimated density (simply referred to as density gradient direction) relative to the direction in which the sample itself points, and the sample x is calculated i Is the local density gradient direction of (1) and sample x i Similarity of direction to the center of each cluster, sample x i And dividing the clustering clusters into class clusters with the maximum similarity, and clustering the class clusters.
The data pushing module 104 may push relevant data to each user group in real time according to the clustering result. In one example, television users may be automatically grouped into K groups by a clustering algorithm, then the T attributes (program types) in the center of each group of clusters are ordered, and the relevant programs are pushed in a targeted manner to each group by the background according to the respective Top-N attribute (program type).
For ease of illustration, embodiments of the present invention will be described below using the mean shift mean++ based K-means clustering algorithm as an example, but those skilled in the art will appreciate that the present invention is equally applicable to other clustering algorithms.
FIG. 2 is a flow chart of a method 200 of data processing based on a meanshift optimization in accordance with one embodiment of the present invention. The method starts in step 201 with the data acquisition module 101 acquiring user behavior data in real time as an original sample set X.
In step 202, the initialize class cluster center module 102 initializes the class cluster center based on the number of class clusters and the original sample set. Algorithms for initializing cluster-like centers include, but are not limited to, K-means++, K-means, canopy, and the like.
At step 203, the data clustering module 103 determines, for each sample in the original set of samples, whether there are two or more cluster centers that are closest and approximate to the sample; if so, calculating a local density gradient direction of the sample by using a non-parameter estimation mean shift algorithm, calculating a similarity between the local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster centers, and classifying the sample into a cluster corresponding to the maximum similarity; otherwise, the sample is divided into the class clusters nearest to the center of the class clusters. The specific implementation steps of this algorithm are described in further detail below in fig. 3.
In step 204, the data pushing module 104 pushes relevant data to each user group in real time according to the clustering result.
FIG. 3 shows a flowchart of a cluster algorithm 300 based on means shift, according to one embodiment of the invention. The detailed steps of the algorithm 300 are as follows:
step 1: the number K of the input class clusters and the original sample set X, namely
Step 2: initializing K cluster centers by using a K-means++ algorithm,
i.e.
Step 3: computing each original sample X in the original sample set X i The Euclidean distance to the center of the K cluster-like clusters is denoted as d (x i ,c k ) Where k=1, 2,3, …, K, where the euclidean distance is found by the following formula: for points x and y in the n-dimensional space,thus, for each sample x i Obtaining a distance set
Step 4: calculation of the original sample x i From the center c of other clusters q Distance from cluster centerTo obtain a corresponding distance ratio set +.>Wherein the original sample x i Distance from cluster center->Is the smallest.
Step 5: if it isAll greater than the threshold epsilon, then the rule of minimum distance is used to divide, i.e. sample x i Into the cluster closest to the center of the cluster, where ε may be a threshold set by human experience.
Step 6: if there is a setThen indicate +.>Cluster center and sample x i The distance is nearest and approximate, and the sample x is determined by means of mean shift i Belonging to which cluster. The method comprises the following specific steps:
a) In sample x i Taking the center, h as the radius, and taking a p-dimensional sphere as S h (x i )。
b) Find x i Shift mean vector, denoted M h (x i )。
Note that if Z is 0, then x is considered i Is an outlier and is rejected.
c) Obtaining a sample x i To the point of{ c v Direction, i.e.)>
d)M h (x i ) Respectively withThe corresponding similarity is obtained through a cosine similarity algorithm, and x is calculated i Dividing the two vectors into class clusters with maximum similarity, wherein the cosine similarity algorithm evaluates the similarity of the two vectors by calculating the cosine value of the included angle, and the larger the cosine value is, the higher the similarity is.
Step 7: each sample X in the original sample set X i After division, updating the center of each class cluster to obtainCalculating the objective function of the whole cluster, and marking as E (1) Wherein the objective function expression is as follows: />
Step 8: when E is (t+1) Approximation E (t) And (3) indicating that the clustering result is converged, and outputting the clustering result, otherwise, continuing to execute the steps 3-7.
What has been described above includes examples of aspects of the claimed subject matter. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the claimed subject matter, but one of ordinary skill in the art may recognize that many further combinations and permutations of the claimed subject matter are possible. Accordingly, the disclosed subject matter is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims.
Claims (10)
1. A method of data processing based on MEANSHIFT optimization, the method comprising:
collecting user behavior data in real time as an original sample set;
initializing a class cluster center according to the number of class clusters and the original sample set;
for each sample in the original set of samples, determining whether there are two or more cluster-like centers closest to the sample,
if so, the local density gradient direction of the sample is calculated using mean shift,
calculating a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster centers, an
Dividing the samples into class clusters corresponding to the maximum similarity;
otherwise, dividing the sample into a class cluster closest to the center of the class cluster; and
and pushing relevant data to each user group in real time according to the clustering result.
2. The method of claim 1, wherein determining whether there are two or more cluster-like centers closest to the sample further comprises:
calculating Euclidean distances from the sample to the centers of K class clusters to obtain a distance set aiming at the sample, wherein K is the number of the class clusters;
calculating the distance between the sample and the center c of other clusters q Is centered from the cluster-like center with the samples in the distance setTo obtain a corresponding distance ratio set +.>
Wherein if there is a setThen determine that +.>The cluster center is closest to the sample, where ε is a threshold set by human experience.
3. The method of claim 1, wherein calculating the local density gradient direction of the sample using mean shift further comprises:
a mean shift vector of the sample part is calculated, wherein the vector represents the direction of maximum increase of estimated density with respect to the sample itself.
4. The method of claim 1, wherein calculating a similarity further comprises:
a cosine similarity algorithm is utilized to calculate a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster-like centers.
5. The method of claim 1, wherein initializing cluster-like centers is performed by a K-means++ clustering algorithm.
6. A MEANSHIFT optimized data processing device, the device comprising:
the data acquisition module is configured to acquire user behavior data in real time as an original sample set;
an initialization class cluster center module configured to initialize class cluster centers according to a number of class clusters and the original sample set;
a data clustering module configured to:
for each sample in the original set of samples, determining whether there are two or more cluster-like centers closest to the sample,
if so, the local density gradient direction of the sample is calculated using mean shift,
calculating a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster centers, an
Dividing the samples into class clusters corresponding to the maximum similarity;
otherwise, dividing the sample into a class cluster closest to the center of the class cluster; and
and the data pushing module is configured to push related data to each user group associated with each class cluster in real time based on the clustering result.
7. The apparatus of claim 6, wherein determining whether there are two or more cluster-like centers closest to the sample further comprises:
calculating Euclidean distances from the sample to the centers of K class clusters to obtain a distance set aiming at the sample, wherein K is the number of the class clusters;
calculating the distance between the sample and the center c of other clusters q Is located from the cluster center c with the samples in the distance set m* To obtain a corresponding set of distance ratios
Wherein if there is a setThen determine that +.>The cluster center is closest to the sample, where ε is a threshold set by human experience.
8. The apparatus of claim 6, wherein calculating the local density gradient direction of the sample using mean shift further comprises:
a mean shift vector of the sample part is calculated, wherein the vector represents the direction of maximum increase of estimated density with respect to the sample itself.
9. The apparatus of claim 6, wherein calculating a similarity further comprises:
a cosine similarity algorithm is utilized to calculate a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster-like centers.
10. The apparatus of claim 6, wherein the initializing cluster-like centers is performed by a K-means++ clustering algorithm.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110161944.7A CN113850281B (en) | 2021-02-05 | 2021-02-05 | MEANSHIFT optimization-based data processing method and device |
PCT/CN2021/136291 WO2022166380A1 (en) | 2021-02-05 | 2021-12-08 | Data processing method and apparatus based on meanshift optimization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110161944.7A CN113850281B (en) | 2021-02-05 | 2021-02-05 | MEANSHIFT optimization-based data processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113850281A CN113850281A (en) | 2021-12-28 |
CN113850281B true CN113850281B (en) | 2024-03-12 |
Family
ID=78972859
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110161944.7A Active CN113850281B (en) | 2021-02-05 | 2021-02-05 | MEANSHIFT optimization-based data processing method and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113850281B (en) |
WO (1) | WO2022166380A1 (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114913423A (en) * | 2022-05-25 | 2022-08-16 | 中国电建集团成都勘测设计研究院有限公司 | Model training method and extraction method for surrounding rock fracture information |
CN115563522B (en) * | 2022-12-02 | 2023-04-07 | 湖南工商大学 | Traffic data clustering method, device, equipment and medium |
CN116304776B (en) * | 2023-03-21 | 2023-11-21 | 宁波送变电建设有限公司运维分公司 | Power grid data value anomaly detection method and system based on k-Means algorithm |
CN116628289B (en) * | 2023-07-25 | 2023-12-01 | 泰能天然气有限公司 | Heating system operation data processing method and strategy optimization system |
CN117113118B (en) * | 2023-10-19 | 2024-01-26 | 张家港长三角生物安全研究中心 | Intelligent monitoring method and system for biological aerosol |
CN117217501B (en) * | 2023-11-09 | 2024-02-20 | 山东多科科技有限公司 | Digital production planning and scheduling method |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101777126A (en) * | 2010-02-10 | 2010-07-14 | 华中科技大学 | Clustering method for multidimensional characteristic vectors |
CN104008127A (en) * | 2014-04-21 | 2014-08-27 | 中国电子科技集团公司第二十八研究所 | Group identification method based on clustering algorithm |
CN106779073A (en) * | 2016-12-27 | 2017-05-31 | 西安石油大学 | Media information sorting technique and device based on deep neural network |
CN108985318A (en) * | 2018-05-28 | 2018-12-11 | 中国地质大学(武汉) | A kind of global optimization K mean cluster method and system based on sample rate |
CN110019563A (en) * | 2018-08-09 | 2019-07-16 | 北京首钢自动化信息技术有限公司 | A kind of portrait modeling method and device based on multidimensional data |
CN110134839A (en) * | 2019-03-27 | 2019-08-16 | 平安科技(深圳)有限公司 | Time series data characteristic processing method, apparatus and computer readable storage medium |
CN110852370A (en) * | 2019-11-06 | 2020-02-28 | 国网湖南省电力有限公司 | Clustering algorithm-based large-industry user segmentation method |
CN111967338A (en) * | 2020-07-27 | 2020-11-20 | 广东电网有限责任公司广州供电局 | Method and system for distinguishing partial discharge pulse interference signal based on mean shift clustering algorithm |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102222234A (en) * | 2011-07-14 | 2011-10-19 | 苏州两江科技有限公司 | Image object extraction method based on mean shift and K-means clustering technology |
US11977959B2 (en) * | 2019-05-15 | 2024-05-07 | EMC IP Holding Company LLC | Data compression using nearest neighbor cluster |
CN110441819B (en) * | 2019-08-06 | 2020-10-27 | 五季数据科技(北京)有限公司 | Earthquake first-motion wave automatic pickup method based on mean shift clustering analysis |
-
2021
- 2021-02-05 CN CN202110161944.7A patent/CN113850281B/en active Active
- 2021-12-08 WO PCT/CN2021/136291 patent/WO2022166380A1/en active Application Filing
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101777126A (en) * | 2010-02-10 | 2010-07-14 | 华中科技大学 | Clustering method for multidimensional characteristic vectors |
CN104008127A (en) * | 2014-04-21 | 2014-08-27 | 中国电子科技集团公司第二十八研究所 | Group identification method based on clustering algorithm |
CN106779073A (en) * | 2016-12-27 | 2017-05-31 | 西安石油大学 | Media information sorting technique and device based on deep neural network |
CN108985318A (en) * | 2018-05-28 | 2018-12-11 | 中国地质大学(武汉) | A kind of global optimization K mean cluster method and system based on sample rate |
CN110019563A (en) * | 2018-08-09 | 2019-07-16 | 北京首钢自动化信息技术有限公司 | A kind of portrait modeling method and device based on multidimensional data |
CN110134839A (en) * | 2019-03-27 | 2019-08-16 | 平安科技(深圳)有限公司 | Time series data characteristic processing method, apparatus and computer readable storage medium |
CN110852370A (en) * | 2019-11-06 | 2020-02-28 | 国网湖南省电力有限公司 | Clustering algorithm-based large-industry user segmentation method |
CN111967338A (en) * | 2020-07-27 | 2020-11-20 | 广东电网有限责任公司广州供电局 | Method and system for distinguishing partial discharge pulse interference signal based on mean shift clustering algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN113850281A (en) | 2021-12-28 |
WO2022166380A1 (en) | 2022-08-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113850281B (en) | MEANSHIFT optimization-based data processing method and device | |
CN113378632B (en) | Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method | |
Nech et al. | Level playing field for million scale face recognition | |
CN108564129B (en) | Trajectory data classification method based on generation countermeasure network | |
Gao et al. | Less is more: Efficient 3-D object retrieval with query view selection | |
CN105760888B (en) | A kind of neighborhood rough set integrated learning approach based on hierarchical cluster attribute | |
Cao et al. | Deep priority hashing | |
CN111126396A (en) | Image recognition method and device, computer equipment and storage medium | |
CN110751027B (en) | Pedestrian re-identification method based on deep multi-instance learning | |
CN111125469B (en) | User clustering method and device of social network and computer equipment | |
JP6897749B2 (en) | Learning methods, learning systems, and learning programs | |
Lim et al. | Efficient-prototypicalnet with self knowledge distillation for few-shot learning | |
CN114283350B (en) | Visual model training and video processing method, device, equipment and storage medium | |
Lin et al. | Fairgrape: Fairness-aware gradient pruning method for face attribute classification | |
Mansourifar et al. | Virtual big data for GAN based data augmentation | |
CN114238329A (en) | Vector similarity calculation method, device, equipment and storage medium | |
WO2014118978A1 (en) | Learning method, image processing device and learning program | |
CN117312681A (en) | Meta universe oriented user preference product recommendation method and system | |
Amid et al. | A more globally accurate dimensionality reduction method using triplets | |
Zhang et al. | Dataset-driven unsupervised object discovery for region-based instance image retrieval | |
CN114781779A (en) | Unsupervised energy consumption abnormity detection method and device and storage medium | |
CN114332550A (en) | Model training method, system, storage medium and terminal equipment | |
CN110209895B (en) | Vector retrieval method, device and equipment | |
CN111488520B (en) | Crop planting type recommendation information processing device, method and storage medium | |
CN115587297A (en) | Method, apparatus, device and medium for constructing image recognition model and image recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20220127 Address after: Room 1423, No. 1256 and 1258, Wanrong Road, Jing'an District, Shanghai 200072 Applicant after: Tianyi Digital Life Technology Co.,Ltd. Address before: 201702 3rd floor, 158 Shuanglian Road, Qingpu District, Shanghai Applicant before: Tianyi Smart Family Technology Co.,Ltd. |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |