CN113850281B - MEANSHIFT optimization-based data processing method and device - Google Patents

MEANSHIFT optimization-based data processing method and device Download PDF

Info

Publication number
CN113850281B
CN113850281B CN202110161944.7A CN202110161944A CN113850281B CN 113850281 B CN113850281 B CN 113850281B CN 202110161944 A CN202110161944 A CN 202110161944A CN 113850281 B CN113850281 B CN 113850281B
Authority
CN
China
Prior art keywords
sample
cluster
class
centers
center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110161944.7A
Other languages
Chinese (zh)
Other versions
CN113850281A (en
Inventor
吕超
张继东
沈志平
吴浩宇
吴风蛟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Digital Life Technology Co Ltd
Original Assignee
Tianyi Digital Life Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi Digital Life Technology Co Ltd filed Critical Tianyi Digital Life Technology Co Ltd
Priority to CN202110161944.7A priority Critical patent/CN113850281B/en
Priority to PCT/CN2021/136291 priority patent/WO2022166380A1/en
Publication of CN113850281A publication Critical patent/CN113850281A/en
Application granted granted Critical
Publication of CN113850281B publication Critical patent/CN113850281B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a data processing method and device based on mean shift. The method comprises the following steps: collecting user behavior data in real time as an original sample set; initializing a class cluster center according to the number of class clusters and the original sample set; determining, for each sample in the original set of samples, whether there are two or more cluster centers closest to the sample, if so, calculating a local density gradient direction of the sample using mean shift, calculating a similarity between the local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster centers, and classifying the sample into a cluster corresponding to a maximum similarity; otherwise, dividing the sample into a class cluster closest to the center of the class cluster; and pushing related data to each user group in real time according to the clustering result.

Description

MEANSHIFT optimization-based data processing method and device
Technical Field
The present invention relates to the field of data mining and machine learning, and more particularly to a MEANSHIFT-based optimized data processing method and apparatus.
Background
With the rapid development of modern information technology, the world has spanned the Internet+big data age. Big data are deeply changing the thinking, production and life style of people, and the big data are deeply fused with various industries to generate unprecedented social and commercial values. Many data processing methods based on data mining and machine learning are generated in the course of big data development, wherein the traditional K-means algorithm is formed by N samplesRandomly select K samplesThe method is used as an initial cluster center, an original sample is divided into clusters which are closest to the original sample based on a minimum distance rule, and when the distances between the sample and the centers of one or more clusters are close to the minimum distance, the clustering effect of K-means is not ideal. How to improve the clustering effect in this scenario becomes a urgent problem to be solved.
The Chinese patent application (CN 201911127104.8) proposes a K-means clustering method based on density Canopy, which takes the density Canopy cluster as a preprocessing step of a K-means algorithm, and compared with the traditional K-means algorithm, the clustering accuracy is improved, but the method does not consider the relation between an original sample and other clusters, only ensures local optimization, but cannot obtain global optimization.
The Chinese patent application (CN 201810570097.8) proposes a K-means clustering method based on a neural network, which solves the problems that the existing K-means iteratively optimizes a clustering center and label distribution by two independent steps, so that the reasoning speed is low, new data, large-scale data and online data cannot be processed, and the method is sensitive to initial values, but the method does not consider the scene of closest and approximate sample and a plurality of clusters, and the sample cannot be reasonably divided under the scene.
Therefore, in order to make sample division more reasonable and further improve clustering accuracy in the case that samples are nearest and approximate to a plurality of class clusters, it is desirable to provide an improved data processing method.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
The invention provides a data processing method and a data processing device based on mean shift optimization, which consider the relation between an original sample and other clusters, so that the edges of each cluster and the peripheral areas of the clusters are divided more reasonably, the clusters are compact, and the clustering precision and speed are greatly improved.
According to an aspect of the present invention, there is provided a data processing method, the method comprising:
collecting user behavior data in real time as an original sample set;
initializing a class cluster center according to the number of class clusters and the original sample set;
for each sample in the original set of samples, determining whether there are two or more cluster-like centers closest to the sample,
if present, then
The local density gradient direction of the sample is calculated using mean shift,
calculating a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster centers, an
Dividing the samples into class clusters corresponding to the maximum similarity;
otherwise, dividing the sample into a class cluster closest to the center of the class cluster; and
and pushing relevant data to each user group in real time according to the clustering result.
According to one embodiment of the invention, determining whether there are two or more cluster-like centers closest to the sample further comprises:
calculating Euclidean distances from the sample to the centers of K class clusters to obtain a distance set aiming at the sample, wherein K is the number of the class clusters;
calculating the distance between the sample and the center c of other clusters q Ratio of the distance of (2) to the smallest distance in the set of distances to obtain a corresponding set of distance ratios
Wherein if there is a setThen determine that +.>The cluster center is closest to the sample, where ε is a threshold set by human experience.
According to a further embodiment of the invention, calculating the local density gradient direction of the sample using mean shift further comprises:
a mean shift vector of the sample part is calculated, wherein the vector represents the direction of maximum increase of estimated density with respect to the sample itself.
According to a further embodiment of the present invention, calculating the similarity further comprises:
a cosine similarity algorithm is utilized to calculate a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster-like centers, wherein the greater the cosine value, the higher the similarity.
According to a further embodiment of the present invention, the initializing cluster centers is performed by a K-means++ clustering algorithm, wherein the distance between the cluster centers is as large as possible.
According to another aspect of the present invention, there is provided a data processing apparatus, the apparatus comprising:
the data acquisition module is configured to acquire user behavior data in real time as an original sample set;
an initialization class cluster center module configured to initialize class cluster centers according to a number of class clusters and the original sample set;
a data clustering module configured to:
for each sample in the original set of samples, determining whether there are two or more cluster-like centers closest to the sample,
if present, then
Calculating a local density gradient direction of the sample using mean shift, calculating a similarity between the local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster centers, and
dividing the samples into class clusters corresponding to the maximum similarity;
otherwise, dividing the sample into a class cluster closest to the center of the class cluster; and
and the data pushing module is configured to push related data to each user group associated with each class cluster in real time based on the clustering result.
According to one embodiment of the invention, determining whether there are two or more cluster-like centers closest to the sample further comprises:
calculating Euclidean distances from the sample to the centers of K class clusters to obtain a distance set aiming at the sample, wherein K is the number of the class clusters;
calculating the distance between the sample and the center c of other clusters q Ratio of the distance of (2) to the smallest distance in the set of distances to obtain a corresponding set of distance ratios
Wherein if there is a setThen determine that +.>The cluster center is closest to the sample, where ε is a threshold set by human experience.
According to a further embodiment of the invention, calculating the local density gradient direction of the sample using mean shift further comprises:
a mean shift vector of the sample part is calculated, wherein the vector represents the direction of maximum increase of estimated density with respect to the sample itself.
According to a further embodiment of the present invention, calculating the similarity further comprises:
a cosine similarity algorithm is utilized to calculate a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster-like centers, wherein the greater the cosine value, the higher the similarity.
According to a further embodiment of the present invention, the initializing cluster centers is performed by a K-means++ clustering algorithm, wherein the distance between the cluster centers is as large as possible.
Compared with the scheme in the prior art, the data processing method and device based on the meanshift optimization provided by the invention have at least the following advantages:
(1) By considering the relation between the original sample and other clusters, the edges of each cluster and the peripheral areas of the clusters are divided more reasonably, the clusters are compact, and the clustering effect is improved, so that the global optimum is achieved.
(2) Compared with the traditional K-means algorithm, the center positions of K class clusters can be estimated more accurately, so that the K class clusters can be converged rapidly, and the iteration times are reduced.
These and other features and advantages will become apparent upon reading the following detailed description and upon reference to the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are explanatory only and are not restrictive of aspects as claimed.
Drawings
So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only certain typical aspects of this invention and are therefore not to be considered limiting of its scope, for the description may admit to other equally effective aspects.
FIG. 1 illustrates an exemplary architecture diagram of a data processing apparatus based on a meanshift optimization in accordance with one embodiment of the present invention.
FIG. 2 shows a flow chart of a method of data processing based on a meanshift optimization in accordance with one embodiment of the present invention.
FIG. 3 shows a flow chart of a mean shift based clustering algorithm according to one embodiment of the invention.
FIG. 4 illustrates an example of a central sample two-dimensional region according to one embodiment of the invention.
Detailed Description
The features of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings.
FIG. 1 is an exemplary architecture diagram of a meanshift optimization-based data processing apparatus 100 in accordance with one embodiment of the present invention. As shown in fig. 1, the apparatus 100 of the present invention includes: the system comprises a data acquisition module 101, an initialization cluster-like center module 102, a data clustering module 103 and a data pushing module 104.
The data acquisition module 101 may acquire user data in real time as an original sample set and store it in a big data platform according to data characteristics classification. As an example, the data collection module 101 may collect, in real time, behavior data of a user watching a television program as an original sample set, wherein a history of television programs watched 30 days before user i is counted each day, accumulated according to the time of their respective watching for each of T program types, and normalized to a score, i.e., time t /(time 1 +time 2 +…+time T ) Wherein the score of each user for each program type is stored as an original sample x i
The initialize class cluster center module 102 may initialize the class cluster center based on the number of class clusters and the original sample set. As one example, the initialize cluster center module 102 may initialize K cluster centers with a K-means++ clustering algorithm to maximize the distance between them. The K-means++ algorithm comprises the following specific steps: (1) Firstly, randomly selecting a sample point X from an original sample set X i As the first initial cluster center c i The method comprises the steps of carrying out a first treatment on the surface of the (2) Then calculate each sample point x i The shortest distance D (x) between the current existing cluster center and the current existing cluster center is calculated, and each sample point x is calculated i Probability P (x) of being selected as the next cluster center, and finally selecting the maximum probability value pairSample point x of interest i As the center of the next cluster; and (3) repeating the step (2) until K cluster centers are selected.
The data clustering module 103 may calculate, for each sample in the original set of samples that is closest and approximate to the center of two or more clusters, a local density gradient direction for that sample using a mean shift algorithm; calculating a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster-like centers; and attributing the sample to the class cluster corresponding to the maximum similarity for clustering. In particular, the data clustering module 103 may calculate each sample X in the original sample set X i Euclidean distance to the center of K class clusters (as can be seen in FIG. 4, each arrow in FIG. 4 points from the center sample point to the center of each class cluster), for each sample x i Obtaining a distance set, and calculating a sample x according to the distance set i Corresponding distance ratio set, judging sample x i Whether the sample x is closest to or similar to two or more cluster centers exists or not, if so, recording the corresponding cluster centers, and calculating the sample x i A local mean shift vector representing the direction of maximum increase in estimated density (simply referred to as density gradient direction) relative to the direction in which the sample itself points, and the sample x is calculated i Is the local density gradient direction of (1) and sample x i Similarity of direction to the center of each cluster, sample x i And dividing the clustering clusters into class clusters with the maximum similarity, and clustering the class clusters.
The data pushing module 104 may push relevant data to each user group in real time according to the clustering result. In one example, television users may be automatically grouped into K groups by a clustering algorithm, then the T attributes (program types) in the center of each group of clusters are ordered, and the relevant programs are pushed in a targeted manner to each group by the background according to the respective Top-N attribute (program type).
For ease of illustration, embodiments of the present invention will be described below using the mean shift mean++ based K-means clustering algorithm as an example, but those skilled in the art will appreciate that the present invention is equally applicable to other clustering algorithms.
FIG. 2 is a flow chart of a method 200 of data processing based on a meanshift optimization in accordance with one embodiment of the present invention. The method starts in step 201 with the data acquisition module 101 acquiring user behavior data in real time as an original sample set X.
In step 202, the initialize class cluster center module 102 initializes the class cluster center based on the number of class clusters and the original sample set. Algorithms for initializing cluster-like centers include, but are not limited to, K-means++, K-means, canopy, and the like.
At step 203, the data clustering module 103 determines, for each sample in the original set of samples, whether there are two or more cluster centers that are closest and approximate to the sample; if so, calculating a local density gradient direction of the sample by using a non-parameter estimation mean shift algorithm, calculating a similarity between the local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster centers, and classifying the sample into a cluster corresponding to the maximum similarity; otherwise, the sample is divided into the class clusters nearest to the center of the class clusters. The specific implementation steps of this algorithm are described in further detail below in fig. 3.
In step 204, the data pushing module 104 pushes relevant data to each user group in real time according to the clustering result.
FIG. 3 shows a flowchart of a cluster algorithm 300 based on means shift, according to one embodiment of the invention. The detailed steps of the algorithm 300 are as follows:
step 1: the number K of the input class clusters and the original sample set X, namely
Step 2: initializing K cluster centers by using a K-means++ algorithm,
i.e.
Step 3: computing each original sample X in the original sample set X i The Euclidean distance to the center of the K cluster-like clusters is denoted as d (x i ,c k ) Where k=1, 2,3, …, K, where the euclidean distance is found by the following formula: for points x and y in the n-dimensional space,thus, for each sample x i Obtaining a distance set
Step 4: calculation of the original sample x i From the center c of other clusters q Distance from cluster centerTo obtain a corresponding distance ratio set +.>Wherein the original sample x i Distance from cluster center->Is the smallest.
Step 5: if it isAll greater than the threshold epsilon, then the rule of minimum distance is used to divide, i.e. sample x i Into the cluster closest to the center of the cluster, where ε may be a threshold set by human experience.
Step 6: if there is a setThen indicate +.>Cluster center and sample x i The distance is nearest and approximate, and the sample x is determined by means of mean shift i Belonging to which cluster. The method comprises the following specific steps:
a) In sample x i Taking the center, h as the radius, and taking a p-dimensional sphere as S h (x i )。
b) Find x i Shift mean vector, denoted M h (x i )。
Note that if Z is 0, then x is considered i Is an outlier and is rejected.
c) Obtaining a sample x i To the point of{ c v Direction, i.e.)>
d)M h (x i ) Respectively withThe corresponding similarity is obtained through a cosine similarity algorithm, and x is calculated i Dividing the two vectors into class clusters with maximum similarity, wherein the cosine similarity algorithm evaluates the similarity of the two vectors by calculating the cosine value of the included angle, and the larger the cosine value is, the higher the similarity is.
Step 7: each sample X in the original sample set X i After division, updating the center of each class cluster to obtainCalculating the objective function of the whole cluster, and marking as E (1) Wherein the objective function expression is as follows: />
Step 8: when E is (t+1) Approximation E (t) And (3) indicating that the clustering result is converged, and outputting the clustering result, otherwise, continuing to execute the steps 3-7.
What has been described above includes examples of aspects of the claimed subject matter. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the claimed subject matter, but one of ordinary skill in the art may recognize that many further combinations and permutations of the claimed subject matter are possible. Accordingly, the disclosed subject matter is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims.

Claims (10)

1. A method of data processing based on MEANSHIFT optimization, the method comprising:
collecting user behavior data in real time as an original sample set;
initializing a class cluster center according to the number of class clusters and the original sample set;
for each sample in the original set of samples, determining whether there are two or more cluster-like centers closest to the sample,
if so, the local density gradient direction of the sample is calculated using mean shift,
calculating a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster centers, an
Dividing the samples into class clusters corresponding to the maximum similarity;
otherwise, dividing the sample into a class cluster closest to the center of the class cluster; and
and pushing relevant data to each user group in real time according to the clustering result.
2. The method of claim 1, wherein determining whether there are two or more cluster-like centers closest to the sample further comprises:
calculating Euclidean distances from the sample to the centers of K class clusters to obtain a distance set aiming at the sample, wherein K is the number of the class clusters;
calculating the distance between the sample and the center c of other clusters q Is centered from the cluster-like center with the samples in the distance setTo obtain a corresponding distance ratio set +.>
Wherein if there is a setThen determine that +.>The cluster center is closest to the sample, where ε is a threshold set by human experience.
3. The method of claim 1, wherein calculating the local density gradient direction of the sample using mean shift further comprises:
a mean shift vector of the sample part is calculated, wherein the vector represents the direction of maximum increase of estimated density with respect to the sample itself.
4. The method of claim 1, wherein calculating a similarity further comprises:
a cosine similarity algorithm is utilized to calculate a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster-like centers.
5. The method of claim 1, wherein initializing cluster-like centers is performed by a K-means++ clustering algorithm.
6. A MEANSHIFT optimized data processing device, the device comprising:
the data acquisition module is configured to acquire user behavior data in real time as an original sample set;
an initialization class cluster center module configured to initialize class cluster centers according to a number of class clusters and the original sample set;
a data clustering module configured to:
for each sample in the original set of samples, determining whether there are two or more cluster-like centers closest to the sample,
if so, the local density gradient direction of the sample is calculated using mean shift,
calculating a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster centers, an
Dividing the samples into class clusters corresponding to the maximum similarity;
otherwise, dividing the sample into a class cluster closest to the center of the class cluster; and
and the data pushing module is configured to push related data to each user group associated with each class cluster in real time based on the clustering result.
7. The apparatus of claim 6, wherein determining whether there are two or more cluster-like centers closest to the sample further comprises:
calculating Euclidean distances from the sample to the centers of K class clusters to obtain a distance set aiming at the sample, wherein K is the number of the class clusters;
calculating the distance between the sample and the center c of other clusters q Is located from the cluster center c with the samples in the distance set m* To obtain a corresponding set of distance ratios
Wherein if there is a setThen determine that +.>The cluster center is closest to the sample, where ε is a threshold set by human experience.
8. The apparatus of claim 6, wherein calculating the local density gradient direction of the sample using mean shift further comprises:
a mean shift vector of the sample part is calculated, wherein the vector represents the direction of maximum increase of estimated density with respect to the sample itself.
9. The apparatus of claim 6, wherein calculating a similarity further comprises:
a cosine similarity algorithm is utilized to calculate a similarity between a local density gradient direction of the sample and a direction of the sample toward each of the two or more cluster-like centers.
10. The apparatus of claim 6, wherein the initializing cluster-like centers is performed by a K-means++ clustering algorithm.
CN202110161944.7A 2021-02-05 2021-02-05 MEANSHIFT optimization-based data processing method and device Active CN113850281B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110161944.7A CN113850281B (en) 2021-02-05 2021-02-05 MEANSHIFT optimization-based data processing method and device
PCT/CN2021/136291 WO2022166380A1 (en) 2021-02-05 2021-12-08 Data processing method and apparatus based on meanshift optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110161944.7A CN113850281B (en) 2021-02-05 2021-02-05 MEANSHIFT optimization-based data processing method and device

Publications (2)

Publication Number Publication Date
CN113850281A CN113850281A (en) 2021-12-28
CN113850281B true CN113850281B (en) 2024-03-12

Family

ID=78972859

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110161944.7A Active CN113850281B (en) 2021-02-05 2021-02-05 MEANSHIFT optimization-based data processing method and device

Country Status (2)

Country Link
CN (1) CN113850281B (en)
WO (1) WO2022166380A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114913423A (en) * 2022-05-25 2022-08-16 中国电建集团成都勘测设计研究院有限公司 Model training method and extraction method for surrounding rock fracture information
CN115563522B (en) * 2022-12-02 2023-04-07 湖南工商大学 Traffic data clustering method, device, equipment and medium
CN116304776B (en) * 2023-03-21 2023-11-21 宁波送变电建设有限公司运维分公司 Power grid data value anomaly detection method and system based on k-Means algorithm
CN116628289B (en) * 2023-07-25 2023-12-01 泰能天然气有限公司 Heating system operation data processing method and strategy optimization system
CN117113118B (en) * 2023-10-19 2024-01-26 张家港长三角生物安全研究中心 Intelligent monitoring method and system for biological aerosol
CN117217501B (en) * 2023-11-09 2024-02-20 山东多科科技有限公司 Digital production planning and scheduling method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777126A (en) * 2010-02-10 2010-07-14 华中科技大学 Clustering method for multidimensional characteristic vectors
CN104008127A (en) * 2014-04-21 2014-08-27 中国电子科技集团公司第二十八研究所 Group identification method based on clustering algorithm
CN106779073A (en) * 2016-12-27 2017-05-31 西安石油大学 Media information sorting technique and device based on deep neural network
CN108985318A (en) * 2018-05-28 2018-12-11 中国地质大学(武汉) A kind of global optimization K mean cluster method and system based on sample rate
CN110019563A (en) * 2018-08-09 2019-07-16 北京首钢自动化信息技术有限公司 A kind of portrait modeling method and device based on multidimensional data
CN110134839A (en) * 2019-03-27 2019-08-16 平安科技(深圳)有限公司 Time series data characteristic processing method, apparatus and computer readable storage medium
CN110852370A (en) * 2019-11-06 2020-02-28 国网湖南省电力有限公司 Clustering algorithm-based large-industry user segmentation method
CN111967338A (en) * 2020-07-27 2020-11-20 广东电网有限责任公司广州供电局 Method and system for distinguishing partial discharge pulse interference signal based on mean shift clustering algorithm

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102222234A (en) * 2011-07-14 2011-10-19 苏州两江科技有限公司 Image object extraction method based on mean shift and K-means clustering technology
US11977959B2 (en) * 2019-05-15 2024-05-07 EMC IP Holding Company LLC Data compression using nearest neighbor cluster
CN110441819B (en) * 2019-08-06 2020-10-27 五季数据科技(北京)有限公司 Earthquake first-motion wave automatic pickup method based on mean shift clustering analysis

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777126A (en) * 2010-02-10 2010-07-14 华中科技大学 Clustering method for multidimensional characteristic vectors
CN104008127A (en) * 2014-04-21 2014-08-27 中国电子科技集团公司第二十八研究所 Group identification method based on clustering algorithm
CN106779073A (en) * 2016-12-27 2017-05-31 西安石油大学 Media information sorting technique and device based on deep neural network
CN108985318A (en) * 2018-05-28 2018-12-11 中国地质大学(武汉) A kind of global optimization K mean cluster method and system based on sample rate
CN110019563A (en) * 2018-08-09 2019-07-16 北京首钢自动化信息技术有限公司 A kind of portrait modeling method and device based on multidimensional data
CN110134839A (en) * 2019-03-27 2019-08-16 平安科技(深圳)有限公司 Time series data characteristic processing method, apparatus and computer readable storage medium
CN110852370A (en) * 2019-11-06 2020-02-28 国网湖南省电力有限公司 Clustering algorithm-based large-industry user segmentation method
CN111967338A (en) * 2020-07-27 2020-11-20 广东电网有限责任公司广州供电局 Method and system for distinguishing partial discharge pulse interference signal based on mean shift clustering algorithm

Also Published As

Publication number Publication date
CN113850281A (en) 2021-12-28
WO2022166380A1 (en) 2022-08-11

Similar Documents

Publication Publication Date Title
CN113850281B (en) MEANSHIFT optimization-based data processing method and device
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
Nech et al. Level playing field for million scale face recognition
CN108564129B (en) Trajectory data classification method based on generation countermeasure network
Gao et al. Less is more: Efficient 3-D object retrieval with query view selection
CN105760888B (en) A kind of neighborhood rough set integrated learning approach based on hierarchical cluster attribute
Cao et al. Deep priority hashing
CN111126396A (en) Image recognition method and device, computer equipment and storage medium
CN110751027B (en) Pedestrian re-identification method based on deep multi-instance learning
CN111125469B (en) User clustering method and device of social network and computer equipment
JP6897749B2 (en) Learning methods, learning systems, and learning programs
Lim et al. Efficient-prototypicalnet with self knowledge distillation for few-shot learning
CN114283350B (en) Visual model training and video processing method, device, equipment and storage medium
Lin et al. Fairgrape: Fairness-aware gradient pruning method for face attribute classification
Mansourifar et al. Virtual big data for GAN based data augmentation
CN114238329A (en) Vector similarity calculation method, device, equipment and storage medium
WO2014118978A1 (en) Learning method, image processing device and learning program
CN117312681A (en) Meta universe oriented user preference product recommendation method and system
Amid et al. A more globally accurate dimensionality reduction method using triplets
Zhang et al. Dataset-driven unsupervised object discovery for region-based instance image retrieval
CN114781779A (en) Unsupervised energy consumption abnormity detection method and device and storage medium
CN114332550A (en) Model training method, system, storage medium and terminal equipment
CN110209895B (en) Vector retrieval method, device and equipment
CN111488520B (en) Crop planting type recommendation information processing device, method and storage medium
CN115587297A (en) Method, apparatus, device and medium for constructing image recognition model and image recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20220127

Address after: Room 1423, No. 1256 and 1258, Wanrong Road, Jing'an District, Shanghai 200072

Applicant after: Tianyi Digital Life Technology Co.,Ltd.

Address before: 201702 3rd floor, 158 Shuanglian Road, Qingpu District, Shanghai

Applicant before: Tianyi Smart Family Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant