JP2023502863A

JP2023502863A - Image incremental clustering method and apparatus, electronic device, storage medium and program product

Info

Publication number: JP2023502863A
Application number: JP2022524182A
Authority: JP
Inventors: ▲劉▼▲凱▼▲鑑▼; 余世杰; ▲陳▼浩彬; ▲陳▼大▲鵬▼; ▲趙▼瑞
Original assignee: Zhejiang Sensetime Technology Development Co Ltd
Current assignee: Zhejiang Sensetime Technology Development Co Ltd
Priority date: 2020-10-30
Filing date: 2020-12-04
Publication date: 2023-01-26
Also published as: TW202217597A; CN112257801A; WO2022088390A1; CN112257801B; KR20220070482A

Abstract

本開示は、画像の増分クラスタリング方法及び装置、電子機器、記憶媒体並びにプログラム製品を提供し、該方法は、第１画像データ集合の第１クラスタを取得し、前記第１クラスタをＭ個の第１サブクラスタに分割し、前記Ｍ個の第１サブクラスタのうちの各第１サブクラスタに対応する第１クラスタ中心を取得し、前記Ｍが１以上の整数であり、第２画像データ集合を取得し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージする。The present disclosure provides a method and apparatus for incremental clustering of images, an electronic device, a storage medium and a program product, the method obtains a first cluster of a first image data set, divides the first cluster into M dividing into one sub-cluster, obtaining a first cluster center corresponding to each first sub-cluster of the M first sub-clusters, wherein M is an integer of 1 or more, and a second image data set is obtained; and merging the second image data set and the first cluster using the first cluster center.

Description

（関連出願の相互参照）
本開示は、出願番号が２０２０１１１８５９１１．８であり、出願日が２０２０年１０月３０日である中国特許出願に基づいて提案され、且つこの中国特許出願の優先権を主張し、この中国特許出願の全てが参照によって本開示に組み込まれる。 (Cross reference to related applications)
This disclosure is proposed based on and claims priority from a Chinese patent application with filing number 202011185911.8 and filing date of October 30, 2020, and All are incorporated into this disclosure by reference.

本開示の実施例はコンピュータビジョン技術分野に関し、特に画像の増分クラスタリング方法及び装置、電子機器、記憶媒体並びにプログラム製品に関する。 TECHNICAL FIELD Embodiments of the present disclosure relate to the technical field of computer vision, and more particularly to methods and apparatus for incremental clustering of images, electronic devices, storage media, and program products.

画像処理技術は、深層学習の発展により大きく進歩しており、顔認識を例にすると、教師あり学習によって得られた顔認識モデルは、認識精度が飛躍的に向上しているが、爆発的に増加しているラベル無しの画像データに接する際に、どのように正確かつ迅速に分類するかは、依然として議論と研究に値する課題である。 Image processing technology has made great progress with the development of deep learning. Taking face recognition as an example, the recognition accuracy of face recognition models obtained by supervised learning has improved dramatically, but it has also exploded. How to classify accurately and quickly in the face of increasing amounts of unlabeled image data is still an issue worthy of discussion and research.

上記課題に対して、本開示は画像の増分クラスタリング方法及び装置、電子機器、記憶媒体並びにプログラム製品を提供し、増分クラスタリングにおいて、クラスタ中心のドリフトによりクラスタリング効果が影響されるという問題を解決するのに役立つ。 In view of the above problems, the present disclosure provides an image incremental clustering method and apparatus, an electronic device, a storage medium and a program product to solve the problem that the clustering effect is affected by cluster center drift in incremental clustering. Helpful.

上記目的を実現するために、本開示の実施例の第１態様は、画像の増分クラスタリング方法を提供する。前記画像の増分クラスタリング方法は、
第１画像データ集合の第１クラスタを取得するステップと、前記第１クラスタをＭ個の第１サブクラスタに分割し、前記Ｍ個の第１サブクラスタのうちの各第１サブクラスタに対応する第１クラスタ中心を取得するステップであって、前記Ｍは１以上の整数であるステップと、第２画像データ集合を取得し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージするステップと、を含む。 To achieve the above objectives, a first aspect of embodiments of the present disclosure provides an incremental clustering method for images. The incremental clustering method for the image comprises:
obtaining a first cluster of a first image data set; dividing said first cluster into M first sub-clusters, corresponding to each first sub-cluster of said M first sub-clusters; obtaining a first cluster center, wherein M is an integer greater than or equal to 1; obtaining a second image data set; merging with one cluster.

第１態様によれば、可能な一実施形態において、前記第１クラスタは第１クラスタＡ、第１クラスタＢ及び第１クラスタＣを含み、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージするステップは、
前記第２画像データ集合に複数の画像データが含まれる場合、前記複数の画像データをクラスタリングし、孤立画像データ及び第２クラスタを得るステップと、前記第１クラスタ中心を用いて前記孤立画像データを前記第１クラスタＡへとマージし、前記第１クラスタ中心を用いて前記第２クラスタと前記第１クラスタＢとをマージするステップと、前記第２画像データ集合内に単一の画像データのみ存在する場合、前記第１クラスタ中心を用いて前記単一の画像データを前記第１クラスタＣへとマージするステップと、を含む。 According to a first aspect, in one possible embodiment said first cluster comprises a first cluster A, a first cluster B and a first cluster C, and said first cluster center is used to The step of merging the first cluster with
if the second image data set includes a plurality of image data, clustering the plurality of image data to obtain isolated image data and a second cluster; and clustering the isolated image data using the first cluster center. merging into said first cluster A and merging said second cluster with said first cluster B using said first cluster center; and only a single image data exists in said second image data set. if so, merging the single image data into the first cluster C using the first cluster center.

このように、第２画像データ集合内の複数の画像データをクラスタリングし、得られた孤立画像データ及び第２クラスタと、第１クラスタに含まれる第１クラスタＡ、第１クラスタＢ及び第１クラスタＣとをそれぞれマージすることで、単一のサンプルのクラスタへの組み込み、及びクラスタ間のマージを実現することができる。 In this way, a plurality of image data in the second image data set are clustered, and the obtained isolated image data and the second cluster, the first cluster A, the first cluster B, and the first cluster included in the first cluster By merging C and C respectively, incorporation of single samples into clusters and merging between clusters can be realized.

第１態様によれば、可能な一実施形態において、前記第１クラスタは対応する第２クラスタ中心を有し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージする前に、前記増分クラスタリング方法は、
前記第２クラスタ中心を用いて、前記第１クラスタからＫ個の第１クラスタを決定するステップをさらに含む。 According to the first aspect, in one possible embodiment, the first cluster has a corresponding second cluster center, and the first cluster center is used to divide the second image data set and the first cluster. Before merging, the incremental clustering method includes:
Further comprising determining K first clusters from the first cluster using the second cluster center.

第１態様によれば、可能な一実施形態において、前記第２クラスタは対応する第３クラスタ中心を有し、前記第２クラスタ中心を用いて、前記第１クラスタからＫ個の第１クラスタを決定するステップは、
前記前記孤立画像データと前記第２クラスタ中心との第１類似度を取得し、前記第１類似度に基づき、前記第１クラスタを高い順にソートして第１クラスタ系列を得て、前記第１クラスタ系列内の先頭からＫ個の第１クラスタを選択するステップと、前記第３クラスタ中心と前記第２クラスタ中心との第２類似度を取得し、前記第２類似度に基づき、前記第１クラスタを高い順にソートして第２クラスタ系列を得て、前記第２クラスタ系列内の先頭からＫ個の第１クラスタを選択するステップと、を含むか、又は、前記単一の画像データと前記第２クラスタ中心との第３類似度を取得し、前記第３類似度に基づき、前記第１クラスタを高い順にソートして第３クラスタ系列を得て、前記第３クラスタ系列内の先頭からＫ個の第１クラスタを選択するステップを含む。 According to a first aspect, in one possible embodiment, said second cluster has a corresponding third cluster center, and said second cluster center is used to derive K first clusters from said first cluster. The step of determining
obtaining a first similarity between the isolated image data and the center of the second cluster, sorting the first clusters in ascending order based on the first similarity to obtain a first cluster series, selecting K first clusters from the beginning in a cluster series; obtaining a second similarity between the third cluster center and the second cluster center; and based on the second similarity, the first sorting the clusters in ascending order to obtain a second cluster sequence, and selecting the top K first clusters in said second cluster sequence; or said single image data and said Obtaining a third similarity with the second cluster center, sorting the first cluster in ascending order based on the third similarity to obtain a third cluster sequence, K from the top in the third cluster sequence selecting first clusters.

このように、計算して得られた第２クラスタ中心と孤立画像データ、第３クラスタ中心及び単一の画像データとの類似度を用いて、第１クラスタを選別することは、第２画像データ集合内の画像データのクラスタカテゴリにより近い第１クラスタを決定するのに役立つ。 Thus, using the calculated similarities between the center of the second cluster and the isolated image data, the center of the third cluster, and the single image data, the selection of the first cluster is performed by selecting the second image data It helps determine the first cluster that is closer to the cluster category of the image data in the set.

第１態様によれば、可能な一実施形態において、前記第１クラスタ中心を用いて前記孤立画像データと前記第１クラスタＡとをマージするステップは、
前記孤立画像データと第１クラスタ中心Ｄとの第４類似度を取得するステップであって、前記第１クラスタ中心Ｄは、前記Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であるステップと、前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第４類似度が第１閾値よりも大きい前記第１クラスタ中心Ｄの第１数量を決定するステップと、前記Ｋ個の第１クラスタのうちの前記第１数量が最大である第１クラスタを前記第１クラスタＡとして決定するステップと、前記孤立画像データと前記第１クラスタＡとをマージするステップと、を含む。 According to a first aspect, in one possible embodiment, the step of merging said isolated image data and said first cluster A using said first cluster center comprises:
obtaining a fourth similarity between the isolated image data and a first cluster center D, wherein the first cluster center D is the first being the first cluster center corresponding to a sub-cluster; and for each first cluster of the K first clusters, within each of the first clusters, the fourth similarity being a first threshold; and determining the first cluster among the K first clusters with the largest first quantity as the first cluster A. and merging the isolated image data and the first cluster A.

このように、孤立画像データにより近い第１サブクラスタは、第１クラスタＡに最も多く存在し、孤立画像データを第１クラスタＡへとマージすることで、クラスタリング結果をより正確にすることができる。 Thus, the first sub-cluster closer to the isolated image data is most present in the first cluster A, and merging the isolated image data into the first cluster A can make the clustering result more accurate. .

第１態様によれば、可能な一実施形態において、前記第１クラスタ中心を用いて前記第２クラスタと前記第１クラスタＢとをマージするステップは、
前記第２クラスタをＮ個の第２サブクラスタに分割し、前記Ｎ個の第２サブクラスタのうちの各第２サブクラスタに対応する第４クラスタ中心を取得するステップであって、前記Ｎは１以上の整数であるステップと、前記第４クラスタ中心と第１クラスタ中心Ｅとの第５類似度を取得するステップであって、前記第１クラスタ中心Ｅは、Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であるステップと、前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第５類似度が第２閾値よりも大きい前記第１クラスタ中心Ｅの第２数量を決定するステップと、前記Ｋ個の第１クラスタのうちの前記第２数量が最大である第１クラスタを前記第１クラスタＢとして決定するステップと、前記第２クラスタと前記第１クラスタＢとをマージするステップと、を含む。 According to a first aspect, in one possible embodiment, the step of merging said second cluster and said first cluster B using said first cluster center comprises:
dividing the second cluster into N second sub-clusters and obtaining a fourth cluster center corresponding to each second sub-cluster of the N second sub-clusters, wherein N is being an integer greater than or equal to 1; and obtaining a fifth similarity between the fourth cluster center and the first cluster center E, wherein the first cluster center E is one of the K first clusters. and for each first cluster of the K first clusters, within each first cluster, determining a second quantity of said first cluster centers E for which said fifth similarity is greater than a second threshold; determining as the first cluster B; and merging the second cluster and the first cluster B;

このように、第１クラスタＫは数量が最も多く、それを第１クラスタＢとして決定し、つまり、第２クラスタの第２サブクラスタにより近い第１サブクラスタは、第１クラスタＢに最も多く存在し、第２クラスタを第１クラスタＢへとマージすることで、クラスタリング結果をより正確にすることができる。 Thus, the first cluster K has the highest quantity and we determine it as the first cluster B, i.e. the first sub-cluster closer to the second sub-cluster of the second cluster is the most abundant in the first cluster B and merging the second cluster into the first cluster B, the clustering result can be made more accurate.

第１態様によれば、可能な一実施形態において、前記第１クラスタ中心を用いて前記単一の画像データと前記第１クラスタＣとをマージするステップは、
前記単一の画像データと第１クラスタ中心Ｆとの第６類似度を取得するステップであって、前記第１クラスタ中心Ｆは、前記Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であるステップと、前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第６類似度が第３閾値よりも大きい前記第１クラスタ中心Ｆの第３数量を決定するステップと、前記Ｋ個の第１クラスタのうちの前記第３数量が最大である第１クラスタを前記第１クラスタＣとして決定するステップと、前記単一の画像データと前記第１クラスタＣとをマージするステップと、を含む。 According to the first aspect, in one possible embodiment, the step of merging said single image data and said first cluster C using said first cluster center comprises:
obtaining a sixth similarity measure between the single image data and a first cluster center F, wherein the first cluster center F is for each first cluster of the K first clusters; being the first cluster center corresponding to a first sub-cluster; and for each first cluster of the K first clusters, within each first cluster, the sixth similarity being the first determining a third quantity of the first cluster center F that is greater than a threshold of 3; and determining the first cluster among the K first clusters with the largest third quantity as the first cluster C. and merging the single image data and the first cluster C.

このように、単一の画像データにより近い第１サブクラスタは、第１クラスタＣに最も多く存在し、単一の画像データを第１クラスタＣへとマージすることで、クラスタリング結果をより正確にすることができる。 In this way, the first sub-cluster closer to the single image data exists most in the first cluster C, and by merging the single image data into the first cluster C, the clustering result is more accurate. can do.

第１態様によれば、可能な一実施形態において、前記Ｍは第４閾値以下であり、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージした後、前記増分クラスタリング方法は、
マージした第１クラスタをＲ個の第３サブクラスタに分割し、前記Ｒ個の第３サブクラスタのうちの各第３サブクラスタの第５クラスタ中心を取得するステップであって、前記Ｒは１以上の整数であるステップと、前記Ｒが前記第４閾値以下である場合、前記Ｒ個の第３サブクラスタを残し、前記Ｒ個の第３サブクラスタに対応する前記第５クラスタ中心を用いて前記第１クラスタ中心を更新するステップと、前記Ｒが前記第４閾値よりも大きい場合、前記Ｒ個の第３サブクラスタのうちの各第３サブクラスタ内の画像データの第４数量を取得するステップと、前記第４数量に基づき、前記Ｒ個の第３サブクラスタを大きい順にソートして第４クラスタ系列を得て、前記第４クラスタ系列内の先頭からＰ個の第３サブクラスタを選択し、前記Ｐ個の第３サブクラスタに対応する前記第５クラスタ中心を用いて前記第１クラスタ中心を更新するステップであって、前記Ｐは前記第４閾値以下であるステップと、をさらに含む。 According to the first aspect, in one possible embodiment, said M is less than or equal to a fourth threshold, and after merging said second image data set and said first cluster using said first cluster center, said The incremental clustering method is
dividing the merged first cluster into R third sub-clusters and obtaining a fifth cluster center of each third sub-cluster of said R third sub-clusters, wherein said R is 1 and if R is less than or equal to the fourth threshold, then retain the R third subclusters and use the fifth cluster center corresponding to the R third subclusters. updating the first cluster center; and obtaining a fourth quantity of image data in each third sub-cluster of the R third sub-clusters if the R is greater than the fourth threshold. sorting the R third sub-clusters according to the fourth quantity to obtain a fourth cluster sequence, and selecting the first P third sub-clusters in the fourth cluster sequence; and updating the first cluster center using the fifth cluster center corresponding to the P third subclusters, wherein the P is less than or equal to the fourth threshold. .

このように、サブクラスタが多い場合、より多くの画像データを有するサブクラスタを残すことで、サブ中心の量を制限し、外れ値となる画像データの影響を除去することができ、これにより、維持を容易にするのみならず、長時間の大規模増分クラスタリングのシーンにおいても高いクラスタリング効果が得られることを可能にする。 Thus, if there are many sub-clusters, leaving sub-clusters with more image data can limit the amount of sub-centers and remove the effect of outlier image data, thereby: It not only facilitates maintenance, but also enables high clustering effects to be obtained even in long-time large-scale incremental clustering scenes.

第１態様によれば、可能な一実施形態において、前記第１クラスタは前記第１画像データ集合内の画像データをクラスタリングすることで得られるものであり、前記第１クラスタをＭ個の第１サブクラスタに分割するステップは、
前記第１クラスタ内の画像データ間の第７類似度を取得し、類似度行列を得るステップと、前記類似度行列に基づいて前記第１クラスタを前記Ｍ個の第１サブクラスタに分割するステップと、を含む。 According to the first aspect, in one possible embodiment, said first cluster is obtained by clustering image data in said first image data set, said first cluster being divided into M first The step of dividing into sub-clusters is
obtaining a seventh degree of similarity between image data in the first cluster to obtain a similarity matrix; and dividing the first cluster into the M first sub-clusters based on the similarity matrix. and including.

このように、類似度行列を用いて前記第１クラスタを前記Ｍ個の第１サブクラスタに分割してもよい。 Thus, a similarity matrix may be used to divide the first cluster into the M first sub-clusters.

第１態様によれば、可能な一実施形態において、前記類似度行列に基づいて前記第１クラスタを前記Ｍ個の第１サブクラスタに分割するステップは、
前記第１クラスタ内の画像データを頂点として構成される連結グラフを取得するステップと、前記類似度行列からサーチして前記連結グラフの頂点間の前記第７類似度を得るステップと、前記第７類似度が第５閾値よりも大きい複数の頂点を１つの第１サブクラスタとして分割し、前記Ｍ個の第１サブクラスタを得るステップと、を含む。 According to the first aspect, in one possible embodiment the step of dividing said first cluster into said M first sub-clusters based on said similarity matrix comprises:
acquiring a connectivity graph configured with image data in the first cluster as vertices; searching from the similarity matrix to obtain the seventh degree of similarity between vertices of the connectivity graph; dividing a plurality of vertices whose similarity is greater than a fifth threshold as one first sub-cluster to obtain the M first sub-clusters.

このように、連結グラフを用いて、前記第７類似度が第５閾値よりも大きい複数の頂点を１つの第１サブクラスタとして分割してもよい。 In this way, a connectivity graph may be used to divide a plurality of vertices for which the seventh similarity is greater than the fifth threshold as one first sub-cluster.

本開示の実施例の第２態様は画像の増分クラスタリング装置を提供する。前記画像の増分クラスタリング装置は、
第１画像データ集合の第１クラスタを取得するように構成される第１取得モジュールと、前記第１クラスタをＭ個の第１サブクラスタに分割し、前記Ｍ個の第１サブクラスタのうちの各第１サブクラスタに対応する第１クラスタ中心を取得するように構成される第１分割モジュールであって、前記Ｍは１以上の整数である第１分割モジュールと、第２画像データ集合を取得し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージするように構成されるマージモジュールと、を含む。 A second aspect of embodiments of the present disclosure provides an apparatus for incremental clustering of images. The apparatus for incremental clustering of images comprises:
a first acquisition module configured to acquire a first cluster of a first image data set; dividing said first cluster into M first sub-clusters; a first segmentation module configured to obtain a first cluster center corresponding to each first subcluster, wherein M is an integer greater than or equal to 1; and obtaining a second image data set. and a merge module configured to merge the second image data set and the first cluster using the first cluster center.

本開示の実施例の第３態様は電子機器を提供し、該電子機器は、入力デバイス及び出力デバイスを含み、１つ又は複数の命令を実現するのに適するプロセッサと、前記プロセッサによりロードされて上記第１態様のいずれかの実施形態のステップを実行するのに適する１つ又は複数の命令が記憶されているコンピュータ記憶媒体と、をさらに含む。 A third aspect of embodiments of the present disclosure provides an electronic apparatus, including an input device and an output device, a processor suitable for implementing one or more instructions, and a processor loaded by the processor. a computer storage medium having stored therein one or more instructions suitable for performing the steps of any embodiment of the first aspect above.

本開示の実施例の第４態様はコンピュータ記憶媒体を提供し、前記コンピュータ記憶媒体は、プロセッサによりロードされて上記第１態様のいずれかの実施形態のステップを実行するのに適する１つ又は複数の命令が記憶されている。 A fourth aspect of embodiments of the present disclosure provides a computer storage medium, said computer storage medium, one or more suitable for being loaded by a processor to perform the steps of any of the embodiments of the first aspect above. instructions are stored.

本開示の実施例の第５態様はコンピュータプログラム製品を提供し、前記コンピュータプログラム製品は、プロセッサによりロードされて上記第１態様のいずれかの実施形態のステップを実行するのに適する１つ又は複数の命令を含む。 A fifth aspect of embodiments of the present disclosure provides a computer program product, said computer program product being one or more suitable to be loaded by a processor to perform the steps of any of the embodiments of the first aspect above. including instructions for

以上から分かるように、本開示の実施例は、第１画像データ集合の第１クラスタを取得し、前記第１クラスタをＭ個の第１サブクラスタに分割し、前記Ｍ個の第１サブクラスタのうちの各第１サブクラスタに対応する第１クラスタ中心を取得し、前記Ｍが１以上の整数であり、第２画像データ集合を取得し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージする。このように、第１クラスタを複数の第１サブクラスタに分割し、第１サブクラスタの第１クラスタ中心に基づいて第１クラスタと第２画像データ集合とのマージを実現しており、複数の第１クラスタ中心（即ちサブ中心）を維持することで、画像データの増加に伴い、クラスタ中心（第１クラスタのクラスタ中心であり、即ちメイン中心である）が新たに追加された画像データの影響を受けてドリフトするという問題を解決し、これはクラスタリング結果をより正確にするのに役立ち、クラスタリング効果が向上する。また、クラスタリング過程で、第２画像データ集合は第１画像データ集合全体と類似度を計算する必要がなくなり、計算複雑性の軽減に役立つ。 As can be seen from the above, embodiments of the present disclosure obtain a first cluster of a first image data set, divide the first cluster into M first sub-clusters, and divide the M first sub-clusters into wherein M is an integer greater than or equal to 1, obtaining a second image data set, using the first cluster centers to obtain the second image Merge the dataset with the first cluster. Thus, the first cluster is divided into a plurality of first sub-clusters, and the merging of the first cluster and the second image data set is realized based on the first cluster center of the first sub-cluster, and the plurality of By maintaining the first cluster center (i.e. sub-center), with the increase of the image data, the cluster center (which is the cluster center of the first cluster, i.e. the main center) is newly added image data influence It solves the problem of drifting with , which helps to make the clustering result more accurate, and the clustering effect is improved. Also, in the clustering process, the second image data set does not need to calculate similarity to the entire first image data set, which helps reduce computational complexity.

本開示の実施例により提供される適用環境の模式図である。1 is a schematic diagram of an application environment provided by an embodiment of the present disclosure; FIG. 本開示の実施例により提供される画像の増分クラスタリング方法のフローチャートである。4 is a flowchart of a method for incremental clustering of images provided by embodiments of the present disclosure; 本開示の実施例により提供される第１クラスタの連結グラフの模式図である。FIG. 4 is a schematic diagram of a connectivity graph of a first cluster provided by an embodiment of the present disclosure; 本開示の実施例により提供される第１クラスタを第１サブクラスタに分割する模式図である。FIG. 4 is a schematic diagram of dividing a first cluster into first sub-clusters provided by an embodiment of the present disclosure; 本開示の実施例により提供される第２画像データ集合のクラスタリング結果の模式図である。FIG. 4 is a schematic diagram of a clustering result of a second image data set provided by an embodiment of the present disclosure; 本開示の実施例により提供される孤立画像データを第１クラスタへとマージする模式図である。FIG. 4 is a schematic diagram of merging isolated image data into a first cluster provided by an embodiment of the present disclosure; 本開示の実施例により提供される第２クラスタと第１クラスタマージとをマージする模式図である。FIG. 4 is a schematic diagram of merging a second cluster and a first cluster merging provided by an embodiment of the present disclosure; 本開示の実施例により提供される第１クラスタ中心を更新するフローチャートである。FIG. 4 is a flow chart of updating a first cluster center provided by an embodiment of the present disclosure; FIG. 本開示の実施例により提供される他の画像の増分クラスタリング方法のフローチャートである。4 is a flowchart of another incremental image clustering method provided by embodiments of the present disclosure; 本開示の実施例により提供される画像の増分クラスタリング装置の構成図である。1 is a block diagram of an incremental clustering device for images provided by an embodiment of the present disclosure; FIG. 本開示の実施例により提供される電子機器の構成図である。1 is a configuration diagram of an electronic device provided by an embodiment of the present disclosure; FIG.

本開示の解決手段を当業者により好適に理解させるために、以下において、本開示の実施例における図面を参照しながら、本開示の実施例における技術的解決手段を明確に、完全に説明し、当然ながら、説明される実施例は全ての実施例ではなく、本開示の実施例の一部であってもよい。本開示の実施例に基づき、当業者が創造的な労力を要することなく、得られた他の全ての実施例は、いずれも本開示の保護範囲に属するものとする。 In order to make the solutions of the present disclosure better understood by those skilled in the art, the following clearly and completely describes the technical solutions in the embodiments of the present disclosure with reference to the drawings in the embodiments of the present disclosure, Of course, the described embodiments may not be all embodiments, but some of the embodiments of the present disclosure. All other embodiments obtained by persons skilled in the art based on the embodiments of the present disclosure without creative efforts shall fall within the protection scope of the present disclosure.

本開示の明細書、特許請求の範囲及び図面における「含む」、「備える」という用語及びそれらのいかなる変形も、非排他的に含むことを意図する。例えば、一連のステップ又はユニットを含むプロセス、方法、システム、製品又は機器は、挙げられたステップ又はユニットに限定されるものではなく、本開示のいくつかの実施例では、挙げられないステップ又はユニットをさらに含み、又は、本開示のいくつかの実施例では、これらのプロセス、方法又は機器に固有の他のステップ又はユニットをさらに含む。また、「第１」、「第２」及び「第３」等の用語は、特定の順序を記述するものではなく、異なる対象を区別するためのものである。 The terms "including", "comprising" and any variations thereof in the specification, claims and drawings of this disclosure are intended to be non-exclusive. For example, a process, method, system, product or apparatus that includes a series of steps or units is not limited to the listed steps or units; or, in some embodiments of the present disclosure, further steps or units specific to these processes, methods or devices. Also, terms such as "first," "second," and "third," do not describe a particular order, but are intended to distinguish between different objects.

実際のシーン、例えば、ソーシャルメディア、セキュリティ等のシーンにおいて、画像がインクリメンタルに生成されることが多いため、増分クラスタリングは分類問題の解決に広く用いられており、従来の増分クラスタリングはいくつかの第１クラスタを維持する必要があり、しかし、異なるクラスタは疎性が異なり、増分クラスタリングの進行に伴い、クラスタ中心がドリフトする可能性は増加し、クラスタリング効果がかえって低下してしまう。 Incremental clustering is widely used in solving classification problems because images are often generated incrementally in real scenes, e.g. social media, security, etc. One cluster needs to be maintained, but different clusters have different sparseness, and with the progress of incremental clustering, the probability of cluster center drift increases and the clustering effect decreases.

本開示の実施例は画像データに対する増分クラスタリング方法を提供し、前記増分クラスタリング方法は図１に示す適用環境に基づいて実施可能であり、図１に示すように、該適用環境は、主に画像処理センタ１０１及び画像収集機器１０２を含み、画像処理センタ１０１はサーバ１０１１、端末及びデータベースを含むが、これらに限定されない。いくつかのシーンにおいて、画像収集機器１０２は、例えば顔画像、ビデオ監視画像等の画像を収集するためにゲートチャネル、ショッピングモール、住宅地等のシーンに配置されるカメラ又はカメラヘッドであってもよく、画像処理センタ１０１は監視センタであってもよく、画像処理センタ１０１は、ビデオ監視を管理するためにビデオクラウドノード（ＶＣＮ：ＶｉｄｅｏＣｌｏｕｄＮｏｄｅ）１０１２を導入してもよく、例えば、ディスプレイ１０１３に画像を表示し、画像をクラスタリングしてからデータベース１０１４に記憶する。いくつかのシーンにおいて、画像収集機器１０２はユーザ端末であってもよく、画像収集機器１０２により収集した画像はユーザが撮影した写真、例えば、ユーザがソーシャルメディアに投稿した写真であってもよく、画像処理センタはソーシャルメディアの処理バックグラウンドであってもよい。画像収集機器１０２は収集した画像を画像処理センタ１０１にアップロードし、画像処理センタ１０１によって特徴抽出、クラスタリング分類、顔認識等の操作を行ってもよく、画像収集機器側の画像が毎日、インクリメンタルに生成され、増分クラスタリングはいくつかのクラスタを維持する必要があるため、画像データの増加と増分クラスタリングの進行に伴い、元々維持されるクラスタのクラスタ中心はドリフトするリスクがあり、これにより、クラスタリング効果は徐々に低下し、このため、サーバ１０１１は、本開示の実施例により提供される増分クラスタリング方法を実行することで、増分クラスタリングにおいて、クラスタ中心のドリフトによりクラスタリング効果が影響されるという問題を解決することに利用可能である。上記サーバ１０１１は、独立した物理サーバであってもよく、サーバクラスタ又は分散システムであってもよく、さらに、クラウドサービス、クラウドデータベース、クラウドコンピューティング、クラウド関数、クラウドストレージ、ネットワークサービス、クラウド通信、ミドルウェアサービス、ドメインネームサービス、セキュリティサービス、ビッグデータ及び人工知能プラットフォーム等の基本的なクラウドコンピューティングサービスを提供するクラウドサーバであってもよい。 An embodiment of the present disclosure provides an incremental clustering method for image data, the incremental clustering method can be implemented based on the application environment shown in FIG. 1, as shown in FIG. It includes a processing center 101 and an image acquisition device 102, the image processing center 101 including but not limited to a server 1011, terminals and databases. In some scenes, the image capture device 102 may be a camera or camera head placed in a scene such as a gated channel, shopping mall, residential area, etc. to capture images such as facial images, video surveillance images, etc. Well, the image processing center 101 may be a surveillance center, and the image processing center 101 may deploy a Video Cloud Node (VCN) 1012 to manage video surveillance, e.g. , and the images are clustered before being stored in database 1014 . In some scenes, the image collection device 102 may be a user terminal, the images collected by the image collection device 102 may be photos taken by the user, such as photos posted on social media by the user, The image processing center may be a social media processing background. The image acquisition device 102 may upload the acquired images to the image processing center 101, and the image processing center 101 may perform operations such as feature extraction, clustering classification, and face recognition. Since the generated and incremental clustering needs to maintain some clusters, as the image data increases and the incremental clustering progresses, there is a risk that the cluster centers of the originally maintained clusters will drift, which causes the clustering effect decreases gradually, so the server 1011 performs the incremental clustering method provided by the embodiments of the present disclosure to solve the problem that the clustering effect is affected by cluster center drift in incremental clustering. available to do. The server 1011 can be an independent physical server, a server cluster or a distributed system, and can also be cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, It may be a cloud server that provides basic cloud computing services such as middleware services, domain name services, security services, big data and artificial intelligence platforms.

以下において、関連する図面を参照しながら、本開示の実施例により提供される画像の増分クラスタリング方法を詳しく説明する。 In the following, the incremental clustering method for images provided by the embodiments of the present disclosure will be described in detail with reference to the associated drawings.

図２は、本開示の実施例により提供される画像の増分クラスタリング方法のフローチャートであり、この画像の増分クラスタリング方法はサーバに適用され、図２に示すように、ステップＳ２１からＳ２３を含む。 FIG. 2 is a flowchart of an incremental image clustering method provided by an embodiment of the present disclosure, which is applied to a server and includes steps S21 to S23, as shown in FIG.

Ｓ２１で、第１画像データ集合の第１クラスタを取得する。 At S21, the first cluster of the first image data set is obtained.

第１画像データ集合とは、現在バッチの画像データより前に、複数のクラスタとしてクラスタリングされた画像データ集合を指し、例えば、画像収集機器がある時刻で一括アップロードした顔画像のデータ（例えば、顔の特徴）は現在バッチのデータであると仮定すると、この前にサーバにアップロードされた顔画像のデータは第１画像データ集合である。第１クラスタは、即ち該第１画像データ集合内の画像データをクラスタリングして得られたクラスタであり、採用するクラスタリングアルゴリズムは、Ｋ平均クラスタリングアルゴリズムであってもよく、なお、各クラスタはいずれも対応するクラスタ中心を有し、即ち第２クラスタ中心を有することが理解される。 The first image data set refers to an image data set clustered as a plurality of clusters prior to the current batch of image data. ) is the current batch of data, the facial image data previously uploaded to the server is the first image data set. The first cluster is a cluster obtained by clustering the image data in the first image data set, the clustering algorithm employed may be a K-means clustering algorithm, and each cluster is It is understood to have a corresponding cluster center, ie a second cluster center.

Ｓ２２で、前記第１クラスタをＭ個の第１サブクラスタに分割し、前記Ｍ個の第１サブクラスタのうちの各第１サブクラスタに対応する第１クラスタ中心を取得し、前記Ｍは１以上の整数である。 At S22, divide the first cluster into M first sub-clusters, obtain a first cluster center corresponding to each first sub-cluster among the M first sub-clusters, wherein M is 1 is an integer equal to or greater than

図３Ａは、本開示の実施例により提供される第１クラスタの連結グラフの模式図であり、図３Ａに示すように、第１クラスタの連結グラフは、第１クラスタ３０１及び第２クラスタ中心３０２を含み、第１クラスタ３０１は、第１画像データ集合内の画像データをクラスタリングして得られたクラスタであり、第２クラスタ中心３０２は、各クラスタが有する対応するクラスタ中心である。 FIG. 3A is a schematic diagram of a first cluster connectivity graph provided by an embodiment of the present disclosure, as shown in FIG. , a first cluster 301 is a cluster obtained by clustering the image data in the first image data set, and a second cluster center 302 is a corresponding cluster center of each cluster.

図３Ｂは、本開示の実施例により提供される第１クラスタを第１サブクラスタに分割する模式図であり、図３Ｂに示すように、第１クラスタは第１サブクラスタに分割され、第１クラスタ３０１、第２クラスタ中心３０２、第１サブクラスタ３０３及び第１クラスタ中心３０４が含まれ、第１サブクラスタ３０３は、第１クラスタ３０１を分割して得られたサブクラスタであり、第１クラスタ中心３０４は、各第１サブクラスタのクラスタ中心である。 FIG. 3B is a schematic diagram of dividing a first cluster into first sub-clusters provided by an embodiment of the present disclosure, as shown in FIG. 3B, the first cluster is divided into first sub-clusters, A cluster 301, a second cluster center 302, a first sub-cluster 303 and a first cluster center 304 are included, the first sub-cluster 303 is a sub-cluster obtained by dividing the first cluster 301, and the first cluster Center 304 is the cluster center of each first sub-cluster.

第１サブクラスタは、即ち第１クラスタを分割して得られたサブクラスタであり、第１データ集合の各第１クラスタに対して、第１クラスタ内の画像データ間の類似度、即ち第７類似度を取得し、類似度行列を得て、続いて、第１クラスタ内の画像データを頂点として構成される連結グラフを取得し、図３Ａに示すように、連結グラフ内の２つずつの頂点に対して、類似度行列からその類似度をサーチし、第１画像データ集合をクラスタリングする場合に使用される閾値はＸ、即ち第５閾値である場合、図３Ｂに示すように、類似度が該Ｘよりも大きい複数の画像データをより緊密な1つの第１サブクラスタとして分割することで、Ｍ個の第１サブクラスタを得て、図３Ａに示される第１クラスタは、連結グラフの分析によりＭ個の第１サブクラスタに分割される。Ｍ個の第１サブクラスタを得た後、Ｍ個の第１サブクラスタのうちの各第１サブクラスタのクラスタ中心、即ち第１クラスタ中心を取得し、これにより、各第１クラスタは、1つのメインクラスタ中心とＭ個のサブクラスタ中心で表すことが可能となる。よりコンパクトなサブクラスタで第１クラスタを表すことは、新たに追加された画像データが組み込まれることにつれて単一のメインクラスタ中心の表現能力が低下するという問題の解決に役立つ。 The first sub-cluster is a sub-cluster obtained by dividing the first cluster. Obtain the similarity, obtain the similarity matrix, and then obtain the connection graph configured with the image data in the first cluster as the vertices, and as shown in FIG. For a vertex, search its similarity from the similarity matrix, and if the threshold used in clustering the first image data set is X, the fifth threshold, then the similarity is is larger than the X as one tighter first sub-cluster to obtain M first sub-clusters, and the first cluster shown in FIG. 3A is the connected graph The analysis divides into M first sub-clusters. After obtaining the M first sub-clusters, obtain the cluster center of each first sub-cluster of the M first sub-clusters, i.e. the first cluster center, whereby each first cluster has 1 It can be represented by one main cluster center and M sub-cluster centers. Representing the first cluster with more compact sub-clusters helps solve the problem of declining expressive power of a single main cluster center as newly added image data is incorporated.

Ｓ２３で、第２画像データ集合を取得し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージする。 At S23, a second image data set is obtained and the first cluster center is used to merge the second image data set and the first cluster.

図４Ａは、本開示の実施例により提供される第２画像データ集合のクラスタリング結果の模式図であり、図４Ａに示すように、第２画像データ集合のクラスタリング結果は、第２画像データ集合４０１、第２クラスタ４０２、孤立画像データ４０３及び第３クラスタ中心４０４を含み、そのうち、第２画像データ集合４０１は、画像収集機器によってアップロードされた現在バッチの画像のデータ集合であり、第２クラスタ４０２は、第２画像データ集合内の画像データをクラスタリングして得られたクラスタであり、孤立画像データ４０３は、クラスタリングされていない孤立画像データであり、第３クラスタ中心４０４は、各第２クラスタが有するクラスタ中心である。 FIG. 4A is a schematic diagram of the clustering result of the second image data set provided by the embodiment of the present disclosure, as shown in FIG. , a second cluster 402, an isolated image data 403 and a third cluster center 404, of which the second image data set 401 is a data set of images of the current batch uploaded by the image acquisition device, and the second cluster 402 is a cluster obtained by clustering the image data in the second image data set, isolated image data 403 is isolated image data that is not clustered, and third cluster center 404 is the center of each second cluster. is the cluster center with

図４Ｂは、本開示の実施例により提供される孤立画像データを第１クラスタへとマージする模式図であり、図４Ｂに示すように、孤立画像データを第１クラスタへとマージし、第１クラスタＡ４０５及び孤立画像データ４０３が含まれ、第１クラスタＡ４０５は、第１クラスタから決定された第１クラスタＡである。 FIG. 4B is a schematic diagram of merging isolated image data into a first cluster provided by an embodiment of the present disclosure, as shown in FIG. Cluster A 405 and isolated image data 403 are included, and the first cluster A 405 is the first cluster A determined from the first cluster.

図４Ｃは、本開示の実施例により提供される第２クラスタと第１クラスタとをマージする模式図であり、図４Ｃに示すように、第２クラスタと第１クラスタとをマージし、第１クラスタＢ４０６及び第２クラスタ４０７が含まれ、第１クラスタＢ４０６と該第２クラスタ４０７とは同じクラスタカテゴリに属する。 FIG. 4C is a schematic diagram of merging the second cluster and the first cluster provided by an embodiment of the present disclosure, as shown in FIG. 4C, merging the second cluster and the first cluster; A cluster B 406 and a second cluster 407 are included, the first cluster B 406 and the second cluster 407 belonging to the same cluster category.

第２画像データ集合、即ち画像収集機器によってアップロードされた現在バッチの画像のデータ集合は、画像収集機器によってアップロードされた画像から得られるものである。第１クラスタは、第１クラスタＡ、第１クラスタＢ及び第１クラスタＣを含み、第２画像データ集合に複数の画像データが含まれる場合、複数の画像データをクラスタリングし、クラスタリング結果を得る。図４Ａに示すように、該クラスタリング結果は、クラスタリングされていない孤立画像データ及びいくつかの第２クラスタを含み、いくつかの第２クラスタのうちの各第２クラスタは、いずれも対応するクラスタ中心、即ち第３クラスタ中心を有する。該孤立画像データに対して、第１クラスタから第１クラスタＡを決定し、第１クラスタ中心を用いてそれを第１クラスタＡへとマージし、つまり、図４Ｂに示すように、孤立画像データを第１クラスタＡに組み込み、該第１クラスタＡと該孤立画像データとは同じクラスタカテゴリに属する。各第２クラスタに対して、第１クラスタから第１クラスタＢを決定し、第１クラスタ中心を用いてそれを第１クラスタＢへとマージし、つまり、図４Ｃに示すように、クラスタとクラスタのマージを行い、該第１クラスタＢと該第２クラスタとは同じクラスタカテゴリに属する。孤立画像データと類似し、第２画像データ集合内に単一の画像データのみ存在する場合、即ち、新たに追加された画像データが単一である場合、第２画像データ集合に対してクラスタリング操作を行う必要がなく、第１クラスタから第１クラスタＣを決定し、第１クラスタ中心を用いてそれを第１クラスタＣへとマージし、該第１クラスタＣと該単一の画像データとは同じクラスタカテゴリに属する。 A second image dataset, ie, the dataset of the current batch of images uploaded by the image capture device, is derived from the images uploaded by the image capture device. The first cluster includes a first cluster A, a first cluster B and a first cluster C. When the second image data set includes a plurality of image data, the plurality of image data are clustered to obtain a clustering result. As shown in FIG. 4A, the clustering result includes unclustered isolated image data and a number of second clusters, each of the number of second clusters each having a corresponding cluster center , that is, has the third cluster center. For the isolated image data, determine the first cluster A from the first cluster, and merge it into the first cluster A using the first cluster center, i.e., as shown in FIG. 4B, the isolated image data is incorporated into the first cluster A, and the first cluster A and the isolated image data belong to the same cluster category. For each second cluster, determine the first cluster B from the first cluster and merge it into the first cluster B using the first cluster center, i.e. cluster and cluster , and the first cluster B and the second cluster belong to the same cluster category. Similar to the isolated image data, if there is only a single image data in the second image data set, i.e. if the newly added image data is single, a clustering operation is performed on the second image data set. , determine the first cluster C from the first cluster, merge it into the first cluster C using the first cluster center, and the first cluster C and the single image data are belong to the same cluster category.

可能な一実施形態において、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージする前に、前記画像の増分クラスタリング方法は、
前記第２クラスタ中心を用いて、前記第１クラスタからＫ個の第１クラスタを決定するステップをさらに含む。 In one possible embodiment, prior to merging the second image data set and the first clusters using the first cluster centers, the method for incremental clustering of images comprises:
Further comprising determining K first clusters from the first cluster using the second cluster center.

第２画像データ集合と第１クラスタとをマージする前に、第１クラスタの第２クラスタ中心を用いて全ての第１クラスタを初期選別し、全ての第１クラスタから、Ｋ個の第１クラスタを決定し、続いて、さらにＫ個のクラスタから、上記の第１クラスタＡ及び第１クラスタＢを選択するか、又は第１クラスタＣを選択する必要がある。説明すべきは、該Ｋ個の第１クラスタは、第２クラスタ中心を用いて全ての第１クラスタをソートした後のｔｏｐＫ個であってもよく、例えば、１００個の第１クラスタをソートした後の先頭から２０個であり、該Ｋ個の第１クラスタは、ソート後の全ての第１クラスタであってもよく、例えば、１００個の第１クラスタをソートした後、依然として１００個を選択する。第２クラスタ中心を用いて第１クラスタを初期選別することは、第２画像データ集合内の画像データのクラスタカテゴリにより近い第１クラスタ、例えば、上述した第１クラスタＡ、第１クラスタＢ及び第１クラスタＣを決定するのに役立つ。 Before merging the second image data set and the first clusters, initially sort all the first clusters using the second cluster centers of the first clusters, and from all the first clusters, K first clusters and then select the above first cluster A and first cluster B or select the first cluster C from further K clusters. It should be explained that the K first clusters may be the top K after sorting all the first clusters using the second cluster center, for example sorting 100 first clusters and the K first clusters may be all first clusters after sorting, for example, after sorting 100 first clusters, still 100 select. Initial screening of the first clusters using the second cluster centers may result in the selection of the first clusters that are closer to the cluster category of the image data in the second image data set, e.g. It helps to determine one cluster C.

可能な一実施形態において、前記第２クラスタ中心を用いて、前記第１クラスタからＫ個の第１クラスタを決定するステップは、
前記孤立画像データと前記第２クラスタ中心との第１類似度を取得し、前記第１類似度に基づき、前記第１クラスタを高い順にソートして第１クラスタ系列を得て、前記第１クラスタ系列内の先頭からＫ個の第１クラスタを選択するステップと、前記第３クラスタ中心と前記第２クラスタ中心との第２類似度を取得し、前記第２類似度に基づき、前記第１クラスタを高い順にソートして第２クラスタ系列を得て、前記第２クラスタ系列内の先頭からＫ個の第１クラスタを選択するステップと、を含むか、又は、前記単一の画像データと前記第２クラスタ中心との第３類似度を取得し、前記第３類似度に基づき、前記第１クラスタを高い順にソートして第３クラスタ系列を得て、前記第３クラスタ系列内の先頭からＫ個の第１クラスタを選択するステップを含む。 In one possible embodiment, determining K first clusters from said first cluster using said second cluster center comprises:
obtaining a first similarity between the isolated image data and the center of the second cluster, sorting the first clusters in ascending order based on the first similarity to obtain a first cluster sequence, and obtaining a first cluster sequence; selecting K first clusters from the beginning in the sequence; obtaining a second similarity between the third cluster center and the second cluster center; and based on the second similarity, the first cluster to obtain a second cluster series in descending order, and selecting K first clusters from the top in said second cluster series, or said single image data and said first Obtaining a third similarity with two cluster centers, sorting the first clusters in descending order based on the third similarity to obtain a third cluster sequence, and obtaining a third cluster sequence from the top K clusters in the third cluster sequence selecting the first cluster of .

第２画像データ集合をクラスタリングして孤立画像データ及び複数の第２クラスタを得た場合、孤立画像データに対して、各第１クラスタの第２クラスタ中心との第１類似度を計算し、第２クラスタに対して、その対応する第３クラスタ中心と各第１クラスタの第２クラスタ中心との第２類似度を計算し、それぞれ第１類似度、第２類似度に基づき、全ての第１クラスタを高い順にソートして対応する第１クラスタ系列及び第２クラスタ系列を得て、続いて、第１クラスタ系列及び第２クラスタ系列から、先頭からＫ個の第１クラスタをそれぞれ選択する。第２画像データ集合内に単一の画像データのみが含まれる場合、単一の画像データと各第１クラスタの第２クラスタ中心との第３類似度を計算し、第３類似度に基づき、全ての第１クラスタを高い順にソートして対応する第３クラスタ系列を得て、続いて、第３クラスタ系列内から先頭からＫ個の第１クラスタを選択する。 When the second image data set is clustered to obtain isolated image data and a plurality of second clusters, a first similarity between each first cluster and a second cluster center is calculated for the isolated image data; For two clusters, calculate the second similarity between the corresponding third cluster center and the second cluster center of each first cluster, and based on the first similarity and the second similarity, respectively, all the first The clusters are sorted in descending order to obtain the corresponding first cluster series and second cluster series, and then the first K first clusters are selected from the first cluster series and the second cluster series, respectively. if only a single image data is included in the second image data set, calculating a third similarity between the single image data and the second cluster center of each first cluster; based on the third similarity, All the first clusters are sorted in descending order to obtain the corresponding third cluster series, and then the top K first clusters are selected from within the third cluster series.

可能な一実施形態において、前記第１クラスタ中心を用いて前記孤立画像データと前記第１クラスタＡとをマージするステップは、
前記孤立画像データと第１クラスタ中心Ｄとの第４類似度を取得するステップであって、前記第１クラスタ中心Ｄは、前記Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であるステップと、前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第４類似度が第１閾値よりも大きい前記第１クラスタ中心Ｄの第１数量を決定するステップと、前記Ｋ個の第１クラスタのうちの前記第１数量が最大である第１クラスタを前記第１クラスタＡとして決定するステップと、前記孤立画像データと前記第１クラスタＡとをマージするステップと、を含む。 In one possible embodiment, the step of merging the isolated image data and the first cluster A using the first cluster center comprises:
obtaining a fourth similarity between the isolated image data and a first cluster center D, wherein the first cluster center D is the first being the first cluster center corresponding to a sub-cluster; and for each first cluster of the K first clusters, within each of the first clusters, the fourth similarity being a first threshold; and determining the first cluster among the K first clusters with the largest first quantity as the first cluster A. and merging the isolated image data and the first cluster A.

孤立画像データのマージについて、選択された先頭からＫ個の第１クラスタから、第１クラスタＡを決定する必要があり、説明すべきは、先頭からＫ個の第１クラスタは、ソート後の全ての第１クラスタであってもよい。まず、孤立画像データとＫ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタのクラスタ中心（即ち第１クラスタ中心Ｄ）との類似度を計算し、第４類似度として決定し、続いて、Ｋ個の第１クラスタを分析し、各第１クラスタ内の、第４類似度が第１閾値よりも大きいことを満たす第１クラスタ中心Ｄの数量を決定し、第１数量として決定し、該第１数量が最大である第１クラスタを第１クラスタＡとして決定し、例えば、Ｋ個の第１クラスタのうち、第１クラスタ１はこのような第１クラスタ中心Ｄを２０個有し、第１クラスタ２はこのような第１クラスタ中心Ｄを１８個有し、…、第１クラスタＫはこのような第１クラスタ中心Ｄを１５個有し、第１クラスタ１は数量が最も多いため、それを第１クラスタＡとして決定し、つまり、孤立画像データにより近い第１サブクラスタは、第１クラスタＡに最も多く存在し、孤立画像データを第１クラスタＡへとマージすることで、クラスタリング結果をより正確にすることができる。 For the merging of isolated image data, it is necessary to determine the first cluster A from the top K first clusters selected, and it should be explained that the top K first clusters are all may be the first cluster of First, the similarity between the isolated image data and the cluster center of each first sub-cluster of each first cluster out of the K first clusters (that is, the first cluster center D) is calculated and determined as the fourth similarity. and then analyzing the K first clusters, determining the quantity of first cluster centers D in each first cluster that satisfies that the fourth similarity is greater than the first threshold, and determining the first quantity and the first cluster with the largest first quantity is determined as the first cluster A. For example, among the K first clusters, the first cluster 1 has such a first cluster center D as 20 , the first cluster 2 has 18 such first cluster centers D, . . . , the first cluster K has 15 such first cluster centers D, and the first cluster 1 has a quantity is the most abundant, so it is determined as the first cluster A, i.e. the first sub-cluster that is closer to the isolated image data is the most present in the first cluster A, and the isolated image data is merged into the first cluster A. This makes the clustering results more accurate.

可能な一実施形態において、前記第１クラスタ中心を用いて前記第２クラスタと前記第１クラスタＢとをマージするステップは、
前記第２クラスタをＮ個の第２サブクラスタに分割し、前記Ｎ個の第２サブクラスタのうちの各第２サブクラスタに対応する第４クラスタ中心を取得するステップであって、前記Ｎは１以上の整数であるステップと、前記第４クラスタ中心と第１クラスタ中心Ｅとの第５類似度を取得するステップであって、前記第１クラスタ中心Ｅは、Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であるステップと、前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第５類似度が第２閾値よりも大きい前記第１クラスタ中心Ｅの第２数量を決定するステップと、前記Ｋ個の第１クラスタのうちの前記第２数量が最大である第１クラスタを前記第１クラスタＢとして決定するステップと、前記第２クラスタと前記第１クラスタＢとをマージするステップと、を含む。 In one possible embodiment, the step of merging said second cluster and said first cluster B using said first cluster center comprises:
dividing the second cluster into N second sub-clusters and obtaining a fourth cluster center corresponding to each second sub-cluster of the N second sub-clusters, wherein N is being an integer greater than or equal to 1; and obtaining a fifth similarity between the fourth cluster center and the first cluster center E, wherein the first cluster center E is one of the K first clusters. and for each first cluster of the K first clusters, within each first cluster, determining a second quantity of said first cluster centers E for which said fifth similarity is greater than a second threshold; determining as the first cluster B; and merging the second cluster and the first cluster B;

クラスタとクラスタのマージについて、選択された先頭からＫ個の第１クラスタから、第１クラスタＢを決定する必要があり、説明すべきは、先頭からＫ個の第１クラスタは、ソート後の全ての第１クラスタであってもよい。まず、第１クラスタを分割する方式で各第２クラスタをＮ個の第２サブクラスタに分割し、各第２サブクラスタのクラスタ中心、即ち第４クラスタ中心を計算し、続いて、第４クラスタ中心とＫ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタのクラスタ中心（即ち第１クラスタ中心Ｅ）との類似度を計算し、第５類似度として決定し、さらに、Ｋ個の第１クラスタを分析し、各第１クラスタ内の、第５類似度が第２閾値よりも大きいことを満たす第１クラスタ中心Ｅの数量を決定し、第２数量として決定し、該第２数量が最大である第１クラスタを第１クラスタＢとして決定し、例えば、Ｋ個の第１クラスタのうち、第１クラスタ１はこのような第１クラスタ中心Ｅを３０個有し、第１クラスタ２はこのような第１クラスタ中心Ｅを１５個有し、…、第１クラスタＫはこのような第１クラスタ中心Ｅを４０個有し、第１クラスタＫは数量が最も多いため、それを第１クラスタＢとして決定し、つまり、第２クラスタの第２サブクラスタにより近い第１サブクラスタは、第１クラスタＢに最も多く存在し、第２クラスタを第１クラスタＢへとマージすることで、クラスタリング結果をより正確にすることができる。 For the merging of clusters and clusters, it is necessary to determine the first cluster B from the top K first clusters selected, and it should be explained that the top K first clusters are all may be the first cluster of First, divide each second cluster into N second sub-clusters in the manner of dividing the first cluster, calculate the cluster center of each second sub-cluster, i.e. the fourth cluster center, and then the fourth cluster calculating the similarity between the center and the cluster center of each first sub-cluster of each first cluster out of the K first clusters (i.e., the first cluster center E), and determining it as a fifth similarity; analyzing the K first clusters, determining the quantity of the first cluster center E in each first cluster that satisfies that the fifth similarity is greater than the second threshold, and determining it as the second quantity; The first cluster with the largest second quantity is determined as the first cluster B, for example, among the K first clusters, the first cluster 1 has 30 such first cluster centers E; One cluster 2 has 15 such first cluster centers E, ..., the first cluster K has 40 such first cluster centers E, and the first cluster K has the largest number, so Determine it as the first cluster B, i.e. the first sub-cluster closer to the second sub-cluster of the second cluster is most present in the first cluster B, merge the second cluster into the first cluster B This makes the clustering results more accurate.

可能な一実施形態において、前記第１クラスタ中心を用いて前記単一の画像データと前記第１クラスタＣとをマージするステップは、
前記単一の画像データと第１クラスタ中心Ｆとの第６類似度を取得するステップであって、前記第１クラスタ中心Ｆは、前記Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であるステップと、前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第６類似度が第３閾値よりも大きい前記第１クラスタ中心Ｆの第３数量を決定するステップと、前記Ｋ個の第１クラスタのうちの前記第３数量が最大である第１クラスタを前記第１クラスタＣとして決定するステップと、前記単一の画像データと前記第１クラスタＣとをマージするステップと、を含む。 In one possible embodiment, merging the single image data and the first cluster C using the first cluster center comprises:
obtaining a sixth similarity measure between the single image data and a first cluster center F, wherein the first cluster center F is for each first cluster of the K first clusters; being the first cluster center corresponding to a first sub-cluster; and for each first cluster of the K first clusters, within each first cluster, the sixth similarity being the first determining a third quantity of the first cluster center F that is greater than a threshold of 3; and determining the first cluster among the K first clusters with the largest third quantity as the first cluster C. and merging the single image data and the first cluster C.

単一の画像データのマージについて、選択された先頭からＫ個の第１クラスタから、第１クラスタＣを決定する必要があり、説明すべきは、先頭からＫ個の第１クラスタは、ソート後の全ての第１クラスタであってもよい。まず、単一の画像データとＫ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタのクラスタ中心（即ち第１クラスタ中心Ｆ）との類似度を計算し、第６類似度として決定し、続いて、Ｋ個の第１クラスタを分析し、各第１クラスタ内の、第６類似度が第３閾値よりも大きいことを満たす第１クラスタ中心Ｆの数量を決定し、第３数量として決定し、該第３数量が最大である第１クラスタを第１クラスタＣとして決定し、つまり、単一の画像データにより近い第１サブクラスタは、第１クラスタＣに最も多く存在し、単一の画像データを第１クラスタＣへとマージすることで、クラスタリング結果をより正確にすることができる。 For the merging of single image data, from the top K first clusters selected, the first cluster C needs to be determined. may be all the first clusters of First, the similarity between the single image data and the cluster center of each first sub-cluster of each first cluster out of the K first clusters (that is, the first cluster center F) is calculated, and a sixth similarity and then analyzing the K first clusters to determine the number of first cluster centers F in each first cluster that satisfy the sixth similarity greater than the third threshold; 3 quantity, and the first cluster with the largest third quantity is determined as the first cluster C, that is, the first sub-cluster that is closer to the single image data is present in the first cluster C the most. , by merging the single image data into the first cluster C, the clustering result can be more accurate.

可能な一実施形態において、前記Ｍは第４閾値以下であり、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージした後、図５に示すように、前記画像の増分クラスタリング方法は以下のステップをさらに含む。
Ｓ５１で、マージした第１クラスタをＲ個の第３サブクラスタに分割し、前記Ｒ個の第３サブクラスタのうちの各第３サブクラスタの第５クラスタ中心を取得し、前記Ｒは１以上の整数である。
Ｓ５２で、前記Ｒが前記第４閾値以下である場合、前記Ｒ個の第３サブクラスタを残し、前記Ｒ個の第３サブクラスタに対応する前記第５クラスタ中心を用いて前記第１クラスタ中心を更新する。
Ｓ５３で、前記Ｒが前記第４閾値よりも大きい場合、前記Ｒ個の第３サブクラスタのうちの各第３サブクラスタ内の画像データの第４数量を取得する。
Ｓ５４で、前記第４数量に基づき、前記Ｒ個の第３サブクラスタを大きい順にソートして第４クラスタ系列を得て、前記第４クラスタ系列内の先頭からＰ個の第３サブクラスタを選択し、前記Ｐ個の第３サブクラスタに対応する前記第５クラスタ中心を用いて前記第１クラスタ中心を更新し、前記Ｐは前記第４閾値以下である。 In one possible embodiment, said M is less than or equal to a fourth threshold, and after merging said second image data set and said first cluster using said first cluster center, said The image incremental clustering method further includes the following steps.
In S51, divide the merged first cluster into R third sub-clusters, obtain a fifth cluster center of each third sub-cluster among the R third sub-clusters, wherein R is 1 or more; is an integer of
In S52, if the R is less than or equal to the fourth threshold, then the R third sub-clusters are retained, and the fifth cluster centers corresponding to the R third sub-clusters are used for the first cluster centers. to update.
At S53, if the R is greater than the fourth threshold, obtain a fourth quantity of image data in each third sub-cluster of the R third sub-clusters.
At S54, based on the fourth quantity, the R third sub-clusters are sorted in descending order to obtain a fourth cluster sequence, and P third sub-clusters are selected from the top in the fourth cluster sequence. and updating the first cluster center using the fifth cluster center corresponding to the P third sub-clusters, wherein P is less than or equal to the fourth threshold.

孤立画像データ及び第２クラスタをある第１クラスタへとマージした後、又は単一の画像データをある第１クラスタへとマージした後、元の第１クラスタに新しい画像データがクラスタリングされているため、元の第１クラスタのサブ中心を更新する必要がある。具体的には、第１クラスタを分割する方式で、マージした第１クラスタをＲ個の第３サブクラスタに分割し、各第３サブクラスタの第５クラスタ中心を計算し、Ｒに基づいて第３サブクラスタの数量を決定し、第３サブクラスタの数量が第４閾値以下、例えば、２０個である場合、このＲ個の第３サブクラスタを残し、このＲ個の第３サブクラスタの第５クラスタ中心を、マージした後の第１クラスタの新しいサブ中心とし、元の第１クラスタ中心を更新し、マージした後の第１クラスタは、第２クラスタ中心及びＲ個の第５クラスタ中心で表される。 After merging the isolated image data and the second cluster into a first cluster, or after merging the single image data into a first cluster, the new image data is clustered in the original first cluster. , the subcenters of the original first cluster need to be updated. Specifically, in a method of splitting the first cluster, split the merged first cluster into R third sub-clusters, calculate the fifth cluster center of each third sub-cluster, and based on R Determine the quantity of 3 sub-clusters, and if the quantity of the 3rd sub-cluster is less than or equal to a fourth threshold, e.g., 20, then leave the R 3rd sub-clusters; Let the 5 cluster centers be the new sub-centers of the first cluster after merging, update the original first cluster center, and the first cluster after merging will be the second cluster center and R fifth cluster centers. expressed.

また、第３サブクラスタの数量が第４閾値よりも大きい場合、各第３サブクラスタ内の画像データの数量（即ち第４数量）に基づき、Ｒ個の第３サブクラスタを大きい順にソートして第４クラスタ系列を得て、先頭からＰ個の第３サブクラスタを選択して残し、例えば、先頭から２０個の第３サブクラスタのみを残し、残りの第３サブクラスタを捨て、このＰ個の第３サブクラスタの第５クラスタ中心を、マージした後の第１クラスタの新しいサブ中心とし、元の第１クラスタ中心を更新し、マージした後の第１クラスタは、第２クラスタ中心及びＰ個の第５クラスタ中心で表される。クラスタをサブクラスタに分割するたびに、事前設定された数量のサブクラスタのみが残されるため、Ｍ及びＮはいずれも第４閾値以下であり、このように、サブクラスタが多い場合、より多くの画像データを有するサブクラスタを残すことで、サブ中心の量を制限し、外れ値となる画像データの影響を除去することができ、これにより、維持を容易にするのみならず、長時間の大規模増分クラスタリングのシーンにおいても高いクラスタリング効果が得られることを可能にする、ことが理解される。 Also, if the quantity of the third sub-cluster is greater than the fourth threshold, sort the R third sub-clusters in descending order based on the quantity of image data in each third sub-cluster (i.e., the fourth quantity). Obtain a fourth cluster sequence, select and retain the top P third subclusters, for example, keep only the top 20 third subclusters, discard the remaining third subclusters, and select the P as the new sub-center of the first cluster after merging, update the original first cluster center, and the first cluster after merging is the second cluster center and P are represented by the fifth cluster centers. Each time a cluster is split into sub-clusters, only a preset number of sub-clusters are left, so both M and N are below the fourth threshold, thus the more sub-clusters, the more By leaving sub-clusters with image data, we can limit the amount of sub-centers and remove the effect of outlier image data, which not only makes maintenance easier, but also makes it easier to maintain large volumes over time. It is understood that it allows a high clustering effect to be obtained even in the scene of scale increment clustering.

図６を参照し、図６は、本開示の実施例により提供される他の画像の増分クラスタリング方法のフローチャートであり、図６に示すように、ステップＳ６１からＳ６６を含む。
Ｓ６１で、第１画像データ集合の第１クラスタを取得する。
Ｓ６２で、前記第１クラスタをＭ個の第１サブクラスタに分割し、前記Ｍ個の第１サブクラスタのうちの各第１サブクラスタに対応する第１クラスタ中心を取得し、前記Ｍは１以上の整数である。
Ｓ６３で、第２画像データ集合を取得する。
Ｓ６４で、前記第２画像データ集合に複数の画像データが含まれる場合、前記複数の画像データをクラスタリングし、孤立画像データ及び第２クラスタを得る。
Ｓ６５で、前記第１クラスタ中心を用いて前記孤立画像データを第１クラスタＡへとマージし、及び前記第１クラスタ中心を用いて前記第２クラスタと第１クラスタＢとをマージする。
Ｓ６６で、前記第２画像データ集合内に単一の画像データのみ存在する場合、前記第１クラスタ中心を用いて前記単一の画像データを第１クラスタＣへとマージする。 Please refer to FIG. 6, which is a flowchart of another incremental clustering method for images provided by an embodiment of the present disclosure, including steps S61 to S66, as shown in FIG.
At S61, the first cluster of the first image data set is obtained.
At S62, divide the first cluster into M first sub-clusters, obtain a first cluster center corresponding to each first sub-cluster of the M first sub-clusters, where M is 1 is an integer equal to or greater than
At S63, a second image data set is obtained.
In S64, if the second image data set contains a plurality of image data, the plurality of image data are clustered to obtain isolated image data and a second cluster.
At S65, the isolated image data is merged into the first cluster A using the first cluster center, and the second cluster and the first cluster B are merged using the first cluster center.
At S66, if there is only a single image data in the second image data set, the single image data is merged into a first cluster C using the first cluster center.

上記ステップＳ６１からＳ６６の実施形態は、図２から図５に示される実施例において説明されており、且つ同じ又は類似の有益な効果が達成でき、ここでは詳しい説明を省略する。 The embodiments of steps S61 to S66 above have been described in the embodiments shown in FIGS. 2 to 5, and can achieve the same or similar beneficial effects, so detailed descriptions are omitted here.

顔認識技術は、深層学習研究の突破により絶えず発展しており、教師あり学習によって得られた顔認識モデルが絶えず突破を遂げたが、大量のラベル無しの顔データに接する際に、どのように正確かつ迅速に分類するかは、経済的及び社会的価値のある大きな課題となる。 Face recognition technology is constantly developing with breakthroughs in deep learning research, and face recognition models obtained by supervised learning have made breakthroughs. Accurate and rapid classification is a big issue with economic and social value.

実際のシーン、例えば、ソーシャルメディア、セキュリティ等の分野では、画像のデータ量が大量であることが多く、そして、データが毎日、インクリメンタルに生成されているため、インクリメンタルなクラスタリング方式はより大きな実用的価値を有する。インクリメンタルなクラスタリング方式は、クラスタリング過程においていくつかのクラスタを維持する必要があり、従来のクラスタリングアルゴリズムは、あるクラスタを単一のクラスタ中心で表し、例えば、クラスタ内の全てのサンプル特徴を平均化してクラスタ中心を得るが、異なるクラスタは疎性が異なり、平均化した単一のクラスタ中心を簡単に用いるような方式は、クラスタ内部の豊富なサンプル情報が失われやすく、増分クラスタリング過程の進行に伴い、クラスタリング効果は徐々に影響を受けるようになる。 In real scenes, such as social media, security, etc., the amount of image data is often large, and the data is generated incrementally every day, so incremental clustering methods are of greater practical use. have value. Incremental clustering schemes need to maintain several clusters in the clustering process, and conventional clustering algorithms represent a cluster by a single cluster center, e.g. Obtaining cluster centers, but different clusters have different sparseness, methods that simply use averaged single cluster centers tend to lose the rich sample information inside the clusters, and as the incremental clustering process progresses, , the clustering effect gradually becomes affected.

顔クラスタリングの実際の適用プロセスにおいて、異なる人物の顔の特徴は特徴空間データにおいて分布が異なり、内部のサンプルが比較的コンパクトであるクラスタもあれば、内部のサンプルが比較的疎らであるクラスタもある。クラスタを単一の中心で表すと、このようなクラスタの内部情報は失われることがあり、増分クラスタリングの進行に伴い、既存サンプルの影響は次第に低下し、新しいサンプルの追加に伴い、クラスタ中心がドリフトするリスクは高くなる。 In the actual application process of face clustering, the facial features of different people have different distributions in the feature space data, some clusters have relatively compact samples inside, and some clusters have relatively sparse samples inside. . When a cluster is represented by a single center, the internal information of such a cluster can be lost, and as incremental clustering progresses, the influence of existing samples tapers off, and as new samples are added, the cluster center becomes The risk of drifting increases.

本開示の実施例により提供される画像の増分クラスタリング方法は、以下のステップを含む。 An image incremental clustering method provided by an embodiment of the present disclosure includes the following steps.

Ｓ６７で、クラスタサンプル間の類似度を計算し、１つのクラスタをいくつかのより緊密なサブクラスタに分割する。 At S67, the similarity between cluster samples is calculated and a cluster is divided into several tighter sub-clusters.

クラスタサンプル間の類似度を計算すると、類似度行列 Calculating the similarity between cluster samples yields a similarity matrix

を得ることができ、クラスタリングに使用される閾値が

and the threshold used for clustering is

であると仮定すると、より高い閾値

, the higher threshold

を設定する必要があり、つまり、

must be set, i.e.

を満たすことで、１つのクラスタをいくつかのより緊密なサブクラスタに分割する。

satisfies a cluster into several tighter sub-clusters.

連結グラフ分析に基づく方式でクラスタを分析することで、クラスタの多中心を得てもよい。クラスタの類似度行列を計算し、クラスタリングに使用される類似度閾値よりも高いものを使用することで、１つのクラスタをいくつかのより緊密なサブクラスタに分割することができ、このように、複数のサブクラスタ中心を得ることができ、さらにメイン中心であるクラスタの中心を加えて、クラスタを多中心で表す方式が構成される。 The multicenter of the cluster may be obtained by analyzing the cluster in a manner based on connectivity graph analysis. By computing the cluster similarity matrix and using a similarity threshold higher than the one used for clustering, one cluster can be split into several tighter sub-clusters, thus: Multiple sub-cluster centers can be obtained, and the center of the cluster, which is the main center, is added to form a multi-center representation of the cluster.

連結グラフ分析に基づくクラスタリングの多中心の設計分析によって複数のサブ中心を得るステップは以下を含む。まず、各クラスタに対して、より高い閾値（クラスタリング閾値よりも高くする必要がある）を設定することで、クラスタをいくつかのよりコンパクトな連結サブグラフに分割し、各連結サブグラフに対してサブ中心を計算し、それによって、複数のサブ中心を得ることができ、メイン中心は、依然としてクラスタ全体に対して従来の平均化方法を行うことで取得される。 Obtaining multiple sub-centers by multi-center design analysis of clustering based on connectivity graph analysis includes the following. First, for each cluster, by setting a higher threshold (which should be higher than the clustering threshold), we split the cluster into several more compact connected subgraphs, and sub-center , thereby obtaining multiple sub-centers, and the main center is still obtained by performing the conventional averaging method over the entire cluster.

Ｓ６８で、増分クラスタリングのプロセスにおいて、新しいバッチのデータが追加されるたびに、新しいデータをクラスタリングし、いくつかのクラスタ及びクラスタリングされていない孤立サンプルを生成する。 At S68, in the process of incremental clustering, each time a new batch of data is added, the new data is clustered to generate a number of clusters and unclustered isolated samples.

Ｓ６９で、生成されたいくつかのクラスタ及びクラスタリングされていない孤立サンプルと、ステップＳ６７で得られた既存のクラスタリング結果とをクラスタリングマージする。 At S69, some generated clusters and unclustered isolated samples are clustered and merged with the existing clustering results obtained at step S67.

単一のメイン中心及び複数のサブ中心に基づく多中心の増分クラスタリング方法は、メイン中心及び複数のサブ中心を得た上で、増分クラスタリングのプロセスにおいて、まず、メイン中心及び新しい追加されたデータを用いてＴｏｐＫをサーチして粗選別し、続いて、複数のサブ中心に基づき、さらに、新しいサンプル又は他のクラスタを組み込むか否かを決定する。 The multi-center incremental clustering method based on a single main center and multiple sub-centers obtains the main center and multiple sub-centers, and then, in the process of incremental clustering, first obtains the main center and the new added data Use TopK to search and coarsely filter, and then determine whether to incorporate new samples or other clusters based on multiple subcenters.

このクラスタリングマージのプロセスは、クラスタ間のマージ、及び単一の孤立サンプルのクラスタへの組み込みに関する。孤立サンプル点の組み込みについては、多中心の設計に基づき、まず、低い閾値を設定し、メイン中心でＴｏｐＫをサーチし、続いて、サブ中心とサンプル点とはクラスタリング閾値 This clustering-merging process involves merging between clusters and incorporating single isolated samples into clusters. For the incorporation of isolated sample points, based on the multi-center design, we first set a low threshold and search TopK at the main center, then the sub-centers and sample points are clustered with the clustering threshold

を満たすか否かを判断する。この場合、要件を満たす孤立サンプル点及びクラスタは複数存在し得、要件を満たすサブ中心数が最も多いクラスタを目標クラスタとする。クラスタ間のマージについては、同様に低い閾値でＴｏｐＫを選別してサーチし、続いて、クラスタ間に閾値要件を満たすサブ中心があるか否かを判断し、要件を満たすクラスタが複数存在する場合、閾値要件を満たすサブ中心数が最も多いクラスタを目標クラスタとする。

to determine whether or not it satisfies In this case, a plurality of isolated sample points and clusters satisfying the requirements may exist, and the cluster with the largest number of subcenters satisfying the requirements is set as the target cluster. For merging between clusters, we similarly filter and search TopK with a low threshold, and then determine whether there are subcenters between clusters that satisfy the threshold requirement, and if there are multiple clusters that satisfy the requirement, , the cluster with the largest number of subcenters satisfying the threshold requirement is taken as the target cluster.

多中心に基づく増分クラスタリングアーキテクチャの使用では、多中心メカニズムにおける単一のメイン中心及び複数のサブ中心を総合的に使用し、ＴｏｐＫの近隣でサーチする時に、メイン中心を用いて類似度を計算し、続いて、複数のサブ中心及びクラスタリング待ちの単一のサンプル又はクラスタを用いて類似度を計算し、これにより、単一のサンプルの組み込み又はクラスタのマージが完了したか否かを判別する。該アーキテクチャは、多中心で表す方式の利点を総合的に利用し、計算の複雑性をあまり増やさずに、クラスタリング効果を向上させることができる。 Using a multicenter-based incremental clustering architecture, we synthetically use a single main center and multiple subcenters in the multicenter mechanism, and use the main centers to compute similarities when searching in TopK neighborhoods. , then compute the similarity using multiple sub-centers and a single sample or cluster awaiting clustering, thereby determining whether the incorporation of a single sample or the merging of clusters is complete. The architecture can comprehensively exploit the advantages of the multi-center representation scheme and improve the clustering effect without significantly increasing the computational complexity.

クラスタのマージ又は新しいサンプルの追加の際に、サブ中心を更新する必要があり、計算を簡単化するために、サブ中心のクラスタリングとしてモデル化してもよく、これにより、サブ中心のマージと更新を実現する。また、サブ中心のデータが多すぎるのを防止するために、表されるサンプル点の数に基づいて各サブ中心を大きい順にソートしてもよく、例えば、最大先頭から２０個のサブ中心を選択するようにしてもよい。 Upon merging clusters or adding new samples, the sub-centers need to be updated, and for simplicity of computation, may be modeled as sub-center clustering, which allows the merging and updating of sub-centers. come true. Also, to prevent too much data for a sub-center, each sub-center may be sorted in ascending order based on the number of sample points represented, e.g., select the 20 largest sub-centers You may make it

クラスタの多中心の増分更新方式を使用する。実際のシーンにおいて、データ量が増え続けるにつれて、サブ中心のマージと更新、及びサブ中心数の制限によって、サブ中心数の継続的な増加による過度な計算及び記憶の負担を防止することができるとともに、外れ値となる干渉点の影響を低減することができる。 Use cluster multi-centric incremental update strategy. In an actual scene, as the amount of data continues to increase, merging and updating sub-centers and limiting the number of sub-centers can prevent excessive computation and storage burdens caused by the continuous increase in the number of sub-centers. , the influence of outlier interference points can be reduced.

本開示の実施例では、大規模なデータに基づく顔クラスタリングの複雑な状況は十分に考慮されており、
まず、顔クラスタの多中心の構築方式が提供され、このような方式で単一のメイン中心及び複数のサブ中心による顔クラスタの記述を取得することができる。クラスタの記述は１つのクラスタ中心を維持し、クラスタ内部のいくつかのコンパクトなサブクラスタ情報が省略されているという問題、及び、データの継続的な増加につれて、単一のクラスタ中心が維持されることから、クラスタ中心は新しいサンプルによる影響を継続的に受け、中心がドリフトするリスクがあり、クラスタ内部の既存サンプルの影響が次第に弱くなり、中心の表現能力が低下するという問題は解決される。そして、単一のクラスタ中心であると、増分クラスタリングのプロセスにおいて、クラスタ内部のサンプル情報が失われ、増分クラスタリングのプロセスにおいて、通常、各クラスタに対して単一のクラスタ中心を維持し、データの継続的な追加につれて、クラスタ中心と、新しいサンプル又はクラスタとの類似度を計算することで、クラスタのマージ及び更新を行うとともに、クラスタ中心も継続的に更新され、データの継続的な追加につれて、単一の多中心であると、クラスタ内部の豊富なサンプル情報が失われ、ドリフトも発生しやすい結果、時間の経過とともにクラスタリング効果が影響されるという問題も解決される。 Embodiments of the present disclosure fully take into account the complexities of face clustering based on large-scale data,
First, a multi-center construction scheme for face clusters is provided, in which a description of a face cluster with a single main center and multiple sub-centers can be obtained. The problem is that the cluster description maintains one cluster center and some compact sub-cluster information inside the cluster is omitted, and as the data continues to grow, a single cluster center is maintained. This solves the problem that the cluster center is continuously influenced by new samples, there is a risk of the center drifting, and the influence of the existing samples inside the cluster gradually weakens, reducing the expressive power of the center. And with a single cluster center, in the process of incremental clustering, the sample information inside the cluster is lost, and in the process of incremental clustering, we usually maintain a single cluster center for each cluster, and the data Merge and update clusters by calculating the similarity between cluster centers and new samples or clusters as we continue to add, and cluster centers are also continuously updated, and as we continue to add data, A single multicenter also solves the problem that the rich sample information inside the cluster is lost and is also prone to drift, thus affecting the clustering effect over time.

次に、多中心に基づく増分クラスタリングアーキテクチャが提供され、該アーキテクチャによって、多中心で表して増分クラスタリングを行うことにおける計算複雑性とクラスタリング精度のバランスを良好にすることができ、単一のサンプルのクラスタへの組み込み、及びクラスタ間のマージを実現することができ、従来技術における多中心の設定は、大規模データのシーンにおいてクラスタリングの計算速度及び記憶に大きく影響するという問題が解決される。 Next, a multicenter-based incremental clustering architecture is provided, which can strike a good balance between computational complexity and clustering accuracy in performing incremental clustering in terms of multicenter, and for a single sample Embedding into clusters and merging between clusters can be realized, solving the problem that the multi-center setting in the prior art greatly affects the computational speed and storage of clustering in large-scale data scenes.

最後に、多中心の増分更新方式が提供され、該方法は、サブ中心間のマージと更新、及びサブ中心数の制限によって、長時間の大規模増分クラスタリングのシーンにおいても高いクラスタリング効果が得られることを可能にする。該方式により、多中心の数の増加を制限することができるとともに、外れ値の影響を除去することができ、従来技術における、顔写真の特徴が一般的に高次元を有するため、複数の多中心を維持すると、クラスタリング時にメモリ負荷が倍増するという問題、及びＴｏｐＫの近隣でサーチする時に計算が余計に倍増するという問題は解決される。 Finally, a multi-center incremental update scheme is provided, which can achieve high clustering effect even in long-time large-scale incremental clustering scenes by merging and updating between sub-centers and limiting the number of sub-centers make it possible. The method can limit the increase in the number of multicenters and remove the influence of outliers. Preserving the center solves the problem of doubling the memory load when clustering, and the problem of an extra doubling of computation when searching in TopK neighborhoods.

図２又は図６に示される方法実施例の説明によれば、本開示の実施例は画像の増分クラスタリング装置をさらに提供し、図７を参照し、図７は、本開示の実施例により提供される画像の増分クラスタリング装置の構成図であり、図７に示すように、この画像の増分クラスタリング装置は、
第１画像データ集合の第１クラスタを取得するように構成される第１取得モジュール７１と、
前記第１クラスタをＭ個の第１サブクラスタに分割し、前記Ｍ個の第１サブクラスタのうちの各第１サブクラスタに対応する第１クラスタ中心を取得するように構成される第１分割モジュール７２であって、前記Ｍは１以上の整数である第１分割モジュール７２と、
第２画像データ集合を取得し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージするように構成されるマージモジュール７３と、を含む。 According to the description of the method embodiment shown in FIG. 2 or FIG. 6, the embodiment of the present disclosure further provides an apparatus for incremental clustering of images, see FIG. 7, which is provided by the embodiment of the present disclosure. Fig. 7 is a block diagram of an incremental clustering device for images to be processed, and as shown in Fig. 7, the incremental clustering device for images is:
a first acquisition module 71 configured to acquire a first cluster of the first image data set;
a first division configured to divide the first cluster into M first sub-clusters and obtain a first cluster center corresponding to each first sub-cluster of the M first sub-clusters; a first division module 72, wherein M is an integer greater than or equal to 1;
a merge module 73 configured to obtain a second image data set and merge the second image data set and the first cluster using the first cluster center.

可能な一実施形態において、前記第１クラスタは第１クラスタＡ、第１クラスタＢ及び第１クラスタＣを含み、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージする点で、マージモジュール７３は、前記第２画像データ集合に複数の画像データが含まれる場合、前記複数の画像データをクラスタリングし、孤立画像データ及び第２クラスタを得て、前記第１クラスタ中心を用いて前記孤立画像データを前記第１クラスタＡへとマージし、前記第１クラスタ中心を用いて前記第２クラスタと前記第１クラスタＢとをマージし、前記第２画像データ集合内に単一の画像データのみ存在する場合、前記第１クラスタ中心を用いて前記単一の画像データを前記第１クラスタＣへとマージするように構成される。 In one possible embodiment, said first cluster comprises a first cluster A, a first cluster B and a first cluster C, and said first cluster center is used to divide said second image data set and said first cluster. In terms of merging, the merging module 73 clusters the plurality of image data, if the second image data set includes a plurality of image data, to obtain isolated image data and a second cluster, the first cluster merging the isolated image data into the first cluster A using the center; merging the second cluster and the first cluster B using the first cluster center; into the second image data set It is configured to merge the single image data into the first cluster C using the first cluster center if only a single image data exists.

可能な一実施形態において、前記第１クラスタは対応する第２クラスタ中心を有し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージする前に、マージモジュール７３は、前記第２クラスタ中心を用いて、前記第１クラスタからＫ個の第１クラスタを決定するようにさらに構成される。 In one possible embodiment, the first cluster has a corresponding second cluster center, and before using the first cluster center to merge the second image data set and the first cluster, a merge module 73 is further configured to determine K first clusters from said first cluster using said second cluster center.

可能な一実施形態において、前記第２クラスタは対応する第３クラスタ中心を有し、前記第２クラスタ中心を用いて、前記第１クラスタからＫ個の第１クラスタを決定する点で、マージモジュール７３は、前記孤立画像データと前記第２クラスタ中心との第１類似度を取得し、前記第１類似度に基づき、前記第１クラスタを高い順にソートして第１クラスタ系列を得て、前記第１クラスタ系列内の先頭からＫ個の第１クラスタを選択し、前記第３クラスタ中心と前記第２クラスタ中心との第２類似度を取得し、前記第２類似度に基づき、前記第１クラスタを高い順にソートして第２クラスタ系列を得て、前記第２クラスタ系列内の先頭からＫ個の第１クラスタを選択するように構成されるか、又は、前記単一の画像データと前記第２クラスタ中心との第３類似度を取得し、前記第３類似度に基づき、前記第１クラスタを高い順にソートして第３クラスタ系列を得て、前記第３クラスタ系列内の先頭からＫ個の第１クラスタを選択するように構成される。 In one possible embodiment, the merge module, in that said second cluster has a corresponding third cluster center, and said second cluster center is used to determine K first clusters from said first cluster. 73 obtains a first similarity between the isolated image data and the center of the second cluster, sorts the first clusters in descending order based on the first similarity, and obtains a first cluster sequence; Select K first clusters from the top in the first cluster series, obtain a second similarity between the third cluster center and the second cluster center, and obtain the first similarity based on the second similarity sorting the clusters in ascending order to obtain a second cluster sequence, and selecting the top K first clusters in said second cluster sequence; or said single image data and said Obtaining a third similarity with the second cluster center, sorting the first cluster in ascending order based on the third similarity to obtain a third cluster sequence, K from the top in the third cluster sequence first clusters.

可能な一実施形態において、前記第１クラスタ中心を用いて前記孤立画像データと前記第１クラスタＡとをマージする点で、マージモジュール７３は、前記孤立画像データと第１クラスタ中心Ｄとの第４類似度を取得し、前記第１クラスタ中心Ｄは、前記Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であり、前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第４類似度が第１閾値よりも大きい前記第１クラスタ中心Ｄの第１数量を決定し、前記Ｋ個の第１クラスタのうちの前記第１数量が最大である第１クラスタを前記第１クラスタＡとして決定し、前記孤立画像データと前記第１クラスタＡとをマージするように構成される。 In one possible embodiment, merging module 73 merges the isolated image data with the first cluster center D in that it uses the first cluster center to merge the isolated image data with the first cluster center D. obtaining 4 similarities, wherein the first cluster center D is the first cluster center corresponding to each first sub-cluster of each first cluster of the K first clusters; determining, for each first cluster of the first clusters, a first quantity of said first cluster centers D within said each first cluster for which said fourth similarity is greater than a first threshold; The first cluster having the largest first quantity among the first clusters is determined as the first cluster A, and the isolated image data and the first cluster A are merged.

可能な一実施形態において、前記第１クラスタ中心を用いて前記第２クラスタと前記第１クラスタＢとをマージする点で、マージモジュール７３は、前記第２クラスタをＮ個の第２サブクラスタに分割し、前記Ｎ個の第２サブクラスタのうちの各第２サブクラスタに対応する第４クラスタ中心を取得し、前記Ｎは１以上の整数であり、前記第４クラスタ中心と第１クラスタ中心Ｅとの第５類似度を取得し、前記第１クラスタ中心Ｅは、Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であり、前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第５類似度が第２閾値よりも大きい前記第１クラスタ中心Ｅの第２数量を決定し、前記Ｋ個の第１クラスタのうちの前記第２数量が最大である第１クラスタを前記第１クラスタＢとして決定し、前記第２クラスタと前記第１クラスタＢとをマージするように構成される。 In one possible embodiment, in merging the second cluster and the first cluster B using the first cluster center, merge module 73 divides the second cluster into N second sub-clusters. dividing to obtain a fourth cluster center corresponding to each second sub-cluster of the N second sub-clusters, wherein N is an integer greater than or equal to 1, the fourth cluster center and the first cluster center; obtain a fifth similarity measure with E, wherein the first cluster center E is the first cluster center corresponding to each first sub-cluster of each first cluster of the K first clusters; for each first cluster of K first clusters, determine a second quantity of said first cluster centers E within said each first cluster for which said fifth similarity is greater than a second threshold; , determining the first cluster having the largest second quantity among the K first clusters as the first cluster B, and merging the second cluster and the first cluster B. be.

可能な一実施形態において、前記第１クラスタ中心を用いて前記単一の画像データと前記第１クラスタＣとをマージする点で、マージモジュール７３は、前記単一の画像データと第１クラスタ中心Ｆとの第６類似度を取得し、前記第１クラスタ中心Ｆは、前記Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であり、前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第６類似度が第３閾値よりも大きい前記第１クラスタ中心Ｆの第３数量を決定し、前記Ｋ個の第１クラスタのうちの前記第３数量が最大である第１クラスタを前記第１クラスタＣとして決定し、そして前記単一の画像データと前記第１クラスタＣとをマージするように構成される。 In one possible embodiment, in merging the single image data and the first cluster C using the first cluster center, the merge module 73 merges the single image data and the first cluster center obtaining a sixth similarity measure with F, wherein the first cluster center F is the first cluster center corresponding to each first sub-cluster of each first cluster of the K first clusters; For each first cluster of the K first clusters, determine a third quantity of the first cluster centers F within each first cluster for which the sixth similarity is greater than a third threshold. and determining the first cluster among the K first clusters having the largest third quantity as the first cluster C, and merging the single image data and the first cluster C. configured as

可能な一実施形態において、前記Ｍは第４閾値以下であり、第１分割モジュール７２は、マージした第１クラスタをＲ個の第３サブクラスタに分割し、前記Ｒ個の第３サブクラスタのうちの各第３サブクラスタの第５クラスタ中心を取得し、前記Ｒは１以上の整数であり、前記Ｒが前記第４閾値以下である場合、前記Ｒ個の第３サブクラスタを残し、前記Ｒ個の第３サブクラスタに対応する前記第５クラスタ中心を用いて前記第１クラスタ中心を更新し、前記Ｒが前記第４閾値よりも大きい場合、前記Ｒ個の第３サブクラスタのうちの各第３サブクラスタ内の画像データの第４数量を取得し、前記第４数量に基づき、前記Ｒ個の第３サブクラスタを大きい順にソートして第４クラスタ系列を得て、前記第４クラスタ系列内の先頭からＰ個の第３サブクラスタを選択し、前記Ｐ個の第３サブクラスタに対応する前記第５クラスタ中心を用いて前記第１クラスタ中心を更新し、前記Ｐは前記第４閾値以下であるようにさらに構成される。 In one possible embodiment, the M is less than or equal to a fourth threshold, and the first splitting module 72 splits the merged first cluster into R third sub-clusters, and dividing the R third sub-clusters into Obtaining a fifth cluster center of each third sub-cluster of said updating the first cluster center with the fifth cluster center corresponding to R third sub-clusters, and if the R is greater than the fourth threshold, one of the R third sub-clusters; obtaining a fourth quantity of image data in each third sub-cluster, sorting the R third sub-clusters according to the fourth quantity in descending order to obtain a fourth cluster sequence; selecting P third subclusters from the top in the sequence, updating the first cluster center using the fifth cluster center corresponding to the P third subclusters, wherein the P is the fourth It is further configured to be less than or equal to the threshold.

可能な一実施形態において、前記第１クラスタをＭ個の第１サブクラスタに分割する点で、第１分割モジュール７２は、前記第１クラスタ内の画像データ間の第７類似度を取得し、類似度行列を得て、前記類似度行列に基づいて前記第１クラスタを前記Ｍ個の第１サブクラスタに分割するように構成される。 In one possible embodiment, in terms of dividing the first cluster into M first sub-clusters, the first division module 72 obtains a seventh similarity measure between image data within the first clusters; It is configured to obtain a similarity matrix and divide the first cluster into the M first sub-clusters based on the similarity matrix.

可能な一実施形態において、前記類似度行列に基づいて前記第１クラスタを前記Ｍ個の第１サブクラスタに分割する点で、第１分割モジュール７２は、前記第１クラスタ内の画像データを頂点として構成される連結グラフを取得し、前記類似度行列からサーチして前記連結グラフの頂点間の前記第７類似度を得て、前記第７類似度が第５閾値よりも大きい複数の頂点を１つの第１サブクラスタとして分割し、前記Ｍ個の第１サブクラスタを得るように構成される。 In one possible embodiment, in dividing the first cluster into the M first sub-clusters based on the similarity matrix, the first division module 72 divides the image data in the first cluster into vertices and search from the similarity matrix to obtain the seventh similarity between vertices of the connected graph, and select the vertices where the seventh similarity is greater than a fifth threshold. partitioning as one first sub-cluster to obtain said M first sub-clusters.

本開示の一実施例によれば、図７に示される画像の増分クラスタリング装置における各ユニットの各々又は全部は、１つ又は複数の他のユニットとして統合して構成されてもよく、あるいは、そのうちのある（いくつかの）ユニットは、さらに機能的により小さな複数のユニットに分割して構成されてもよく、どちらも同じ操作が実現可能であり、本開示の実施例の技術的効果の実現に影響しない。上記ユニットは論理機能に基づいて分けられるものであり、実際の適用において、１つのユニットの機能は複数のユニットによって実現されてもよく、又は複数のユニットの機能は１つのユニットによって実現されてもよい。本開示の他の実施例では、画像に基づく増分クラスタリング装置は、他のユニットを含んでもよく、実際の適用において、これらの機能は、他のユニットの協働によって実現されてもよく、複数のユニットの協働によって実現されてもよい。 According to one embodiment of the present disclosure, each or all of the units in the incremental image clustering apparatus shown in FIG. 7 may be configured integrally as one or more other units, or A certain (several) unit may be further divided into a plurality of functionally smaller units, both of which are capable of realizing the same operation, and to achieve the technical effects of the embodiments of the present disclosure. It does not affect. The above units are divided according to their logical functions. In practical application, the function of one unit can be implemented by multiple units, or the function of multiple units can be implemented by one unit. good. In other embodiments of the present disclosure, the image-based incremental clustering device may include other units, and in actual application, these functions may be realized by cooperation of other units, and a plurality of It may be realized by cooperation of units.

本開示の別の実施例によれば、中央処理ユニット（ＣＰＵ）、ランダムアクセス記憶媒体（ＲＡＭ）、読み取り専用記憶媒体（ＲＯＭ）等の処理素子及び記憶素子を含むコンピュータのような汎用コンピューティングデバイス上で、図２又は図６に示される対応する方法に係る各ステップを実行可能なコンピュータプログラム（プログラムコードを含む）を実行することで、図７に示される画像の増分クラスタリング装置を構成し、本開示の実施例の画像の増分クラスタリング方法を実現することができる。前記コンピュータプログラムは、例えば、コンピュータ可読記録媒体に記録され、コンピュータ可読記録媒体を介して上記コンピューティングデバイスにロードされ、コンピューティングデバイス内で実行されることが可能である。 According to another embodiment of the present disclosure, a general purpose computing device, such as a computer, including processing and memory elements such as a central processing unit (CPU), random access storage medium (RAM), read only storage medium (ROM), etc. Constructing an incremental image clustering device shown in FIG. 7 by executing a computer program (including program code) capable of executing the steps of the corresponding method shown in FIG. 2 or FIG. 6 above, The incremental clustering method for images of the embodiments of the present disclosure can be implemented. The computer program can be recorded in a computer-readable recording medium, loaded into the computing device via the computer-readable recording medium, and executed in the computing device, for example.

上記方法実施例及び装置実施例の説明によれば、本開示の実施例は電子機器をさらに提供する。図８を参照し、該電子機器は、少なくともプロセッサ８１、入力デバイス８２、出力デバイス８３及びコンピュータ記憶媒体８４を含む。電子機器内のプロセッサ８１、入力デバイス８２、出力デバイス８３及びコンピュータ記憶媒体８４はバス又は他の方式を介して接続されてもよい。 According to the above method embodiments and apparatus embodiments, embodiments of the present disclosure further provide an electronic device. Referring to FIG. 8, the electronic device includes at least a processor 81 , an input device 82 , an output device 83 and a computer storage medium 84 . The processor 81, input device 82, output device 83 and computer storage medium 84 within the electronic device may be connected via a bus or other manner.

コンピュータ記憶媒体８４は電子機器のメモリに記憶されてもよく、前記コンピュータ記憶媒体８４は、プログラム命令を含むコンピュータプログラムを記憶するように構成され、前記プロセッサ８１は、前記コンピュータ記憶媒体８４に記憶されているプログラム命令を実行するように構成される。プロセッサ８１（ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ：中央処理ユニット）とも呼ばれる）は、電子機器のコンピューティングコア及び制御コアであり、１つ又は複数の命令を実現するのに適し、１つ又は複数の命令をロードして実行して対応する方法フロー又は対応する機能を実現するのに適する。 Computer storage medium 84 may be stored in a memory of an electronic device, said computer storage medium 84 being configured to store a computer program including program instructions, said processor 81 being stored in said computer storage medium 84. configured to execute program instructions that A processor 81 (also called a CPU (Central Processing Unit)) is a computing core and control core of an electronic device, suitable for implementing one or more instructions, and executing one or more instructions. It is suitable for loading and executing to implement corresponding method flows or corresponding functions.

一実施例では、本開示の実施例により提供される電子機器のプロセッサ８１は、以下の一連の画像の増分クラスタリング処理を行うように構成されてもよい。 In one embodiment, processor 81 of an electronic device provided by embodiments of the present disclosure may be configured to perform the following incremental clustering process for a series of images.

第１画像データ集合の第１クラスタを取得し、
前記第１クラスタをＭ個の第１サブクラスタに分割し、前記Ｍ個の第１サブクラスタのうちの各第１サブクラスタに対応する第１クラスタ中心を取得し、前記Ｍは１以上の整数であり、第２画像データ集合を取得し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージする。 obtaining a first cluster of the first image data set;
dividing the first cluster into M first sub-clusters, obtaining a first cluster center corresponding to each first sub-cluster among the M first sub-clusters, wherein M is an integer greater than or equal to 1; and obtaining a second image data set and merging the second image data set and the first cluster using the first cluster center.

さらに別の実施例では、前記第１クラスタは第１クラスタＡ、第１クラスタＢ及び第１クラスタＣを含み、プロセッサ８１は、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージするステップを実行し、前記第２画像データ集合に複数の画像データが含まれる場合、前記複数の画像データをクラスタリングし、孤立画像データ及び第２クラスタを得るステップと、前記第１クラスタ中心を用いて前記孤立画像データを前記第１クラスタＡへとマージし、前記第１クラスタ中心を用いて前記第２クラスタと前記第１クラスタＢとをマージするステップと、前記第２画像データ集合内に単一の画像データのみ存在する場合、前記第１クラスタ中心を用いて前記単一の画像データを前記第１クラスタＣへとマージするステップと、を含む。 In yet another embodiment, said first clusters include a first cluster A, a first cluster B and a first cluster C, and processor 81 uses said first cluster centers to determine said second image data set and said first cluster. performing the step of merging the image data with one cluster, and if the second image data set includes a plurality of image data, clustering the plurality of image data to obtain isolated image data and a second cluster; merging the isolated image data into the first cluster A using one cluster center and merging the second cluster and the first cluster B using the first cluster center; merging the single image data into the first cluster C using the first cluster center if there is only a single image data in the dataset.

さらに別の実施例では、前記第１クラスタは対応する第２クラスタ中心を有し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージする前に、プロセッサ８１は、前記第２クラスタ中心を用いて、前記第１クラスタからＫ個の第１クラスタを決定するステップを実行するようにさらに構成される。 In yet another embodiment, said first cluster has a corresponding second cluster center, and prior to merging said second image data set and said first cluster using said first cluster center, processor 81 is further configured to perform the step of determining K first clusters from said first cluster using said second cluster center.

さらに別の実施例では、前記第２クラスタは対応する第３クラスタ中心を有し、プロセッサ８１は、前記第２クラスタ中心を用いて、前記第１クラスタからＫ個の第１クラスタを決定するステップを実行し、前記前記孤立画像データと前記第２クラスタ中心との第１類似度を取得し、前記第１類似度に基づき、前記第１クラスタを高い順にソートして第１クラスタ系列を得て、前記第１クラスタ系列内の先頭からＫ個の第１クラスタを選択するステップと、前記第３クラスタ中心と前記第２クラスタ中心との第２類似度を取得し、前記第２類似度に基づき、前記第１クラスタを高い順にソートして第２クラスタ系列を得て、前記第２クラスタ系列内の先頭からＫ個の第１クラスタを選択するステップと、を含むか、又は、前記単一の画像データと前記第２クラスタ中心との第３類似度を取得し、前記第３類似度に基づき、前記第１クラスタを高い順にソートして第３クラスタ系列を得て、前記第３クラスタ系列内の先頭からＫ個の第１クラスタを選択するステップを含む。 In yet another embodiment, said second cluster has a corresponding third cluster center, and processor 81 determines K first clusters from said first cluster using said second cluster center. obtaining a first similarity between the isolated image data and the center of the second cluster, and sorting the first clusters in ascending order based on the first similarity to obtain a first cluster series; , selecting K first clusters from the top in the first cluster sequence; obtaining a second similarity between the third cluster center and the second cluster center; and based on the second similarity, , sorting the first clusters in ascending order to obtain a second cluster sequence, and selecting the top K first clusters in the second cluster sequence; or obtaining a third similarity between the image data and the center of the second cluster, sorting the first clusters in descending order based on the third similarity to obtain a third cluster sequence, and within the third cluster sequence selecting the K first clusters from the beginning of .

さらに別の実施例では、プロセッサ８１は、前記第１クラスタ中心を用いて前記孤立画像データと前記第１クラスタＡとをマージするステップを実行し、前記孤立画像データと第１クラスタ中心Ｄとの第４類似度を取得するステップであって、前記第１クラスタ中心Ｄは、前記Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であるステップと、前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第４類似度が第１閾値よりも大きい前記第１クラスタ中心Ｄの第１数量を決定するステップと、前記Ｋ個の第１クラスタのうちの前記第１数量が最大である第１クラスタを前記第１クラスタＡとして決定するステップと、前記孤立画像データと前記第１クラスタＡとをマージするステップと、を含む。 In yet another embodiment, processor 81 performs the step of merging said isolated image data with said first cluster A using said first cluster center, and merging said isolated image data with said first cluster center D. obtaining a fourth similarity measure, wherein the first cluster center D is the first cluster center corresponding to each first sub-cluster of each first cluster among the K first clusters; and for each first cluster of said K first clusters, the first of said first cluster centers D within said each first cluster for which said fourth similarity is greater than a first threshold. determining a quantity; determining a first cluster having the largest first quantity among the K first clusters as the first cluster A; and merging.

さらに別の実施例では、プロセッサ８１は、前記第１クラスタ中心を用いて前記第２クラスタと前記第１クラスタＢとをマージするステップを実行し、前記第２クラスタをＮ個の第２サブクラスタに分割し、前記Ｎ個の第２サブクラスタのうちの各第２サブクラスタに対応する第４クラスタ中心を取得するステップであって、前記Ｎは１以上の整数であるステップと、前記第４クラスタ中心と第１クラスタ中心Ｅとの第５類似度を取得するステップであって、前記第１クラスタ中心Ｅは、Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であるステップと、前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第５類似度が第２閾値よりも大きい前記第１クラスタ中心Ｅの第２数量を決定するステップと、前記Ｋ個の第１クラスタのうちの前記第２数量が最大である第１クラスタを前記第１クラスタＢとして決定するステップと、前記第２クラスタと前記第１クラスタＢとをマージするステップと、を含む。 In yet another embodiment, processor 81 performs the step of merging said second cluster and said first cluster B using said first cluster center, dividing said second cluster into N second sub-clusters. and obtaining a fourth cluster center corresponding to each second sub-cluster of the N second sub-clusters, wherein N is an integer equal to or greater than 1; obtaining a fifth similarity between a cluster center and a first cluster center E, wherein the first cluster center E corresponds to each first sub-cluster of each first cluster among the K first clusters; and for each first cluster of the K first clusters, the fifth similarity within each first cluster is greater than a second threshold. determining a second quantity of the first cluster center E; determining a first cluster having the largest second quantity among the K first clusters as the first cluster B; merging a second cluster with said first cluster B;

さらに別の実施例では、プロセッサ８１は、前記第１クラスタ中心を用いて前記単一の画像データと前記第１クラスタＣとをマージするステップを実行し、前記単一の画像データと第１クラスタ中心Ｆとの第６類似度を取得するステップであって、前記第１クラスタ中心Ｆは、前記Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であるステップと、前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第６類似度が第３閾値よりも大きい前記第１クラスタ中心Ｆの第３数量を決定するステップと、前記Ｋ個の第１クラスタのうちの前記第３数量が最大である第１クラスタを前記第１クラスタＣとして決定するステップと、前記単一の画像データと前記第１クラスタＣとをマージするステップと、を含む。 In yet another embodiment, processor 81 performs the step of merging said single image data and said first cluster C using said first cluster center, said single image data and said first cluster obtaining a sixth similarity measure with a center F, wherein said first cluster center F corresponds to each first sub-cluster of each first cluster of said K first clusters; and for each first cluster of said K first clusters, said first cluster center within said each first cluster for which said sixth similarity is greater than a third threshold. determining a third quantity of F; determining a first cluster among the K first clusters having the largest third quantity as the first cluster C; and the single image data and said first cluster C.

さらに別の実施例では、前記Ｍは第４閾値以下であり、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージした後、プロセッサ８１は、マージした第１クラスタをＲ個の第３サブクラスタに分割し、前記Ｒ個の第３サブクラスタのうちの各第３サブクラスタの第５クラスタ中心を取得するステップであって、前記Ｒは１以上の整数であるステップと、前記Ｒが前記第４閾値以下である場合、前記Ｒ個の第３サブクラスタを残し、前記Ｒ個の第３サブクラスタに対応する前記第５クラスタ中心を用いて前記第１クラスタ中心を更新するステップと、前記Ｒが前記第４閾値よりも大きい場合、前記Ｒ個の第３サブクラスタのうちの各第３サブクラスタ内の画像データの第４数量を取得するステップと、前記第４数量に基づき、前記Ｒ個の第３サブクラスタを大きい順にソートして第４クラスタ系列を得て、前記第４クラスタ系列内の先頭からＰ個の第３サブクラスタを選択し、前記Ｐ個の第３サブクラスタに対応する前記第５クラスタ中心を用いて前記第１クラスタ中心を更新するステップであって、前記Ｐは前記第４閾値以下であるステップと、を実行するようにさらに構成される。 In yet another embodiment, said M is less than or equal to a fourth threshold, and after merging said second image data set and said first cluster using said first cluster center, processor 81 calculates the merged first cluster center dividing a cluster into R third sub-clusters and obtaining a fifth cluster center of each third sub-cluster among the R third sub-clusters, wherein R is an integer greater than or equal to 1; a step, if said R is less than or equal to said fourth threshold, then said first cluster using said fifth cluster center corresponding to said R third sub-clusters, leaving said R third sub-clusters; updating the center; obtaining a fourth quantity of image data in each third sub-cluster of the R third sub-clusters if the R is greater than the fourth threshold; sorting the R third sub-clusters in descending order to obtain a fourth cluster sequence according to a fourth quantity; selecting the first P third sub-clusters in the fourth cluster sequence; updating the first cluster center with the fifth cluster center corresponding to the third sub-cluster, wherein the P is less than or equal to the fourth threshold be done.

さらに別の実施例では、前記第１クラスタは前記第１画像データ集合内の画像データをクラスタリングすることで得られるものであり、プロセッサ８１は、前記第１クラスタをＭ個の第１サブクラスタに分割するステップを実行し、前記第１クラスタ内の画像データ間の第７類似度を取得し、類似度行列を得るステップと、前記類似度行列に基づいて前記第１クラスタを前記Ｍ個の第１サブクラスタに分割するステップと、を含む。 In yet another embodiment, said first cluster is obtained by clustering image data in said first image data set, and processor 81 divides said first cluster into M first sub-clusters. performing a dividing step to obtain a seventh similarity between image data in the first cluster to obtain a similarity matrix; dividing the first cluster into the M th clusters based on the similarity matrix and dividing into one sub-cluster.

さらに別の実施例では、プロセッサ８１は、前記類似度行列に基づいて前記第１クラスタを前記Ｍ個の第１サブクラスタに分割するステップを実行し、前記第１クラスタ内の画像データを頂点として構成される連結グラフを取得するステップと、前記類似度行列からサーチして前記連結グラフの頂点間の前記第７類似度を得るステップと、前記第７類似度が第５閾値よりも大きい複数の頂点を１つの第１サブクラスタとして分割し、前記Ｍ個の第１サブクラスタを得るステップと、を含む。 In yet another embodiment, processor 81 performs the step of dividing said first cluster into said M first sub-clusters based on said similarity matrix, with image data in said first cluster as vertices obtaining a constructed connectivity graph; searching from the similarity matrix to obtain the seventh similarity between vertices of the connectivity graph; and splitting the vertices as one first sub-cluster to obtain the M first sub-clusters.

例示的には、上記電子機器はコンピュータ、コンピュータホスト、サーバ、クラウドサーバ、サーバクラスタ等であってもよく、電子機器は、プロセッサ８１、入力デバイス８２、出力デバイス８３及びコンピュータ記憶媒体８４を含むが、これらに限定されなく、入力デバイス８２はキーボード、タッチスクリーン等であってもよく、出力デバイス８３はスピーカー、ディスプレイ、無線周波数送信機等であってもよい。当業者であれば、前記模式図は電子機器の例示であり得、電子機器を限定するものではなく、図面より多く又はより少ない部材を含んでもよく、又は何らかの部材、もしくは異なる部材を組み合わせてもよいことが理解される。 Illustratively, the electronic device may be a computer, computer host, server, cloud server, server cluster, etc., although the electronic device includes a processor 81, an input device 82, an output device 83 and a computer storage medium 84. , without limitation, input device 82 may be a keyboard, touch screen, etc., and output device 83 may be a speaker, display, radio frequency transmitter, or the like. Those skilled in the art will appreciate that the schematic diagram may be an illustration of an electronic device, not a limitation of the electronic device, and may include more or fewer members than the drawings, or may combine any or different members. Good thing is understood.

説明すべきは、電子機器のプロセッサ８１がコンピュータプログラムを実行すると、上述した画像の増分クラスタリング方法のステップは実現されるため、上述した画像の増分クラスタリング方法の実施例は、いずれも該電子機器に適用され、且ついずれも同じ又は類似の有益な効果を達成できる。 It should be noted that the steps of the above-described incremental image clustering method are realized when the processor 81 of the electronic device executes the computer program, so that any of the above-described incremental image clustering method embodiments can be applied to the electronic device. applied and either can achieve the same or similar beneficial effects.

本開示の実施例はコンピュータプログラム製品をさらに提供し、該コンピュータプログラム製品は、プロセッサにより実行されと、前記実施例のいずれか１つの方法を実現する。該コンピュータプログラム製品はハードウェア、ソフトウェア又はそれらの組み合わせにより実現可能である。本開示のいくつかの実施例では、前記コンピュータプログラム製品はコンピュータ記憶媒体として実現され、本開示の別の実施例では、コンピュータプログラム製品は、例えばソフトウェア開発キット（ＳＤＫ：ＳｏｆｔｗａｒｅＤｅｖｅｌｏｐｍｅｎｔＫｉｔ）等のソフトウェア製品として実現される。 An embodiment of the present disclosure further provides a computer program product, which, when executed by a processor, implements the method of any one of the preceding embodiments. The computer program product can be implemented in hardware, software or a combination thereof. In some embodiments of the disclosure, the computer program product is embodied as a computer storage medium, and in other embodiments of the disclosure, the computer program product is software, such as a Software Development Kit (SDK). realized as a product.

本開示の実施例はコンピュータ記憶媒体（Ｍｅｍｏｒｙ）をさらに提供し、前記コンピュータ記憶媒体は、電子機器内の記憶デバイスであり、プログラム及びデータを記憶するように構成される。ここのコンピュータ記憶媒体は端末内の内蔵記憶媒体を含んでもよく、当然、端末によってサポートされる拡張記憶媒体を含んでもよいことが理解される。コンピュータ記憶媒体は記憶空間を提供し、該記憶空間に端末のオペレーティングシステムが記憶されている。且つ、該記憶空間には、プロセッサ８１によりロードされて実行されるのに適する１つ又は複数の命令も記憶されており、これらの命令は、１つ以上のコンピュータプログラム（プログラムコードを含む）であってもよい。説明すべきは、ここのコンピュータ記憶媒体は、高速ＲＡＭメモリであってもよく、例えば少なくとも１つの磁気ディスクメモリ等の不揮発性メモリ（Ｎｏｎ－ＶｏｌａｔｉｌｅＭｅｍｏｒｙ）であってもよく、本開示のいくつかの実施例では、前記プロセッサ８１から遠く離れて配置された少なくとも１つのコンピュータ記憶媒体であってもよい。一実施例では、上述した画像の増分クラスタリング方法に係る対応するステップを実現するために、プロセッサ８１によってコンピュータ記憶媒体に記憶されている１つ又は複数の命令をロードして実行してもよい。 Embodiments of the present disclosure further provide a computer storage medium (Memory), which is a storage device within an electronic device and is configured to store programs and data. It is understood that computer storage media herein may include internal storage media within the terminal as well as, of course, extended storage media supported by the terminal. A computer storage medium provides a storage space in which the operating system of the terminal is stored. Also stored in the memory space is one or more instructions suitable for being loaded and executed by processor 81, which instructions may be one or more computer programs (including program code). There may be. It should be noted that the computer storage medium herein may be high speed RAM memory or may be Non-Volatile Memory, such as at least one magnetic disk memory; , there may be at least one computer storage medium remotely located from the processor 81 . In one embodiment, one or more instructions stored in a computer storage medium may be loaded and executed by processor 81 to implement the corresponding steps of the incremental clustering of images method described above.

例示的には、コンピュータ記憶媒体のコンピュータプログラムは、ソースコード形式、オブジェクトコード形式、実行可能ファイル又は何らかの中間形式等であってもよいコンピュータプログラムコードを含む。前記コンピュータ可読媒体は、前記コンピュータプログラムコードを携帯可能なあらゆる実体又は装置、記録媒体、ＵＳＢフラッシュディスク、モバイルハードディスク、磁気ディスク、光ディスク、コンピュータメモリ、読み取り専用メモリ（ＲＯＭ：Ｒｅａｄ－ＯｎｌｙＭｅｍｏｒｙ）、ランダムアクセスメモリ（ＲＡＭ：ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、電気キャリア信号、電気通信信号及びソフトウェア配信媒体等を含むことができる。 By way of example, a computer program on a computer storage medium comprises computer program code, which may be in source code form, object code form, an executable file, or some intermediate form, and the like. The computer-readable medium is any entity or device capable of carrying the computer program code, recording medium, USB flash disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM), random It may include access memory (RAM), electrical carrier signals, telecommunication signals, software distribution media, and the like.

説明すべきは、コンピュータ記憶媒体のコンピュータプログラムがプロセッサにより実行されると、上記画像の増分クラスタリング方法のステップは実現されるため、上述した画像の増分クラスタリング方法の全ての実施例は、いずれも該コンピュータ記憶媒体に適用され、且ついずれも同じ又は類似の有益な効果を達成できる。 It should be explained that the steps of the above incremental image clustering method are realized when the computer program on the computer storage medium is executed by the processor, so that all the embodiments of the above incremental image clustering method are applicable. applied to computer storage media, and either can achieve the same or similar beneficial effects.

以上、本開示の実施例を詳しく説明した。本明細書において、例を用いて本開示の原理及び実施形態を説明したが、以上の実施例は、本開示の方法及びその核心思想を理解しやすくするために説明するものに過ぎず、当業者であれば、本発明の思想に基づいて、実施形態及び応用範囲のいずれも変化でき、以上より、本明細書の内容は本開示を制限するものと理解すべきではない。 Exemplary embodiments of the present disclosure have been described above in detail. Although the principles and embodiments of the present disclosure have been described herein using examples, the above examples are merely illustrative to facilitate understanding of the methods and core ideas of the present disclosure. A person skilled in the art can change both the embodiment and the scope of application based on the idea of the present invention, and therefore, the content of the present specification should not be understood as limiting the present disclosure.

本実施例では、第１クラスタを複数の第１サブクラスタに分割し、第１サブクラスタの第１クラスタ中心に基づいて第１クラスタと第２画像データ集合とのマージを実現しており、複数の第１クラスタ中心を維持することで、画像データの増加に伴い、クラスタ中心が新たに追加された画像データの影響を受けてドリフトするという問題を解決し、これはクラスタリング結果をより正確にするのに役立ち、クラスタリング効果が向上する。 In this embodiment, the first cluster is divided into a plurality of first sub-clusters, and the merging of the first cluster and the second image data set is realized based on the first cluster center of the first sub-cluster. By maintaining the first cluster center of , it solves the problem that as the image data increases, the cluster center drifts under the influence of the newly added image data, which makes the clustering result more accurate and improve the clustering effect.

以上から分かるように、本開示の実施例は、第１画像データ集合の第１クラスタを取得し、前記第１クラスタをＭ個の第１サブクラスタに分割し、前記Ｍ個の第１サブクラスタのうちの各第１サブクラスタに対応する第１クラスタ中心を取得し、前記Ｍが１以上の整数であり、第２画像データ集合を取得し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージする。このように、第１クラスタを複数の第１サブクラスタに分割し、第１サブクラスタの第１クラスタ中心に基づいて第１クラスタと第２画像データ集合とのマージを実現しており、複数の第１クラスタ中心（即ちサブ中心）を維持することで、画像データの増加に伴い、クラスタ中心（第１クラスタのクラスタ中心であり、即ちメイン中心である）が新たに追加された画像データの影響を受けてドリフトするという問題を解決し、これはクラスタリング結果をより正確にするのに役立ち、クラスタリング効果が向上する。また、クラスタリング過程で、第２画像データ集合は第１画像データ集合全体と類似度を計算する必要がなくなり、計算複雑性の軽減に役立つ。
例えば、本願は以下の項目を提供する。
（項目１）
第１画像データ集合の第１クラスタを取得するステップと、
前記第１クラスタをＭ個の第１サブクラスタに分割し、前記Ｍ個の第１サブクラスタのうちの各第１サブクラスタに対応する第１クラスタ中心を取得するステップであって、前記Ｍは１以上の整数であるステップと、
第２画像データ集合を取得し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージするステップと、を含む、画像の増分クラスタリング方法。
（項目２）
前記第１クラスタは第１クラスタＡ、第１クラスタＢ及び第１クラスタＣを含み、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージするステップは、
前記第２画像データ集合に複数の画像データが含まれる場合、前記複数の画像データをクラスタリングし、孤立画像データ及び第２クラスタを得るステップと、
前記第１クラスタ中心を用いて前記孤立画像データを前記第１クラスタＡへとマージし、前記第１クラスタ中心を用いて前記第２クラスタと前記第１クラスタＢとをマージするステップと、
前記第２画像データ集合内に単一の画像データのみ存在する場合、前記第１クラスタ中心を用いて前記単一の画像データを前記第１クラスタＣへとマージするステップと、を含む
項目１に記載の画像の増分クラスタリング方法。
（項目３）
前記第１クラスタは対応する第２クラスタ中心を有し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージする前に、
前記第２クラスタ中心を用いて、前記第１クラスタからＫ個の第１クラスタを決定するステップをさらに含む
項目２に記載の画像の増分クラスタリング方法。
（項目４）
前記第２クラスタは対応する第３クラスタ中心を有し、前記第２クラスタ中心を用いて、前記第１クラスタからＫ個の第１クラスタを決定するステップは、
前記孤立画像データと前記第２クラスタ中心との第１類似度を取得し、
前記第１類似度に基づき、前記第１クラスタを高い順にソートして第１クラスタ系列を得て、前記第１クラスタ系列内の先頭からＫ個の第１クラスタを選択するステップと、
前記第３クラスタ中心と前記第２クラスタ中心との第２類似度を取得し、
前記第２類似度に基づき、前記第１クラスタを高い順にソートして第２クラスタ系列を得て、前記第２クラスタ系列内の先頭からＫ個の第１クラスタを選択するステップと、を含むか、又は、
前記単一の画像データと前記第２クラスタ中心との第３類似度を取得し、
前記第３類似度に基づき、前記第１クラスタを高い順にソートして第３クラスタ系列を得て、前記第３クラスタ系列内の先頭からＫ個の第１クラスタを選択するステップを含む
項目３に記載の画像の増分クラスタリング方法。
（項目５）
前記第１クラスタ中心を用いて前記孤立画像データと前記第１クラスタＡとをマージするステップは、
前記孤立画像データと第１クラスタ中心Ｄとの第４類似度を取得するステップであって、前記第１クラスタ中心Ｄは、前記Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であるステップと、
前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第４類似度が第１閾値よりも大きい前記第１クラスタ中心Ｄの第１数量を決定するステップと、
前記Ｋ個の第１クラスタのうちの前記第１数量が最大である第１クラスタを前記第１クラスタＡとして決定するステップと、
前記孤立画像データと前記第１クラスタＡとをマージするステップと、を含む
項目３に記載の画像の増分クラスタリング方法。
（項目６）
前記第１クラスタ中心を用いて前記第２クラスタと前記第１クラスタＢとをマージするステップは、
前記第２クラスタをＮ個の第２サブクラスタに分割し、前記Ｎ個の第２サブクラスタのうちの各第２サブクラスタに対応する第４クラスタ中心を取得するステップであって、前記Ｎは１以上の整数であるステップと、
前記第４クラスタ中心と第１クラスタ中心Ｅとの第５類似度を取得するステップであって、前記第１クラスタ中心Ｅは、Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であるステップと、
前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第５類似度が第２閾値よりも大きい前記第１クラスタ中心Ｅの第２数量を決定するステップと、
前記Ｋ個の第１クラスタのうちの前記第２数量が最大である第１クラスタを前記第１クラスタＢとして決定するステップと、
前記第２クラスタと前記第１クラスタＢとをマージするステップと、を含む
項目３に記載の画像の増分クラスタリング方法。
（項目７）
前記第１クラスタ中心を用いて前記単一の画像データと前記第１クラスタＣとをマージするステップは、
前記単一の画像データと第１クラスタ中心Ｆとの第６類似度を取得するステップであって、前記第１クラスタ中心Ｆは、前記Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心であるステップと、
前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第６類似度が第３閾値よりも大きい前記第１クラスタ中心Ｆの第３数量を決定するステップと、
前記Ｋ個の第１クラスタのうちの前記第３数量が最大である第１クラスタを前記第１クラスタＣとして決定するステップと、
前記単一の画像データと前記第１クラスタＣとをマージするステップと、を含む
項目３に記載の画像の増分クラスタリング方法。
（項目８）
前記Ｍは第４閾値以下であり、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージした後、
マージした第１クラスタをＲ個の第３サブクラスタに分割し、前記Ｒ個の第３サブクラスタのうちの各第３サブクラスタの第５クラスタ中心を取得するステップであって、前記Ｒは１以上の整数であるステップと、
前記Ｒが前記第４閾値以下である場合、前記Ｒ個の第３サブクラスタを残し、前記Ｒ個の第３サブクラスタに対応する前記第５クラスタ中心を用いて前記第１クラスタ中心を更新するステップと、
前記Ｒが前記第４閾値よりも大きい場合、前記Ｒ個の第３サブクラスタのうちの各第３サブクラスタ内の画像データの第４数量を取得するステップと、
前記第４数量に基づき、前記Ｒ個の第３サブクラスタを大きい順にソートして第４クラスタ系列を得て、前記第４クラスタ系列内の先頭からＰ個の第３サブクラスタを選択し、前記Ｐ個の第３サブクラスタに対応する前記第５クラスタ中心を用いて前記第１クラスタ中心を更新するステップであって、前記Ｐは前記第４閾値以下であるステップと、をさらに含む
項目１から７のいずれか１項に記載の画像の増分クラスタリング方法。
（項目９）
前記第１クラスタは前記第１画像データ集合内の画像データをクラスタリングすることで得られるものであり、前記第１クラスタをＭ個の第１サブクラスタに分割するステップは、
前記第１クラスタ内の画像データ間の第７類似度を取得し、類似度行列を得るステップと、
前記類似度行列に基づいて前記第１クラスタを前記Ｍ個の第１サブクラスタに分割するステップと、を含む
項目１から７のいずれか１項に記載の画像の増分クラスタリング方法。
（項目１０）
前記類似度行列に基づいて前記第１クラスタを前記Ｍ個の第１サブクラスタに分割するステップは、
前記第１クラスタ内の画像データを頂点として構成される連結グラフを取得するステップと、
前記類似度行列からサーチして前記連結グラフの頂点間の前記第７類似度を得るステップと、
前記第７類似度が第５閾値よりも大きい複数の頂点を１つの第１サブクラスタとして分割し、前記Ｍ個の第１サブクラスタを得るステップと、を含む
項目９に記載の画像の増分クラスタリング方法。
（項目１１）
第１画像データ集合の第１クラスタを取得するように構成される第１取得モジュールと、
前記第１クラスタをＭ個の第１サブクラスタに分割し、前記Ｍ個の第１サブクラスタのうちの各第１サブクラスタに対応する第１クラスタ中心を取得するように構成される第１分割モジュールであって、前記Ｍは１以上の整数である第１分割モジュールと、
第２画像データ集合を取得し、前記第１クラスタ中心を用いて前記第２画像データ集合と前記第１クラスタとをマージするように構成されるマージモジュールと、を含む、画像の増分クラスタリング装置。
（項目１２）
前記第１クラスタは第１クラスタＡ、第１クラスタＢ及び第１クラスタＣを含み、前記マージモジュールは、
前記第２画像データ集合に複数の画像データが含まれる場合、前記複数の画像データをクラスタリングし、孤立画像データ及び第２クラスタを得るように構成されるクラスタリングサブモジュールと、
前記第１クラスタ中心を用いて前記孤立画像データを前記第１クラスタＡへとマージするように構成される第１マージサブモジュールと、
前記第１クラスタ中心を用いて前記第２クラスタと前記第１クラスタＢとをマージするように構成される第２マージサブモジュールと、
前記第２画像データ集合内に単一の画像データのみ存在する場合、前記第１クラスタ中心を用いて前記単一の画像データを前記第１クラスタＣへとマージするように構成される第３マージサブモジュールと、を含む
項目１１に記載の画像の増分クラスタリング装置。
（項目１３）
前記第１クラスタは対応する第２クラスタ中心を有し、前記マージモジュールは、
前記第２クラスタ中心を用いて、前記第１クラスタからＫ個の第１クラスタを決定するように構成される第１決定サブモジュールをさらに含む
項目１２に記載の画像の増分クラスタリング装置。
（項目１４）
前記第２クラスタは対応する第３クラスタ中心を有し、前記第１決定サブモジュールは、
前記孤立画像データと前記第２クラスタ中心との第１類似度を取得するように構成される第１取得ユニットと、
前記第１類似度に基づき、前記第１クラスタを高い順にソートして第１クラスタ系列を得て、前記第１クラスタ系列内の先頭からＫ個の第１クラスタを選択するように構成される第１ソートユニットと、
前記第３クラスタ中心と前記第２クラスタ中心との第２類似度を取得するように構成される第２取得ユニットと、
前記第２類似度に基づき、前記第１クラスタを高い順にソートして第２クラスタ系列を得て、前記第２クラスタ系列内の先頭からＫ個の第１クラスタを選択するように構成される第２ソートユニットと、を含むか、又は
前記単一の画像データと前記第２クラスタ中心との第３類似度を取得するように構成される第３取得ユニットと、
前記第３類似度に基づき、前記第１クラスタを高い順にソートして第３クラスタ系列を得て、前記第３クラスタ系列内の先頭からＫ個の第１クラスタを選択するように構成される第３ソートユニットと、を含む
項目１３に記載の画像の増分クラスタリング装置。
（項目１５）
前記第１マージサブモジュールは、
前記孤立画像データと第１クラスタ中心Ｄとの第４類似度を取得するように構成される第４取得ユニットであって、前記第１クラスタ中心Ｄは、前記Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心である第４取得ユニットと、
前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第４類似度が第１閾値よりも大きい前記第１クラスタ中心Ｄの第１数量を決定するように構成される第１決定ユニットと、
前記Ｋ個の第１クラスタのうちの前記第１数量が最大である第１クラスタを前記第１クラスタＡとして決定するように構成される第２決定ユニットと、
前記孤立画像データと前記第１クラスタＡとをマージするように構成される第１マージユニットと、を含む
項目１３に記載の画像の増分クラスタリング装置。
（項目１６）
前記第２マージサブモジュールは、
前記第２クラスタをＮ個の第２サブクラスタに分割し、前記Ｎ個の第２サブクラスタのうちの各第２サブクラスタに対応する第４クラスタ中心を取得するように構成される第１分割ユニットであって、前記Ｎは１以上の整数である第１分割ユニットと、
前記第４クラスタ中心と第１クラスタ中心Ｅとの第５類似度を取得するように構成される第５取得ユニットであって、前記第１クラスタ中心Ｅは、Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心である第５取得ユニットと、
前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第５類似度が第２閾値よりも大きい前記第１クラスタ中心Ｅの第２数量を決定するように構成される第３決定ユニットと、
前記Ｋ個の第１クラスタのうちの前記第２数量が最大である第１クラスタを前記第１クラスタＢとして決定するように構成される第４決定ユニットと、
前記第２クラスタと前記第１クラスタＢとをマージするように構成される第２マージユニットと、を含む
項目１３に記載の画像の増分クラスタリング装置。
（項目１７）
前記第３マージサブモジュールは、
前記単一の画像データと第１クラスタ中心Ｆとの第６類似度を取得するように構成される第６取得ユニットであって、前記第１クラスタ中心Ｆは、前記Ｋ個の第１クラスタのうちの各第１クラスタの各第１サブクラスタに対応する前記第１クラスタ中心である第６取得ユニットと、
前記Ｋ個の第１クラスタのうちの各第１クラスタに対して、前記各第１クラスタ内の、前記第６類似度が第３閾値よりも大きい前記第１クラスタ中心Ｆの第３数量を決定するように構成される第５決定ユニットと、
前記Ｋ個の第１クラスタのうちの前記第３数量が最大である第１クラスタを前記第１クラスタＣとして決定するように構成される第６決定ユニットと、
前記単一の画像データと前記第１クラスタＣとをマージするように構成される第３マージユニットと、を含む
項目１３に記載の画像の増分クラスタリング装置。
（項目１８）
前記Ｍは第４閾値以下であり、
マージした第１クラスタをＲ個の第３サブクラスタに分割し、前記Ｒ個の第３サブクラスタのうちの各第３サブクラスタの第５クラスタ中心を取得するように構成される第２分割モジュールであって、前記Ｒは１以上の整数である第２分割モジュールと、
前記Ｒが前記第４閾値以下である場合、前記Ｒ個の第３サブクラスタを残し、前記Ｒ個の第３サブクラスタに対応する前記第５クラスタ中心を用いて前記第１クラスタ中心を更新するように構成される第１更新モジュールと、
前記Ｒが前記第４閾値よりも大きい場合、前記Ｒ個の第３サブクラスタのうちの各第３サブクラスタ内の画像データの第４数量を取得するように構成される第２取得モジュールと、
前記第４数量に基づき、前記Ｒ個の第３サブクラスタを大きい順にソートして第４クラスタ系列を得て、前記第４クラスタ系列内の先頭からＰ個の第３サブクラスタを選択し、前記Ｐ個の第３サブクラスタに対応する前記第５クラスタ中心を用いて前記第１クラスタ中心を更新するように構成される第２更新モジュールであって、前記Ｐは前記第４閾値以下である第２更新モジュールと、をさらに含む
項目１１から１７のいずれか１項に記載の画像の増分クラスタリング装置。
（項目１９）
前記第１クラスタは前記第１画像データ集合内の画像データをクラスタリングすることで得られるものであり、前記第１分割モジュールは、
前記第１クラスタ内の画像データ間の第７類似度を取得し、類似度行列を得るように構成される取得サブモジュールと、
前記類似度行列に基づいて前記第１クラスタを前記Ｍ個の第１サブクラスタに分割するように構成される分割サブモジュールと、を含む
項目１１から１７のいずれか１項に記載の画像の増分クラスタリング装置。
（項目２０）
前記分割サブモジュールは、
前記第１クラスタ内の画像データを頂点として構成される連結グラフを取得するように構成される第７取得ユニットと、
前記類似度行列からサーチして前記連結グラフの頂点間の前記第７類似度を得るように構成されるサーチユニットと、
前記第７類似度が第５閾値よりも大きい複数の頂点を１つの第１サブクラスタとして分割し、前記Ｍ個の第１サブクラスタを得るように構成される第２分割ユニットと、を含む
項目１９項に記載の画像の増分クラスタリング装置。
（項目２１）
入力デバイス及び出力デバイスを含み、
１つ又は複数の命令を実現するのに適するプロセッサと、
前記プロセッサによりロードされて項目１から１０のいずれか１項に記載の方法を実行するのに適する１つ又は複数の命令が記憶されているコンピュータ記憶媒体と、をさらに含む、電子機器。
（項目２２）
プロセッサによりロードされて項目１から１０のいずれか１項に記載の方法を実行するのに適する１つ又は複数の命令が記憶されている、コンピュータ記憶媒体。
（項目２３）
プロセッサによりロードされて項目１から１０のいずれか１項に記載の方法を実行するのに適する１つ又は複数の命令を含む、コンピュータプログラム製品。 As can be seen from the above, embodiments of the present disclosure obtain a first cluster of a first image data set, divide the first cluster into M first sub-clusters, and divide the M first sub-clusters into wherein M is an integer greater than or equal to 1, obtaining a second image data set, using the first cluster centers to obtain the second image Merge the dataset with the first cluster. Thus, the first cluster is divided into a plurality of first sub-clusters, and the merging of the first cluster and the second image data set is realized based on the first cluster center of the first sub-cluster, and the plurality of By maintaining the first cluster center (i.e. sub-center), with the increase of the image data, the cluster center (which is the cluster center of the first cluster, i.e. the main center) is newly added image data influence It solves the problem of drifting with , which helps to make the clustering result more accurate, and the clustering effect is improved. Also, in the clustering process, the second image data set does not need to calculate similarity to the entire first image data set, which helps reduce computational complexity.
For example, the present application provides the following items.
(Item 1)
obtaining a first cluster of the first image data set;
dividing the first cluster into M first sub-clusters and obtaining a first cluster center corresponding to each first sub-cluster of the M first sub-clusters, wherein M is a step that is an integer greater than or equal to 1;
obtaining a second image data set and merging the second image data set and the first cluster using the first cluster center.
(Item 2)
wherein said first cluster comprises a first cluster A, a first cluster B and a first cluster C, and merging said second image data set and said first cluster using said first cluster center comprises:
if the second image data set contains a plurality of image data, clustering the plurality of image data to obtain isolated image data and a second cluster;
merging the isolated image data into the first cluster A using the first cluster center and merging the second cluster and the first cluster B using the first cluster center;
merging the single image data into the first cluster C using the first cluster center if there is only a single image data in the second image data set.
A method for incremental clustering of images according to item 1.
(Item 3)
The first cluster has a corresponding second cluster center, and before using the first cluster center to merge the second image data set and the first cluster,
Further comprising determining K first clusters from the first cluster using the second cluster center.
A method for incremental clustering of images according to item 2.
(Item 4)
said second cluster having a corresponding third cluster center, and using said second cluster center to determine K first clusters from said first cluster;
obtaining a first similarity between the isolated image data and the second cluster center;
sorting the first clusters in descending order based on the first similarity to obtain a first cluster sequence, and selecting K first clusters from the top in the first cluster sequence;
obtaining a second similarity between the third cluster center and the second cluster center;
sorting the first clusters in descending order based on the second similarity to obtain a second cluster sequence, and selecting K first clusters from the top in the second cluster sequence. , or
obtaining a third similarity between the single image data and the second cluster center;
sorting the first clusters in descending order based on the third similarity to obtain a third cluster sequence, and selecting the top K first clusters in the third cluster sequence.
A method for incremental clustering of images according to item 3.
(Item 5)
merging the isolated image data and the first cluster A using the first cluster center,
obtaining a fourth similarity between the isolated image data and a first cluster center D, wherein the first cluster center D is the first being the first cluster center corresponding to a sub-cluster;
For each first cluster of the K first clusters, determine a first quantity of the first cluster centers D within each first cluster for which the fourth similarity is greater than a first threshold. and
determining a first cluster among the K first clusters having the largest first quantity as the first cluster A;
merging the isolated image data and the first cluster A.
A method for incremental clustering of images according to item 3.
(Item 6)
merging the second cluster and the first cluster B using the first cluster center,
dividing the second cluster into N second sub-clusters and obtaining a fourth cluster center corresponding to each second sub-cluster of the N second sub-clusters, wherein N is a step that is an integer greater than or equal to 1;
obtaining a fifth similarity between the fourth cluster center and the first cluster center E, wherein the first cluster center E is the first cluster center of each first cluster among the K first clusters; being the first cluster center corresponding to a sub-cluster;
For each first cluster of the K first clusters, determine a second quantity of the first cluster centers E within each first cluster for which the fifth similarity is greater than a second threshold. and
determining, as the first cluster B, a first cluster among the K first clusters that has the largest second quantity;
merging the second cluster and the first cluster B.
A method for incremental clustering of images according to item 3.
(Item 7)
merging the single image data and the first cluster C using the first cluster center;
obtaining a sixth similarity measure between the single image data and a first cluster center F, wherein the first cluster center F is for each first cluster of the K first clusters; being the first cluster center corresponding to a first sub-cluster;
For each first cluster of the K first clusters, determine a third quantity of the first cluster centers F within each first cluster for which the sixth similarity is greater than a third threshold. and
determining, as the first cluster C, the first cluster having the largest third quantity among the K first clusters;
merging the single image data and the first cluster C.
A method for incremental clustering of images according to item 3.
(Item 8)
wherein M is less than or equal to a fourth threshold, and after merging the second image data set and the first cluster using the first cluster center;
dividing the merged first cluster into R third sub-clusters and obtaining a fifth cluster center of each third sub-cluster of said R third sub-clusters, wherein said R is 1 a step that is an integer greater than or equal to
if the R is less than or equal to the fourth threshold, retain the R third sub-clusters and update the first cluster center using the fifth cluster center corresponding to the R third sub-clusters; a step;
obtaining a fourth quantity of image data in each third sub-cluster of the R third sub-clusters, if the R is greater than the fourth threshold;
sorting the R third sub-clusters in ascending order according to the fourth quantity to obtain a fourth cluster sequence, selecting P third sub-clusters from the top in the fourth cluster sequence; updating the first cluster center with the fifth cluster center corresponding to P third subclusters, wherein the P is less than or equal to the fourth threshold.
8. A method for incremental clustering of images according to any one of items 1-7.
(Item 9)
The first cluster is obtained by clustering the image data in the first image data set, and the step of dividing the first cluster into M first sub-clusters comprises:
obtaining a seventh similarity between image data in the first cluster to obtain a similarity matrix;
dividing the first cluster into the M first sub-clusters based on the similarity matrix.
8. A method for incremental clustering of images according to any one of items 1-7.
(Item 10)
dividing the first cluster into the M first sub-clusters based on the similarity matrix;
obtaining a connected graph configured with the image data in the first cluster as vertices;
searching from the similarity matrix to obtain the seventh similarity between vertices of the connectivity graph;
dividing a plurality of vertices whose seventh similarity is greater than a fifth threshold as one first sub-cluster to obtain the M first sub-clusters.
A method for incremental clustering of images according to item 9.
(Item 11)
a first acquisition module configured to acquire a first cluster of the first image data set;
a first division configured to divide the first cluster into M first sub-clusters and obtain a first cluster center corresponding to each first sub-cluster of the M first sub-clusters; a first divided module, wherein M is an integer of 1 or more;
a merge module configured to obtain a second image data set and merge the second image data set and the first cluster using the first cluster center.
(Item 12)
The first cluster includes a first cluster A, a first cluster B and a first cluster C, and the merge module comprises:
a clustering sub-module configured to cluster the plurality of image data to obtain isolated image data and a second cluster, if the second image data set includes a plurality of image data;
a first merging sub-module configured to merge the isolated image data into the first cluster A using the first cluster center;
a second merging sub-module configured to merge the second cluster and the first cluster B using the first cluster center;
a third merge configured to merge the single image data into the first cluster C using the first cluster center if only a single image data exists in the second image data set; contains submodules and
12. Apparatus for incremental clustering of images according to item 11.
(Item 13)
The first cluster has a corresponding second cluster center, the merge module comprising:
further comprising a first determining sub-module configured to determine K first clusters from the first cluster using the second cluster center;
13. Apparatus for incremental clustering of images according to item 12.
(Item 14)
said second cluster having a corresponding third cluster center, said first determining sub-module comprising:
a first obtaining unit configured to obtain a first similarity between the isolated image data and the second cluster center;
A first cluster configured to sort the first clusters in descending order based on the first similarity to obtain a first cluster sequence, and select K first clusters from the top in the first cluster sequence. 1 sorting unit;
a second obtaining unit configured to obtain a second similarity measure between the third cluster center and the second cluster center;
A second cluster configured to sort the first clusters in descending order based on the second similarity to obtain a second cluster sequence, and select K first clusters from the top in the second cluster sequence. 2 sorting units, or
a third obtaining unit configured to obtain a third similarity between the single image data and the second cluster center;
A third cluster configured to sort the first clusters in descending order based on the third similarity to obtain a third cluster sequence, and select K first clusters from the top in the third cluster sequence. 3 sorting units and
14. Apparatus for incremental clustering of images according to item 13.
(Item 15)
The first merging submodule includes:
a fourth obtaining unit configured to obtain a fourth similarity measure between the isolated image data and a first cluster center D, wherein the first cluster center D is one of the K first clusters; a fourth obtaining unit that is the first cluster center corresponding to each first sub-cluster of each first cluster;
For each first cluster of the K first clusters, determine a first quantity of the first cluster centers D within each first cluster for which the fourth similarity is greater than a first threshold. a first determining unit configured to
a second determining unit configured to determine the first cluster among the K first clusters with the largest first quantity as the first cluster A;
a first merging unit configured to merge the isolated image data and the first cluster A.
14. Apparatus for incremental clustering of images according to item 13.
(Item 16)
The second merging submodule includes:
A first division configured to divide the second cluster into N second sub-clusters and obtain a fourth cluster center corresponding to each second sub-cluster of the N second sub-clusters. a first divisional unit, wherein N is an integer of 1 or more;
a fifth obtaining unit configured to obtain a fifth similarity measure between the fourth cluster center and a first cluster center E, wherein the first cluster center E is one of K first clusters a fifth obtaining unit that is the first cluster center corresponding to each first sub-cluster of each first cluster;
For each first cluster of the K first clusters, determine a second quantity of the first cluster centers E within each first cluster for which the fifth similarity is greater than a second threshold. a third determining unit configured to
a fourth determining unit configured to determine the first cluster among the K first clusters with the largest second quantity as the first cluster B;
a second merging unit configured to merge the second cluster and the first cluster B.
14. Apparatus for incremental clustering of images according to item 13.
(Item 17)
The third merging submodule includes:
a sixth obtaining unit configured to obtain a sixth similarity measure between the single image data and a first cluster center F, wherein the first cluster center F is one of the K first clusters a sixth obtaining unit that is the first cluster center corresponding to each first sub-cluster of each first cluster thereof;
For each first cluster of the K first clusters, determine a third quantity of the first cluster centers F within each first cluster for which the sixth similarity is greater than a third threshold. a fifth determining unit configured to
a sixth determining unit configured to determine the first cluster among the K first clusters for which the third quantity is the largest as the first cluster C;
a third merging unit configured to merge the single image data and the first cluster C.
14. Apparatus for incremental clustering of images according to item 13.
(Item 18)
The M is equal to or less than the fourth threshold,
A second splitting module configured to split the merged first cluster into R third sub-clusters and obtain a fifth cluster center of each third sub-cluster of said R third sub-clusters. a second segmented module, wherein R is an integer greater than or equal to 1;
if the R is less than or equal to the fourth threshold, retain the R third sub-clusters and update the first cluster center using the fifth cluster center corresponding to the R third sub-clusters; a first update module configured to:
a second acquisition module configured to acquire a fourth quantity of image data in each third sub-cluster of the R third sub-clusters, if the R is greater than the fourth threshold;
sorting the R third sub-clusters in ascending order according to the fourth quantity to obtain a fourth cluster sequence, selecting P third sub-clusters from the top in the fourth cluster sequence; a second update module configured to update the first cluster center using the fifth cluster center corresponding to P third sub-clusters, wherein the P is less than or equal to the fourth threshold; 2 update modules, further including
Apparatus for incremental clustering of images according to any one of items 11 to 17.
(Item 19)
The first cluster is obtained by clustering the image data in the first image data set, and the first division module includes:
an acquisition sub-module configured to acquire a seventh degree of similarity between image data in the first cluster to obtain a similarity matrix;
a splitting sub-module configured to split the first cluster into the M first sub-clusters based on the similarity matrix.
Apparatus for incremental clustering of images according to any one of items 11 to 17.
(Item 20)
The divided sub-modules are
a seventh obtaining unit configured to obtain a connectivity graph configured with the image data in the first cluster as vertices;
a search unit configured to search from the similarity matrix to obtain the seventh similarity between vertices of the connectivity graph;
a second segmentation unit configured to segment a plurality of vertices with the seventh similarity greater than a fifth threshold as one first sub-cluster to obtain the M first sub-clusters.
20. Apparatus for incremental clustering of images according to item 19.
(Item 21)
including input devices and output devices;
a processor suitable for implementing one or more instructions;
and a computer storage medium loaded by said processor storing one or more instructions suitable for carrying out the method of any one of items 1 to 10.
(Item 22)
A computer storage medium loaded by a processor and storing one or more instructions suitable for carrying out the method of any one of items 1 to 10.
(Item 23)
A computer program product comprising one or more instructions suitable to be loaded by a processor to carry out the method of any one of items 1 to 10.

Claims

obtaining a first cluster of the first image data set;
dividing the first cluster into M first sub-clusters and obtaining a first cluster center corresponding to each first sub-cluster of the M first sub-clusters, wherein M is a step that is an integer greater than or equal to 1;
obtaining a second image data set and merging the second image data set and the first cluster using the first cluster center.

wherein said first cluster comprises a first cluster A, a first cluster B and a first cluster C, and merging said second image data set and said first cluster using said first cluster center comprises:
if the second image data set contains a plurality of image data, clustering the plurality of image data to obtain isolated image data and a second cluster;
merging the isolated image data into the first cluster A using the first cluster center and merging the second cluster and the first cluster B using the first cluster center;
merging said single image data into said first cluster C using said first cluster center if only a single image data exists in said second image data set. A method for incremental clustering of images as described in .

The first cluster has a corresponding second cluster center, and before using the first cluster center to merge the second image data set and the first cluster,
3. The method of incremental clustering of images of claim 2, further comprising determining K first clusters from the first cluster using the second cluster centers.

said second cluster having a corresponding third cluster center, and using said second cluster center to determine K first clusters from said first cluster;
obtaining a first similarity between the isolated image data and the second cluster center;
sorting the first clusters in descending order based on the first similarity to obtain a first cluster sequence, and selecting K first clusters from the top in the first cluster sequence;
obtaining a second similarity between the third cluster center and the second cluster center;
sorting the first clusters in descending order based on the second similarity to obtain a second cluster sequence, and selecting K first clusters from the top in the second cluster sequence. , or
obtaining a third similarity between the single image data and the second cluster center;
sorting the first clusters in descending order based on the third similarity to obtain a third cluster sequence, and selecting K first clusters from the top in the third cluster sequence. A method for incremental clustering of images as described in .

merging the isolated image data and the first cluster A using the first cluster center,
obtaining a fourth similarity between the isolated image data and a first cluster center D, wherein the first cluster center D is the first being the first cluster center corresponding to a sub-cluster;
For each first cluster of the K first clusters, determine a first quantity of the first cluster centers D within each first cluster for which the fourth similarity is greater than a first threshold. and
determining a first cluster among the K first clusters having the largest first quantity as the first cluster A;
4. The method of incremental clustering of images of claim 3, comprising merging the isolated image data and the first cluster A.

merging the second cluster and the first cluster B using the first cluster center,
dividing the second cluster into N second sub-clusters and obtaining a fourth cluster center corresponding to each second sub-cluster of the N second sub-clusters, wherein N is a step that is an integer greater than or equal to 1;
obtaining a fifth similarity between the fourth cluster center and the first cluster center E, wherein the first cluster center E is the first cluster center of each first cluster among the K first clusters; being the first cluster center corresponding to a sub-cluster;
For each first cluster of the K first clusters, determine a second quantity of the first cluster centers E within each first cluster for which the fifth similarity is greater than a second threshold. and
determining, as the first cluster B, a first cluster among the K first clusters that has the largest second quantity;
4. The method of incremental clustering of images of claim 3, comprising merging the second cluster and the first cluster B.

merging the single image data and the first cluster C using the first cluster center;
obtaining a sixth similarity measure between the single image data and a first cluster center F, wherein the first cluster center F is for each first cluster of the K first clusters; being the first cluster center corresponding to a first sub-cluster;
For each first cluster of the K first clusters, determine a third quantity of the first cluster centers F within each first cluster for which the sixth similarity is greater than a third threshold. and
determining, as the first cluster C, the first cluster having the largest third quantity among the K first clusters;
4. The method of incremental clustering of images of claim 3, comprising merging the single image data and the first cluster C.

wherein M is less than or equal to a fourth threshold, and after merging the second image data set and the first cluster using the first cluster center;
dividing the merged first cluster into R third sub-clusters and obtaining a fifth cluster center of each third sub-cluster of said R third sub-clusters, wherein said R is 1 a step that is an integer greater than or equal to
if the R is less than or equal to the fourth threshold, retain the R third sub-clusters and update the first cluster center using the fifth cluster center corresponding to the R third sub-clusters; a step;
obtaining a fourth quantity of image data in each third sub-cluster of the R third sub-clusters, if the R is greater than the fourth threshold;
sorting the R third sub-clusters in ascending order according to the fourth quantity to obtain a fourth cluster sequence, selecting P third sub-clusters from the top in the fourth cluster sequence; 2. updating the first cluster center with the fifth cluster center corresponding to P third sub-clusters, wherein the P is less than or equal to the fourth threshold. 8. A method for incremental clustering of images according to any one of 1 to 7.

The first cluster is obtained by clustering the image data in the first image data set, and the step of dividing the first cluster into M first sub-clusters comprises:
obtaining a seventh similarity between image data in the first cluster to obtain a similarity matrix;
dividing said first cluster into said M first sub-clusters based on said similarity matrix.

dividing the first cluster into the M first sub-clusters based on the similarity matrix;
obtaining a connected graph configured with the image data in the first cluster as vertices;
searching from the similarity matrix to obtain the seventh similarity between vertices of the connectivity graph;
dividing a plurality of vertices with said seventh similarity greater than a fifth threshold as one first sub-cluster to obtain said M first sub-clusters. clustering method.

a first acquisition module configured to acquire a first cluster of the first image data set;
a first division configured to divide the first cluster into M first sub-clusters and obtain a first cluster center corresponding to each first sub-cluster of the M first sub-clusters; a first divided module, wherein M is an integer of 1 or more;
a merge module configured to obtain a second image data set and merge the second image data set and the first cluster using the first cluster center.

The first cluster includes a first cluster A, a first cluster B and a first cluster C, and the merge module comprises:
a clustering sub-module configured to cluster the plurality of image data to obtain isolated image data and a second cluster, if the second image data set includes a plurality of image data;
a first merging sub-module configured to merge the isolated image data into the first cluster A using the first cluster center;
a second merging sub-module configured to merge the second cluster and the first cluster B using the first cluster center;
a third merge configured to merge the single image data into the first cluster C using the first cluster center if only a single image data exists in the second image data set; 12. The apparatus for incremental clustering of images according to claim 11, comprising: a sub-module.

The first cluster has a corresponding second cluster center, the merge module comprising:
13. The apparatus for incremental clustering of images according to claim 12, further comprising a first determination sub-module configured to determine K first clusters from the first cluster using the second cluster centers.

said second cluster having a corresponding third cluster center, said first determining sub-module comprising:
a first obtaining unit configured to obtain a first similarity between the isolated image data and the second cluster center;
A first cluster configured to sort the first clusters in descending order based on the first similarity to obtain a first cluster sequence, and select K first clusters from the top in the first cluster sequence. 1 sorting unit;
a second obtaining unit configured to obtain a second similarity measure between the third cluster center and the second cluster center;
A second cluster configured to sort the first clusters in descending order based on the second similarity to obtain a second cluster sequence, and select K first clusters from the top in the second cluster sequence. 2 sorting units, or a third obtaining unit configured to obtain a third similarity between the single image data and the second cluster centers;
A third cluster configured to sort the first clusters in descending order based on the third similarity to obtain a third cluster sequence, and select K first clusters from the top in the third cluster sequence. 14. The apparatus for incremental clustering of images according to claim 13, comprising: 3 sorting units.

The first merging submodule includes:
a fourth obtaining unit configured to obtain a fourth similarity measure between the isolated image data and a first cluster center D, wherein the first cluster center D is one of the K first clusters; a fourth obtaining unit that is the first cluster center corresponding to each first sub-cluster of each first cluster;
For each first cluster of the K first clusters, determine a first quantity of the first cluster centers D within each first cluster for which the fourth similarity is greater than a first threshold. a first determining unit configured to
a second determining unit configured to determine the first cluster among the K first clusters with the largest first quantity as the first cluster A;
14. The apparatus for incremental clustering of images according to claim 13, comprising a first merging unit configured to merge the isolated image data and the first cluster A.

The second merging submodule includes:
A first division configured to divide the second cluster into N second sub-clusters and obtain a fourth cluster center corresponding to each second sub-cluster of the N second sub-clusters. a first divisional unit, wherein N is an integer of 1 or more;
a fifth obtaining unit configured to obtain a fifth similarity measure between the fourth cluster center and a first cluster center E, wherein the first cluster center E is one of K first clusters a fifth obtaining unit that is the first cluster center corresponding to each first sub-cluster of each first cluster;
For each first cluster of the K first clusters, determine a second quantity of the first cluster centers E within each first cluster for which the fifth similarity is greater than a second threshold. a third determining unit configured to
a fourth determining unit configured to determine the first cluster among the K first clusters with the largest second quantity as the first cluster B;
14. Apparatus for incremental clustering of images according to claim 13, comprising a second merging unit configured to merge said second cluster and said first cluster B.

The third merging submodule includes:
a sixth obtaining unit configured to obtain a sixth similarity measure between the single image data and a first cluster center F, wherein the first cluster center F is one of the K first clusters a sixth obtaining unit that is the first cluster center corresponding to each first sub-cluster of each first cluster thereof;
For each first cluster of the K first clusters, determine a third quantity of the first cluster centers F within each first cluster for which the sixth similarity is greater than a third threshold. a fifth determining unit configured to
a sixth determining unit configured to determine the first cluster among the K first clusters for which the third quantity is the largest as the first cluster C;
14. Apparatus for incremental clustering of images according to claim 13, comprising a third merging unit configured to merge said single image data and said first cluster C.

The M is equal to or less than the fourth threshold,
A second splitting module configured to split the merged first cluster into R third sub-clusters and obtain a fifth cluster center of each third sub-cluster of said R third sub-clusters. a second segmented module, wherein R is an integer greater than or equal to 1;
if the R is less than or equal to the fourth threshold, retain the R third sub-clusters and update the first cluster center using the fifth cluster center corresponding to the R third sub-clusters; a first update module configured to:
a second acquisition module configured to acquire a fourth quantity of image data in each third sub-cluster of the R third sub-clusters, if the R is greater than the fourth threshold;
sorting the R third sub-clusters in ascending order according to the fourth quantity to obtain a fourth cluster sequence, selecting P third sub-clusters from the top in the fourth cluster sequence; a second update module configured to update the first cluster center using the fifth cluster center corresponding to P third sub-clusters, wherein the P is less than or equal to the fourth threshold; 18. Apparatus for incremental clustering of images according to any one of claims 11 to 17, further comprising: 2 update modules.

The first cluster is obtained by clustering the image data in the first image data set, and the first division module includes:
an acquisition sub-module configured to acquire a seventh degree of similarity between image data in the first cluster to obtain a similarity matrix;
a splitting sub-module configured to split the first cluster into the M first sub-clusters based on the similarity matrix. Incremental clustering device.

The divided sub-modules are
a seventh obtaining unit configured to obtain a connectivity graph configured with the image data in the first cluster as vertices;
a search unit configured to search from the similarity matrix to obtain the seventh similarity between vertices of the connectivity graph;
a second segmentation unit configured to segment a plurality of vertices with the seventh similarity greater than a fifth threshold as one first sub-cluster to obtain the M first sub-clusters. Item 20. Apparatus for incremental clustering of images according to Item 19.

including input devices and output devices;
a processor suitable for implementing one or more instructions;
an electronic device, further comprising a computer storage medium loaded by said processor and storing one or more instructions suitable for carrying out the method of any one of claims 1 to 10.

A computer storage medium storing one or more instructions suitable for being loaded by a processor to carry out the method of any one of claims 1 to 10.

A computer program product containing one or more instructions suitable to be loaded by a processor to carry out the method of any one of claims 1 to 10.