KR20220038593A

KR20220038593A - Clustering method and apparatus, electronic device and storage medium

Info

Publication number: KR20220038593A
Application number: KR1020217037805A
Authority: KR
Inventors: 캉 왕; 시아오 진; 추이비 후앙; 유안쿠이 피아오; 유헹 첸
Original assignee: 저지앙 센스타임 테크놀로지 디벨롭먼트 컴퍼니 리미티드
Priority date: 2020-09-17
Filing date: 2021-05-25
Publication date: 2022-03-29
Also published as: WO2022057302A1; CN112101238A; JP2022552034A

Abstract

본 발명의 실시예는 클러스터링 방법 및 장치, 전자 기기 및 저장 매체에 관한 것이고, 상기 클러스터링 방법은, 얼굴 특징의 제1 클러스터에 대해 정량 분석을 수행하여, 상기 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 결정하는 단계 - 상기 제1 클러스터는 제1 임계값을 기반으로 클러스터링하여 획득된 것이고, 상기 기설정 클러스터링 조건은 상기 제1 클러스터에 포함된 다수의 얼굴 특징이 동일한 신분에 대응됨을 지시함 - ; 상기 제1 클러스터가 상기 기설정 클러스터링 조건에 부합되지 않을 경우, 상기 제1 클러스터에 포함된 상기 다수의 얼굴 특징을 해제하는 단계; 및 제2 임계값을 기반으로, 상기 다수의 얼굴 특징을 클러스터링하여 제2 클러스터를 결정하는 단계 - 상기 제2 임계값은 상기 제1 임계값보다 큼 - ; 을 포함한다. 본 발명의 실시예는 클러스터의 정확도를 효과적으로 향상시킬 수 있다. 이와 같이, 기설정 클러스터링 조건을 만족하지 않는 클러스터를 해제하고, 더 높은 임계값을 사용하여 해제된 얼굴 특징을 다시 클러스터링함으로써, 클러스터의 정확도를 효과적으로 향상시킬 수 있다.An embodiment of the present invention relates to a clustering method and apparatus, an electronic device, and a storage medium, wherein the clustering method performs quantitative analysis on a first cluster of facial features, so that the first cluster meets a preset clustering condition Determining whether or not - The first cluster is obtained by clustering based on a first threshold, and the preset clustering condition indicates that a plurality of facial features included in the first cluster correspond to the same identity. - ; releasing the plurality of facial features included in the first cluster when the first cluster does not meet the preset clustering condition; and determining a second cluster by clustering the plurality of facial features based on a second threshold, wherein the second threshold is greater than the first threshold. includes An embodiment of the present invention can effectively improve the accuracy of a cluster. In this way, cluster accuracy can be effectively improved by canceling clusters that do not satisfy the preset clustering condition and re-clustering the canceled facial features using a higher threshold value.

Description

Clustering method and apparatus, electronic device and storage medium

관련 출원의 상호 참조Cross-referencing of related applications

본 발명은 출원번호가 202010981204.3이고, 출원일이 2020년 9월 17일인 중국 특허 출원을 기반으로 제출되고, 해당 중국 특허 출원의 우선권을 주장하는 바, 상기 중국 특허 출원의 모든 내용은 참조로서 본 발명에 인용된다.The present invention is filed based on a Chinese patent application with an application number of 202010981204.3 and an filing date of September 17, 2020, and claims priority to the Chinese patent application, all contents of the Chinese patent application are incorporated herein by reference. are cited

본 발명은 컴퓨터 기술 분야에 관한 것으로, 특히 클러스터링 방법 및 장치, 전자 기기 및 저장 매체에 관한 것이다.The present invention relates to the field of computer technology, and more particularly to a clustering method and apparatus, an electronic device and a storage medium.

스마트 비디오 분석 분야에서 얼굴 클러스터링은 중요한 연구 방향이다. 스마트 비디오에서의 얼굴 스냅 사진에는 시공간 정보가 포함되고, 시공간 정보가 포함된 얼굴 스냅 사진에 대해 얼굴 클러스터링을 수행하여 파일을 형성함으로써, 한 사람의 궤적을 잘 분석할 수 있다. 도시 수준의 비디오 소스는 환경이 복잡하고 광선 조건이 나쁘며 해상도가 낮은 등 단점이 있어 클러스터링 결과의 정확도를 확보할 수 없다.Face clustering is an important research direction in the field of smart video analysis. A face snapshot in smart video includes spatiotemporal information, and by performing face clustering on a face snapshot including spatiotemporal information to form a file, the trajectory of a person can be well analyzed. Urban-level video sources have disadvantages such as complex environments, poor light conditions, and low resolution, so the accuracy of clustering results cannot be ensured.

본 발명은 클러스터링 방법 및 장치, 전자 기기 및 저장 매체의 기술적 해결수단을 제공한다.The present invention provides a technical solution for a clustering method and apparatus, an electronic device, and a storage medium.

본 발명의 제1 측면에 따르면, 얼굴 특징의 제1 클러스터에 대해 정량 분석을 수행하여, 상기 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 결정하는 단계 - 상기 제1 클러스터는 제1 임계값을 기반으로 클러스터링하여 획득된 것이고, 상기 기설정 클러스터링 조건은 상기 제1 클러스터에 포함된 다수의 얼굴 특징이 모두 동일한 신분에 대응됨을 지시함 - ; 상기 제1 클러스터가 상기 기설정 클러스터링 조건에 부합되지 않을 경우, 상기 제1 클러스터에 포함된 상기 다수의 얼굴 특징을 해제하는 단계; 및 제2 임계값을 기반으로, 상기 다수의 얼굴 특징을 클러스터링하여 제2 클러스터를 결정하는 단계 - 상기 제2 임계값은 상기 제1 임계값보다 큼 - ; 를 포함하는 클러스터링 방법을 제공한다.According to a first aspect of the present invention, performing quantitative analysis on a first cluster of facial features to determine whether the first cluster meets a preset clustering condition, wherein the first cluster is a first threshold value is obtained by clustering based on , and the preset clustering condition indicates that a plurality of facial features included in the first cluster all correspond to the same identity; releasing the plurality of facial features included in the first cluster when the first cluster does not meet the preset clustering condition; and determining a second cluster by clustering the plurality of facial features based on a second threshold, wherein the second threshold is greater than the first threshold. It provides a clustering method comprising

일 가능한 구현 형태에서, 상기 다수의 얼굴 특징에는 상기 제1 클러스터에 대응되는 하나의 클래스 중심 얼굴 특징이 포함되고; 상기 제1 클러스터에 대해 정량 분석을 수행하여, 상기 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 결정하는 단계는, 상기 클래스 중심 얼굴 특징과 상기 제1 클러스터 중 다른 상기 얼굴 특징 사이의 코사인 거리를 결정하는 단계; 상기 클래스 중심 얼굴 특징과 상기 제1 클러스터 중 다른 상기 얼굴 특징 사이의 코사인 거리에 따라, 평균 거리 및 표준편차 거리를 결정하는 단계; 및 상기 평균 거리가 제3 임계값보다 큰 것 및 상기 표준편차 거리가 제4 임계값보다 작은 것 중 적어도 하나를 만족할 경우, 상기 제1 클러스터가 상기 기설정 클러스터링 조건에 부합되지 않는 것으로 결정하는 단계를 포함한다.In one possible implementation form, the plurality of facial features includes one class-centric facial feature corresponding to the first cluster; The performing quantitative analysis on the first cluster to determine whether the first cluster satisfies a preset clustering condition includes: a cosine distance between the class-centered facial feature and the other facial feature in the first cluster determining; determining a mean distance and a standard deviation distance according to a cosine distance between the class-centric facial feature and the other facial feature in the first cluster; and when the average distance is greater than a third threshold value and the standard deviation distance is less than a fourth threshold value, determining that the first cluster does not meet the preset clustering condition includes

일 가능한 구현 형태에서, 상기 제2 임계값을 기반으로, 상기 다수의 얼굴 특징을 클러스터링하여 제2 클러스터를 결정하는 단계 후에, 상기 클러스터링 방법은, 상기 제2 클러스터 및 상기 기설정 클러스터링 조건에 부합되는 상기 제1 클러스터 중 적어도 하나에 따라, N개의 기지(known) 클러스터를 결정하는 단계 - 여기서, N≥1임; 및 상기 N개의 기지 클러스터에 따라, 클러스터링할 얼굴 특징을 분류하는 단계를 더 포함한다.In one possible implementation form, after determining a second cluster by clustering the plurality of facial features based on the second threshold value, the clustering method may include: determining, according to at least one of the first clusters, N known clusters, where N≧1; and classifying facial features to be clustered according to the N known clusters.

일 가능한 구현 형태에서, 상기 제2 클러스터 및 상기 기설정 클러스터링 조건에 부합되는 상기 제1 클러스터 중 적어도 하나에 따라, N개의 기지 클러스터를 결정하는 단계 후에, 상기 클러스터링 방법은, 상기 N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 클러스터링하여, 클래스 중심 클러스터를 획득하는 단계; 및 상기 클래스 중심 클러스터에 포함된 각 클래스 중심 얼굴 특징에 대응되는 상기 기지 클러스터를 병합하여, 동일한 신분에 대응되는 병합된 클러스터를 획득하는 단계를 더 포함한다.In one possible implementation form, after determining N known clusters according to at least one of the second cluster and the first cluster meeting the preset clustering condition, the clustering method includes: clustering corresponding class-centric facial features to obtain a class-centric cluster; and merging the known clusters corresponding to each class-centric facial feature included in the class-centric cluster to obtain a merged cluster corresponding to the same identity.

일 가능한 구현 형태에서, 상기 N개의 기지 클러스터에 따라, 클러스터링할 얼굴 특징을 분류하는 단계는, 상기 N개의 기지 클러스터에 타깃 기지 클러스터가 존재하는지 여부를 결정하는 단계 - 상기 클러스터링할 얼굴 특징과 상기 타깃 기지 클러스터에 대응되는 클래스 중심 얼굴 특징 사이의 코사인 거리는 제5 임계값보다 작음 - ; 및 상기 N개의 기지 클러스터에 상기 타깃 기지 클러스터가 존재할 경우, 상기 클러스터링할 얼굴 특징을 상기 타깃 기지 클러스터에 분류하는 단계를 포함한다.In one possible implementation form, the classifying the facial feature to be clustered according to the N known clusters includes: determining whether a target known cluster exists in the N known clusters - the clustering facial feature and the target the cosine distance between the class-centric facial features corresponding to the known cluster is less than the fifth threshold; and when the target known clusters exist in the N known clusters, classifying the facial features to be clustered into the target known clusters.

일 가능한 구현 형태에서, 상기 N개의 기지 클러스터에 타깃 기지 클러스터가 존재하는지 여부를 결정하는 단계는, k-최근접 이웃 알고리즘을 사용하여, 상기 N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징에서 상기 클러스터링할 얼굴 특징과의 유클리디언 거리가 가장 가까운 k개의 클래스 중심 얼굴 특징을 결정하는 단계 - 여기서, N≥k≥1임 - ; 상기 클러스터링할 얼굴 특징과 상기 k개의 클래스 중심 얼굴 특징 사이의 코사인 거리를 각각 결정하는 단계; 및 코사인 거리가 상기 제5 임계값보다 작은 클래스 중심 얼굴 특징에 대응되는 기지 클러스터를 상기 타깃 기지 클러스터로 결정하는 단계를 포함한다.In one possible implementation form, the step of determining whether a target known cluster exists in the N known clusters includes using a k-nearest neighbor algorithm to cluster the clustering in class-centric facial features corresponding to the N known clusters. determining k class-centric facial features with closest Euclidean distances to the facial features to be performed, where N≥k≥1; determining a cosine distance between the facial features to be clustered and the k class-centered facial features, respectively; and determining, as the target known cluster, a known cluster corresponding to a class-centered facial feature having a cosine distance smaller than the fifth threshold.

일 가능한 구현 형태에서, 상기 N개의 기지 클러스터에 따라, 클러스터링할 얼굴 특징을 분류하는 단계 후에, 상기 클러스터링 방법은, 상기 클러스터링할 얼굴 특징을 분류한 결과를 기반으로, 상기 N개의 기지 클러스터에 포함된 다수의 얼굴 특징을 업데이트하는 단계; 및 임의의 하나의 상기 기지 클러스터에 대해, 상기 기지 클러스터에 대응되는 업데이트된 다수의 얼굴 특징에 따라, 상기 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 업데이트하는 단계를 더 포함한다.In one possible implementation form, after classifying facial features to be clustered according to the N known clusters, the clustering method includes, based on a result of classifying the facial features to be clustered, included in the N known clusters. updating the plurality of facial features; and updating, for any one of the known clusters, a class-centric facial feature corresponding to the known cluster according to the plurality of updated facial features corresponding to the known cluster.

일 가능한 구현 형태에서, 상기 N개의 기지 클러스터에 타깃 기지 클러스터가 존재하는지 여부를 결정하는 단계 후에, 상기 클러스터링 방법은, 상기 N개의 기지 클러스터에 상기 타깃 기지 클러스터가 존재하지 않을 경우, 상기 클러스터링할 얼굴 특징을 미분류 얼굴 특징으로 결정하는 단계; 및 k-최근접 이웃 알고리즘 및 그래프 알고리즘을 사용하여, 기설정 지속 시간에서의 다수의 미분류 얼굴 특징을 클러스터링하여, 새로 추가된 클러스터를 결정하는 단계를 더 포함한다.In one possible implementation form, after determining whether a target known cluster exists in the N known clusters, the clustering method includes: When the target known cluster does not exist in the N known clusters, the face to be clustered determining the feature as an unclassified facial feature; and clustering a plurality of unclassified facial features at a preset duration by using a k-nearest neighbor algorithm and a graph algorithm to determine a newly added cluster.

일 가능한 구현 형태에서, 상기k-최근접 이웃 알고리즘 및 그래프 알고리즘을 사용하여, 기설정 지속 시간에서의 다수의 미분류 얼굴 특징을 클러스터링하여, 새로 추가된 클러스터를 결정하는 단계 후에, 상기 클러스터링 방법은, 상기 새로 추가된 클러스터에 따라, 상기 N개의 기지 클러스터를 업데이트하는 단계를 더 포함한다.In one possible implementation form, after determining a newly added cluster by clustering a plurality of unclassified facial features at a preset duration using the k-nearest neighbor algorithm and the graph algorithm, the clustering method comprises: The method further includes updating the N base clusters according to the newly added cluster.

본 발명의 일 측면에 따르면, 얼굴 특징의 제1 클러스터에 대해 정량 분석을 수행하여, 상기 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 결정하도록 구성되는 정량 분석 모듈 - 상기 제1 클러스터는 제1 임계값을 기반으로 클러스터링하여 획득된 것이고, 상기 기설정 클러스터링 조건은 상기 제1 클러스터에 포함된 다수의 얼굴 특징이 동일한 신분에 대응됨을 지시함 - ; 상기 제1 클러스터가 상기 기설정 클러스터링 조건에 부합되지 않을 경우, 상기 제1 클러스터에 포함된 상기 다수의 얼굴 특징을 해제하도록 구성되는 해제 모듈; 및 제2 임계값을 기반으로, 상기 다수의 얼굴 특징을 클러스터링하여 제2 클러스터를 결정하도록 구성되는 클러스터링 모듈 - 상기 제2 임계값은 상기 제1 임계값보다 큼 - ; 을 포함하는 클러스터링 장치를 제공한다.According to one aspect of the present invention, a quantitative analysis module, configured to perform quantitative analysis on a first cluster of facial features to determine whether the first cluster meets a preset clustering condition, wherein the first cluster is a first cluster. It is obtained by clustering based on a threshold value of 1, and the preset clustering condition indicates that a plurality of facial features included in the first cluster correspond to the same identity; a release module, configured to cancel the plurality of facial features included in the first cluster when the first cluster does not meet the preset clustering condition; and a clustering module, configured to cluster the plurality of facial features to determine a second cluster based on a second threshold, wherein the second threshold is greater than the first threshold; It provides a clustering device comprising a.

일 가능한 구현 형태에서, 상기 다수의 얼굴 특징에는 상기 제1 클러스터에 대응되는 하나의 클래스 중심 얼굴 특징이 포함되고; 상기 정량 분석 모듈은, 상기 클래스 중심 얼굴 특징과 상기 제1 클러스터 중 다른 상기 얼굴 특징 사이의 코사인 거리를 결정하도록 구성되는 제1 결정 서브 모듈; 상기 클래스 중심 얼굴 특징과 상기 제1 클러스터 중 다른 상기 얼굴 특징 사이의 코사인 거리에 따라, 평균 거리 및 표준편차 거리를 결정하도록 구성되는 제2 결정 서브 모듈; 및 상기 평균 거리가 제3 임계값보다 큰 것 및 상기 표준편차 거리가 제4 임계값보다 작은 것 중 적어도 하나를 만족할 경우, 상기 제1 클러스터가 상기 기설정 클러스터링 조건에 부합되지 않는 것으로 결정하도록 구성되는 제3 결정 서브 모듈을 포함한다.In one possible implementation form, the plurality of facial features includes one class-centric facial feature corresponding to the first cluster; The quantitative analysis module includes: a first determining submodule, configured to determine a cosine distance between the class-centric facial feature and the other facial feature in the first cluster; a second determining submodule, configured to determine a mean distance and a standard deviation distance according to a cosine distance between the class-centric facial feature and the other facial feature in the first cluster; and when the average distance is greater than a third threshold value and the standard deviation distance is less than a fourth threshold value, it is determined that the first cluster does not meet the preset clustering condition. and a third determining sub-module.

일 가능한 구현 형태에서, 상기 클러스터링 장치는, 상기 제2 클러스터 및 상기 기설정 클러스터링 조건에 부합되는 상기 제1 클러스터 중 적어도 하나에 따라, N개의 기지 클러스터를 결정하도록 구성되는 제1 결정 모듈 - 여기서, N≥1임 - ; 및 상기 N개의 기지 클러스터에 따라, 클러스터링할 얼굴 특징을 분류하도록 구성되는 분류 모듈을 더 포함한다.In one possible implementation form, the clustering device includes: a first determining module, configured to determine N known clusters according to at least one of the second cluster and the first cluster meeting the preset clustering condition, wherein: N≥1 - ; and a classification module, configured to classify facial features to be clustered according to the N known clusters.

일 가능한 구현 형태에서, 상기 클러스터링 모듈은 또한, 상기 N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 클러스터링하여, 클래스 중심 클러스터를 획득하도록 구성되고; 상기 클러스터링 장치는, 상기 클래스 중심 클러스터에 포함된 각 클래스 중심 얼굴 특징에 대응되는 상기 기지 클러스터를 병합하여, 동일한 신분에 대응되는 병합된 클러스터를 획득하도록 구성되는 병합 모듈을 더 포함한다.In one possible implementation form, the clustering module is further configured to cluster class-centric facial features corresponding to the N known clusters to obtain a class-centric cluster; The clustering device further includes a merging module, configured to merge the known clusters corresponding to each class-centric facial feature included in the class-centric cluster to obtain a merged cluster corresponding to the same identity.

일 가능한 구현 형태에서, 상기 분류 모듈은, 상기 N개의 기지 클러스터에 타깃 기지 클러스터가 존재하는지 여부를 결정하도록 구성되는 제4 결정 서브 모듈 - 상기 클러스터링할 얼굴 특징과 상기 타깃 기지 클러스터에 대응되는 클래스 중심 얼굴 특징 사이의 코사인 거리는 제5 임계값보다 작음 - ; 및 상기 N개의 기지 클러스터에 상기 타깃 기지 클러스터가 존재할 경우, 상기 클러스터링할 얼굴 특징을 상기 타깃 기지 클러스터에 분류하도록 구성되는 분류 서브 모듈을 포함한다.In one possible implementation form, the classification module includes: a fourth determining submodule, configured to determine whether a target known cluster exists in the N known clusters - the facial feature to be clustered and a class center corresponding to the target known cluster cosine distance between facial features is less than the fifth threshold - ; and a classification sub-module, configured to classify the facial feature to be clustered into the target known cluster when the target known cluster exists in the N known clusters.

일 가능한 구현 형태에서, 상기 제4 결정 서브 모듈은, k-최근접 이웃 알고리즘을 사용하여, 상기 N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징에서 상기 클러스터링할 얼굴 특징과의 유클리디언 거리가 가장 가까운 k개의 클래스 중심 얼굴 특징을 결정하도록 구성되는 제1 결정 유닛 - 여기서, N≥k≥1임 - ; 상기 클러스터링할 얼굴 특징과 상기 k개의 클래스 중심 얼굴 특징 사이의 코사인 거리를 각각 결정하도록 구성되는 제2 결정 유닛; 및 코사인 거리가 상기 제5 임계값보다 작은 클래스 중심 얼굴 특징에 대응되는 기지 클러스터를 상기 타깃 기지 클러스터로 결정하도록 구성되는 제3 결정 유닛을 포함한다.In one possible implementation form, the fourth determining submodule is, by using a k-nearest neighbor algorithm, the Euclidean distance from the class-centered facial feature corresponding to the N known clusters to the facial feature to be clustered is the greatest. a first determining unit, configured to determine the nearest k class-centric facial features, wherein N≥k≥1; a second determining unit, configured to respectively determine a cosine distance between the facial features to be clustered and the k class-centric facial features; and a third determining unit, configured to determine, as the target known cluster, a known cluster corresponding to a class-centric facial feature whose cosine distance is less than the fifth threshold value.

일 가능한 구현 형태에서, 상기 클러스터링 장치는, 상기 클러스터링할 얼굴 특징을 분류한 결과를 기반으로, 상기 N개의 기지 클러스터에 포함된 다수의 얼굴 특징을 업데이트하도록 구성되는 제1 업데이트 모듈; 및 임의의 하나의 상기 기지 클러스터에 대해, 상기 기지 클러스터에 대응되는 업데이트된 다수의 얼굴 특징에 따라, 상기 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 업데이트하도록 구성되는 제2 업데이트 모듈을 더 포함한다.In one possible implementation form, the clustering device may include: a first update module configured to update a plurality of facial features included in the N known clusters based on a result of classifying the facial features to be clustered; and a second update module, configured to update, for any one of the known clusters, a class-centric facial feature corresponding to the known cluster according to the plurality of updated facial features corresponding to the known cluster.

일 가능한 구현 형태에서, 상기 클러스터링 장치는, 상기 N개의 기지 클러스터에 상기 타깃 기지 클러스터가 존재하지 않을 경우, 상기 클러스터링할 얼굴 특징을 미분류 얼굴 특징으로 결정하도록 구성되는 제2 결정 모듈을 더 포함하고; 상기 클러스터링 모듈은 또한, k-최근접 이웃 알고리즘 및 그래프 알고리즘을 사용하여, 기설정 지속 시간에서의 다수의 미분류 얼굴 특징을 클러스터링하여, 새로 추가된 클러스터를 결정하도록 구성된다.In one possible implementation form, the clustering device further comprises a second determining module, configured to determine the clustered facial feature as an unclassified facial feature when the target known cluster does not exist in the N known clusters; The clustering module is also configured to cluster a plurality of unclassified facial features at a preset duration, using a k-nearest neighbor algorithm and a graph algorithm, to determine a newly added cluster.

일 가능한 구현 형태에서, 상기 클러스터링 장치는, 상기 새로 추가된 클러스터에 따라, 상기 N개의 기지 클러스터를 업데이트하도록 구성되는 제3 업데이트 모듈을 더 포함한다.In one possible implementation form, the clustering device further includes a third update module, configured to update the N base clusters according to the newly added cluster.

본 발명의 일 측면에 따르면, 프로세서 및 프로세서에 의해 실행 가능한 명령이 저장되도록 구성되는 메모리를 포함하는 전자 기기를 제공하고, 상기 프로세서는 상기 메모리에 저장된 명령을 호출하여, 상기 클러스터링 방법이 수행한다. According to one aspect of the present invention, there is provided an electronic device including a processor and a memory configured to store instructions executable by the processor, the processor calls the instructions stored in the memory, and the clustering method is performed.

본 발명의 일 측면에 따르면, 컴퓨터 프로그램 명령이 저장된 컴퓨터 판독 가능 저장 매체를 제공하고, 상기 컴퓨터 프로그램 명령이 프로세서에 의해 실행될 경우, 상기 프로세서로 하여금 상기 클러스터링 방법을 구현하도록 한다.According to one aspect of the present invention, there is provided a computer readable storage medium storing computer program instructions, and when the computer program instructions are executed by a processor, the processor causes the clustering method to be implemented.

본 발명은 컴퓨터 판독 가능 코드를 포함하는 컴퓨터 프로그램을 제공하고, 상기 컴퓨터 판독 가능 코드가 전자 기기에서 실행될 경우, 상기 전자 기기의 프로세서로 하여금 상기 임의의 하나에 따른 클러스터링 방법이 구현하도록 한다.The present invention provides a computer program including computer readable code, and when the computer readable code is executed in an electronic device, causes a processor of the electronic device to implement the clustering method according to any one of the above.

본 발명의 실시예에서는 제1 임계값을 기반으로 클러스터링하여 획득된 얼굴 특징의 제1 클러스터에 대해 정량 분석을 수행하여, 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 결정하고, 기설정 클러스터링 조건은 제1 클러스터에 포함된 다수의 얼굴 특징이 모두 동일한 신분에 대응됨을 지시하며, 제1 클러스터가 기설정 클러스터링 조건에 부합되지 않을 경우, 제1 클러스터에 포함된 다수의 얼굴 특징을 해제하고, 제1 임계값보다 큰 제2 임계값을 기반으로, 해제된 다수의 얼굴 특징을 클러스터링하여 제2 클러스터를 결정한다. 기설정 클러스터링 조건을 만족하지 않는 클러스터를 해제하고, 더 높은 임계값을 이용하여 해제된 얼굴 특징을 다시 클러스터링함으로써, 클러스터의 정확도를 효과적으로 향상시킬 수 있다.In an embodiment of the present invention, quantitative analysis is performed on a first cluster of facial features obtained by clustering based on a first threshold to determine whether the first cluster meets a preset clustering condition, and to perform preset clustering The condition indicates that a plurality of facial features included in the first cluster all correspond to the same identity, and when the first cluster does not meet the preset clustering condition, cancel the plurality of facial features included in the first cluster, A second cluster is determined by clustering a plurality of released facial features based on a second threshold value greater than the first threshold value. By canceling the clusters that do not satisfy the preset clustering condition and re-clustering the canceled facial features using a higher threshold, the accuracy of the clusters can be effectively improved.

위의 일반적인 설명 및 후술되는 세부사항에 대한 설명은 예시적이고 해석을 위한 것일 뿐, 본 발명을 제한하기 위함이 아니다. 이하 도면을 참고하여 예시적인 실시예를 상세하게 설명하면, 본 발명의 다른 특징 및 양태가 명확해질 것이다.The general description above and the description of the details set forth below are illustrative and for interpretation purposes only, and not for limiting the present invention. Other features and aspects of the present invention will become apparent when exemplary embodiments are described in detail below with reference to the drawings.

여기서의 도면은 명세서에 병합되어 본 명세서의 일부분을 구성하고, 이러한 도면들은 본 발명에 부합되는 실시예를 나타내며, 명세서와 함께 본 발명의 기술적 해결수단을 해석하기 위한 것이다.
도 1a는 본 발명의 실시예에 따른 클러스터링 방법의 네트워크 아키텍처 모식도를 도시한다.
도 1b는 본 발명의 실시예에 따른 클러스터링 방법의 흐름도를 도시한다.
도 2는 본 발명의 실시예에 따른 클러스터링 방법의 모식도를 도시한다.
도 3은 본 발명의 실시예에 따른 클러스터링 장치의 블록도를 도시한다.
도 4는 본 발명의 실시예의 컴퓨터 기기의 구성 구조 모식도이다.The drawings herein are incorporated in and constitute a part of this specification, and these drawings show embodiments consistent with the present invention, and together with the specification are for interpreting the technical solutions of the present invention.
1A is a schematic diagram of a network architecture of a clustering method according to an embodiment of the present invention.
1B shows a flowchart of a clustering method according to an embodiment of the present invention.
2 shows a schematic diagram of a clustering method according to an embodiment of the present invention.
3 is a block diagram of a clustering apparatus according to an embodiment of the present invention.
Fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

아래에서는 도면을 참고하여 본 발명의 다양한 예시적인 실시예, 특징 및 양태를 상세히 설명한다. 도면에서 동일한 도면 부호는 기능이 동일하거나 유사한 요소를 나타낸다. 도면에서 실시예의 다양한 양태를 도시하였을 지라도, 특별히 지적하지 않는 한, 비율에 따라 도면을 작성하지 않아도 된다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Various exemplary embodiments, features and aspects of the present invention will be described in detail below with reference to the drawings. In the drawings, the same reference numbers indicate elements having the same or similar functions. Although the drawings show various aspects of the embodiment, it is not necessary to draw the drawings according to the proportions unless otherwise indicated.

여기서 "예시적”이라는 전용 단어는 “예, 실시예 또는 예시로 사용됨”을 의미한다. 여기서 “예시적”으로 설명되는 임의의 실시예는 다른 실시예보다 우수하거나 더 나은 것으로 해석될 필요는 없다.The dedicated word “exemplary” herein means “used as an example, embodiment, or illustration.” Any embodiment described herein as “exemplary” is not necessarily to be construed as superior or superior to another embodiment. .

본 명세서의 용어 “및/또는”은 연관 대상을 설명하는 연관 관계로서, 3가지 관계가 존재할 수 있음을 나타낸다. 예를 들면, A 및/또는 B는, A만 존재, A와 B가 동시에 존재, B만 존재하는 3가지 경우를 나타낼 수 있다. 이밖에, 본 명세서의 용어 “적어도 하나”는 여러가지 중 어느 하나 또는 여러가지 중 적어도 두 개의 임의의 조합을 나타낸다. 예를 들어, A, B, C 중 적어도 하나는 A, B 및 C로 구성된 집합에서 선택되는 어느 하나 또는 다수의 요소를 포함함을 나타낼 수 있다.As used herein, the term “and/or” is an association relationship describing an object to be associated, and indicates that three relationships may exist. For example, A and/or B may represent three cases in which only A exists, A and B exist simultaneously, and only B exists. In addition, as used herein, the term “at least one” refers to any one of various or any combination of at least two of various. For example, it may indicate that at least one of A, B, and C includes any one or a plurality of elements selected from the set consisting of A, B, and C.

이 밖에, 본 발명의 실시예를 더욱 잘 설명하기 위해, 아래의 구체적인 실시형태에서는 많은 구체적인 세부 사항이 제공된다. 본 기술분야의 통상의 기술자라면 일부 구체적인 세부 사항 없이도 본 발명의 실시예가 마찬가지로 구현될 수 있음을 이해해야 한다. 일부 구현예에서, 본 기술분야의 통상의 기술자에게 익숙한 방법, 수단, 요소 및 회로에 대하여 상세히 설명하지 않음으로써 본 발명의 실시예의 요지를 뚜렷하게 한다.In addition, in order to better explain the embodiments of the present invention, numerous specific details are provided in the specific embodiments below. It should be understood by those skilled in the art that the embodiments of the present invention may be likewise implemented without some specific details. In some embodiments, the gist of the embodiments of the present invention is obscured by not describing in detail methods, means, elements and circuits familiar to those skilled in the art.

본 발명의 실시예에서는 도 1a에 도시된 네트워크 아키텍처를 통해 다수의 얼굴 특징에 대한 클러스터링을 구현할 수 있고, 도 1a는 본 발명의 실시예에 따른 클러스터링 방법의 네트워크 아키텍처 모식도를 도시하며, 상기 네트워크 아키텍처에는 사용자 단말기(101), 네트워크(102) 및 얼굴 특징의 클러스터링 단말기(103)가 포함된다. 하나의 예시적인 응용을 지원하기 위해, 사용자 단말기(101) 및 얼굴 특징의 클러스터링 단말기(103)는 네트워크(102)를 통해 통신 연결이 셋업되고, 사용자 단말기(101)는 획득된 다수의 얼굴 특징을 클러스터링할 경우, 우선 네트워크(102)를 통해 얼굴 특징의 제1 클러스터를 얼굴 특징의 클러스터링 단말기(103)로 송신해야 하며, 다음, 얼굴 특징의 클러스터링 단말기(103)는 제1 클러스터에 대해 정량 분석을 수행하여, 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 판단하며, 그 다음, 제1 클러스터가 기설정 클러스터링 조건에 부합되지 않으면, 제1 클러스터에 포함된 다수의 얼굴 특징을 해제하고, 마지막으로, 더 높은 임계값인 제2 임계값을 사용하여 다수의 얼굴 특징을 다시 클러스터링하여 제2 클러스터를 획득한다. 이와 같이, 기설정 클러스터링 조건을 만족하지 않는 클러스터를 해제하고, 더 높은 임계값을 사용하여 해제된 얼굴 특징을 다시 클러스터링함으로써, 클러스터의 정확도를 효과적으로 향상시킬 수 있다.In an embodiment of the present invention, clustering for a plurality of facial features can be implemented through the network architecture shown in Fig. 1A, and Fig. 1A shows a network architecture schematic diagram of a clustering method according to an embodiment of the present invention, the network architecture includes a user terminal 101 , a network 102 , and a clustering terminal 103 of facial features. In order to support one exemplary application, the user terminal 101 and the clustering terminal 103 of facial features establish a communication connection through the network 102, and the user terminal 101 collects a plurality of acquired facial features. In the case of clustering, first, the first cluster of facial features must be transmitted to the clustering terminal 103 of the facial features through the network 102, and then, the clustering terminal 103 of the facial features performs quantitative analysis on the first cluster. to determine whether the first cluster meets the preset clustering condition, and then, if the first cluster does not meet the preset clustering condition, release a plurality of facial features included in the first cluster, and finally , a second cluster is obtained by re-clustering a plurality of facial features using a second threshold, which is a higher threshold. In this way, cluster accuracy can be effectively improved by canceling clusters that do not satisfy the preset clustering condition and re-clustering the canceled facial features using a higher threshold value.

도 1b는 본 발명의 실시예에 따른 클러스터링 방법의 흐름도를 도시한다. 상기 클러스터링 방법은 단말 기기 또는 서버 등 전자 기기에 의해 수행될 수 있고, 단말 기기는 사용자 기기(User Equipment, UE), 모바일 기기, 사용자 단말기, 단말기, 셀룰러폰, 무선 전화, 개인 휴대 정보 단말기(Personal Digital Assistant, PDA), 핸드헬드 기기, 컴퓨팅 기기, 차량 탑재 기기, 웨어러블 기기 등일 수 있으며, 상기 클러스터링 방법은 프로세서를 통해 메모리에 저장된 컴퓨터 판독 가능 명령을 호출하는 방식으로 구현될 수 있다. 또는, 서버를 통해 상기 클러스터링 방법을 수행할 수 있다. 도 1에 도시된 바와 같이, 상기 클러스터링 방법은 하기와 같은 단계를 포함할 수 있다.1B shows a flowchart of a clustering method according to an embodiment of the present invention. The clustering method may be performed by an electronic device such as a terminal device or a server, and the terminal device is a user equipment (UE), a mobile device, a user terminal, a terminal, a cellular phone, a wireless telephone, and a personal information terminal (Personal). Digital Assistant (PDA), a handheld device, a computing device, an in-vehicle device, a wearable device, and the like, and the clustering method may be implemented by calling a computer readable command stored in a memory through a processor. Alternatively, the clustering method may be performed through a server. 1 , the clustering method may include the following steps.

단계 S11에서, 얼굴 특징의 제1 클러스터에 대해 정량 분석을 수행하여, 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 결정하며, 제1 클러스터는 제1 임계값을 기반으로 클러스터링하여 획득된 것이고, 기설정 클러스터링 조건은 제1 클러스터에 포함된 다수의 얼굴 특징이 동일한 신분에 대응됨을 지시한다.In step S11, quantitative analysis is performed on the first cluster of facial features to determine whether the first cluster meets a preset clustering condition, and the first cluster is obtained by clustering based on a first threshold value , the preset clustering condition indicates that a plurality of facial features included in the first cluster correspond to the same identity.

단계 S12에서, 제1 클러스터가 기설정 클러스터링 조건에 부합되지 않을 경우, 제1 클러스터에 포함된 다수의 얼굴 특징을 해제한다.In step S12, if the first cluster does not meet the preset clustering condition, a plurality of facial features included in the first cluster are released.

단계 S13에서, 제2 임계값을 기반으로, 다수의 얼굴 특징을 클러스터링하여 제2 클러스터를 결정하고, 제2 임계값은 제1 임계값보다 크다.In step S13, a second cluster is determined by clustering a plurality of facial features based on the second threshold, wherein the second threshold is greater than the first threshold.

기설정 클러스터링 조건을 만족하지 않는 클러스터를 해제하고, 더 높은 임계값을 이용하여 해제된 얼굴 특징을 다시 클러스터링함으로써, 클러스터의 정확도를 효과적으로 향상시킬 수 있다.By canceling the clusters that do not satisfy the preset clustering condition and re-clustering the canceled facial features using a higher threshold, the accuracy of the clusters can be effectively improved.

비디오 소스는 환경이 복잡하고 광선 조건이 나쁘며 해상도가 낮은 등 단점이 있으므로, 비디소 소스에서의 얼굴 스냅 사진에 대해 이미지 처리를 수행하여 얼굴 특징을 획득할 경우, 상이한 신분의 얼굴 스냅 사진에서 획득된 얼굴 특징의 유사도가 매우 높아지게 되고, 이는 얼굴 특징의 유사도에 따라 클러스터링을 수행한 후, 상이한 신분의 얼굴 특징이 동일한 클러스터에 클러스터링되도록 할 수 있어, 클러스터링 정확도에 영향을 미친다. 따라서, 클러스터 데이터베이스에 이미 존재하는 클러스터에 대해 정량 분석을 수행하여, 클러스터링 정확도를 확보해야 한다.Since the video source has disadvantages such as complex environment, poor light conditions, and low resolution, when image processing is performed on a facial snapshot from a non-disso source to acquire facial features, The similarity of facial features becomes very high, and after clustering is performed according to the similarity of facial features, facial features of different identities can be clustered in the same cluster, which affects the clustering accuracy. Therefore, it is necessary to secure clustering accuracy by performing quantitative analysis on clusters that already exist in the cluster database.

일 가능한 구현 형태에서, 다수의 얼굴 특징에는 제1 클러스터에 대응되는 하나의 클래스 중심 얼굴 특징이 포함되고; 제1 클러스터에 대해 정량 분석을 수행하여, 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 결정하는 단계는, 클래스 중심 얼굴 특징과 제1 클러스터 중 다른 얼굴 특징 사이의 코사인 거리를 결정하는 단계; 클래스 중심 얼굴 특징과 제1 클러스터 중 다른 얼굴 특징 사이의 코사인 거리에 따라, 평균 거리 및 표준편차 거리를 결정하는 단계; 및 평균 거리가 제3 임계값보다 큰 것 및 표준편차 거리가 제4 임계값보다 작은 것 중 적어도 하나를 만족할 경우, 제1 클러스터가 기설정 클러스터링 조건에 부합되지 않는 것으로 결정하는 단계를 포함한다.In one possible implementation form, the plurality of facial features includes one class-centric facial feature corresponding to the first cluster; The performing quantitative analysis on the first cluster to determine whether the first cluster satisfies a preset clustering condition may include: determining a cosine distance between a class-centered facial feature and another facial feature in the first cluster; determining a mean distance and a standard deviation distance according to a cosine distance between a class-centric facial feature and another facial feature in the first cluster; and determining that the first cluster does not meet a preset clustering condition when at least one of the mean distance is greater than the third threshold value and the standard deviation distance is less than the fourth threshold value.

제1 클러스터는 클러스터 데이터베이스에 이미 존재하는 클러스터이고, 제1 클러스터에 대해 정량 분석을 수행하여, 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 결정하는 것은, 제1 클러스터에 포함된 다수의 얼굴 특징이 동일한 신분(예를 들어, 동일 인물)에 대응되는지 여부를 결정하는 것이다.The first cluster is a cluster that already exists in the cluster database, and performing quantitative analysis on the first cluster to determine whether the first cluster satisfies the preset clustering condition includes: a plurality of faces included in the first cluster Determining whether the features correspond to the same identity (eg, the same person).

제1 클러스터의 클래스 중심 얼굴 특징과 제1 클러스터 중 다른 얼굴 특징 사이의 코사인 거리를 결정하고, S_i로 기록하며, 이러한 코사인 거리의 평균 거리

를 계산하고, 여기서, n은 제1 클러스터 중 클래스 중심 얼굴 특징을 제외한 다른 얼굴 특징의 개수이며, 이러한 코사인 거리의 표준편차 거리

를 계산한다.Determine the cosine distance between the class-centric facial feature in the first cluster and the other facial feature in the first cluster, and record as S _i , the average distance of these cosine distances

, where n is the number of facial features other than class-centered facial features in the first cluster, and the standard deviation distance of this cosine distance

to calculate

제1 클러스터에 포함된 얼굴 특징의 수량이 많을 경우, 계산 효율을 향상시키기 위해, 먼저 제1 클러스터 중 클래스 중심 얼굴 특징을 제외한 다른 얼굴 특징에 대해 샘플링 추출을 수행하여, 클래스 중심 얼굴 특징과 샘플링 추출된 얼굴 특징 사이의 코사인 거리를 결정함으로써, 평균 거리와 표준편차 거리를 계산할 수 있다. 샘플링 조건은 실제 상황에 따라 결정될 수 있으며, 본 발명은 이에 대해 구체적으로 제한하지 않는다. 예를 들어, 얼굴 특징은 비디오 소스에서 수집된 얼굴 스냅 사진에 따라 획득되기 때문에, 얼굴 스냅 사진은 시공간 정보를 구비하며, 이는 얼굴 특징도 시공간 정보를 구비하도록 한다. 따라서, 얼굴 특징의 시공간 정보에 따라, 제1 클러스터 중 클래스 중심 얼굴 특징을 제외한 다른 얼굴 특징에 대해 샘플링 추출을 수행할 수 있다.When the number of facial features included in the first cluster is large, in order to improve computational efficiency, sampling extraction is performed on other facial features except for the class-centered facial feature in the first cluster, and the class-centered facial feature and sampling are extracted By determining the cosine distance between the calculated facial features, the mean distance and standard deviation distance can be calculated. Sampling conditions may be determined according to actual circumstances, and the present invention is not specifically limited thereto. For example, since facial features are obtained according to facial snapshots collected from a video source, the facial snapshots have spatiotemporal information, which makes the facial features also have spatiotemporal information. Accordingly, according to the spatiotemporal information of the facial feature, sampling and extraction may be performed on other facial features except for the class-oriented facial feature in the first cluster.

제1 클러스터에 대해 정량 분석을 수행하여, 상기 평균 거리와 표준편차 거리를 획득한 후, 평균 거리와 표준편차 거리에 따라, 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 판단한다. 평균 거리가 작고 표준편차 거리가 클수록, 제1 클러스터 중의 각 얼굴 특징의 유사도가 높음을 나타낸다. 즉 제1 클러스터에 포함된 다수의 얼굴 특징은 모두 동일한 신분에 대응되고, 제1 클러스터가 기설정 클러스터링 조건에 부합될수록, 제1 클러스터의 클러스터링 정확도가 높아진다. 다시 말해서, 평균 거리가 크고 표준편차 거리가 작을수록, 제1 클러스터의 클러스터링 정확도가 낮아진다.After quantitative analysis is performed on the first cluster to obtain the average distance and standard deviation distance, it is determined whether the first cluster satisfies a preset clustering condition according to the average distance and standard deviation distance. The smaller the average distance and the larger the standard deviation distance, the higher the similarity of each facial feature in the first cluster. That is, a plurality of facial features included in the first cluster all correspond to the same identity, and as the first cluster satisfies the preset clustering condition, the clustering accuracy of the first cluster increases. In other words, the larger the mean distance and the smaller the standard deviation distance, the lower the clustering accuracy of the first cluster.

따라서, 평균 거리가 제3 임계값보다 큰 것 및 표준편차 거리가 제4 임계값보다 작은 것 중 적어도 하나를 만족할 경우, 제1 클러스터가 기설정 클러스터링 조건에 부합되지 않는 것으로 결정하고, 제1 클러스터의 클러스터링 정확도가 낮다.Accordingly, if at least one of the mean distance is greater than the third threshold value and the standard deviation distance is less than the fourth threshold value, it is determined that the first cluster does not meet the preset clustering condition, and the first cluster has low clustering accuracy.

클러스터링 정확도를 향상시키기 위해, 기설정 클러스터링 조건에 부합되지 않는 제1 클러스터 중의 다수의 얼굴 특징을 해제한다. 제1 클러스터가 제1 임계값으로 클러스터링하여 획득된 것이면(제1 클러스터에 포함된 각 얼굴 특징 사이의 유사도가 제1 임계값보다 큼), 제1 임계값보다 더 높은 제2 임계값을 사용하여, 다수의 얼굴 특징을 다시 클러스터링하여 제2 클러스터를 획득함으로써(제2 클러스터에 포함된 각 얼굴 특징 사이의 유사도가 제2 임계값보다 큼), 클러스터링 정확도를 효과적으로 향상시킬 수 있다.In order to improve the clustering accuracy, a plurality of facial features in the first cluster that do not meet the preset clustering condition are removed. If the first cluster is obtained by clustering with a first threshold (the degree of similarity between each facial feature included in the first cluster is greater than the first threshold), using a second threshold higher than the first threshold , by re-clustering a plurality of facial features to obtain the second cluster (the degree of similarity between each facial feature included in the second cluster is greater than the second threshold), it is possible to effectively improve the clustering accuracy.

도 2는 본 발명의 실시예에 따른 클러스터링 방법의 모식도를 도시한다. 클러스터 해제는 정기 태스크 방식으로 클러스터 데이터베이스에 이미 존재하는 제1 클러스터에 대해 상기 정량 분석, 해제, 재클러스터링 조작을 주기적으로 수행할 수 있다. 도 2에 도시된 바와 같이, 태스크 스케줄러(201)에 의해 클러스터 병합/해제 태스크(202)를 수행하고, 클러스터 데이터베이스(203) 중의 제1 클러스터를 판독하여, 제1 클러스터에 대해 상기 정량 분석, 해제, 재클러스터링 조작을 수행한 다음, 새로 클러스터링하여 얻은 제2 클러스터를 클러스터 데이터베이스(203)에 반환한다. 태스크 스케줄러(201)가 클러스터 병합 태스크를 수행하는 과정에서 클러스터 데이터베이스(2) 중 N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 판독하고, 상기 클러스터링 태스크(204)를 수행하여 클래스 중심 얼굴 클러스터를 획득한다. 또한, 태스크 스케줄러(201)를 사용하여 얼굴에 대해 얼굴 특징의 분류 태스크(205)를 수행하고, 시공간 데이터베이스(206) 중의 클러스터링할 얼굴 특징을 판독하며, 클러스터 데이터베이스(203) 중 N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 판독한다.2 shows a schematic diagram of a clustering method according to an embodiment of the present invention. The cluster de-cluster may periodically perform the quantitative analysis, de-clustering, and re-clustering operations on the first cluster already existing in the cluster database in a regular task manner. As shown in FIG. 2 , the cluster merging/deleting task 202 is performed by the task scheduler 201 , the first cluster in the cluster database 203 is read, and the quantitative analysis and release of the first cluster are performed. , a re-clustering operation is performed, and then the second cluster obtained by newly clustering is returned to the cluster database 203 . In the process where the task scheduler 201 performs the cluster merging task, class-oriented facial features corresponding to N known clusters in the cluster database 2 are read, and the clustering task 204 is performed to obtain a class-oriented facial cluster. do. In addition, the task scheduler 201 is used to perform the classification task 205 of facial features on the face, read the facial features to be clustered in the spatiotemporal database 206, and assign N number of known clusters in the cluster database 203. The corresponding class-centric facial features are read.

일 가능한 구현 형태에서, 제2 클러스터 및 기설정 클러스터링 조건에 부합되는 제1 클러스터 중 적어도 하나에 따라, N개의 기지 클러스터를 결정하는 단계 - 여기서, N≥1임 - ; 및 N개의 기지 클러스터에 따라, 클러스터링할 얼굴 특징을 분류하는 단계를 더 포함한다.In one possible implementation form, according to at least one of the second cluster and the first cluster meeting a preset clustering condition, determining N known clusters, where N≥1; and classifying facial features to be clustered according to the N known clusters.

클러스터 데이터베이스 중의 제1 클러스터에 대해 상기 정량 분석을 수행 완료한 후, 클러스터 데이터베이스에는 기설정 클러스터링 조건에 부합되는 제1 클러스터(즉 정량 분석 후 해제되지 않은 제1 클러스터) 및 새로 클러스터링된 제2 클러스터 중 적어도 하나가 포함되며, 이러한 클러스터를 통합하여 클러스터 데이터베이스 중의 N개의 기지 클러스터라고 한다. 클러스터 데이터베이스 중의 N개의 기지 클러스터를 이용하여, 클러스터링할 얼굴 특징을 분류하여 얼굴 클러스터링 효율을 향상시킬 수 있다.After the quantitative analysis is completed on the first cluster in the cluster database, the cluster database contains a first cluster that meets the preset clustering condition (that is, a first cluster that is not released after quantitative analysis) and a newly clustered second cluster. At least one is included, and these clusters are collectively referred to as N known clusters in the cluster database. Face clustering efficiency can be improved by classifying facial features to be clustered using N known clusters in the cluster database.

일 가능한 구현 형태에서, N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 클러스터링하여, 클래스 중심 클러스터를 획득하는 단계; 및 클래스 중심 클러스터에 포함된 각 클래스 중심 얼굴 특징에 대응되는 기지 클러스터를 병합하는 단계를 더 포함한다.In one possible implementation form, clustering class-centric facial features corresponding to N known clusters to obtain a class-centric cluster; and merging known clusters corresponding to each class-centric facial feature included in the class-centric cluster.

비디오 소스는 환경이 복잡하고 광선 조건이 나쁘며 해상도가 낮은 등 단점이 있으므로, 비디소 소스에서의 얼굴 스탭 사진에 대해 이미지 처리를 수행하여 얼굴 특징을 획득할 경우, 동일한 신분의 얼굴 스냅 사진에서 획득된 얼굴 특징의 유사도가 매우 낮아지게 되고, 이는 얼굴 특징의 유사도에 따라 클러스터링을 수행한 후, 동일한 신분의 얼굴 특징이 상이한 클러스터에 클러스터링되도록 할 수 있어, 클러스터링 정확도에 영향을 미친다. 따라서, 클러스터 데이터베이스에 이미 존재하는, 동일한 신분에 대응되는 클러스터를 병합하여, 클러스터링 정확도를 향상시켜야 한다.Since the video source has disadvantages such as complex environment, poor light conditions, and low resolution, when image processing is performed on the face staff photo from the VDISO source to obtain facial features, the The similarity of facial features becomes very low, and after clustering is performed according to the similarity of facial features, facial features of the same identity can be clustered in different clusters, which affects the clustering accuracy. Therefore, clustering accuracy should be improved by merging clusters that already exist in the cluster database and corresponding to the same identity.

N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징 사이의 유클리디언 거리를 결정하고, k-최근접 이웃 알고리즘을 사용하여, 유클리디언 거리가 가장 가까운 k개의 클래스 중심 얼굴 특징을 획득하며, 상기 k개의 클래스 중심 얼굴 특징 사이의 코사인 거리를 결정함으로써, 코사인 거리를 사용한다. 분산 그래프 처리 프레임워크(Spark GraphX) 그래프 알고리즘을 사용하고, 점 집합 및 변 집합을 구성하여, 연결 서브 집합을 획득함으로써, 클래스 중심 클러스터를 획득한다. 클래스 중심 클러스터에 포함된 다수의 클래스 중심 얼굴 특징은 동일한 신분에 대응됨을 결정할 수 있어, 클래스 중심 클러스터에 포함된 각 클래스 중심 얼굴 특징에 대응되는 기지 클러스터를 병합, 즉 동일한 신분의 얼굴 특징의 상이한 클러스터 사이의 병합을 구현함으로써, 클러스터링 정확도를 효과적으로 향상시킬 수 있다.Determine the Euclidean distances between class-centric facial features corresponding to N known clusters, and use a k-nearest-neighbor algorithm to obtain k class-centric facial features with the closest Euclidean distances, the k We use the cosine distance by determining the cosine distance between the class-centric facial features. A class-centric cluster is obtained by using the distributed graph processing framework (Spark GraphX) graph algorithm, constructing a point set and a side set, to obtain a connected subset. It can be determined that a plurality of class-centric facial features included in the class-centric cluster correspond to the same identity, so that the known clusters corresponding to each class-centric facial feature included in the class-centric cluster are merged, that is, different clusters of facial features of the same identity. By implementing merging between them, the clustering accuracy can be effectively improved.

클러스터 병합도 정기 태스크 방식으로 클러스터 데이터베이스 중의 N개의 기지 클러스터에 대해 상기 클래스 중심 얼굴 특징의 클러스터링, 클러스터 병합 조작을 주기적으로 수행할 수 있다. 여전히 상기 도 2를 예로 들면, 도 2에 도시된 바와 같이, 태스크 스케줄러(201)에 의해 클러스터 병합 태스크를 수행하고, 클러스터 데이터베이스(203) 중 N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 판독하며, 상기 클래스 중심 얼굴 특징의 클러스터링 태스크(204)를 수행하여 클래스 중심 얼굴 클러스터를 획득한다. 나아가 클래스 중심 얼굴 클러스터에 따라, 클러스터 데이터베이스에서, 클래스 중심 클러스터에 포함된 각 클래스 중심 얼굴 특징에 대응되는 기지 클러스터를 병합한다.Cluster merging is also a regular task method, and the clustering and cluster merging operations of the class-oriented facial features may be periodically performed for N known clusters in the cluster database. Still taking Fig. 2 as an example, as shown in Fig. 2, the cluster merging task is performed by the task scheduler 201, and class-oriented facial features corresponding to N known clusters in the cluster database 203 are read. , perform the clustering task 204 of the class-centric facial feature to obtain a class-centric facial cluster. Further, according to the class-centric face cluster, in the cluster database, known clusters corresponding to each class-centric facial feature included in the class-centric cluster are merged.

일 가능한 구현 형태에서, N개의 기지 클러스터에 따라, 클러스터링할 얼굴 특징을 분류하는 단계는, N개의 기지 클러스터에 타깃 기지 클러스터가 존재하는지 여부를 결정하는 단계 - 클러스터링할 얼굴 특징과 타깃 기지 클러스터에 대응되는 클래스 중심 얼굴 특징 사이의 코사인 거리는 제5 임계값보다 작음 - ; 및N개의 기지 클러스터에 타깃 기지 클러스터가 존재할 경우, 클러스터링할 얼굴 특징을 타깃 기지 클러스터에 분류하는 단계를 포함한다.In one possible implementation form, the step of classifying the facial features to be clustered according to the N known clusters includes: determining whether a target known cluster exists in the N known clusters - corresponding to the facial features to be clustered and the target known clusters the cosine distance between the class-centric facial features being and when the target known clusters exist in the N known clusters, classifying the facial features to be clustered into the target known clusters.

도시 수준의 상주 인구는 고정되어 있는데, 이는 스마트 비디오 스냅샷 시스템이 소정의 시간동안 실행된 후, 클러스터 데이터베이스 중의 전체 클러스터가 비교적 안정적임을 의미하고, 즉 매일 생성되는 얼굴 스냅 사진에 대응되는 얼굴 특징은 대부분 클러스터 데이터베이스에 이미 존재하는 N개의 기지 클러스터에 직접 분류될 수 있음을 의미한다. 따라서, 클러스터링 조작을 직접 수행할 필요없이, 먼저 클러스터 데이터베이스 중의 N개의 기지 클러스터에 따라, 클러스터링할 얼굴 특징을 분류함으로써, 클러스터링의 시효성을 향상시킬 수 있다.The resident population at the city level is fixed, which means that after the smart video snapshot system has been running for a certain period of time, the entire cluster in the cluster database is relatively stable, that is, the facial features corresponding to the daily generated facial snapshots are This means that most can be directly classified into N known clusters that already exist in the cluster database. Therefore, by first classifying the facial features to be clustered according to the N known clusters in the cluster database, without the need to directly perform the clustering operation, it is possible to improve the effectiveness of clustering.

여전히 상기 도 2를 예로 들면, 시공간 데이터베이스에는 비디오 소스에 따라 획득된 새로 추가된 얼굴 스냅 사진, 및 새로 추가된 얼굴 스냅 사진에 대응되는 클러스터링할 얼굴 특징이 포함된다. 태스크 스케줄러(201)에 의해 분류 태스크(205)를 수행하고, 시공간 데이터베이스(206) 중의 클러스터링할 얼굴 특징을 판독하며, 클러스터 데이터베이스(203) 중 N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 판독한다. 클러스터 데이터베이스 중 N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징이 그래픽 처리장치(GPU, Graphics Processing Unit)의 디스플레이 메모리에 이미 존재하면, 클러스터 데이터베이스로부터 판독할 필요없이, 태스크 스케줄러는 GPU를 스케줄링하여 디스플레이 메모리로부터N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 직접 판독한다. 나아가, N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징에 따라, N개의 기지 클러스터에 타깃 기지 클러스터가 존재하는지 여부를 판단한다.Still taking Fig. 2 as an example, the spatiotemporal database includes a newly added face snapshot obtained according to a video source, and facial features to be clustered corresponding to the newly added face snapshot. A classification task 205 is performed by the task scheduler 201, facial features to be clustered in the spatiotemporal database 206 are read, and class-oriented facial features corresponding to N known clusters are read out of the cluster database 203. . If class-oriented facial features corresponding to N known clusters in the cluster database already exist in the display memory of the graphics processing unit (GPU), the task scheduler schedules the GPU to display memory without reading from the cluster database. Class-centric facial features corresponding to N known clusters are directly read from Furthermore, it is determined whether a target known cluster exists in the N known clusters according to class-oriented facial features corresponding to the N known clusters.

일 가능한 구현 형태에서, N개의 기지 클러스터에 타깃 기지 클러스터가 존재하는지 여부를 결정하는 단계는, k-최근접 이웃 알고리즘을 사용하여, N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징에서 클러스터링할 얼굴 특징과의 유클리디언 거리가 가장 가까운 k개의 클래스 중심 얼굴 특징을 결정하는 단계 - 여기서, N≥k≥1임; 클러스터링할 얼굴 특징과 k개의 클래스 중심 얼굴 특징 사이의 코사인 거리를 각각 결정하는 단계; 및 코사인 거리가 제5 임계값보다 작은 클래스 중심 얼굴 특징에 대응되는 기지 클러스터를 타깃 기지 클러스터로 결정하는 단계를 포함한다.In one possible implementation form, the step of determining whether a target known cluster exists in the N known clusters includes, using a k-nearest neighbor algorithm, facial features to cluster in class-centric facial features corresponding to the N known clusters. determining the k class-centric facial features with closest Euclidean distances to, where N≥k≥1; determining a cosine distance between facial features to be clustered and k class-centric facial features, respectively; and determining, as a target known cluster, a known cluster corresponding to a class-centered facial feature whose cosine distance is less than a fifth threshold.

클러스터링할 얼굴 특징과 N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징 사이의 유클리디언 거리를 각각 결정하고, k-최근접 이웃 알고리즘을 사용하여, 유클리디언 거리가 가장 가까운 k개의 클래스 중심 얼굴 특징을 획득하며, 클러스터링할 얼굴 특징과의 유클리디언 거리가 가장 가까운 k개의 클래스 중심 얼굴 특징을 획득한다. 거리 결과를 보다 정확하게 하기 위해, 클러스터링할 얼굴 특징과 k개의 클래스 중심 얼굴 특징 사이의 코사인 거리를 각각 결정하여, 클러스터링할 얼굴 특징과의 코사인 거리가 제5 임계값보다 작은 클래스 중심 얼굴 특징이 존재하는지 여부를 판단하고, 존재하면 클러스터링할 얼굴 특징과의 코사인 거리가 제5 임계값보다 작은 클래스 중심 얼굴 특징에 대응되는 기지 클러스터를 타깃 기지 클러스터로 결정한다. 여기서, 타깃 기지 클러스터와 클러스터링할 얼굴 특징 사이의 코사인 거리가 제5 임계값보다 작음은, 타깃 기지 클러스터와 클러스터링할 얼굴 특징 사이의 유사도가 높고, 동일한 클러스터에 클러스터링될 수 있음을 나타낸다. 따라서, 타깃 기지 클러스터가 결정된 후, 클러스터링할 얼굴 특징을 클러스터 데이터베이스 중의 타깃 기지 클러스터에 분류한다.Determine the Euclidean distances between the facial features to be clustered and the class-centric facial features corresponding to the N known clusters, respectively, and use the k-nearest-neighbor algorithm to determine the k class-centric facial features with the closest Euclidean distances. , and acquire k class-centered facial features with the closest Euclidean distance to the facial features to be clustered. In order to make the distance results more accurate, the cosine distances between the facial features to be clustered and the k class-centric facial features are respectively determined to determine whether there exists a class-centric facial feature whose cosine distance from the facial feature to cluster is less than the fifth threshold. It is determined whether or not there is, and a known cluster corresponding to a class-centered facial feature having a cosine distance with a facial feature to be clustered smaller than a fifth threshold is determined as a target known cluster. Here, the fact that the cosine distance between the target known cluster and the facial feature to be clustered is smaller than the fifth threshold value indicates that the similarity between the target known cluster and the facial feature to be clustered is high and clustering can be performed in the same cluster. Accordingly, after the target known cluster is determined, the facial features to be clustered are classified into the target known cluster in the cluster database.

일 가능한 구현 형태에서, 상기 클러스터링할 얼굴 특징을 분류한 결과를 기반으로, N개의 기지 클러스터에 포함된 다수의 얼굴 특징을 업데이트하는 단계; 및 임의의 하나의 기지 클러스터에 대해, 기지 클러스터에 대응되는 업데이트된 다수의 얼굴 특징에 따라, 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 업데이트하는 단계를 더 포함한다.In one possible implementation form, updating a plurality of facial features included in N known clusters based on a result of classifying the facial features to be clustered; and updating, for any one known cluster, the class-centric facial feature corresponding to the known cluster according to the updated plurality of facial features corresponding to the known cluster.

시공간 데이터베이스에 클러스터링할 얼굴 특징이 지속적으로 새롭게 추가되기 때문에, 태스크 스케줄러는 분류 태스크를 지속적으로 수행하여, 클러스터 데이터베이스 중의N개의 기지 클러스터가 지속적으로 변경되도록 한다. 따라서, 분류 태스크 결과에 따라, N개의 기지 클러스터에 포함된 다수의 얼굴 특징을 업데이트한다. 예를 들어, 특정 기지 클러스터의 경우, 한 번의 분류 태스크 이후, 새로운 얼굴 특징이 추가된다.Since new facial features to be clustered are continuously added to the spatiotemporal database, the task scheduler continuously performs the classification task, so that the N known clusters in the cluster database are continuously changed. Therefore, according to the classification task result, a plurality of facial features included in the N known clusters are updated. For example, in the case of a specific known cluster, a new facial feature is added after one classification task.

기지 클러스터의 지속적인 업데이트를 고려하면, 기지 클러스터에 대응되는 클래스 중심 얼굴 특징의 정확성을 확보하기 위해, 기지 클러스터에 대응되는 업데이트된 다수의 얼굴 특징에 따라, 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 정기적으로 재트레이닝한다.Considering the continuous update of the known cluster, in order to secure the accuracy of the class-oriented facial feature corresponding to the known cluster, the class-oriented facial feature corresponding to the known cluster is regularly updated according to the updated plurality of facial features corresponding to the known cluster. retrain with

일 가능한 구현 형태에서, N개의 기지 클러스터에 타깃 기지 클러스터가 존재하지 않을 경우, 클러스터링할 얼굴 특징을 미분류 얼굴 특징으로 결정하는 단계; 및 k-최근접 이웃 알고리즘 및 그래프 알고리즘을 사용하여, 기설정 지속 시간에서의 다수의 미분류 얼굴 특징을 클러스터링하여, 새로 추가된 클러스터를 결정하는 단계를 더 포함한다.In one possible implementation form, when a target known cluster does not exist in the N known clusters, determining a facial feature to be clustered as an unclassified facial feature; and clustering a plurality of unclassified facial features at a preset duration by using a k-nearest neighbor algorithm and a graph algorithm to determine a newly added cluster.

상기 분류 태스크가 수행될 경우, N개의 기지 클러스터에 타깃 기지 클러스터가 존재하지 않으면, 클러스터링할 얼굴 특징이 클러스터 데이터베이스에 이미 존재하는 N개의 기지 클러스터에 분류될 수 없으며, 클러스터링할 얼굴 특징을 미분류 얼굴 특징으로 결정하는 것을 나타낸다. 기설정 지속 시간 내에 다수의 미분류 얼굴 특징이 누적된 후, 태스크 스케줄러에 의해 클러스터링 조작을 수행한다.When the classification task is performed, if the target known cluster does not exist in the N known clusters, the facial feature to be clustered cannot be classified into the N known clusters that already exist in the cluster database, and the facial feature to be clustered is not classified as an unclassified facial feature. indicates that it is determined by After a plurality of unclassified facial features are accumulated within a preset duration, a clustering operation is performed by the task scheduler.

여전히 상기 도 2를 예로 들면, 태스크 스케줄러에 의해 클러스터링 조작을 수행하고, 기설정 지속 시간에서의 다수의 미분류 얼굴 특징을 클러스터링하여, 새로 추가된 클러스터를 결정하며, 새로 추가된 클러스터 및 새로 추가된 클러스터에 포함된 얼굴 특징을 클러스터 데이터베이스에 송신한다.Still taking the above Fig. 2 as an example, a clustering operation is performed by the task scheduler, a plurality of unclassified facial features are clustered at a preset duration to determine a newly added cluster, and a newly added cluster and a newly added cluster Sends the facial features included in the cluster database.

다수의 미분류 얼굴 특징 사이의 유클리디언 거리를 결정하고, k-최근접 이웃 알고리즘을 사용하여 유클리디언 거리가 가장 가까운 k미분류 얼굴 특징을 결정하고, 나아가 상기 k개의 미분류 얼굴 특징 사이의 코사인 거리를 계산하고, 나아가 코사인 거리를 사용하고 그래프 알고리즘을 사용하여, 점 집합 및 변 집합을 구성하여, 연결 서브 집합을 획득함으로써, 새로 추가된 클러스터를 획득한다.Determine the Euclidean distance between a plurality of unclassified facial features, determine the k unclassified facial features with the closest Euclidean distance using a k-nearest neighbor algorithm, and further determine the cosine distance between the k unclassified facial features Then, by using the cosine distance and using a graph algorithm to construct a set of points and edges, and obtain a connected subset, a newly added cluster is obtained.

일 가능한 구현 형태에서, 새로 추가된 클러스터에 따라, N개의 기지 클러스터를 업데이트하는 단계를 더 포함한다.In one possible implementation form, the method further comprises updating the N base clusters according to the newly added cluster.

클러스터링 태스크가 수행됨에 따라, 클러스터 데이터베이스에 새로 추가된 클러스터가 추가되고, 나아가 새로 추가된 클러스터에 따라, N개의 기지 클러스터를 업데이트할 수 있다. 예를 들어, 클러스터링 태스크가 수행되기 전에 N개의 기지 클러스터는 6개의 기지 클러스터이고, 클러스터링 태스크에 의해 2개의 새로 추가된 클러스터를 획득하여 클러스터 데이터베이스에 송신한다. 따라서, 업데이트된 N개의 기지 클러스터는 8개의 기지 클러스터이며, 나아가 업데이트된 8개의 기지 클러스터를 사용하여, 후속되는 분류 태스크를 계속하여 수행할 수 있다.As the clustering task is performed, a newly added cluster is added to the cluster database, and further N base clusters may be updated according to the newly added cluster. For example, before the clustering task is performed, the N base clusters are 6 base clusters, and two newly added clusters are obtained by the clustering task and transmitted to the cluster database. Accordingly, the updated N base clusters are 8 base clusters, and further, a subsequent classification task may be continuously performed using the updated 8 base clusters.

일 가능한 구현 형태에서, 시공간 데이터베이스는 비디오 소스에 따라 얼굴 스냅 사진을 획득하고, 얼굴 스냅 사진에 따라 대응되는 얼굴 특징을 획득한 후, 얼굴 특징에 대해 대응되는 특징 인덱스를 구축하여, 후속적으로 특징 인덱스를 기반으로 얼굴 특징을 조회하기 편리하도록 한다. 클러스터 데이터베이스 중의 N개의 기지 클러스터에 포함된 얼굴 특징은 모두 시공간 데이터베이스에서 유래되므로, 각 기지 클러스터에 포함된 각 얼굴 특징에 대응되는 특징 인덱스에 따라, 각 기지 클러스터를 위해 클러스터 인덱스를 구축할 수 있어, 후속적으로 클러스터 인덱스를 기반으로 기지 클러스터를 조회하기 편리하도록 하고, 나아가 상기 기지 클러스터에 대응되는 각 얼굴 특징을 획득한다. 클러스터 데이터베이스 중의 N개의 기지 클러스터가 업데이트된 후, 업데이트된 각 기지 클러스터에 포함된 각 얼굴 특징에 대응되는 특징 인덱스에 따라, 각 기지 클러스터에 대응되는 클러스터 인덱스를 업데이트할 수 있다.In one possible implementation form, the spatiotemporal database acquires a facial snapshot according to a video source, acquires a corresponding facial feature according to the facial snapshot, and then builds a corresponding feature index for the facial feature, followed by a feature It makes it convenient to search for facial features based on the index. Since the facial features included in N known clusters in the cluster database are all derived from the spatiotemporal database, a cluster index can be built for each known cluster according to the feature index corresponding to each facial feature included in each known cluster, Subsequently, it makes it convenient to inquire the known cluster based on the cluster index, and further acquires each facial feature corresponding to the known cluster. After the N known clusters in the cluster database are updated, the cluster index corresponding to each known cluster may be updated according to the feature index corresponding to each facial feature included in each updated known cluster.

실제 응용 시나리오에서, 얼굴 특징에 대응되는 클러스터를 사용하여 사용자의 궤적을 분석할 경우, 사용자의 궤적이 사람의 파일에 적시에 나타나기를 희망하며, 즉 클러스터링 시효성에 대해 소정의 요구 사항이 있다. 분류 태스크 및 클러스터링 태스크의 조합을 수행하는 상기 방식을 사용하면, 클러스터링할 얼굴 특징의 클러스터링 시효성을 향상시킬 수 있어, 궤적을 보다 종합적으로 연구하고 판단할 수 있다.In an actual application scenario, when a user's trajectory is analyzed using a cluster corresponding to a facial feature, it is hoped that the user's trajectory appears in a person's file in a timely manner, that is, there is a certain requirement for clustering staleness. By using the above method of performing the combination of the classification task and the clustering task, it is possible to improve the clustering validity of the facial features to be clustered, so that the trajectory can be studied and determined more comprehensively.

일부 안건 판단의 경우, 한 사람의 파일에 다른 사람의 궤적이 혼합되어 있거나, 또는 다수의 파일이 동일한 사람의 궤적에 대응되기를 희망하지 않으며, 즉 클러스터링 정확도에 대해 소정의 요구 사항이 있다. 클러스터 해제 태스크 및 클러스터 병합 태스크를 수행하는 상기 방식을 사용하면, 클러스터링 정확도를 향상시킬 수 있어, 상기 안건을 보다 정확하게 판단할 수 있다.In some case judgments, it is not desired that one person's files have the trajectories of other people mixed, or multiple files do not want to correspond to the same person's trajectories, that is, there are certain requirements for clustering accuracy. By using the above method of performing the cluster de-cluster task and the cluster merging task, the clustering accuracy can be improved, so that the agenda can be determined more accurately.

본 발명의 실시예에서는 제1 임계값을 기반으로 클러스터링하여 획득된 얼굴 특징의 제1 클러스터에 대해 정량 분석을 수행하여, 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 결정하고, 기설정 클러스터링 조건은 제1 클러스터에 포함된 다수의 얼굴 특징이 모두 동일한 신분에 대응됨을 지시하며, 제1 클러스터가 기설정 클러스터링 조건에 부합되지 않을 경우, 제1 클러스터에 포함된 다수의 얼굴 특징을 해제하고, 제1 임계값보다 큰 제2 임계값을 기반으로, 해제된 다수의 얼굴 특징을 클러스터링하여 제2 클러스터를 결정한다. 기설정 클러스터링 조건을 만족하지 않는 클러스터를 해제하고, 더 높은 임계값을 사용하여 해제된 얼굴 특징을 다시 클러스터링함으로써, 클러스터의 클러스터링 정확도를 효과적으로 향상시킬 수 있다.In an embodiment of the present invention, quantitative analysis is performed on a first cluster of facial features obtained by clustering based on a first threshold to determine whether the first cluster meets a preset clustering condition, and to perform preset clustering The condition indicates that a plurality of facial features included in the first cluster all correspond to the same identity, and when the first cluster does not meet the preset clustering condition, cancel the plurality of facial features included in the first cluster, A second cluster is determined by clustering a plurality of released facial features based on a second threshold value greater than the first threshold value. By canceling the clusters that do not satisfy the preset clustering condition and re-clustering the canceled facial features using a higher threshold, it is possible to effectively improve the clustering accuracy of the clusters.

원리 및 논리를 위반하지 않을 경우, 본 발명에서 언급된 상기 각각의 방법 실시예는 모두 서로 결합되어 결합된 실시예를 형성할 수 있으며, 지면의 제한으로 인해 본 발명은 상세한 설명을 생략함을 이해할 수 있다. 본 기술분야의 통상의 기술자는, 구체적인 실시형태의 상기 클러스터링 방법에서 각 단계의 구체적인 수행 순서는 그 기능과 가능한 내부 논리에 의해 결정되어야 함을 이해할 수 있다.It will be understood that the respective method embodiments mentioned in the present invention may be combined with each other to form a combined embodiment, provided that the principles and logic are not violated, and the detailed description of the present invention will be omitted due to space limitations. can A person skilled in the art can understand that the specific execution order of each step in the clustering method of the specific embodiment should be determined by its function and possible internal logic.

이 밖에, 본 발명의 실시예는 클러스터링 장치, 전자 기기, 컴퓨터 판독 가능 저장 매체, 프로그램을 더 제공하고, 이들은 모두 본 발명의 실시예에서 제공되는 임의의 하나의 클러스터링 방법을 구현할 수 있으며, 상응한 기술적 해결수단과 설명은 방법 부분의 상응한 기재를 참조하며, 더이상 설명하지 않는다.In addition, an embodiment of the present invention further provides a clustering device, an electronic device, a computer readable storage medium, and a program, all of which may implement any one clustering method provided in the embodiment of the present invention, and corresponding Technical solutions and descriptions refer to corresponding descriptions in the method section, which are not further described.

도 3은 본 발명의 실시예에 따른 클러스터링 장치의 블록도를 도시한다. 도 3에 도시된 바와 같이, 장치(30)는,3 is a block diagram of a clustering apparatus according to an embodiment of the present invention. As shown in Figure 3, the device 30,

얼굴 특징의 제1 클러스터에 대해 정량 분석을 수행하여, 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 결정하도록 구성되는 정량 분석 모듈(31) - 제1 클러스터는 제1 임계값을 기반으로 클러스터링하여 획득된 것이고, 기설정 클러스터링 조건은 제1 클러스터에 포함된 다수의 얼굴 특징이 동일한 신분에 대응됨을 지시함 - ;Quantitative analysis module 31, configured to perform quantitative analysis on the first cluster of facial features to determine whether the first cluster meets a preset clustering condition, wherein the first cluster is clustered based on a first threshold to be obtained, and the preset clustering condition indicates that a plurality of facial features included in the first cluster correspond to the same identity;

제1 클러스터가 기설정 클러스터링 조건에 부합되지 않을 경우, 제1 클러스터에 포함된 다수의 얼굴 특징을 해제하도록 구성되는 해제 모듈(32); 및a release module 32, configured to release a plurality of facial features included in the first cluster when the first cluster does not meet the preset clustering condition; and

제2 임계값을 기반으로, 다수의 얼굴 특징을 클러스터링하여 제2 클러스터를 결정하도록 구성되는 클러스터링 모듈(33) - 제2 임계값은 제1 임계값보다 큼 - ; 을 포함한다.a clustering module 33, configured to cluster the plurality of facial features to determine a second cluster based on the second threshold, wherein the second threshold is greater than the first threshold; includes

일 가능한 구현 형태에서, 다수의 얼굴 특징에는 제1 클러스터에 대응되는 하나의 클래스 중심 얼굴 특징이 포함되고;In one possible implementation form, the plurality of facial features includes one class-centric facial feature corresponding to the first cluster;

정량 분석 모듈(31)은,Quantitative analysis module 31,

클래스 중심 얼굴 특징과 제1 클러스터 중 다른 얼굴 특징 사이의 코사인 거리를 결정하도록 구성되는 제1 결정 서브 모듈;a first determining submodule, configured to determine a cosine distance between the class-centric facial feature and another facial feature in the first cluster;

클래스 중심 얼굴 특징과 제1 클러스터 중 다른 얼굴 특징 사이의 코사인 거리에 따라, 평균 거리 및 표준편차 거리를 결정하도록 구성되는 제2 결정 서브 모듈; 및a second determining submodule, configured to determine a mean distance and a standard deviation distance according to a cosine distance between the class-centric facial feature and another facial feature in the first cluster; and

평균 거리가 제3 임계값보다 큰 것 및 표준편차 거리가 제4 임계값보다 작은 것 중 적어도 하나를 만족할 경우, 제1 클러스터가 기설정 클러스터링 조건에 부합되지 않는 것으로 결정하도록 구성되는 제3 결정 서브 모듈을 포함한다.a third determining sub, configured to determine that the first cluster does not meet the preset clustering condition when the average distance is greater than the third threshold value and the standard deviation distance is less than the fourth threshold value contains modules.

일 가능한 구현 형태에서, 장치(30)는,In one possible implementation form, device 30 comprises:

제2 클러스터 및 기설정 클러스터링 조건에 부합되는 제1 클러스터 중 적어도 하나에 따라, N개의 기지 클러스터를 결정하도록 구성되는 제1 결정 모듈 - 여기서, N≥1임 - ; 및a first determining module, configured to determine, according to at least one of the second cluster and the first cluster meeting a preset clustering condition, N known clusters, wherein N≥1; and

N개의 기지 클러스터에 따라, 클러스터링할 얼굴 특징을 분류하도록 구성되는 분류 모듈을 포함한다.and a classification module, configured to classify facial features to be clustered according to the N known clusters.

일 가능한 구현 형태에서, In one possible implementation form,

클러스터링 모듈(33)은 또한, N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 클러스터링하여, 클래스 중심 클러스터를 획득하도록 구성되고;The clustering module 33 is also configured to cluster class-centric facial features corresponding to the N known clusters to obtain a class-centric cluster;

장치(30)는,Device 30,

클래스 중심 클러스터에 포함된 각 클래스 중심 얼굴 특징에 대응되는 기지 클러스터를 병합하여, 동일한 신분에 대응되는 병합된 클러스터를 획득하도록 구성되는 병합 모듈을 더 포함한다.The method further includes a merging module, configured to merge known clusters corresponding to each class-centric facial feature included in the class-centric cluster to obtain a merged cluster corresponding to the same identity.

일 가능한 구현 형태에서, 분류 모듈은,In one possible implementation form, the classification module comprises:

N개의 기지 클러스터에 타깃 기지 클러스터가 존재하는지 여부를 결정하도록 구성되는 제4 결정 서브 모듈 - 클러스터링할 얼굴 특징과 타깃 기지 클러스터에 대응되는 클래스 중심 얼굴 특징 사이의 코사인 거리는 제5 임계값보다 작음 - ; 및a fourth determining submodule, configured to determine whether a target known cluster exists in the N known clusters, wherein a cosine distance between a facial feature to be clustered and a class-centric facial feature corresponding to the target known cluster is less than a fifth threshold; and

N개의 기지 클러스터에 타깃 기지 클러스터가 존재할 경우, 클러스터링할 얼굴 특징을 타깃 기지 클러스터에 분류하도록 구성되는 분류 서브 모듈을 포함한다.and a classification submodule, configured to classify facial features to be clustered into the target known clusters when the target known clusters exist in the N known clusters.

일 가능한 구현 형태에서, 제4 결정 서브 모듈은,In one possible implementation form, the fourth determining submodule comprises:

k-최근접 이웃 알고리즘을 사용하여, N개의 기지 클러스터에 대응되는 클래스 중심 얼굴 특징에서 클러스터링할 얼굴 특징과의 유클리디언 거리가 가장 가까운 k개의 클래스 중심 얼굴 특징을 결정하도록 구성되는 제1 결정 유닛 - 여기서, N≥k≥1임 - ;a first determining unit, configured to determine, in class-centric facial features corresponding to the N known clusters, k class-centric facial features that have the closest Euclidean distance from the facial features to cluster by using the k-nearest neighbor algorithm - where N≥k≥1 - ;

클러스터링할 얼굴 특징과 k개의 클래스 중심 얼굴 특징 사이의 코사인 거리를 각각 결정하도록 구성되는 제2 결정 유닛; 및a second determining unit, configured to respectively determine a cosine distance between the facial features to be clustered and the k class-centric facial features; and

코사인 거리가 제5 임계값보다 작은 클래스 중심 얼굴 특징에 대응되는 기지 클러스터를 타깃 기지 클러스터로 결정하도록 구성되는 제3 결정 유닛을 포함한다.and a third determining unit, configured to determine, as the target known cluster, the known cluster corresponding to the class-centric facial feature whose cosine distance is less than the fifth threshold.

상기 클러스터링할 얼굴 특징을 분류한 결과를 기반으로, N개의 기지 클러스터에 포함된 다수의 얼굴 특징을 업데이트하도록 구성되는 제1 업데이트 모듈; 및a first update module configured to update a plurality of facial features included in the N known clusters based on a result of classifying the facial features to be clustered; and

임의의 하나의 기지 클러스터에 대해, 기지 클러스터에 대응되는 업데이트된 다수의 얼굴 특징에 따라, 기지 클러스터에 대응되는 클래스 중심 얼굴 특징을 업데이트하도록 구성되는 제2 업데이트 모듈을 더 포함한다.and a second update module, configured to update, for any one known cluster, the class-centric facial feature corresponding to the known cluster according to the updated plurality of facial features corresponding to the known cluster.

N개의 기지 클러스터에 타깃 기지 클러스터가 존재하지 않을 경우, 클러스터링할 얼굴 특징을 미분류 얼굴 특징으로 결정하도록 구성되는 제2 결정 모듈을 더 포함하고;a second determining module, configured to determine, when a target known cluster does not exist in the N known clusters, a facial feature to be clustered as an unclassified facial feature;

클러스터링 모듈(33)은 또한, k-최근접 이웃 알고리즘 및 그래프 알고리즘을 사용하여, 기설정 지속 시간에서의 다수의 미분류 얼굴 특징을 클러스터링하여, 새로 추가된 클러스터를 결정하도록 구성된다.The clustering module 33 is also configured to cluster a plurality of unclassified facial features at a preset duration, using the k-nearest neighbor algorithm and the graph algorithm, to determine a newly added cluster.

새로 추가된 클러스터에 따라, N개의 기지 클러스터를 업데이트하도록 구성되는 제3 업데이트 모듈을 더 포함한다.and a third update module, configured to update the N base clusters according to the newly added cluster.

설명해야 할 것은, 본 발명의 실시예에서 상기 클러스터링 방법이 소프트웨어 기능 모듈의 형식으로 구현하고, 독립적인 제품으로 판매되거나 사용될 때, 하나의 컴퓨터 판독 가능 저장 매체에 저장될 수도 있다. 이러한 이해에 기반하면, 본 발명의 실시예의 기술적 해결수단은 본질적으로 또는 선행 기술에 대해 기여하는 부분은 소프트웨어 제품 형식으로 구현될 수 있고, 상기 컴퓨터 소프트웨어 제품은 하나의 컴퓨터 기기(단말기, 서버 등일 수 있음)가 본 발명의 각 실시예에 따른 방법의 전부 또는 일부를 수행하도록 하는 다수의 명령을 포함하는 하나의 저장 매체에 저장된다. 전술한 저장 매체는 USB 메모리, 이동식 저장 기기, 판독 전용 메모리(Read Only Memory, ROM), 자기 디스크 또는 광 디스크 등 프로그램 코드를 저장할 수 있는 다양한 매체를 포함한다. 이로써, 본 발명의 실시예는 임의의 특정된 하드웨어와 소프트웨어의 결합에 제한되지 않는다.It should be described that, in an embodiment of the present invention, when the clustering method is implemented in the form of a software function module and sold or used as an independent product, it may be stored in one computer-readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present invention may be implemented in the form of a software product essentially or a part contributing to the prior art, and the computer software product may be a single computer device (terminal, server, etc.) is stored in one storage medium including a plurality of instructions for performing all or part of the method according to each embodiment of the present invention. The above-described storage medium includes various media capable of storing a program code, such as a USB memory, a removable storage device, a read-only memory (ROM), a magnetic disk, or an optical disk. As such, embodiments of the present invention are not limited to any specified combination of hardware and software.

일부 실시예에서, 본 발명의 실시예에서 제공되는 장치에 구비되는 기능 또는 포함되는 모듈은 전술한 방법 실시예에서 설명된 방법을 수행할 수 있는데, 그 구체적인 구현은 전술한 방법 실시예에 대한 설명을 참조할 수 있으며, 간결함을 위해 여기서 상세한 설명을 생략한다.In some embodiments, a function provided or a module included in the apparatus provided in the embodiment of the present invention may perform the method described in the above-described method embodiment, the specific implementation of which is the description of the above-described method embodiment. may be referred to, and a detailed description thereof is omitted here for the sake of brevity.

본 발명의 실시예는 컴퓨터 프로그램 명령이 저장된 컴퓨터 판독 가능 저장 매체를 더 제공하고, 상기 컴퓨터 프로그램 명령이 프로세서에 의해 실행될 경우, 상기 클러스터링 방법이 구현된다. 컴퓨터 판독 가능 저장 매체는 비휘발성 컴퓨터 판독 가능 저장 매체일 수 있다.An embodiment of the present invention further provides a computer-readable storage medium storing computer program instructions, and when the computer program instructions are executed by a processor, the clustering method is implemented. The computer-readable storage medium may be a non-volatile computer-readable storage medium.

본 발명의 실시예는 컴퓨터 판독 가능 코드가 포함된 컴퓨터 프로그램 제품을 더 제공하고, 컴퓨터 판독 가능 코드가 기기에서 실행될 경우, 기기의 프로세서는 전술한 임의의 하나의 실시예에서 제공되는 클러스터링 방법을 구현하는 명령을 실행한다.An embodiment of the present invention further provides a computer program product including computer readable code, wherein when the computer readable code is executed in a device, the processor of the device implements the clustering method provided in any one of the embodiments described above. run the command to

본 발명의 실시예는 컴퓨터 판독 가능 명령이 저장되도록 구성되는 다른 컴퓨터 프로그램 제품을 더 제공하고, 명령이 실행될 경우, 컴퓨터로 하여금 상기 임의의 하나의 실시예에서 제공되는 클러스터링 방법의 동작을 실행하도록 한다.An embodiment of the present invention further provides another computer program product, configured to store computer readable instructions, when the instructions are executed, cause the computer to execute the operation of the clustering method provided in any one embodiment above. .

상응하게, 본 발명의 실시예는 컴퓨터 기기를 제공하고, 도 4는 본 발명의 실시예의 컴퓨터 기기의 구성 구조 모식도이며, 도 4에 도시된 바와 같이, 상기 전자 기기(400)는, 하나의 프로세서(401), 적어도 하나의 통신 버스, 통신 인터페이스(402), 적어도 하나의 외부 통신 인터페이스 및 메모리(403)를 포함한다. 여기서, 통신 인터페이스(402)는 이러한 컴포넌트 사이의 연결 통신을 구현하도록 구성된다. 여기서, 통신 인터페이스(402)는 디스플레이 장치를 포함할 수 있고, 외부 통신 인터페이스는 표준 유선 인터페이스 및 무선 인터페이스를 포함할 수 있다. 여기서, 상기 프로세서(401)는 메모리의 이미지 처리 프로그램을 실행하여, 상기 실시예에서 제공되는 클러스터링 방법의 단계가 구현되도록 구성된다. 상기 전자 기기는 단말기, 서버 또는 다른 형태의 기기로 제공될 수 있다. 예를 들어, 전자 기기는 휴대폰, 컴퓨터, 디지털 방송 단말기, 메시지 송수신 기기, 게임 콘솔, 태블릿 기기, 의료 기기, 피트니스 기기, 개인 정보 단말기 등 단말기일 수 있다.Correspondingly, the embodiment of the present invention provides a computer device, FIG. 4 is a schematic structural diagram of the computer device of the embodiment of the present invention, and as shown in FIG. 4 , the electronic device 400 includes one processor 401 , at least one communication bus, a communication interface 402 , at least one external communication interface and a memory 403 . Here, the communication interface 402 is configured to implement connection communication between these components. Here, the communication interface 402 may include a display device, and the external communication interface may include a standard wired interface and a wireless interface. Here, the processor 401 is configured to execute an image processing program in a memory to implement the steps of the clustering method provided in the above embodiment. The electronic device may be provided as a terminal, a server, or another type of device. For example, the electronic device may be a mobile phone, a computer, a digital broadcasting terminal, a message transceiving device, a game console, a tablet device, a medical device, a fitness device, a personal information terminal, and the like.

상기 클러스터링 장치, 컴퓨터 기기 및 저장 매체 실시예의 설명은 상기 클러스터링 방법 실시예의 설명과 유사하고, 대응되는 방법 실시예와 유사한 기술 설명 및 유익한 효과를 가지며, 지면의 제한으로 인해 상기 클러스터링 방법 실시예의 기재를 참조할 수 있으므로 더이상 설명하지 않는다. 본 발명의 클러스터링 장치, 컴퓨터 기기 및 저장 매체 실시예에서 개시되지 않은 기술적 세부 사항은 본 발명의 방법 실시예의 설명을 참조하여 이해할 수 있다.The description of the embodiments of the clustering device, computer equipment and storage medium is similar to the description of the embodiments of the clustering method, and has similar technical descriptions and beneficial effects to the corresponding method embodiments, and due to space limitations, the description of the embodiments of the clustering method is omitted. It can be referenced, so I won't describe it any further. Technical details not disclosed in the embodiments of the clustering device, computer device and storage medium of the present invention may be understood with reference to the description of the method embodiments of the present invention.

본 발명은 시스템, 방법 및/또는 컴퓨터 프로그램 제품일 수 있다. 컴퓨터 프로그램 제품은 프로세서로 하여금 본 발명의 각 양태를 구현하게 하는 컴퓨터 판독 가능 프로그램 명령이 탑재된 컴퓨터 저장 매체를 포함할 수 있다.The invention may be a system, method and/or computer program product. The computer program product may include a computer storage medium carrying computer readable program instructions that cause a processor to implement each aspect of the present invention.

컴퓨터 판독 가능 저장 매체는 명령 실행 기기에 의해 사용되는 명령을 유지하고 저장할 수 있는 유형의 기기일 수 있다. 컴퓨터 판독 가능 저장 매체는 전기 저장 기기, 자기 저장 기기, 광 저장 기기, 전자기 저장 기기, 반도체 저장 기기 또는 이들의 임의의 적합한 조합일 수 있으나 이에 제한되지 않는다. 컴퓨터 판독 가능 저장 매체의 더욱 구체적인 예(비 완전 리스트)로는, 휴대용 컴퓨터 디스크, 하드 디스크, 랜덤 액세스 메모리(RAM), 판독 전용 메모리(ROM), 소거 가능 프로그램 가능 판독 전용 메모리(EPROM 또는 플래시 메모리), 정적 랜덤 액세스 메모리(SRAM), 콤팩트 디스크 판독 전용 메모리(CD-ROM), 디지털 비디오 디스크(DVD), 메모리 스틱, 플로피 디스크, 기계 코딩 기기, 명령이 저장된 펀치 카드 또는 요홈 내 돌기 구조 및 이들의 임의의 적합한 조합을 포함한다. 여기에서 사용되는 컴퓨터 판독 가능 저장 매체는 무선 전기파 또는 다른 자유롭게 전파되는 전자기파, 도파관 또는 다른 전송 매체를 통해 전파되는 전자기파(예를 들어, 광섬유 케이블을 통한 광 펄스), 또는 전기선을 통해 전송되는 전기 신호와 같은 일시적 신호 자체로 해석되지 않는다.A computer-readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer-readable storage medium may be, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination thereof. More specific examples (non-exhaustive list) of computer-readable storage media include, but are not limited to, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory). , static random access memory (SRAM), compact disk read-only memory (CD-ROM), digital video disk (DVD), memory stick, floppy disk, machine coding device, punch card or grooved structure with instructions stored therein and their any suitable combination. As used herein, a computer-readable storage medium includes a wireless electric wave or other freely propagating electromagnetic wave, an electromagnetic wave propagating through a waveguide or other transmission medium (eg, light pulses through a fiber optic cable), or an electrical signal transmitted over an electric wire. It is not interpreted as a temporary signal itself, such as

여기서 설명되는 컴퓨터 판독 가능 프로그램 명령은 컴퓨터 판독 가능 저장 매체로부터 각각의 컴퓨팅/처리 기기에 다운로드될 수 있거나, 또는 인터넷, 근거리 통신망, 광역 통신망 및/또는 무선 인터넷을 통해 외부 컴퓨터 또는 외부 저장 기기에 다운로드될 수 있다. 네트워크는 구리 전송 케이블, 광섬유 전송, 무선 전송, 라우터, 방화벽, 교환기, 게이트웨이 컴퓨터 및/또는 에지 서버를 포함할 수 있다. 각 컴퓨팅/처리 기기의 네트워크 어댑터 또는 네트워크 인터페이스는 네트워크로부터 컴퓨터 판독 가능 프로그램 명령을 수신하고, 상기 컴퓨터 판독 가능 프로그램 명령을 전달하여, 각각의 컴퓨팅/처리 기기의 컴퓨터 판독 가능 저장 매체에 저장되도록 한다.The computer readable program instructions described herein may be downloaded from a computer readable storage medium to each computing/processing device, or downloaded to an external computer or external storage device via the Internet, local area network, wide area network and/or wireless Internet. can be Networks may include copper transport cables, fiber optic transport, wireless transport, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter or network interface of each computing/processing device receives computer readable program instructions from a network, and transmits the computer readable program instructions to be stored in a computer readable storage medium of each computing/processing device.

본 발명의 동작을 실행하는 컴퓨터 프로그램 명령은 어셈블리 명령, 명령 세트 아키텍처(ISA) 명령, 기계 명령, 기계 관련 명령, 마이크로 코드, 펌웨어 명령, 상태 설정 데이터 또는 하나 또는 다수의 프로그래밍 언어의 임의의 조합으로 작성되는 소스 코드 또는 오브젝트 코드일 수 있고, 상기 프로그래밍 언어는 Smalltalk, C++ 등과 같은 객체 지향 프로그래밍 언어와, “C" 언어 또는 유사한 프로그래밍 언어와 같은 기존 절차적 프로그래밍 언어를 포함한다. 컴퓨터 판독 가능 프로그램 명령은 완전히 사용자의 컴퓨터에서 실행되거나, 부분적으로 사용자의 컴퓨터에서 실행되거나, 독립형 소프트웨어 패키지로서 실행되거나, 일부는 사용자의 컴퓨터에서 실행되고 일부는 원격 컴퓨터에서 실행되거나, 또는 완전히 원격 컴퓨터 또는 서버에서 실행될 수 있다. 원격 컴퓨터의 경우, 원격 컴퓨터는 근거리 통신망 또는 광역 통신망을 포함하는 임의의 종류의 네트워크를 통해 사용자의 컴퓨터에 연결되거나 외부 컴퓨터에 연결될 수 있다(예를 들어, 인터넷 서비스 제공 업체를 이용하여 인터넷을 통해 연결). 일부 실시예에서, 컴퓨터 판독 가능 프로그램 명령의 상태 정보를 이용하여, 프로그램 가능 논리 회로, 필드 프로그램 가능 게이트 어레이 또는 프로그램 가능 논리 어레이와 같은 전자 회로를 개성화 맞춤하고, 상기 전자 회로는 컴퓨터 판독 가능 프로그램 명령을 실행함으로써 본 발명의 실시예의 각 양태를 구현할 수 있다.The computer program instructions for implementing the operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or any combination of one or more programming languages. It can be written source code or object code, and the programming language includes an object-oriented programming language such as Smalltalk, C++, etc., and an existing procedural programming language such as a “C” language or similar programming language. may run entirely on the user's computer, partially on the user's computer, as a standalone software package, in part on the user's computer and partly on the remote computer, or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network or a wide area network, or may be connected to an external computer (e.g., the Internet using an Internet service provider). In some embodiments, state information in computer readable program instructions is used to personalize an electronic circuit, such as a programmable logic circuit, a field programmable gate array, or a programmable logic array, the electronic circuit comprising: Each aspect of the embodiments of the present invention may be implemented by executing computer readable program instructions.

여기서 본 발명의 실시예의 방법, 장치(시스템) 및 컴퓨터 프로그램 제품의 흐름도 및/또는 블록도를 참조하여 본 발명의 실시예의 각 양태를 설명한다. 흐름도 및/또는 블록도의 각각의 블록 및 흐름도 및/또는 블록도 중 각 블록의 조합은 모두 컴퓨터 판독 가능 프로그램 명령에 의해 구현될 수 있음을 이해해야 한다.Herein, each aspect of an embodiment of the present invention will be described with reference to a flowchart and/or a block diagram of a method, an apparatus (system) and a computer program product of the embodiment of the present invention. It should be understood that each block in the flowcharts and/or block diagrams and combinations of respective blocks in the flowcharts and/or block diagrams may all be implemented by computer readable program instructions.

이러한 컴퓨터 판독 가능 프로그램 명령은 범용 컴퓨터, 전용 컴퓨터 또는 다른 프로그램 가능 데이터 처리 장치의 프로세서에 제공되어 하나의 기계를 생성함으로써, 이러한 명령이 컴퓨터 또는 다른 프로그램 가능 데이터 처리 장치의 프로세서에 의해 실행될 경우, 흐름도 및/또는 블록도 중 하나 또는 다수의 블록에 지정된 기능/동작을 구현하는 장치를 생성할 수 있다. 또한, 이러한 컴퓨터 판독 가능 프로그램 명령을 컴퓨터 판독 가능 저장 매체에 저장할 수 있고, 이러한 명령은 컴퓨터, 프로그램 가능 데이터 처리 장치 및/또는 다른 기기가 특정된 방식으로 작업하도록 함으로써, 명령이 저장된 컴퓨터 판독 가능 매체는 흐름도 및/또는 블록도 중 하나 또는 다수의 블록에 지정된 기능/동작의 각 양태를 구현하는 명령을 포함하는 하나의 제조품을 포함한다.These computer readable program instructions are provided to the processor of a general purpose computer, special purpose computer, or other programmable data processing device to create a machine, such that when these instructions are executed by the processor of the computer or other programmable data processing device, the flow chart and/or devices that implement the functions/acts specified in one or more blocks of the block diagrams. Further, such computer readable program instructions may be stored in a computer readable storage medium, which instructions cause a computer, programmable data processing apparatus, and/or other device to work in a specified manner, thereby causing the computer readable medium having the instructions stored thereon. includes an article of manufacture comprising instructions for implementing each aspect of the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.

또한 컴퓨터 판독 가능 프로그램 명령을 컴퓨터, 다른 프로그램 가능 데이터 처리 장치, 또는 다른 기기에 로딩하여 컴퓨터, 다른 프로그램 가능 데이터 처리 장치 또는 다른 기기에서 일련의 동작의 단계가 수행되도록 하여 컴퓨터에 의해 구현되는 프로세스가 생성되도록 함으로써, 컴퓨터, 다른 프로그램 가능 데이터 처리 장치, 또는 다른 기기에서 실행되는 명령이 흐름도 및/또는 블록도 중 하나 또는 다수의 블록에 지정된 기능/동작을 구현하도록 할 수 있다.Also, a process implemented by a computer by loading computer readable program instructions into a computer, other programmable data processing device, or other device causes a sequence of steps to be performed on the computer, other programmable data processing device, or other device. By causing them to be generated, instructions executing on a computer, other programmable data processing device, or other device may implement functions/actions specified in one or more blocks of the flowcharts and/or block diagrams.

도면의 흐름도 및 블록도는 본 발명의 다수의 실시예에 따른 시스템, 방법 및 컴퓨터 프로그램 제품의 구현 가능한 아키텍처, 기능 및 동작을 도시한다. 이 점에서, 흐름도 또는 블록도의 각 블록은 지정된 논리적 기능을 구현하기 위한 하나 또는 다수의 실행 가능한 명령을 포함하는 모듈, 프로그램 세그먼트 또는 명령의 일부를 나타낼 수 있다. 일부 대안적인 구현에서, 블록에 표기된 기능은 도면에 도시된 것과 다른 순서로 구현될 수도 있음에 유의해야 한다. 예를 들어, 연속되는 2개의 블록은 실제로 병렬 수행될 수 있고, 관련 기능에 따라 때때로 역순으로 수행될 수도 있다. 또한, 블록도 및/또는 흐름도의 각 블록, 및 블록도 및/또는 흐름도 중의 블록의 조합은 지정된 기능 또는 동작을 수행하는 전용 하드웨어 기반 시스템에 의해 구현될 수 있거나, 또는 전용 하드웨어와 컴퓨터 명령의 조합에 의해 구현될 수도 있음에 유의해야 한다.The flowchart and block diagrams in the drawings illustrate the possible architectures, functions, and operations of systems, methods, and computer program products in accordance with multiple embodiments of the present invention. In this regard, each block in the flowchart or block diagram may represent a module, program segment, or portion of an instruction comprising one or more executable instructions for implementing a specified logical function. It should be noted that, in some alternative implementations, the functions indicated in the blocks may be implemented in a different order than shown in the figures. For example, two consecutive blocks may actually be performed in parallel, or may sometimes be performed in the reverse order according to a related function. Further, each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts, may be implemented by a dedicated hardware-based system for performing specified functions or operations, or a combination of dedicated hardware and computer instructions It should be noted that it may be implemented by

상기 컴퓨터 프로그램 제품은 구체적으로 하드웨어, 소프트웨어 또는 이들의 조합으로 구현될 수 있다. 일 선택 가능한 실시예에서, 상기 컴퓨터 프로그램 제품은 구체적으로 컴퓨터 저장 매체로 구현되고, 다른 선택 가능한 실시예에서, 컴퓨터 프로그램 제품은 구체적으로 소프트웨어 개발 키트(Software Development Kit, SDK) 등과 같은 소프트웨어 제품으로 구현된다.The computer program product may be specifically implemented in hardware, software, or a combination thereof. In one selectable embodiment, the computer program product is specifically embodied as a computer storage medium, and in another selectable embodiment, the computer program product is specifically embodied as a software product such as a Software Development Kit (SDK) or the like. do.

이상 내용에서는 본 발명의 각 실시예를 이미 설명하였고, 상기 설명은 예시적인 것일 뿐 완전한 것이 아니며, 개시된 각 실시예에 한정되지도 않는다. 설명된 각 실시예의 범위 및 사상을 벗어나지 않고 진행한 많은 수정 및 변경은 본 기술분야의 통상의 기술자에게 있어서 모두 자명한 것이다. 본 명세서에서 사용되는 용어의 선택은 각 실시예의 원리, 실제 응용 또는 시장의 기술에 대한 기술 개선을 가장 잘 해석하거나, 또는 본 기술분야의 다른 통상의 기술자가 본 명세서에 개시된 각 실시예를 이해할 수 있도록 하기 위한 것이다.In the above content, each embodiment of the present invention has already been described, and the above description is only illustrative and not exhaustive, and is not limited to each disclosed embodiment. Many modifications and changes made without departing from the scope and spirit of each described embodiment will be apparent to those skilled in the art. The choice of terminology used herein best interprets the principle of each embodiment, practical application, or technological improvement over market technology, or may allow others skilled in the art to understand each embodiment disclosed herein. is to make it

산업상 이용 가능성Industrial Applicability

본 발명의 실시예는 클러스터링 방법 및 장치, 전자 기기 및 저장 매체를 제공하고, 여기서, 상기 클러스터링 방법은, 얼굴 특징의 제1 클러스터에 대해 정량 분석을 수행하여, 상기 제1 클러스터가 기설정 클러스터링 조건에 부합되는지 여부를 결정하는 단계 - 상기 제1 클러스터는 제1 임계값을 기반으로 클러스터링하여 획득된 것이고, 상기 기설정 클러스터링 조건은 상기 제1 클러스터에 포함된 다수의 얼굴 특징이 동일한 신분에 대응됨을 지시함 - ; 상기 제1 클러스터가 상기 기설정 클러스터링 조건에 부합되지 않을 경우, 상기 제1 클러스터에 포함된 상기 다수의 얼굴 특징을 해제하는 단계; 및 제2 임계값을 기반으로, 상기 다수의 얼굴 특징을 클러스터링하여 제2 클러스터를 결정하는 단계 - 여기서, 상기 제2 임계값은 상기 제1 임계값보다 큼 - ; 을 포함한다.An embodiment of the present invention provides a clustering method and apparatus, an electronic device, and a storage medium, wherein the clustering method performs quantitative analysis on a first cluster of facial features, so that the first cluster is determined by a preset clustering condition determining whether the first cluster is obtained by clustering based on a first threshold, and the preset clustering condition indicates that a plurality of facial features included in the first cluster correspond to the same identity. instruct - ; releasing the plurality of facial features included in the first cluster when the first cluster does not meet the preset clustering condition; and clustering the plurality of facial features to determine a second cluster based on a second threshold, wherein the second threshold is greater than the first threshold; includes

Claims

A clustering method comprising:
performing quantitative analysis on a first cluster of facial features to determine whether the first cluster meets a preset clustering condition - the first cluster is obtained by clustering based on a first threshold, the preset clustering condition indicates that a plurality of facial features included in the first cluster correspond to the same identity;
releasing the plurality of facial features included in the first cluster when the first cluster does not meet the preset clustering condition; and
determining a second cluster by clustering the plurality of facial features based on a second threshold, wherein the second threshold is greater than the first threshold; A clustering method comprising

According to claim 1,
the plurality of facial features includes one class-centric facial feature corresponding to the first cluster;
The step of performing quantitative analysis on the first cluster to determine whether the first cluster satisfies a preset clustering condition includes:
determining a cosine distance between the class-centric facial feature and the other of the first clusters;
determining an average distance and a standard deviation distance according to the cosine distance; and
determining that the first cluster does not meet the preset clustering condition when at least one of the average distance is greater than a third threshold value and the standard deviation distance is less than a fourth threshold value; A clustering method that includes.

3. The method of claim 1 or 2,
After determining the second cluster by clustering the plurality of facial features based on the second threshold value, the clustering method includes:
determining N known clusters according to at least one of the second cluster and the first cluster meeting the preset clustering condition, wherein N≥1; and
and classifying facial features to be clustered according to the N known clusters.

4. The method of claim 3,
After determining N known clusters according to at least one of the second cluster and the first cluster meeting the preset clustering condition, the clustering method includes:
clustering class-centric facial features corresponding to the N known clusters to obtain a class-centric cluster; and
and merging the known clusters corresponding to each class-centric facial feature included in the class-centric cluster to obtain a merged cluster corresponding to the same identity.

5. The method according to claim 3 or 4,
Classifying facial features to be clustered according to the N known clusters includes:
determining whether a target known cluster exists in the N known clusters, wherein a cosine distance between the clustered facial feature and a class-centered facial feature corresponding to the target known cluster is less than a fifth threshold; and
and when the target known clusters exist in the N known clusters, classifying the facial features to be clustered into the target known clusters.

6. The method of claim 5,
Determining whether a target base cluster exists in the N base clusters comprises:
determining k class-centric facial features having the closest Euclidean distance from the class-centric facial features corresponding to the N known clusters to the clustered facial features using a k-nearest neighbor algorithm, wherein: N≥k≥1 - ;
determining a cosine distance between the facial features to be clustered and the k class-centered facial features, respectively; and
and determining, as the target known cluster, a known cluster corresponding to a class-centered facial feature having a cosine distance smaller than the fifth threshold.

7. The method according to any one of claims 3 to 6,
After classifying facial features to be clustered according to the N known clusters, the clustering method includes:
updating a plurality of facial features included in the N known clusters based on a result of classifying the facial features to be clustered; and
The clustering method further comprising: for any one of the known clusters, updating a class-centric facial feature corresponding to the known cluster according to a plurality of updated facial features corresponding to the known cluster.

8. The method according to any one of claims 5 to 7,
After determining whether a target base cluster exists in the N base clusters, the clustering method includes:
determining the clustered facial feature as an unclassified facial feature when the target known cluster does not exist in the N known clusters; and
clustering a plurality of unclassified facial features at a preset duration using a k-nearest neighbor algorithm and a graph algorithm to determine a newly added cluster.

9. The method of claim 8,
After determining a newly added cluster by clustering a plurality of unclassified facial features at a preset duration using the k-nearest neighbor algorithm and the graph algorithm, the clustering method comprises:
The clustering method further comprising the step of updating the N base clusters according to the newly added cluster.

A clustering device comprising:
A quantitative analysis module, configured to perform quantitative analysis on a first cluster of facial features to determine whether the first cluster meets a preset clustering condition, wherein the first cluster is clustered based on a first threshold obtained, and the preset clustering condition indicates that a plurality of facial features included in the first cluster correspond to the same identity;
a release module, configured to cancel the plurality of facial features included in the first cluster when the first cluster does not meet the preset clustering condition; and
a clustering module, configured to cluster the plurality of facial features to determine a second cluster based on a second threshold, wherein the second threshold is greater than the first threshold; A clustering device comprising a.

11. The method of claim 10,
the plurality of facial features includes one class-centric facial feature corresponding to the first cluster; The quantitative analysis module,
a first determining submodule, configured to determine a cosine distance between the class-centric facial feature and the other one of the first clusters;
a second determining submodule, configured to determine a mean distance and a standard deviation distance according to a cosine distance between the class-centric facial feature and the other facial feature in the first cluster; and
and when the average distance is greater than a third threshold value and the standard deviation distance is less than a fourth threshold value, it is determined that the first cluster does not meet the preset clustering condition. A clustering device comprising a third determining sub-module.

12. The method of claim 10 or 11,
The clustering device is
a first determining module, configured to determine, according to at least one of the second cluster and the first cluster meeting the preset clustering condition, N known clusters, wherein N≥1; and
and a classification module, configured to classify facial features to be clustered according to the N known clusters.

13. The method of claim 12,
the clustering module is further configured to cluster class-centric facial features corresponding to the N known clusters to obtain a class-centric cluster;
The clustering device is
and a merging module, configured to merge the known clusters corresponding to each class-centric facial feature included in the class-centric cluster to obtain a merged cluster corresponding to the same identity.

14. The method of claim 12 or 13,
The classification module is
a fourth determining submodule, configured to determine whether a target known cluster exists in the N known clusters; a cosine distance between the clustered facial feature and a class-centric facial feature corresponding to the target known cluster is greater than a fifth threshold. small - ; and
and a classification submodule configured to classify the facial feature to be clustered into the target known cluster when the target known cluster exists in the N known clusters.

15. The method of claim 14,
The fourth determining sub-module,
a first, configured to determine, in class-centric facial features corresponding to the N known clusters, k class-centric facial features that have the closest Euclidean distance to the clustered facial features by using a k-nearest neighbor algorithm Determining unit, wherein N≥k≥1;
a second determining unit, configured to respectively determine a cosine distance between the facial features to be clustered and the k class-centric facial features; and
and a third determining unit, configured to determine, as the target known cluster, a known cluster corresponding to a class-centric facial feature whose cosine distance is less than the fifth threshold value.

16. The method according to any one of claims 12 to 15,
The clustering device is
a first update module configured to update a plurality of facial features included in the N known clusters based on a result of classifying the facial features to be clustered; and
The clustering device further comprising a second update module, configured to update, with respect to any one of the known clusters, a class-oriented facial feature corresponding to the known cluster according to the plurality of updated facial features corresponding to the known cluster. .

17. The method according to any one of claims 14 to 16,
The clustering device is
a second determining module, configured to determine the clustered facial feature as an unclassified facial feature when the target known cluster does not exist in the N known clusters;
and the clustering module is further configured to cluster a plurality of unclassified facial features at a preset duration by using a k-nearest neighbor algorithm and a graph algorithm to determine a newly added cluster.

18. The method of claim 17,
The clustering device is
and a third update module, configured to update the N base clusters according to the newly added cluster.

As an electronic device,
a memory storing instructions executable by the processor; and
a processor for executing the method according to any one of claims 1 to 9 by calling the instruction stored in the memory; An electronic device comprising a.

A computer readable storage medium having computer program instructions stored thereon, comprising:
10. A computer-readable storage medium that, when the computer program instructions are executed by a processor, causes the processor to implement the method according to any one of claims 1 to 9.