CN112668482B

CN112668482B - Face recognition training method, device, computer equipment and storage medium

Info

Publication number: CN112668482B
Application number: CN202011594406.9A
Authority: CN
Inventors: 刘钊
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2020-12-29
Filing date: 2020-12-29
Publication date: 2023-11-21
Anticipated expiration: 2040-12-29
Also published as: CN112668482A

Abstract

The embodiment of the application belongs to the field of artificial intelligence and relates to a face recognition training method, which comprises the steps of counting the number of face images of each user in a training data set, acquiring N users with the front number of the face images, transmitting the face images corresponding to the N users to a preset feature extraction model for feature extraction, obtaining image feature data, carrying out feature clustering on the image feature data to obtain different feature classes, screening out the users with the most similar features in pairs according to the feature classes to form quadruples, inputting all the quadruples into a feature expansion learning model for training, and carrying out feature mapping expansion on target users by using the trained feature expansion learning model. The application also provides a human face recognition training device, computer equipment and a storage medium. In addition, the application also relates to a blockchain technology, and image characteristic data can be stored in the blockchain. The application can achieve the effect of increasing the supporting area in the feature space, thereby improving the accuracy of face recognition.

Description

Face recognition training method, device, computer equipment and storage medium

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a face recognition training method, a face recognition training device, computer equipment and a storage medium.

Background

At present, the face recognition technology has played a great role in many aspects of social life such as video monitoring, certificate verification, criminal investigation and case breaking. However, the data set of the current face recognition algorithm has long tail problem, and especially the phenomenon of long tail problem is more serious in the training process using self-collected data.

Typically, in the self-collected data, the number of valid pictures per ID is only 20 to 30 on average. If these data are used in the training process, the class feature space of the ID is small, that is, the distribution cannot fully represent the feature space distribution of the class, so that when feature extraction is performed on the part of data, the feature center is inaccurate, resulting in lower accuracy of feature extraction.

One of the current approaches to solving this problem is to manually add data, but this approach is inefficient and cost-effective. Other methods, such as using GAN for data amplification, can easily learn some features that do not belong to the need for face recognition, thereby introducing noise into the training data and reducing the accuracy of face recognition.

Disclosure of Invention

The embodiment of the application aims to provide a face recognition training method, a device, computer equipment and a storage medium, which are used for solving the problem of inaccurate face recognition caused by long tail problem of training data in related technologies.

In order to solve the above technical problems, the embodiment of the present application provides a face recognition training method, which adopts the following technical scheme:

counting the number of face images of each user in the training data set, and sorting according to the number of the face images;

acquiring N users sequenced in front, and transmitting face images corresponding to the N users as sample data to a preset feature extraction model for feature extraction to obtain image feature data;

performing feature clustering on the image feature data to obtain different feature classes;

screening out users which are the most similar in pairs according to the feature class to form a quadruple;

inputting all four-element groups into a feature expansion learning model for training; and

And performing feature mapping expansion on the target user by using the trained feature expansion learning model.

Further, the step of clustering the image feature data to obtain different feature classes includes:

Determining the number K of clusters, and selecting K image features from the image feature data as cluster centers;

evaluating first distances between image features except the K image features and each clustering center, and classifying the image features according to the first distances to obtain an initial classification result;

calculating the average position of all image features in each class according to the initial classification result, and determining the average position as a new clustering center;

and calculating a second distance between the image features outside the new cluster centers and each new cluster center until the cluster centers are converged, and determining K final feature classes.

Further, the step of screening the four-element group formed by the users with the most similar pairwise according to the feature class includes:

determining the nearest feature class of each user;

and calculating the class center distance between every two users according to the class center of the feature class closest to the users, and screening out the two users with the smallest class center distance difference to form a quadruple.

Further, the step of determining the closest feature class for each user includes:

acquiring a first image feature quantity of each user in each feature class;

And determining the feature class with the largest number of the first image features as the feature class closest to the user.

Further, after the step of determining the feature class including the first image feature with the largest number as the feature class closest to the user, the method further includes:

when two or more users exist and the nearest feature class is the same class, respectively calculating the average positions of all the image features in each user;

calculating the distance between the average position and K final clustering centers;

and determining the nearest feature class of the user according to the distance.

Further, after the step of performing feature mapping expansion on the target user by using the trained feature expansion learning model, the method further comprises:

training the classifier by using the expanded image characteristic data, iterating the weight of the classifier according to the training result, and fixing the weight of the classifier;

and retraining a preset feature extraction model by using the expanded image feature data.

Further, the step of training the classifier using the expanded image feature data includes:

carrying out reshape on the expanded image characteristics to enable the expanded image characteristics to have the same dimension as the original characteristics;

And taking the characteristics after reshape as the input of a classifier training process, and training the classifier.

In order to solve the above technical problems, the embodiment of the present application further provides a face recognition training device, which adopts the following technical scheme:

the statistics module is used for counting the number of face images of each user in the training data set and sequencing the face images according to the number of the face images;

the feature extraction module is used for acquiring N users ranked in front, and transmitting face images corresponding to the N users as sample data to a preset feature extraction model for feature extraction to obtain image feature data;

the clustering module is used for carrying out feature clustering on the image feature data to obtain different feature classes;

the screening module is used for screening out the users which are the most similar in pairs to form a quadruple according to the characteristic class;

the training module is used for inputting all the four-element groups into the feature expansion learning model for training; and

And the expansion module is used for carrying out feature mapping expansion on the target user by using the trained feature expansion learning model.

In order to solve the above technical problems, the embodiment of the present application further provides a computer device, which adopts the following technical schemes:

The computer device comprises a memory having stored therein computer readable instructions which when executed by the processor implement the steps of the face recognition training method as described above.

In order to solve the above technical problems, an embodiment of the present application further provides a computer readable storage medium, which adopts the following technical schemes:

the computer readable storage medium has stored thereon computer readable instructions which when executed by a processor implement the steps of the face recognition training method as described above.

Compared with the prior art, the embodiment of the application has the following main beneficial effects:

according to the application, the number of face images of each user in a training data set is counted, the N users which are ranked in front are obtained according to the number of the face images, the face images corresponding to the N users are used as sample data to be transmitted to a preset feature extraction model for feature extraction, image feature data are obtained, feature clustering is carried out on the image feature data to obtain different feature classes, users which are the most similar in pairs are screened out according to the feature classes to form quadruples, all the quadruples are input into a feature expansion learning model for training, and finally feature mapping expansion is carried out on a target user by using the trained feature expansion learning model; according to the application, different feature classes are obtained by clustering the extracted image feature data, and two most similar users are selected according to the feature classes to form the quadruple for training, so that the feature expansion is performed on target users with insufficient sample numbers, the effect of increasing the support area in the feature space is achieved, and the accuracy of face recognition can be further improved.

Drawings

In order to more clearly illustrate the solution of the present application, a brief description will be given below of the drawings required for the description of the embodiments of the present application, it being apparent that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without the exercise of inventive effort for a person of ordinary skill in the art.

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow chart of one embodiment of a face recognition training method in accordance with the present application;

FIG. 3 is a flow chart of one embodiment of step S203 of FIG. 2;

fig. 4 is a flow chart of another embodiment of a face recognition training method according to the present application;

fig. 5 is a schematic structural view of an embodiment of a face recognition training device according to the present application;

FIG. 6 is a schematic structural diagram of one embodiment of a computer device in accordance with the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description of the application and the claims and the description of the drawings above are intended to cover a non-exclusive inclusion. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

In order to make the person skilled in the art better understand the solution of the present application, the technical solution of the embodiment of the present application will be clearly and completely described below with reference to the accompanying drawings.

In order to solve the problem of inaccurate face recognition caused by long tail problem of training data in the related art, the application provides a face recognition training method, which relates to artificial intelligence deep learning, and can be applied to a system architecture 100 shown in fig. 1, wherein the system architecture 100 can comprise terminal equipment 101, 102 and 103, a network 104 and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.

The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.

The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.

It should be noted that, the face recognition training method provided by the embodiment of the present application is generally executed by a server or a terminal device, and accordingly, the face recognition training device is generally set in the server or the terminal device.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2, a flow chart of one embodiment of a face recognition training method according to the present application is shown. The face recognition training method comprises the following steps:

step S201, counting the number of face images of each user in the training data set, and sorting according to the number of face images.

The training dataset contains images of a number of different persons, each of whom may in turn have one or more numbers of images.

It should be noted that, when users with the same number and size of face images exist, the users may be ranked according to the quality of the face images. Specifically, the quality of the face images is evaluated according to the blurring degree of the face images, each face image is scored according to the quality, so that the score of each user can be obtained, and the users with the same number and size of the face images are ranked according to the score. The blurring degree of the face image is a gradient value of a feature point in the face image, wherein the feature point comprises an eye feature point, a nose feature point and a mouth feature point, the gradient value of the feature point is an average gradient, the average gradient means that gray scales near the boundary or the two sides of a shadow line of the feature point of the face image have obvious differences, namely, the gray scale change rate is large, the size of the change rate can be used for representing the image definition, the change rate of the contrast change of tiny details of the feature point, namely, the change rate of density in the multidimensional direction of the feature point is reflected, and the relative definition degree of the face image is represented.

In this embodiment, the electronic device (for example, the server/terminal device shown in fig. 1) on which the face recognition training method operates may acquire the number of face images of all users in the training dataset through a wired connection manner or a wireless connection manner. It should be noted that the wireless connection may include, but is not limited to, 3G/4G connections, wiFi connections, bluetooth connections, wiMAX connections, zigbee connections, UWB (ultra wideband) connections, and other now known or later developed wireless connection means.

Step S202, N users sequenced in front are obtained, face images corresponding to the N users are used as sample data to be transmitted to a preset feature extraction model for feature extraction, and image feature data are obtained.

In this embodiment, each user is identified by a user identifier, which may be identified by an ID. The method comprises the steps of obtaining a feature extraction model through pre-training, obtaining N IDs which are ranked at the front, and transmitting all face images corresponding to the N IDs as sample data to the feature extraction model to perform feature extraction, wherein N is a positive integer larger than 0. For example, the first 50 IDs may be obtained, 10000 face images are obtained by the 50 IDs as training sample data, the 10000 face images are input into a preset feature extraction model to perform feature extraction, the feature extraction may be performed according to different face sizes, different illumination conditions, different shielding areas, different face angles, different face ages, and the like, and the extracted image features carry user identifiers for performing feature expansion learning between the same user class according to the user identifiers.

It should be emphasized that, to further ensure the privacy and security of the image feature data, the image feature data may also be stored in a node of a blockchain.

The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

Step S203, carrying out feature clustering on the image feature data to obtain different feature classes.

The feature clustering is a process of determining feature center points of the images of the same category in the feature space after the images of the same category are mapped to the feature space, and the calculation amount can be reduced by using the feature clustering. It should be understood that in this embodiment, feature clustering is performed on all the extracted image feature data.

Algorithms for feature clustering include K-Means clustering algorithms, mean shift clustering algorithms, DBSCAN (Density-Based Spatial Clustering of Applications with Noise) clustering algorithms, expectation maximization clustering using Gaussian mixture models, and hierarchical clustering algorithms.

In some optional implementations of the present embodiment, the method for clustering features of image feature data to obtain different feature classes by using a K-Means clustering algorithm specifically includes the following steps:

step S301, determining the number K of the clustered categories, and selecting K image features from the image feature data as a clustering center.

In this embodiment, K is the number of categories to be obtained, and may be selected and set as required, where K is a positive integer greater than 0.

Step 302, evaluating a first distance between the image features except the K image features and each cluster center, and classifying the image features according to the first distance to obtain an initial classification result.

In this embodiment, the K image features selected are used as cluster centers, and for each of the remaining image features, a first distance between the K image features and each cluster center is calculated, and the K image features are closest to which cluster center and are then divided into a set to which the cluster center belongs.

The first distance between each image feature and each cluster center can be calculated by using euclidean distance, manhattan distance, cosine distance, and the like. Classifying the image features according to the calculated distance values, specifically, after calculating the distance between each image feature and each cluster center, comparing the obtained distances, and classifying the image features to the set to which the cluster center with the smallest distance belongs.

Step S303, calculating the average positions of all the image features in each class according to the initial classification result, determining the average positions as new clustering centers, and determining K final feature classes.

And expressing the position of each image characteristic by coordinates, and calculating an average value of the coordinates of all the image characteristics in each category, wherein the average value is the average position.

Step S304, calculating a second distance between the image features outside the new cluster centers and each new cluster center until the cluster centers are converged, and determining K final feature classes.

In this embodiment, steps S303 to 304 are repeated until the cluster center converges. The clustering center converges to the new calculated distance between the clustering center and the clustering center obtained by the last clustering is smaller than a preset threshold value, the image features with high similarity are aggregated together through feature clustering, and the clustering center is updated until the convergence is to improve the accuracy of the clustering result.

And S204, screening out the users which are the most similar in pairs according to the feature class to form a quadruple.

In some optional implementations of this embodiment, screening out the user-formed quadruples that are the most similar in pairs according to the feature class specifically includes the following steps:

Determining the nearest feature class of each user;

and calculating the class center distance between every two users according to the class center of the feature class closest to the users, and screening out two users with the smallest class center distance difference to form a quadruple.

After feature extraction is performed on the face image of each user, the face image may be distributed in different feature classes when feature clustering is performed, the nearest feature class of each user is determined, and the class center of the feature class is used as the class center of the user to calculate the class center distance between every two users.

In a specific implementation manner of this embodiment, determining the feature class closest to each user specifically includes the following steps:

acquiring the image feature quantity of each user in each feature class;

and determining the feature class with the largest number of image features as the feature class closest to the user.

In this embodiment, the filtering out the quadruple according to the feature clustering result may reduce the calculation amount and improve the efficiency.

In the present embodiment, the representation of the quadruple is (ID ₁ ,Class ₁ ,ID ₂ ,Class ₂ ) Wherein, ID ₁ And ID ₂ To obtain the user with the smallest difference between the two Class center distances, the Class ₁ And Class ₂ Respectively ID ₁ And ID ₂ The closest feature class center.

If two or more users are the same category, the average position of all image features in each user is calculated, and then the distance between the average position and all cluster centers is calculated, and the closest feature class of the user and the cluster center is determined.

Step S205, inputting all the four-element groups into the feature expansion learning model for training.

The feature expansion learning model is implemented by using a Multi-Layer Perceptron (MLP), and in this embodiment, a three-Layer MLP model is used to learn the image feature variation pattern, and the three-Layer MLP model includes an input Layer, a hidden Layer, and an output Layer. Specifically, the four-tuple is input into a three-layer MLP model for training, and the three-layer MLP model can learn the characteristic change modes among the same user classes, so that the characteristic change modes are transferred and learned to users with a small sample number, and data expansion is performed.

For example, the two users selected to form the quadruple are regarded as a user class, namely ID ₁ And ID ₂ For the same user class, four tuples (ID ₁ ,Class ₁ ,ID ₂ ,Class ₂ ) Inputting into a three-layer MLP model for training, and ID ₁ And ID ₂ Can learn the Class ₁ And Class ₂ All image features in the corresponding feature class can achieve the purpose of amplifying the sample features.

And S206, performing feature mapping expansion on the target user by using the trained feature expansion learning model.

Since different feature class features may include different face changes, such as facial gestures, concerns, and occlusions, after learning all the image features of the same user class, the image features are output to the corresponding users through a feature expansion learning model for image feature expansion, for example, assume that user 1 and user 2 are the same user class, and the sample feature set U of user 1 ₁ ＝{S ₁₁ ,S ₁₂ ,S ₁₃ Sample feature set U of sum user 2 ₂ ＝{S ₂₁ ,S ₂₂ ,S ₂₃ ,S ₂₄ ,S ₂₅ ,S ₂₆ Through the above series of steps, the nearest feature class S of the user 1 is obtained ₁ Is { S ] ₁₁ ,S ₁₂ ,S ₂₂ ,S ₂₄ User 2 closest feature class S ₂ Is { S ] ₁₃ ,S ₂₁ ,S ₂₅ ,S ₂₆ Performing feature mapping expansion on the user 1 and the user 2 by using the good feature expansion learning model, wherein the sample feature set of the user 1 after expansion is U ₁ ＝{S ₁₁ ,S ₁₂ ,S ₁₃ ,S ₂₁ ,S ₂₂ ,S ₂₄ ,S ₂₅ ,S ₂₆ Sample feature set U of user 2 ₂ ＝{S ₂₁ ,S ₂₂ ,S ₂₃ ,S ₂₄ ,S ₂₅ ,S ₂₆ ,S ₁₁ ,S ₁₂ ,S ₁₃ }。

It should be noted that, the target user is not limited to N users in the sample data, and after the feature expansion learning model learns the diversity mode of the image features, the image feature expansion can be performed in the users with the smaller number of external samples of the N users.

In this embodiment, sample expansion is performed in the feature space, and single sample set features are expanded by using the intra-class variance of the general data set features in the feature space.

After the quadruple is screened, two types of features in the quadruple are used as feature subsets selected by each type of single sample features. All image feature data of a user are a single sample set feature, the single sample set feature is denoted by f, and a different image features exist in the single sample set feature, f _i Representing the ith image feature, using S _i Denoted as f _i Selected m classes of features, i.e. feature subsets selected for each class of single sample features, S _ij Represent S _i In the class j feature of (c),represent S _i Center feature of the j-th class feature of (c).

Specifically, with S _i Expansion by handFill single sample feature f _i Wherein S is _i There are m classes of features where m is 2 and each class has n image features. By S _ij Intra-class variation expansion f _i . Considering image features as vectors in a high-dimensional space, the expanded idea is: will be set S _ij Is used as a reference feature, S _ij Along with the direction f _i Rotate so as to be matched with f after rotation _i Correspondingly, the rotated characteristic is taken as f _i Is described. Because of the high complexity of high-dimensional vector rotation, this is accomplished here by vector addition, for f _i And S is _ij First, a compensation vector Vij is solved such that:

||V _ij || ₂ ＝1

wherein the compensation vector and the image feature vector are on the same hypersphere feature space. The β variable is a scaling factor, from which V can be found _ij Is a unique solution to (c).

Augmenting single sample feature f with _i ：

Wherein i is e [1, a ]]，j∈[1,m]H.epsilon.1, n]。Ef _ijh Representing the ith single sample feature by S _i The h feature sample obtained by expansion of the j-th class of the model is obtained.

In this embodiment, features of the same class of users can be transferred and learned to users with smaller sample numbers through feature mapping expansion, so as to perform feature expansion.

According to the application, different feature classes are obtained by clustering the extracted image feature data, and two most similar users are selected according to the feature classes to form the quadruple for training, so that the feature expansion is performed on target users with insufficient sample numbers, the effect of increasing the support area in the feature space is achieved, and the accuracy of face recognition can be further improved.

In some optional implementations of this embodiment, after step 206, the electronic device may further perform the following steps:

and step 401, training the classifier by using the expanded image characteristic data, iterating the weight of the classifier according to the training result, and fixing the weight of the classifier.

In this embodiment, the expanded image features are reshaped to have the same dimensions as the original features, and the reshaped features are used as input in the classifier training process to train the classifier.

The image feature data obtained by feature extraction in the feature extraction model is trained by the classifier, and the classifier is trained by using the expanded image features, so that the recognition accuracy of the classifier is improved, and the accuracy of face recognition is further improved.

Step S402, retraining a preset feature extraction model by using the expanded image feature data.

And training a preset feature extraction model by using the expanded image feature data, wherein the training process is a deep learning process.

According to the application, the classifier is trained by using the expanded image feature data, the weight of the classifier is iterated according to the training result, the weight of the classifier is fixed, and the preset feature extraction model is retrained by using the expanded image feature data, so that the image features can be classified according to the user identification, and the corresponding user can be positioned according to the image features in the subsequent face recognition, thereby improving the face recognition efficiency.

The application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by computer readable instructions stored in a computer readable storage medium that, when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

With further reference to fig. 5, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a face recognition training apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 5, the face recognition training device according to the present embodiment includes: statistics module 501, feature extraction module 502, clustering module 503, screening module 504, training module 505, and augmentation module 506. Wherein:

The statistics module 501 is used for counting the number of face images of each user in the training data set, and sorting according to the number of the face images;

the feature extraction module 502 is configured to obtain N users ranked in front, and transmit face images corresponding to the N users as sample data to a preset feature extraction model for feature extraction, so as to obtain image feature data;

the clustering module 503 is configured to perform feature clustering on the image feature data to obtain different feature classes;

the screening module 504 is configured to screen out the users that are the most similar to each other to form a quadruple according to the feature class;

the training module 505 is configured to input all four tuples into the feature extension learning model for training;

the expansion module 506 is configured to perform feature map expansion on the target user using the trained feature expansion learning model.

In some optional implementations of the present embodiment, the clustering module 503 includes a selecting unit, a clustering unit, and a calculating unit;

the selecting unit is used for determining the number K of clusters, and selecting K image features from the image feature data as cluster centers;

The clustering unit is used for evaluating first distances between the image features except the K image features and each clustering center, and classifying the image features according to the first distances to obtain an initial classification result;

the computing unit is used for computing the average position of all image features in each class according to the initial classification result, and determining the average position as a new clustering center; and calculating a second distance between the image features outside the new cluster centers and each new cluster center until the cluster centers are converged, and determining K final feature classes.

In one implementation of this embodiment, the screening module 504 includes a determination sub-module and a screening sub-module;

the processing sub-module is used for determining the nearest characteristic class of each user;

the screening sub-module is used for calculating the class center distance between every two users according to the class center of the feature class closest to the users, and screening out the two users with the smallest class center distance difference to form a quadruple.

In some alternative implementations of the present embodiment, the processing submodule includes: the device comprises an acquisition unit and a determination unit, wherein the acquisition unit is used for acquiring the first image feature quantity of each user in each feature class; the determining unit is used for determining the feature class with the largest number of the first image features as the feature class closest to the user.

According to the human face recognition training device, the classifier is trained by using the expanded image feature data, the weight of the classifier is iterated according to the training result, the weight of the classifier is fixed, and the preset feature extraction model is retrained by using the expanded image feature data, so that the image features can be classified according to the user identification, and the corresponding user can be positioned according to the image features when the human face recognition is carried out subsequently, so that the human face recognition efficiency is improved.

In some optional implementations of this embodiment, the apparatus further includes: a matrix transformation module and a classifier training module,

the matrix transformation module is used for carrying out reshape on the expanded image characteristics so that the expanded image characteristics have the same dimension as the original characteristics;

the classifier training module is used for taking the characteristics after reshape as the input of the classifier training process and training the classifier.

In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 6, fig. 6 is a basic structural block diagram of a computer device according to the present embodiment.

The computer device 6 comprises a memory 61, a processor 62, a network interface 63 communicatively connected to each other via a system bus. It is noted that only computer device 6 having components 61-63 is shown in the figures, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculations and/or information processing in accordance with predetermined or stored instructions, the hardware of which includes, but is not limited to, microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASICs), programmable gate arrays (fields-Programmable Gate Array, FPGAs), digital processors (Digital Signal Processor, DSPs), embedded devices, etc.

The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.

The memory 61 includes at least one type of readable storage media including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 61 may be an internal storage unit of the computer device 6, such as a hard disk or a memory of the computer device 6. In other embodiments, the memory 61 may also be an external storage device of the computer device 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the computer device 6. Of course, the memory 61 may also comprise both an internal memory unit of the computer device 6 and an external memory device. In this embodiment, the memory 61 is generally used to store an operating system and various application software installed on the computer device 6, such as computer readable instructions of a face recognition training method. Further, the memory 61 may be used to temporarily store various types of data that have been output or are to be output.

The processor 62 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 62 is typically used to control the overall operation of the computer device 6. In this embodiment, the processor 62 is configured to execute computer readable instructions stored in the memory 61 or process data, such as computer readable instructions for executing the face recognition training method.

The network interface 63 may comprise a wireless network interface or a wired network interface, which network interface 63 is typically used for establishing a communication connection between the computer device 6 and other electronic devices.

According to the embodiment, the steps of the face recognition training method in the embodiment are realized when the processor executes the computer readable instructions stored in the memory, the classifier is trained by using the expanded image feature data, the weight of the classifier is iterated according to the training result, the weight of the classifier is fixed, the preset feature extraction model is retrained by using the expanded image feature data, the image features can be classified according to the user identification, and the corresponding user can be positioned according to the image features during the subsequent face recognition, so that the face recognition efficiency is improved.

The application also provides another embodiment, namely, a computer readable storage medium is provided, the computer readable storage medium stores computer readable instructions, the computer readable instructions can be executed by at least one processor, so that the at least one processor executes the steps of the face recognition training method, the classifier is trained by using the expanded image feature data, the weight of the classifier is iterated according to the training result, the weight of the classifier is fixed, the expanded image feature data is used for retraining a preset feature extraction model, the image features can be classified according to user identification, the corresponding user can be positioned according to the image features during the subsequent face recognition, and the face recognition efficiency is improved.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.

It is apparent that the above-described embodiments are only some embodiments of the present application, but not all embodiments, and the preferred embodiments of the present application are shown in the drawings, which do not limit the scope of the patent claims. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a thorough and complete understanding of the present disclosure. Although the application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing description, or equivalents may be substituted for elements thereof. All equivalent structures made by the content of the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the scope of the application.

Claims

1. The face recognition training method is characterized by comprising the following steps of:

Performing feature mapping expansion on the target user by using the trained feature expansion learning model;

the step of clustering the image feature data to obtain different feature classes comprises the following steps:

determining the number K of clusters, and selecting K image features from the image feature data as initial cluster centers;

calculating the average position of all image features in each class according to the initial classification result, and determining the average position as a new cluster center;

calculating a second distance between the image features outside the new cluster centers and each new cluster center until the cluster centers are converged, and determining K final feature classes;

the step of screening the users which are the most similar in pairs according to the feature class to form the quadruple comprises the following steps:

Determining the nearest feature class of each user;

calculating the class center distance between every two users according to the class center of the feature class closest to the users, and screening out the two users with the smallest class center distance difference to form a quadruple; the four-element group is expressed in the form of (ID 1, class1, ID2 and Class 2), wherein the ID1 and the ID2 are the users with the smallest obtained center distances of two classes, and the Class1 and the Class2 are the nearest characteristic Class centers of the ID1 and the ID2 respectively;

wherein the step of determining the closest feature class for each user comprises:

acquiring a first image feature quantity of each user in each feature class;

2. The face recognition training method of claim 1, further comprising, after the step of determining the feature class containing the largest number of features of the first image as the feature class closest to the user:

3. The face recognition training method of claim 1, further comprising, after the step of feature map augmenting the target user using the trained feature augmentation learning model:

4. A face recognition training method according to claim 3, wherein the step of training the classifier using the extended image feature data comprises:

5. A face recognition training device, comprising:

The expansion module is used for carrying out feature mapping expansion on the target user by using the trained feature expansion learning model;

the clustering module comprises a selection unit, a clustering unit and a calculation unit:

the computing unit is used for computing the average position of all image features in each class according to the initial classification result, and determining the average position as a new clustering center; calculating a second distance between the image features outside the new cluster centers and each new cluster center until the cluster centers are converged, and determining K final feature classes;

The screening module includes a determination sub-module and a screening sub-module:

the determining submodule is used for determining the nearest characteristic class of each user;

the screening submodule is used for calculating the class center distance between every two users according to the class center of the feature class closest to the users, and screening out two users with the smallest class center distance difference to form a quadruple; the four-element group is expressed in the form of (ID 1, class1, ID2 and Class 2), wherein the ID1 and the ID2 are the users with the smallest obtained center distances of two classes, and the Class1 and the Class2 are the nearest characteristic Class centers of the ID1 and the ID2 respectively;

the determination submodule comprises an acquisition unit and a determination unit:

the acquisition unit is used for acquiring the first image feature quantity of each user in each feature class;

the determining unit is used for determining the feature class with the largest number of the first image features as the feature class closest to the user.

6. A computer device comprising a memory having stored therein computer readable instructions which when executed by the processor implement the steps of the face recognition training method of any one of claims 1 to 4.

7. A computer readable storage medium having stored thereon computer readable instructions which when executed by a processor implement the steps of the face recognition training method of any of claims 1 to 4.