CN113657178A

CN113657178A - Face recognition method, electronic device and computer-readable storage medium

Info

Publication number: CN113657178A
Application number: CN202110833405.3A
Authority: CN
Inventors: 王飒; 葛主贝
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2021-07-22
Filing date: 2021-07-22
Publication date: 2021-11-16

Abstract

The application discloses a face recognition method, an electronic device and a computer readable storage medium, wherein the face recognition method comprises the following steps: obtaining a first face sequence with the resolution smaller than a first resolution threshold from a video stream to be processed, wherein the first face sequence comprises a plurality of face images of a target to be detected; selecting a part of face images from the first face sequence based on a genetic algorithm; the similarity between every two partial face images is lower than a first similarity threshold value; fusing the face features of part of the face images to obtain fused face features; and determining a face recognition result corresponding to the face image with the resolution smaller than the first resolution threshold value in the video stream to be processed based on the similarity of the fused face features and the preset face features. By means of the method, the accuracy rate of face recognition of the blurred face image can be improved.

Description

Face recognition method, electronic device and computer-readable storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a face recognition method, an electronic device, and a computer-readable storage medium.

Background

With the continuous development of image processing technology, higher requirements are also provided for the accuracy rate of face recognition, if the face recognition result is not accurate enough, then data analysis or personnel deployment and control based on the face recognition result will go to wrong direction, and the accuracy rate of recognizing a blurred face with resolution smaller than a certain threshold in the prior art is not high, which becomes a pain point of image processing. In view of this, how to improve the accuracy of face recognition of a blurred face image becomes an urgent problem to be solved.

Disclosure of Invention

The technical problem mainly solved by the application is to provide a face recognition method, an electronic device and a computer readable storage medium, which can improve the accuracy of face recognition of a blurred face image.

In order to solve the above technical problem, a first aspect of the present application provides a face recognition method, including: obtaining a first face sequence with the resolution smaller than a first resolution threshold from a video stream to be processed, wherein the first face sequence comprises a plurality of face images of a target to be detected; selecting a part of face images from the first face sequence based on a genetic algorithm; the similarity between every two of the partial face images is lower than a first similarity threshold value; fusing the face features of the partial face images to obtain fused face features; and determining a face recognition result corresponding to the face image with the resolution smaller than the first resolution threshold value in the video stream to be processed based on the similarity between the fused face feature and the preset face feature.

In order to solve the above technical problem, a second aspect of the present application provides an electronic device, including: a memory and a processor coupled to each other, wherein the memory stores program data, and the processor calls the program data to execute the method of the first aspect.

To solve the above technical problem, a third aspect of the present application provides a computer-readable storage medium having stored thereon program data, which when executed by a processor, implements the method of the first aspect.

The beneficial effect of this application is: the method and the device for processing the face images have the advantages that the first face sequence with the resolution smaller than the first resolution threshold is obtained from the video stream to be processed, at least part of face images are extracted from the first face sequence through a genetic algorithm, the similarity between every two parts of the face images is lower than the first similarity threshold, so that the part of face images with smaller quantity and more representativeness are obtained, the face features of the part of face images are fused, fused face features are obtained, the fused face features comprise the face features corresponding to the part of face images with more representativeness, when the face features are compared with the preset face features based on the similarity, the face recognition result corresponding to the face images with the resolution smaller than the first resolution threshold can be more accurate, and the accuracy of face recognition of the fuzzy face images is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. Wherein:

FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a face recognition method according to the present application;

FIG. 2 is a schematic flow chart diagram illustrating another embodiment of the face recognition method of the present application;

FIG. 3 is a flowchart illustrating an embodiment corresponding to step S206 in FIG. 2;

FIG. 4 is a schematic structural diagram of an embodiment of an electronic device of the present application;

FIG. 5 is a schematic structural diagram of an embodiment of a computer-readable storage medium according to the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.

Referring to fig. 1, fig. 1 is a schematic flow chart of an embodiment of a face recognition method according to the present application, the method including:

s101: the method comprises the steps of obtaining a first face sequence with the resolution smaller than a first resolution threshold value from a video stream to be processed, wherein the first face sequence comprises a plurality of face images of a target to be detected.

Specifically, a video stream to be processed acquired by a camera device is obtained, and face images belonging to a target to be detected are extracted from the video stream to be processed to form a face sequence, wherein the face images corresponding to the same target to be detected are in the same face sequence.

Further, the face image with the highest resolution in the face sequence is obtained, and when the resolution corresponding to the face image with the highest resolution in any one of the face sequences is smaller than a first resolution threshold, the corresponding face sequence is used as the first face sequence, where the first resolution threshold may be one of 50PPI, 70PPI, and 100PPI (pixel Per Inch), and may also be set by a user through self-definition, which is not specifically limited in this application.

In an application mode, a video stream to be processed is received, a camera device tracks and calibrates different targets in a video frame respectively for the different targets, RGB images corresponding to the video frame are decoded, the RGB images belonging to the same target to be detected are stored in the same image sequence, the position of a human face in the image sequence is calibrated by using a human face calibration algorithm to obtain a human face sequence formed by human face images belonging to the same target to be detected, and the human face sequence with the resolution ratio smaller than 50PPI in the human face sequence is screened out to be used as a first human face sequence.

In another application mode, a video stream to be processed is received, video frames are extracted from the video stream to be processed, a face detection technology is used for detecting the position of the face in the video frames, the moving track of the same target to be detected in the whole video frame sequence is obtained through a face tracking algorithm, the area where the face appears is cut out from the video frames and is decoded into RGB images, a face sequence formed by face images belonging to the same target to be detected is obtained, and the face sequence with the resolution ratio smaller than 100PPI in the face sequence is screened out to serve as the first face sequence.

S102: selecting a part of face images from the first face sequence based on a genetic algorithm; the similarity between every two partial face images is lower than a first similarity threshold value.

Specifically, a genetic algorithm is used for screening the face images in the first face sequence, and part of the face images are selected from the first face sequence. The genetic algorithm converts a complex problem solving process into a process similar to biological evolution through a mathematic and simulation mode so as to obtain a better solution with a higher matching value, wherein the matching value is related to the similarity between every two face images, so that the face images as few as possible are selected from continuous face images, and the similarity between every two selected face images is lower than a first similarity threshold value, therefore, the selected partial face images with more representativeness carry out subsequent feature fusion, the fusion rate is improved, and the accuracy of face recognition is improved at the same time.

In an application mode, the facial images in the first facial sequence are encoded by using a genetic algorithm, initial characteristic values are randomly generated for the facial images in the first facial sequence, to obtain multiple groups of candidate characteristic sequences, each characteristic sequence is correspondingly provided with a characteristic value for calibrating whether the corresponding face image is selected or not, measuring and calculating the matching value of the candidate characteristic sequence by using a matching value function, reserving at least part of the candidate characteristic sequence based on the matching value, and carrying out cross and/or variation operation on the candidate characteristic sequence, to generate a new candidate characteristic sequence, and after a plurality of iterations, to make the characteristic value in the candidate characteristic sequence tend to the optimal solution, to obtain the characteristic sequence with the highest matching value in the candidate characteristic sequence, and selecting a part of face images from the first face sequence according to the feature value corresponding to the feature sequence with the highest matching value to obtain more representative face images.

S103: and fusing the face features of part of the face images to obtain fused face features.

Specifically, the face features corresponding to part of the face images are obtained, and the face features of the part of the face images are fused to obtain fused face features. The partial face images correspond to different time sequences during collection, and the similarity between the partial face images is lower than a first similarity threshold value, so that the fused face features comprise face features corresponding to representative face images in different time sequences.

In an application mode, the long-time and short-time memory network is used for fusing the face features of part of face images to obtain fused face features. The long-time and short-time memory network can remember the characteristics on different time sequences for a long time and fuse the characteristics on different time sequences.

Specifically, the long-short-term memory network is trained in advance, a plurality of face images with the resolution smaller than a first resolution threshold value correspond to the same target, preset face features extracted from the face images with the resolution larger than the first resolution threshold value also correspond to the same target, the long-short-term memory network is trained by using the plurality of face images with the resolution smaller than the first resolution threshold value, fused face features output by the long-short-term memory network after continuous iteration updating are close to the corresponding preset face features of the same target, and the similarity difference between the two face features is smaller than a preset numerical value. Furthermore, the trained long-time and short-time memory network is used for fusing the face features of part of the face images to obtain fused face features, so that the face recognition accuracy is improved, and the difficulty of inputting fuzzy face images into a database is reduced.

S104: and determining a face recognition result corresponding to the face image with the resolution smaller than the first resolution threshold value in the video stream to be processed based on the similarity of the fused face features and the preset face features.

Specifically, the similarity between the fused face features and the preset face features is obtained, if the similarity is greater than a preset threshold, the target corresponding to the fused face features is determined to be the same target as the target corresponding to the preset face features, if the preset face features with the similarity greater than the preset threshold are not found, the fused face features are discarded or stored for a period of time and then are compared again for confirmation, and the face recognition result corresponding to the face image with the resolution less than the first resolution threshold in the video stream to be processed is determined.

In an application mode, the preset face features are obtained by extracting a face image with a resolution greater than a first resolution threshold value through a convolutional neural network, wherein the face image with the resolution greater than the first resolution threshold value is extracted from a video stream, and then the face features of the face image with the highest pixels in a face sequence corresponding to the same target are extracted as the preset face features.

In another application mode, the preset face features are obtained by extracting the certificate photo with the resolution greater than the first resolution threshold value through a convolutional neural network, wherein the resolution of the certificate photo is at least twice of the first resolution threshold value, and the face features corresponding to the certificate photo are used as the preset face features.

According to the scheme, the first face sequence with the resolution smaller than the first resolution threshold is obtained from the video stream to be processed, at least part of face images are extracted from the first face sequence by utilizing a genetic algorithm, the similarity between every two face images in the part of face images is lower than the first similarity threshold, so that the less and more representative part of face images are obtained, the face features of the part of face images are fused, the fused face features are obtained, the fused face features comprise the face features corresponding to the more representative part of face images, therefore, when the face features are compared with the preset face features based on the similarity, the face recognition result corresponding to the face image with the resolution smaller than the first resolution threshold can be more accurate, and the accuracy of face recognition of the blurred face image is improved.

Referring to fig. 2, fig. 2 is a schematic flow chart of another embodiment of the face recognition method of the present application, including:

s201: and analyzing an image sequence consisting of a plurality of images from the video stream to be processed.

Specifically, a video stream to be processed is received, a plurality of video frames are extracted from the video stream to be processed, and the video frames are decoded to obtain an image sequence composed of a plurality of images.

In an application mode, key frame data is extracted from a video stream to be processed, and the key frame data is used as an image sequence, so that the data volume of the image sequence is reduced.

In another application, key frames are extracted from the video stream to be processed, at least part of the distinguishing frames are decoded based on the key frames, and all the obtained video frames are combined into an image sequence to obtain more detailed data in time sequence.

Optionally, before the step of selecting a partial facial image from the first facial sequence based on a genetic algorithm, the method further includes: acquiring similarity between face images in a first face sequence, dividing the face images in the first face sequence into a plurality of image groups based on the similarity, extracting a representative face image in each image group, and updating the first face sequence; the similarity between the face images in the same image group is greater than a second similarity threshold, the similarity between the face images in different image groups is less than a third similarity threshold, and the third similarity threshold is less than or equal to the second similarity threshold.

Specifically, the similarity between face images in a first face sequence is obtained, the face images in the first face sequence are divided into a plurality of image groups based on a clustering algorithm, the similarity between the face images in the same image group is greater than a second similarity threshold value and has stronger similarity, the similarity between the face images in different image groups is less than a third similarity threshold value and has certain discrimination, representative face images in each image group are selected, the simplified first face sequence is formed, information redundancy is eliminated to a certain extent, and subsequent processing efficiency is improved.

S202: and extracting the face images belonging to the same target to be detected in the image sequence by using a face tracking algorithm to obtain a plurality of face sequences respectively corresponding to different targets to be detected.

Specifically, a face tracking algorithm is used for calibrating faces belonging to the same target to be detected in an image sequence, and then face images corresponding to the same target to be detected are extracted, each target to be detected corresponds to a face sequence consisting of face images in sequence, and then different targets to be detected correspond to respective face sequences.

S203: and determining a resolution reference value corresponding to each face sequence in the plurality of face sequences, and judging whether the resolution reference value is smaller than a first resolution threshold value.

Specifically, the resolution reference value is the highest value of the resolution of the face image in the corresponding face sequence, the face image with the highest resolution in the face sequence is extracted, whether the resolution reference value in each face sequence is smaller than a first resolution threshold value is judged, if so, the step S204 is performed, and otherwise, the step S210 is performed.

S204: and determining the face sequence with the resolution reference value smaller than the first resolution threshold value as a first face sequence.

Specifically, when the resolution reference value is smaller than the first resolution threshold, it indicates that the resolutions corresponding to all the face images in the corresponding face sequence are smaller than the first resolution threshold, and the face sequence with all the resolutions smaller than the first resolution threshold is obtained as the first face sequence.

S205: and randomly generating initial characteristic values for the face images in the first face sequence for multiple times, sequencing the characteristic values generated each time respectively to be used as a characteristic sequence, and obtaining a characteristic sequence set consisting of a plurality of incompletely identical characteristic sequences.

Specifically, the feature values include positive values and negative values, and the first numerical value of the positive values in each feature sequence is the same. The positive values indicate that the corresponding face images are selected, the negative values indicate that the corresponding face images are not selected, initial characteristic values are randomly generated for the face images in the first face sequence, after all the face images in the first face sequence generate the corresponding characteristic values each time, the characteristic values are sequenced to obtain one characteristic sequence, and a characteristic sequence set consisting of a plurality of incompletely identical characteristic sequences is obtained.

In an application mode, all the face images in the first face sequence are sequentially described as a character string composed of N0/1 characters, 0 indicates that the face image is not selected, 1 indicates that the face image is selected, and the character string is taken as a feature sequence. A plurality of characteristic sequences are randomly generated based on the method, and each characteristic sequence is not identical, so that corresponding characteristic values of the same face image in different characteristic sequences are not identical.

S206: and obtaining a matching value corresponding to each characteristic sequence in the characteristic sequence set, and performing multiple crossing and variation operations on the characteristic values in the characteristic sequences in the characteristic sequence set based on the matching values to obtain a target characteristic sequence set.

Specifically, a matching value function is obtained, a matching value of a feature sequence in a current feature sequence set is determined based on the matching value function, so that the feature sequence is selected according to the matching value to perform crossing and variation operations, the feature sequence is updated, the matching value of the feature sequence in the current feature sequence set is determined again after each updating, the feature value in the feature sequence tends to an optimal solution corresponding to the matching value function after multiple cycle iterations, and then a part of face images selected based on a target feature sequence set after the target feature sequence set is obtained is higher in matching value and more representative.

In an application manner, please refer to fig. 3, fig. 3 is a flowchart illustrating an embodiment corresponding to step S206 in fig. 2, where step S206 specifically includes:

s301: and obtaining an average value of the similarity between the face images corresponding to the positive values in the feature sequences, determining a matching value of the feature sequences based on the average value, and generating the selection probability of each feature sequence based on the matching value.

Specifically, extracting a face image corresponding to a positive value in a feature sequence, obtaining similarity between every two selected face images in the same feature sequence, and further determining a similarity average value corresponding to the face image corresponding to the positive value in the feature sequence, wherein the similarity is a numerical value before 0-1, and the more similar the features between the face images are, the higher the similarity is, and the similarity between the face images includes but is not limited to calculation based on Euclidean distance between pixel points or PCA dimension reduction distance. The higher the similarity average value corresponding to the feature sequence is, the closer the features of the face image corresponding to the positive value in the corresponding feature sequence are, and the lower the discrimination is.

Further, the similarity average is subtracted by 1 to obtain a matching value corresponding to the feature sequence. The objective of the genetic algorithm is to obtain a more representative face image, and therefore, the similarity between every two face images which need to be subsequently selected is lower than a first similarity threshold, so that the larger the similarity average value corresponding to the feature sequence is, the smaller the matching value of the corresponding feature sequence is, and conversely, the smaller the similarity average value corresponding to the feature sequence is, the larger the matching value of the corresponding feature sequence is, the larger the matching value is, the higher the discrimination between the face images in the feature sequence is, and the more the face images can approach to an ideal result. The probability of each feature sequence being selected is the ratio of the matching value of the current feature sequence to the sum of the matching values of all feature sequences in the current feature sequence set. Therefore, the larger the matching value is, the larger the selection probability of the corresponding feature sequence is, and the higher the selected probability is, so that the selected sampling feature sequence tends to the optimal solution after multiple iterations.

S302: and extracting at least one characteristic sequence as a sampling characteristic sequence based on the selection probability, and performing cross and variation operation on the characteristic values of the sampling characteristic sequence to obtain a new characteristic sequence set.

Specifically, based on the selection probability corresponding to each feature sequence, at least one feature sequence is selected from the current feature sequence set as a sampling feature sequence, the feature values of the sampling feature sequence are subjected to crossover and mutation operations, and the updated feature sequence is added into the original feature sequence set to obtain a new feature sequence set.

In an application scene, averagely dividing a sampling characteristic sequence into two first characteristic sequence sets, and crossing at least partial characteristic sequence segments in the two first characteristic sequence sets to enable characteristic values on the corresponding characteristic sequence segments to be interchanged; after the cross operation is carried out, the number of positive values and the number of negative values in the two first characteristic sequence sets are kept unchanged; randomly selecting a first characteristic value at a position from the sampling characteristic sequence based on the variation probability for inversion, and randomly selecting a characteristic value with a value opposite to that of the first characteristic value for inversion after the first characteristic value is inverted.

Specifically, the selected sampling feature sequence is averagely divided into two parts, the two parts are used as first feature sequence sets, the two first feature sequence sets are randomly disturbed, then one feature sequence is respectively selected from the two first feature sequence sets to be crossed, a feature sequence segment (with the length of k) is randomly selected from the feature sequence of one of the first feature sequence sets, the number of positive values in the segment is counted, then the feature sequence segments with the same length and the same number of positive values are searched from the other first feature sequence set to be crossed, so that the feature values corresponding to the two feature sequence segments are exchanged, the two first feature sequence sets are combined after the crossing operation is completed, the number of the positive values and the number of the negative values in the two first feature sequence sets are kept unchanged after the crossing operation is performed, and the optimization is performed on the basis of the existing positive values in each sampling feature sequence, the feature value at a position that is more representative and has a higher matching value is set to a positive value.

Further, mutation operation is to change the characteristic value at a certain position in the sampling characteristic sequence. And selecting whether to perform mutation operation according to the mutation probability, if so, randomly selecting a sampling characteristic sequence from the current characteristic sequence set, then randomly selecting a first characteristic value at a certain position of the sampling characteristic sequence to invert, if the first characteristic value is changed from a positive value to a negative value, then randomly selecting a characteristic value at a position to change from a negative value to a positive value, and otherwise, executing the same operation until all the characteristic sequences in the current characteristic sequence set are traversed, and keeping the number of the positive values and the number of the negative values in the current characteristic sequence set unchanged after the mutation operation, so that each sampling characteristic sequence is optimized on the basis of the existing positive values, and setting the characteristic value at a position which is more representative and has a higher matching value as a positive value.

S303: and judging whether the number of times of return exceeds a first time threshold value.

Specifically, whether the current return frequency exceeds a first frequency threshold value is judged, if yes, the step S304 is carried out, if not, loop iteration is carried out, the average value of the similarity between face images corresponding to the positive values in the feature sequences is obtained in a return mode, the matching value of the feature sequences is determined based on the average value, the selection probability of each feature sequence is generated based on the matching value, and the return frequency is added.

S304: and obtaining a target characteristic sequence set.

Specifically, a target feature sequence set is obtained after updating and optimizing the feature sequence set, and the feature sequence set output after the number of times of return exceeds a first number threshold is used as the target feature sequence set.

S207: and obtaining a matching value corresponding to the characteristic sequence in the target characteristic sequence set, and selecting a face image corresponding to the positive value as a partial face image in the characteristic sequence with the highest matching value.

Specifically, feature values at each position in a feature sequence with the highest matching value in the target feature sequence set are obtained, and at least part of face images are extracted from the first face sequence based on positive values of the feature values so as to obtain the face images with higher matching values after the face images are simplified.

S208: and fusing the face features of part of the face images to obtain fused face features.

Optionally, before the step of fusing the face features of the partial face image to obtain fused face features, the method further includes: and extracting the face features respectively corresponding to part of the face images by using the feature extraction model.

Specifically, the facial features respectively corresponding to part of the facial images are obtained through the feature extraction model, when the part of the facial images have N facial images, an N-dimensional facial feature sequence is obtained after the part of the facial images pass through the feature extraction model, and therefore the facial features corresponding to each facial image in the part of the facial images are fully mined.

Further, the step of fusing the face features of the partial face images to obtain fused face features includes: and respectively fusing the face features corresponding to part of the face images by using a long-time and short-time memory network to obtain fused face features.

Specifically, the long-time and short-time memory network is provided with a memory unit, a forgetting gate, an input gate and an output gate, and can remember input features for a long time, so that when the long-time and short-time memory network is used for fusing face feature sequences corresponding to partial face images, the face features on different time sequences in the face feature sequences corresponding to the partial face images can be fused, the obtained fused face features comprise face features on different time sequences, the features of blurred face images with low resolution can be enhanced, and the accuracy of recognition results is improved.

S209: and determining a face recognition result corresponding to the face image with the resolution smaller than the first resolution threshold value in the video stream to be processed based on the similarity of the fused face features and the preset face features.

Specifically, based on the similarity between the fused face features and the preset face features, the fused face features are compared with the preset face features to obtain face recognition results corresponding to face images with the resolution smaller than a first resolution threshold, if the face recognition results are matched, the identity of the corresponding target is output, and if the face recognition results are not matched, a prompt of matching failure is output.

It should be noted that the feature extraction model is obtained by pre-training face images with different resolutions; the long-time and short-time memory network is obtained by pre-training face images of which the number is a first numerical value and the resolution is smaller than a first resolution threshold; the preset human face features are obtained by a feature extraction model based on a human face image with the resolution ratio greater than a first resolution ratio threshold; the loss value between the first fused face feature output by the training-completed long-time memory network and the preset face feature from the same target is smaller than the first loss threshold.

Specifically, a large number of face images with different resolutions are adopted to train a convolutional neural network, so that network parameters are adjusted under the supervision of a loss function, and when the loss converges to be smaller than a preset value, the training is finished, and a trained feature extraction model is obtained. And extracting the face features corresponding to the face image with the resolution ratio greater than the first resolution ratio threshold value based on the trained feature extraction model so as to obtain the preset face features.

Further, a large number of fuzzy face images with the resolution lower than a first resolution threshold are prepared, multiple groups of image sequences are randomly selected for the image sequences of each target to form original image sequence data IMG for long-time memory network training, face feature sequences corresponding to the IMG are extracted by using a trained feature extraction model to serve as training data, the prepared training data with the number of a first numerical value are sent to the long-time memory network for training, and a loss function is a loss value of similarity between a first fused face feature output by the long-time memory network and a preset face feature derived from the same target. Through cyclic iteration, network parameters are adjusted under the supervision of a loss function, and training is finished when loss converges to be smaller than a preset value, so that a trained long-time and short-time memory network is obtained, a first fused face feature output by the trained long-time and short-time memory network is fitted with a preset face feature extracted by a feature extraction model and originated from the same target, and the loss value between the first fused face feature and the preset face feature originated from the same target is smaller than a first loss threshold value, so that the fused face feature corresponding to a blurred face image can be more accurate after feature fusion is performed on the blurred face image with the resolution lower than the first resolution threshold value by using the long-time and short-time memory network.

It should be noted that the number of training data used by the long and short term memory network during training is a first numerical value, the number of positive values in the feature sequence is also a first numerical value, and the number of positive values in the feature sequence is kept unchanged during subsequent crossing and mutation operations, so that the positive values of the feature sequences in the target feature sequence set are always the first numerical value, and therefore, the number of the selected partial face images is the first numerical value, which is matched with the number of the long and short term memory network during the training phase, so that the obtained fused face features have an optimal fitting effect, and the loss value between the fused face features and the preset face features derived from the same target is smaller than a first loss threshold value.

S210: and determining the face sequence with the resolution reference value larger than or equal to the first resolution threshold value as a second face sequence.

Specifically, a face sequence with the highest resolution ratio face image greater than or equal to a first resolution ratio threshold value is obtained and is used as a second face sequence.

S211: and determining a face recognition result corresponding to the face image with the resolution greater than or equal to the first resolution threshold in the video stream to be processed based on the similarity between the face feature corresponding to the face image with the highest resolution in the second face sequence and the preset face feature.

Specifically, feature extraction is performed on the face image with the highest resolution in the second face sequence by using the trained feature extraction model, so as to obtain the face feature corresponding to the face image with the highest resolution. And further, determining a face recognition result corresponding to the face image with the highest resolution ratio in the video stream to be processed based on the similarity between the face feature corresponding to the face image with the highest resolution ratio in the second face sequence and the preset face feature, if the face feature is matched with the preset face feature, outputting the identity of the corresponding target, and if the face feature is not matched with the preset face feature, storing the face feature corresponding to the face image with the highest resolution ratio in a database for subsequent comparison.

In this embodiment, a resolution reference value in a face sequence extracted from a video stream is determined, where the resolution reference value is the highest value of the resolution of a face image in the corresponding face sequence, and when the resolution reference value is greater than or equal to a first resolution threshold, the face features of the face image with the highest resolution in the face sequence are extracted by directly using the feature extraction model to be compared with the preset face features, when the resolution reference value is smaller than a first resolution threshold value, the corresponding first face sequence is subjected to preliminary reduction, a genetic algorithm is utilized to extract a part of face image with a higher matching value, further, the long-time memory network is used for fusing the face features corresponding to part of the face images to obtain fused face features which are compared with the preset face features, therefore, the face images with different resolutions in the video stream are compatible, and the accuracy rate of face recognition of the blurred face images with the resolutions lower than the first resolution threshold is improved.

Referring to fig. 4, fig. 4 is a schematic structural diagram of an embodiment of an electronic device 40 of the present application, where the electronic device includes a memory 401 and a processor 402 coupled to each other, where the memory 401 stores program data (not shown), and the processor 402 calls the program data to implement the method in any of the embodiments described above, and the description of the related contents refers to the detailed description of the embodiments of the method described above, which is not repeated herein.

Referring to fig. 5, fig. 5 is a schematic structural diagram of an embodiment of a computer-readable storage medium 50 of the present application, the computer-readable storage medium 50 stores program data 500, and the program data 500 is executed by a processor to implement the method in any of the above embodiments, and the related contents are described in detail with reference to the above method embodiments and will not be described in detail herein.

It should be noted that, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only for the purpose of illustrating embodiments of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application or are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims

1. A face recognition method, comprising:

obtaining a first face sequence with the resolution smaller than a first resolution threshold from a video stream to be processed, wherein the first face sequence comprises a plurality of face images of a target to be detected;

selecting a part of face images from the first face sequence based on a genetic algorithm; the similarity between every two of the partial face images is lower than a first similarity threshold value;

fusing the face features of the partial face images to obtain fused face features;

and determining a face recognition result corresponding to the face image with the resolution smaller than the first resolution threshold value in the video stream to be processed based on the similarity between the fused face feature and the preset face feature.

2. The method of claim 1, wherein the step of selecting a partial face image from the first face sequence based on a genetic algorithm comprises:

randomly generating initial characteristic values for the face images in the first face sequence for multiple times, sequencing the characteristic values generated each time respectively to be used as a characteristic sequence, and obtaining a characteristic sequence set consisting of a plurality of incompletely identical characteristic sequences; wherein the feature values comprise positive values and negative values, and a first value of the positive values in each of the feature sequences is the same;

obtaining a matching value corresponding to each feature sequence in the feature sequence set, and performing multiple times of crossing and variation operations on the feature values in the feature sequences in the feature sequence set based on the matching values to obtain a target feature sequence set;

and obtaining a matching value corresponding to the feature sequence in the target feature sequence set, and selecting the face image corresponding to the positive value as the partial face image in the feature sequence with the highest matching value.

3. The face recognition method according to claim 2, wherein the step of obtaining a matching value corresponding to each of the feature sequences in the feature sequence set, and performing multiple operations of crossing and mutating feature values in the feature sequences in the feature sequence set based on the matching value to obtain a target feature sequence set comprises:

obtaining an average value of similarity between the face images corresponding to the positive values in the feature sequences, determining matching values of the feature sequences based on the average value, and generating selection probability of each feature sequence based on the matching values;

extracting at least one characteristic sequence as a sampling characteristic sequence based on the selection probability, and performing cross and variation operation on characteristic values of the sampling characteristic sequence to obtain a new characteristic sequence set;

and returning to the step of obtaining the average value of the similarity between the face images corresponding to the positive values in the feature sequences, determining the matching value of the feature sequences based on the average value, and generating the selection probability of each feature sequence based on the matching value until the number of times of return exceeds a first number threshold value, thereby obtaining a target feature sequence set.

4. The face recognition method according to claim 3, wherein the step of performing intersection and variation operations on the feature values of the sampling feature sequences to obtain a new feature sequence set comprises:

averagely dividing the sampling characteristic sequence into two first characteristic sequence sets, and crossing at least part of characteristic sequence segments in the two first characteristic sequence sets so as to interchange characteristic values on the corresponding characteristic sequence segments; after the crossover operation is carried out, the number of the positive values and the number of the negative values in the two first feature sequence sets are kept unchanged;

randomly selecting a first characteristic value at a position from the sampling characteristic sequence based on the variation probability for inversion, and randomly selecting a characteristic value with a value opposite to that of the first characteristic value for inversion after the first characteristic value is inverted.

5. The face recognition method according to claim 2, wherein before the step of fusing the face features of the partial face image to obtain fused face features, the method further comprises:

extracting the face features respectively corresponding to the partial face images by using a feature extraction model;

the step of fusing the face features of the partial face images to obtain fused face features comprises the following steps:

and fusing the face features respectively corresponding to the partial face images by using a long-time memory network to obtain fused face features.

6. The face recognition method of claim 5,

the feature extraction model is obtained by pre-training face images with different resolutions;

the long-time and short-time memory network is obtained by pre-training face images, the number of which is the first numerical value, and the resolution of which is smaller than the first resolution threshold;

the preset human face features are obtained by passing the human face image with the resolution ratio larger than the first resolution ratio threshold value through the feature extraction model;

and inputting the face images of which the number is the first numerical value and the resolution is smaller than the first resolution threshold into a first fused face feature output by the long-time and short-time memory network after training, wherein the loss value between the preset face feature derived from the same target and the first fused face feature is smaller than a first loss threshold.

7. The method of claim 1, wherein the step of selecting a partial face image from the first face sequence based on a genetic algorithm further comprises:

acquiring similarity among the face images in the first face sequence, dividing the face images in the first face sequence into a plurality of image groups based on the similarity, extracting a representative face image in each image group, and updating the first face sequence;

the similarity between the face images in the same image group is greater than a second similarity threshold, the similarity between the face images in different image groups is less than a third similarity threshold, and the third similarity threshold is less than or equal to the second similarity threshold.

8. The face recognition method according to claim 1, wherein the step of obtaining the first face sequence with the resolution less than the first resolution threshold from the video stream to be processed comprises:

analyzing an image sequence consisting of a plurality of images from the video stream to be processed;

extracting face images belonging to the same target to be detected in the image sequence by using a face tracking algorithm to obtain a plurality of face sequences corresponding to different targets to be detected respectively;

determining a resolution reference value corresponding to each face sequence in the plurality of face sequences, and determining a face sequence with a resolution reference value smaller than the first resolution threshold as the first face sequence; the resolution reference value is the highest value of the resolution of the face image in the corresponding face sequence;

after the step of obtaining the first face sequence with the resolution smaller than the first resolution threshold from the video stream to be processed, the method further includes:

determining the face sequence with the resolution reference value larger than or equal to the first resolution threshold value as a second face sequence;

and determining a face recognition result corresponding to the face image with the resolution greater than or equal to the first resolution threshold value in the video stream to be processed based on the similarity between the face feature corresponding to the face image with the highest resolution in the second face sequence and the preset face feature.

9. An electronic device, comprising: a memory and a processor coupled to each other, wherein the memory stores program data that the processor calls to perform the method of any of claims 1-8.

10. A computer-readable storage medium, on which program data are stored, which program data, when being executed by a processor, carry out the method of any one of claims 1-8.