CN111507138A

CN111507138A - Image recognition method and device, computer equipment and storage medium

Info

Publication number: CN111507138A
Application number: CN201910100448.3A
Authority: CN
Inventors: 赵鑫; 刘洛麒; 肖胜涛
Original assignee: Beijing Qihoo Technology Co Ltd
Current assignee: Beijing Qihoo Technology Co Ltd
Priority date: 2019-01-31
Filing date: 2019-01-31
Publication date: 2020-08-07

Abstract

The embodiment of the invention discloses an image identification method, an image identification device, computer equipment and a storage medium, wherein the image identification method comprises the following steps: acquiring a first face image to be recognized; extracting the coordinates of key points in the first face image; performing image correction on the first face image according to the key point coordinates to generate a second face image so as to complement the first face image; and carrying out face recognition according to the second face image so as to obtain the user information represented by the first face image. And correcting the first face image according to the key point coordinates to generate a second face image, wherein the face image in the corrected second face image is more complete than the face image in the first face image. Therefore, when the second face image is adopted for image recognition, more image features expressing the face features can be extracted, so that the accuracy rate in face recognition is improved, and the recognition accuracy rate in a monitoring environment is greatly improved.

Description

Image recognition method and device, computer equipment and storage medium

Technical Field

The embodiment of the invention relates to the field of image recognition, in particular to an image recognition method, an image recognition device, computer equipment and a storage medium.

Background

Face recognition is a biometric technology for identity recognition based on facial feature information of a person. A series of related technologies, also commonly referred to as face recognition or face recognition, collect an image or video stream containing a human face with a camera or a video camera, automatically detect and track the human face in the image, and then perform face recognition on the detected human face.

In the prior art, a face recognition technology is widely applied to video monitoring, when face recognition is performed in a video monitoring mode, a plurality of face images in a monitoring range need to be monitored and shot firstly, then different faces are segmented from shot images, a plurality of different face images are obtained through segmentation, and finally the face images are recognized. Because the sequence and the angle of each face image in the monitored image are different, the difficulty of face recognition is increased, and the problem of lower accuracy of face recognition in a monitoring mode is caused.

Disclosure of Invention

The embodiment of the invention provides an image identification method, an image identification device, computer equipment and a storage medium, wherein the image identification method, the image identification device, the computer equipment and the storage medium can improve the identification accuracy rate of a verification result after face images in a monitoring mode are corrected.

In order to solve the above technical problem, the embodiment of the present invention adopts a technical solution that: provided is an image recognition method including:

acquiring a first face image to be identified, wherein the first face image is an image to be completed;

extracting the coordinates of key points in the first face image;

performing image correction on the first face image according to the key point coordinates to generate a second face image so as to complement the first face image;

and carrying out face recognition according to the second face image so as to obtain the user information represented by the first face image.

Optionally, the extracting the coordinates of the key points in the first face image includes:

inputting the first face image into a preset image conversion model, wherein the image conversion model is a neural network model which is trained to a convergence state in advance and extracts key point coordinates in the face image;

and reading the key point coordinates of the first face image output by the image conversion model, wherein the key point coordinates are intermediate data output by the image conversion model.

Optionally, the image rectification of the first face image according to the key point coordinates to generate a second face image so as to complement the first face image includes:

calculating a correction parameter of the first face image according to the key point coordinate;

and according to the correction parameters and a preset affine transformation matrix, correcting the first face image to generate a second face image so as to complete the first face image.

Optionally, before performing face recognition according to the second face image to obtain the user information represented by the first face image, the method includes:

acquiring image parameters of the second face image;

comparing the image parameters with a preset threshold condition to judge whether the second face image reaches the standard or not;

and when the second face image reaches the standard, confirming the face recognition of the second face image.

Optionally, the image parameter includes a confidence level of the second face image, the threshold condition is a preset confidence level threshold, and the comparing the image parameter with the preset threshold condition to determine whether the second face image meets the standard includes:

comparing the confidence of the second face image with the confidence threshold;

when the confidence of the second face image is smaller than the confidence threshold, determining that the second face image reaches the standard; otherwise, confirming that the second face image does not reach the standard.

Optionally, when the second face image reaches the standard, after confirming that the face recognition is performed on the second face image, the method includes:

selecting an image enhancement strategy corresponding to the second face image according to the image parameters;

and enhancing the second face image according to the image enhancement strategy so as to enhance the attribute of pixel points representing the face image in the second face image.

Optionally, the performing face recognition according to the second face image to obtain the user information represented by the first face image includes:

inputting the second face image into a preset face recognition model, wherein the face recognition model is a neural network model which is trained to a convergence state in advance and used for carrying out feature extraction on the face image;

reading a feature vector of the second face image output by the face recognition model;

retrieving in a preset information database by taking the characteristic vector as a retrieval condition, wherein the information database comprises user information, the user information is provided with an index tag, and the index tag is the characteristic vector of the certificate image of the user;

and calling the user information represented by the first face image according to the retrieval result.

In order to solve the above technical problem, an embodiment of the present invention further provides an image recognition apparatus, including:

the system comprises an acquisition module, a judging module and a judging module, wherein the acquisition module is used for acquiring a first face image to be identified, and the first face image is an image to be completed;

the extraction module is used for extracting the key point coordinates in the first face image;

the processing module is used for carrying out image rectification on the first face image according to the key point coordinates to generate a second face image so as to complement the first face image;

and the execution module is used for carrying out face recognition according to the second face image so as to obtain the user information represented by the first face image.

Optionally, the image recognition apparatus further includes:

the first processing submodule is used for inputting the first face image into a preset image conversion model, wherein the image conversion model is a neural network model which is trained to a convergence state in advance and used for extracting key point coordinates in the face image;

and the first execution submodule is used for reading the key point coordinates of the first face image output by the image conversion model, wherein the key point coordinates are intermediate data output by the image conversion model.

Optionally, the image recognition apparatus further includes:

the second processing submodule is used for calculating the correction parameters of the first face image according to the key point coordinates;

and the second execution submodule is used for correcting the first face image according to the correction parameters and a preset affine transformation matrix to generate a second face image so as to complement the first face image.

Optionally, the image recognition apparatus further includes:

the first acquisition submodule is used for acquiring the image parameters of the second face image;

the third processing submodule is used for comparing the image parameters with a preset threshold value condition so as to judge whether the second face image reaches the standard or not;

and the third execution sub-module is used for confirming the face recognition of the second face image when the second face image reaches the standard.

Optionally, the image parameter includes a confidence level of the second face image, the threshold condition is a preset confidence level threshold, and the image recognition apparatus further includes:

the fourth processing submodule is used for comparing the confidence of the second face image with the confidence threshold;

the fourth execution sub-module is used for confirming that the second face image reaches the standard when the confidence coefficient of the second face image is smaller than the confidence coefficient threshold value; otherwise, confirming that the second face image does not reach the standard.

Optionally, the image recognition apparatus further includes:

the fifth processing submodule is used for selecting an image enhancement strategy corresponding to the second face image according to the image parameters;

and the fifth execution sub-module is used for performing enhancement processing on the second face image according to the image enhancement strategy so as to enhance the attribute of the pixel points representing the face image in the second face image.

Optionally, the image recognition apparatus further includes:

a sixth processing submodule, configured to input the second face image into a preset face recognition model, where the face recognition model is a neural network model trained in advance to a convergence state and used for performing feature extraction on the face image;

the first reading submodule is used for reading the feature vector of the second face image output by the face recognition model;

the first retrieval submodule is used for retrieving in a preset information database by taking the characteristic vector as a retrieval condition, wherein the information database comprises user information, the user information is provided with an index tag, and the index tag is the characteristic vector of the certificate image of a user;

and the sixth execution submodule is used for calling the user information represented by the first face image according to the retrieval result.

In order to solve the above technical problem, an embodiment of the present invention further provides a computer device, including a memory and a processor, where the memory stores computer-readable instructions, and the computer-readable instructions, when executed by the processor, cause the processor to execute the steps of the image recognition method.

To solve the above technical problem, an embodiment of the present invention further provides a storage medium storing computer-readable instructions, which, when executed by one or more processors, cause the one or more processors to perform the steps of the image recognition method.

The embodiment of the invention has the beneficial effects that: when monitoring face recognition is carried out, a first face image needing to be recognized is read, the key point coordinates of the first face image are extracted, the first face image is corrected according to the key point coordinates to generate a second face image, and the face image in the corrected second face image is more complete than the face image in the first face image. Therefore, when the second face image is adopted for image recognition, more image features expressing the face features can be extracted, so that the accuracy rate in face recognition is improved, and the recognition accuracy rate in a monitoring environment is greatly improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of a basic flow chart of an image recognition method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram illustrating a process of generating a validation character according to validation information according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart illustrating the generation of verification characters according to semantic recognition according to an embodiment of the present invention;

fig. 4 is a schematic flow chart illustrating deformation of a verification character according to a face image in the embodiment of the present invention;

FIG. 5 is a schematic flow chart illustrating the generation of a verification image by image transformation according to an embodiment of the present invention;

FIG. 6 is a schematic flow chart illustrating the process of obtaining a verification image in a display area according to an embodiment of the present invention;

FIG. 7 is a schematic flow chart illustrating a process of determining whether the verification image is consistent with the screenshot image according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of a basic structure of an image recognition apparatus according to an embodiment of the present invention;

FIG. 9 is a block diagram of the basic structure of a computer device according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention.

In some of the flows described in the present specification and claims and in the above figures, a number of operations are included that occur in a particular order, but it should be clearly understood that these operations may be performed out of order or in parallel as they occur herein, with the order of the operations being indicated as 101, 102, etc. merely to distinguish between the various operations, and the order of the operations by themselves does not represent any order of performance. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As will be appreciated by those skilled in the art, "terminal" as used herein includes both devices that are wireless signal receivers, devices that have only wireless signal receivers without transmit capability, and devices that include receive and transmit hardware, devices that have receive and transmit hardware capable of performing two-way communication over a two-way communication link. Such a device may include: a cellular or other communication device having a single line display or a multi-line display or a cellular or other communication device without a multi-line display; PCS (Personal Communications Service), which may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (personal digital Assistant), which may include a radio frequency receiver, a pager, internet/intranet access, a web browser, a notepad, a calendar, and/or a GPS (Global Positioning System) receiver; a conventional laptop and/or palmtop computer or other device having and/or including a radio frequency receiver. As used herein, a "terminal" or "terminal device" may be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or situated and/or configured to operate locally and/or in a distributed fashion at any other location(s) on earth and/or in space. As used herein, a "terminal Device" may also be a communication terminal, a web terminal, a music/video playing terminal, such as a PDA, an MID (Mobile Internet Device) and/or a Mobile phone with music/video playing function, or a smart tv, a set-top box, etc.

Specifically, referring to fig. 1, fig. 1 is a basic flow chart of the image recognition method according to the embodiment.

As shown in fig. 1, an image recognition method includes:

s1100, acquiring a first face image to be recognized, wherein the first face image is an image to be completed;

since the face recognition in the present embodiment is performed in the monitoring scene, the image to be subjected to the image recognition is also the monitoring image captured by the monitoring apparatus.

The monitoring image comprises one or more face images, when image recognition is carried out, each face image in the monitoring image is divided through image division, meanwhile, the complete face image of the face image in the divided face images is directly applied to face recognition, and the image needing to be completed in the divided face image is defined as a first face image.

The human face image in the human face image is blocked or the face is inclined. Whether the face image needs to be completed or not is detected, and judgment can be carried out through a neural network model trained to be in a convergence state.

S1200, extracting the coordinates of key points in the first face image;

after the first face image is obtained, extracting the coordinates of the key points in the first face image in a preset image conversion model.

However, the image conversion model is not limited to this, and may be a convolutional neural network model (CNN) trained to a convergence state: a deep neural network model (DNN), a recurrent neural network model (RNN), or a variant of the three network models described above.

The image conversion model is a neural network model which is trained to a convergence state and used for extracting key points of five sense organs in the face image. When the image conversion model is trained, a large number of face images are adopted to perform face image key point extraction training, and after the face image key point extraction training is carried out to a convergence state, the key point coordinates of the face image can be accurately extracted.

In this embodiment, the key point coordinates refer to positions of pixel points representing positions of five sense organs of the human face in the human face image. The origin of coordinates in the coordinate position is the position of the first pixel point at the lower left corner in the first face image. But not limited to this, the origin of coordinates is the coordinates of the position at any point in the first face image, depending on the specific application scenario.

S1300, carrying out image rectification on the first face image according to the key point coordinates to generate a second face image so as to complement the first face image;

and carrying out image correction on the first face image according to the key point coordinates, wherein the final purpose of the image correction is to complement the blocked position in the first face image or solve the problem of partial face image deletion caused by side faces.

In some embodiments, the first face image is image rectified using an image transformation model, the image transformation model being generic toThe offset delta S is obtained by comparing the offset or the rotation angle between the key point coordinate in the first face image and a preset standard key point coordinate graph_tIs the offset under the new feature space, and should be subjected to an inverse transformation T after being subjected to the offset_t ^-1And restoring to the original space to finish the correction of the first face image.

In some embodiments, an affine transformation layer is provided in the image conversion model, and the affine transformation layer performs linear transformation on the first face image and then performs translation to transform the first face image into the second face image, wherein the linear transformation refers to rotating the first face image or compensating for the missing. The linear transformation is a multiplication of an image matrix of the first face image by an affine transformation matrix.

The image conversion model corrects the first face image to obtain a second face image, and the second face image is more complete than the first face image.

And S1400, carrying out face recognition according to the second face image so as to obtain the user information represented by the first face image.

And carrying out face recognition according to the second face image, wherein the face recognition mode is as follows: and inputting the second face image into an image search engine as a retrieval condition, extracting a feature vector of the second face image by the image search engine, retrieving in an established user information base according to the feature vector, and obtaining a retrieval result, namely the user information represented by the first face image. Wherein the user information can be: identity document information of the user, various card numbers of the user or criminal information of the user.

When the monitoring face recognition is carried out, the first face image needing to be recognized is read, the key point coordinates of the first face image are extracted, the first face image is corrected according to the key point coordinates to generate the second face image, and the face image in the corrected second face image is more complete than the face image in the first face image. Therefore, when the second face image is adopted for image recognition, more image features expressing the face features can be extracted, so that the accuracy rate in face recognition is improved, and the recognition accuracy rate in a monitoring environment is greatly improved.

In some embodiments, the coordinates of the key points in the first face image are extracted by an image conversion model. Referring to fig. 2, fig. 2 is a schematic flow chart illustrating the process of extracting the coordinates of the key points according to the present embodiment.

As shown in fig. 2, the S1200 step shown in fig. 1 includes:

s1211, inputting the first face image into a preset image conversion model, wherein the image conversion model is a neural network model which is trained to be in a convergence state in advance and used for extracting key point coordinates in the face image;

the first face image is input into a preset image conversion model, which can be a convolutional neural network model (CNN) trained to a convergence state, but, not limited thereto, the image conversion model can also be: a deep neural network model (DNN), a recurrent neural network model (RNN), or a variant of the three network models described above.

The image conversion model is a neural network model which is trained to a convergence state and used for extracting key points of five sense organs in the face image.

During training of an initial neural network model of the image conversion model, a large number of face images are collected as training samples, and each training sample is calibrated in a manual calibration mode (calibration refers to the manual calibration of coordinates of key points of five sense organs in the face images). Then inputting the training sample into the initial neural network model, and obtaining the classification result output by the model (the classification result is the coordinate of the key point of the training sample obtained by the model), calculating the distance (such as Euclidean distance, Mahalanobis distance or cosine distance) between the classification result and the calibration result through a loss function of the neural network model, comparing the calculation result with a set distance threshold, continuing the training of the next training sample through verification if the calculation result is less than or equal to the distance threshold (such as 0.01), calculating the difference between the two through the loss function if the calculation result is greater than the distance threshold, and the weight in the neural network model is corrected through back propagation, so that the neural network model can improve the weight of pixel points at the positions of the five sense organs in the face image, and the accuracy of model judgment is increased. After the above scheme and training of a large number of training samples are performed in a circulating manner, the accuracy rate of the neural network model obtained through training for extracting the key point coordinates of the face image is greater than a certain value, for example, 98%, the neural network model is trained to be in a convergence state, and the neural network trained to be in the convergence state is the image conversion model.

The image conversion model trained to the convergence state can accurately extract the coordinates of the key points in the face image.

And S1212, reading a key point coordinate of the first face image output by the image conversion model, wherein the key point coordinate is intermediate data output by the image conversion model.

In this embodiment, the image conversion model outputs the position of the key point coordinate on the last convolution layer of the convolution channel, that is, the key point coordinate is the intermediate data of the image conversion model. Therefore, after the first face image is input to the image conversion model, the keypoint coordinates of the first face image are read at the output interface of the last convolution layer of the image conversion model.

The key point coordinates of the first face image can be extracted quickly and accurately through the neural network model, and the image extraction efficiency is improved.

In some embodiments, the image conversion model requires correction of the first face image in addition to the keypoint extraction of the first face image. Referring to fig. 3, fig. 3 is a schematic flow chart illustrating a process of transforming a face image according to an affine transformation matrix according to the present embodiment.

As shown in fig. 3, the step S1300 shown in fig. 1 includes:

s1311, calculating correction parameters of the first face image according to the key point coordinates;

after extracting the key point coordinates of the first face image, calculating the correction parameters of the first face image according to the key point coordinates, wherein the correction parameters are as follows: the offset between the keypoint coordinates of the first face image relative to the keypoint coordinates of the standard face image. For example, when the first face image is shot, the first face image is used for a target person to stand sideways, when the image of the face part of the obtained face image is shot, because the distance of a connecting line between key point coordinates representing two eyes in the key point coordinates of the first face image which stands sideways is correspondingly shortened, the offset angle of the face image in space can be obtained by converting the shortened amount into a space offset, and the corrected face image is obtained by compensating the offset angle and performing pixel filling on the compensated first face image.

And S1312, according to the correction parameters and a preset affine transformation matrix, correcting the first face image to generate a second face image so as to complete the first face image.

An affine transformation layer is arranged in the image conversion model, linear transformation is carried out on the first face image by the affine transformation layer, the first face image is subjected to the last translation, and the second face image is transformed by the affine transformation layer, wherein the linear transformation refers to the rotation of the first face image or the defect compensation of the first face image. The linear transformation is a multiplication of an image matrix of the first face image by an affine transformation matrix.

For example:

wherein the content of the first and second substances,

is a matrix of pixels of the second face image,

a matrix of pixels of the first face image, a being an offset,

in order to be an affine transformation matrix,

is the amount of translation.

After the weight features with the standard faces in the affine transformation matrix are multiplied by the matrix of the first face image, the missing parts in the first face image can be supplemented.

The rapid correction of the first face image is realized through the image conversion model, and the face image correction efficiency is improved.

In some embodiments, after the second face image is generated, parameters of the second face image need to be recognized to distinguish whether the second face image can be used for face recognition. Referring to fig. 4, fig. 4 is a schematic flow chart illustrating the process of determining whether the image parameter of the second face image meets the standard according to the present embodiment.

As shown in fig. 4, before the step S1400 shown in fig. 1, the method includes:

s1321, acquiring image parameters of the second face image;

acquiring image parameters of the second face image after the second face image is generated, wherein the image parameters refer to (are not limited to): and attribute information such as deflection angle, image definition, human face confidence coefficient and the like of the human face image in the second human face image.

The deflection angle refers to the deflection angle of the second face image relative to the first face image; the image definition refers to the image definition of the second face image; the face confidence is the confidence of the similarity of the second face image relative to the first face image.

The acquired image parameter may be any one of the three parameters described above, or may be a combination of a plurality of the three parameters.

S1322, comparing the image parameters with a preset threshold value condition to judge whether the second face image reaches the standard or not;

in this embodiment, a threshold condition is set for detecting whether the image parameter of the second face image meets the standard. When the image parameter is a deflection angle, the threshold condition is an angle threshold, and if the deflection angle is smaller than or equal to the angle threshold, the second face image reaches the standard; otherwise, judging that the second face image does not reach the standard. When the image parameter is image definition, the preset condition is a definition threshold value, and if the image definition is greater than or equal to the definition threshold value, the second face image is determined to reach the standard; otherwise, judging that the second face image does not reach the standard. When the image parameter is the confidence coefficient, the preset condition is a confidence coefficient threshold value, and if the confidence coefficient is greater than or equal to the confidence coefficient threshold value, the second face image is confirmed to reach the standard; otherwise, the second face image is determined not to reach the standard.

When a plurality of image parameters are obtained, sequentially carrying out corresponding comparison on all the image parameters according to a set judgment order, comparing the next image parameter after the last image parameter is qualified, and confirming that the second face image reaches the standard only after all the image parameters are compared; otherwise, the second face image is determined not to reach the standard.

And when the second face image is determined not to meet the standard, discarding the second face image, and re-acquiring the first face image of the target face.

S1323, when the second face image reaches the standard, confirming that the face recognition is carried out on the second face image.

And when the second face image reaches the standard, confirming that the second face image is adopted for face recognition.

In some embodiments, the image parameter is a confidence level of the second face image, and the second face image is subjected to the compliance test according to the confidence level. Referring to fig. 5, fig. 5 is a schematic flow chart illustrating the second face image reaching the standard according to the confidence level in the embodiment.

As shown in fig. 5, the step S1322 shown in fig. 4 includes:

s1331, comparing the confidence of the second face image with the confidence threshold;

in this embodiment, the confidence level is a probability value for determining whether the first face image and the second face image are similar to each other. And judging whether the first face image is similar to the second face image or not through an image comparison model. For example, if the image comparison model determines that the confidence level of the first face image is equal to that of the second face image by 0.95, it indicates that the probability of the image comparison model determining that the first face image is equal to that of the second face image is 95%.

The image comparison model is a neural network model trained to be convergent and used for judging whether the two images are similar or not. The image alignment model can be a convolutional neural network model (CNN) that has been trained to a convergent state, but, without limitation, the image alignment model can also be: a deep neural network model (DNN), a recurrent neural network model (RNN), or a variant of the three network models described above. When the image comparison model is trained, a large number of face images are adopted to train face image similarity judgment, and after the face images are trained to be in a convergence state, whether different images are similar or not can be accurately judged.

S1332, when the confidence of the second face image is smaller than the confidence threshold, determining that the second face image reaches the standard; otherwise, confirming that the second face image does not reach the standard.

Confirming that the second face image reaches the standard by comparing when the confidence coefficient of the second face image is smaller than a confidence coefficient threshold value; otherwise, the second face image is determined not to reach the standard. In this embodiment, the confidence threshold is set to 0.8, but the value of the confidence threshold is not limited to this, and the value of the confidence threshold can be set by user according to different application scenarios.

In some embodiments, since the first face image is a face image of a target user captured from the surveillance video, the image sharpness of the first face image or the contrast between the face image and the surrounding image is not prominent, which results in a low accuracy in image recognition of the second face image generated by converting the first face image. Referring to fig. 6, fig. 6 is a schematic flow chart illustrating image enhancement performed on a second face image according to the present embodiment.

As shown in fig. 6, after step S1323 shown in fig. 4, the method includes

S1341, selecting an image enhancement strategy corresponding to the second face image according to the image parameters;

and when an image enhancement strategy corresponding to the second face image is selected according to the image parameters, firstly, acquiring the image parameters of the second face image. When the image parameter of the second face image is a deflection angle, the deflection angle meets a threshold condition, homomorphic filtering is performed on the second face image, multiplicative noise (multiplicative noise) is removed, and the contrast and the standardized brightness can be increased at the same time, so that the purpose of image enhancement is achieved; when the image parameter of the second face image is the image definition, if the image definition does not reach the standard, the image pixel value is too low, the image processing is carried out in a pixel filling mode, the color difference value between each pixel point and the surrounding pixel points is obtained during the pixel filling, and then a transitional color with the color difference value smaller than the color difference value between the adjacent pixel points is selected for the pixel filling so as to enhance the definition of the second face image; when the image parameters of the second face image are confidence degrees, the second face image is converted from the first face image, and therefore when the second face image does not meet the standard, it is proved that the contrast of a portion, which is completed relative to the first face image, in the second face image is large, and image enhancement processing should be performed in a manner of reducing the contrast of the pixels of the completed portion.

And when the image parameters of the second face image are multiple, sequentially performing corresponding image enhancement processing on the second face image according to the verification sequence during parameter verification.

S1342, enhancing the second face image according to the image enhancement strategy so as to enhance the attribute of the pixel points representing the face image in the second face image.

Because different image parameters correspond to different image enhancement strategies, after the image parameters are obtained, the second face image is processed according to the processing strategies corresponding to the image parameters, so that the attribute of the pixel points representing the face image in the second face image is enhanced.

And image enhancement processing is carried out on the second face image, so that the user face pixels in the second face image are clearer and more obvious, and the accuracy of subsequent image identification is enhanced.

In some embodiments, after the second face image is generated or enhanced, the second face image continues to be identified and retrieved to obtain the user information represented by the first face image. Referring to fig. 7, fig. 7 is a schematic flow chart illustrating the process of performing face recognition on a second face image according to the present embodiment.

As shown in fig. 7, the step S1400 shown in fig. 1 includes:

s1411, inputting the second face image into a preset face recognition model, wherein the face recognition model is a neural network model which is trained to a convergence state in advance and used for carrying out feature extraction on the face image;

and inputting the second face image into a face recognition model, wherein the face recognition model is a core module in an image search engine and is used for extracting a neural network model of the feature vector of the input image.

The face recognition model in the present embodiment may be a convolutional neural network model (CNN) that has been trained to a convergent state, but is not limited thereto, and the face recognition model may be: a deep neural network model (DNN), a recurrent neural network model (RNN), or a variant of the three network models described above.

The face recognition model is a neural network model which is trained to a convergence state and used for extracting feature vectors of the face image. When the face recognition model is trained, a large number of face images are adopted to perform the training of face image feature vector extraction, and after the face recognition model is trained to a convergence state, the feature vectors of the face images can be accurately extracted.

S1412, reading a feature vector of the second face image output by the face recognition model;

reading a feature vector of a second face image output by the face recognition model, wherein the output position of the feature vector is as follows: the output position of the last convolutional layer of the face recognition model.

S1413, retrieving in a preset information database by taking the feature vector as a retrieval condition, wherein the information database comprises user information, the user information is provided with an index tag, and the index tag is the feature vector of the certificate image of the user;

and searching in a preset information database by taking the extracted feature vector as a searching condition, wherein the information database stores the identity information of the users, and the identity information comprises the photos of the certificate images of the users or the face images of the users. When the identity information of the user is recorded in the information database, index tags need to be made on the identity information of the user so as to facilitate quick retrieval of the identity information of the user, wherein one of the index tags is as follows: the method comprises the steps of inputting a photo of a user certificate image or a feature vector of a user face image into a face recognition model or other similar models, and extracting the feature vector of the photo of the certificate image or the user face image to serve as an index label of user identity information.

When the feature vector of the second face image is taken as a retrieval condition, the search engine calculates the hamming distances between different index tags and the feature vector of the second face image in the information database, and recalls the identity information with the hamming distance smaller than a first threshold, for example, if the first threshold is 20, the identity information with the hamming distance smaller than 20 with the feature vector of the second face image is recalled. And then, in the recalled content, calculating the confidence degrees of the feature vector of the second face image and the index label of the recalled identity information, and taking the one with the highest confidence degree as a retrieval result.

And S1414, calling the user information represented by the first face image according to the retrieval result.

And calling identity information corresponding to the index label with the maximum confidence coefficient between the feature vectors of the second face image according to the retrieval result, wherein the identity information is the user information represented by the first face image.

The neural network model can be used for rapidly retrieving the represented user information of the face image, and the image recognition efficiency and the image search efficiency are improved.

In order to solve the above technical problem, an embodiment of the present invention further provides an image recognition apparatus.

Referring to fig. 8, fig. 8 is a schematic view of a basic structure of the image recognition apparatus according to the present embodiment.

As shown in fig. 8, an image recognition apparatus includes: an acquisition module 2100, an extraction module 2200, a processing module 2300, and an execution module 2400. The acquiring module 2100 is configured to acquire a first face image to be identified, where the first face image is an image to be completed; the extracting module 2200 is configured to extract the coordinates of the key points in the first face image; the processing module 2300 is configured to perform image rectification on the first face image according to the key point coordinates to generate a second face image, so as to complement the first face image; the execution module 2400 is configured to perform face recognition according to the second facial image, so as to obtain user information represented by the first facial image.

When the image recognition device carries out monitoring face recognition, a first face image needing to be recognized is read, the key point coordinates of the first face image are extracted, the first face image is corrected according to the key point coordinates to generate a second face image, and the face image in the corrected second face image is more complete than the face image in the first face image. Therefore, when the second face image is adopted for image recognition, more image features expressing the face features can be extracted, so that the accuracy rate in face recognition is improved, and the recognition accuracy rate in a monitoring environment is greatly improved.

In some embodiments, the image recognition apparatus further comprises: a first processing submodule and a first execution submodule. The first processing submodule is used for inputting a first face image into a preset image conversion model, wherein the image conversion model is a neural network model which is trained to a convergence state in advance and extracts key point coordinates in the face image; the first execution submodule is used for reading the key point coordinates of the first face image output by the image conversion model, wherein the key point coordinates are intermediate data output by the image conversion model.

In some embodiments, the image recognition apparatus further comprises: a second processing submodule and a second execution submodule. The second processing submodule is used for calculating the correction parameters of the first face image according to the key point coordinates; and the second execution submodule is used for correcting the first face image according to the correction parameters and the preset affine transformation matrix to generate a second face image so as to complement the first face image.

In some embodiments, the image recognition apparatus further comprises: the device comprises a first obtaining submodule, a third processing submodule and a third executing submodule. The first acquisition submodule is used for acquiring image parameters of the second face image; the third processing submodule is used for comparing the image parameters with a preset threshold value condition so as to judge whether the second face image reaches the standard or not; and the third execution sub-module is used for confirming the face recognition of the second face image when the second face image reaches the standard.

In some embodiments, the image parameter includes a confidence level of the second face image, the threshold condition is a preset confidence level threshold, and the image recognition apparatus further includes: a fourth processing submodule and a fourth execution submodule. The fourth processing submodule is used for comparing the confidence of the second face image with a confidence threshold; the fourth execution sub-module is used for confirming that the second face image reaches the standard when the confidence coefficient of the second face image is smaller than the confidence coefficient threshold value; otherwise, the second face image is determined not to reach the standard.

In some embodiments, the image recognition apparatus further comprises: a fifth processing submodule and a fifth execution submodule. The fifth processing submodule is used for selecting an image enhancement strategy corresponding to the second face image according to the image parameters; and the fifth execution submodule is used for performing enhancement processing on the second face image according to the image enhancement strategy so as to enhance the attribute of the pixel points which are used for representing the face image in the second face image.

In some embodiments, the image recognition apparatus further comprises: the device comprises a sixth processing submodule, a first reading submodule, a first retrieval submodule and a sixth execution submodule. The sixth processing submodule is used for inputting the second face image into a preset face recognition model, wherein the face recognition model is a neural network model which is trained to be in a convergence state in advance and used for carrying out feature extraction on the face image; the first reading submodule is used for reading a feature vector of a second face image output by the face recognition model; the first retrieval submodule is used for retrieving in a preset information database by taking the characteristic vector as a retrieval condition, wherein the information database comprises user information, the user information is provided with an index tag, and the index tag is the characteristic vector of the certificate image of a user; and the sixth execution submodule is used for calling the user information represented by the first face image according to the retrieval result.

In order to solve the above technical problem, an embodiment of the present invention further provides a computer device. Referring to fig. 9, fig. 9 is a block diagram of a basic structure of a computer device according to the present embodiment.

As shown in fig. 9, the internal structure of the computer device is schematically illustrated. The computer device includes a processor, a non-volatile storage medium, a memory, and a network interface connected by a system bus. The non-volatile storage medium of the computer device stores an operating system, a database and computer readable instructions, the database can store control information sequences, and the computer readable instructions can enable the processor to realize an image recognition method when being executed by the processor. The processor of the computer device is used for providing calculation and control capability and supporting the operation of the whole computer device. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, may cause the processor to perform a method of image recognition. The network interface of the computer device is used for connecting and communicating with the terminal. Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In this embodiment, the processor is configured to execute specific functions of the obtaining module 2100, the extracting module 2200, the processing module 2300 and the executing module 2400 in fig. 8, and the memory stores program codes and various data required for executing the modules. The network interface is used for data transmission to and from a user terminal or a server. The memory in this embodiment stores program codes and data required for executing all the sub-modules in the face image key point detection device, and the server can call the program codes and data of the server to execute the functions of all the sub-modules.

When the computer equipment is used for monitoring face recognition, a first face image needing to be recognized is read, the key point coordinates of the first face image are extracted, the first face image is corrected according to the key point coordinates to generate a second face image, and the face image in the corrected second face image is more complete than the face image in the first face image. Therefore, when the second face image is adopted for image recognition, more image features expressing the face features can be extracted, so that the accuracy rate in face recognition is improved, and the recognition accuracy rate in a monitoring environment is greatly improved.

The present invention also provides a storage medium storing computer-readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the image recognition method of any of the above embodiments.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the computer program is executed. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

Claims

1. An image recognition method, comprising:

extracting the coordinates of key points in the first face image;

2. The image recognition method according to claim 1, wherein the extracting the coordinates of the key points in the first face image includes:

3. The image recognition method according to claim 2, wherein the performing image rectification on the first face image according to the key point coordinates to generate a second face image so as to complement the first face image comprises:

4. The image recognition method according to claim 1, wherein before performing face recognition based on the second facial image to obtain the user information represented by the first facial image, the method comprises:

acquiring image parameters of the second face image;

5. The image recognition method of claim 4, wherein the image parameter comprises a confidence level of the second face image, the threshold condition is a preset confidence level threshold, and the comparing the image parameter with the preset threshold condition to determine whether the second face image meets the standard comprises:

6. The image recognition method according to claim 4, wherein when the second face image reaches a standard, after confirming the face recognition of the second face image, the method comprises:

7. The image recognition method of claim 1, wherein the performing face recognition based on the second facial image to obtain user information represented by the first facial image comprises:

8. An image recognition apparatus, comprising:

9. A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which, when executed by the processor, cause the processor to carry out the steps of the image recognition method according to any one of claims 1 to 7.

10. A storage medium having stored thereon computer-readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the image recognition method of any one of claims 1 to 7.