CN112861589A

CN112861589A - Portrait extraction, quality evaluation, identity verification and model training method and device

Info

Publication number: CN112861589A
Application number: CN201911189725.9A
Authority: CN
Inventors: 沈程隆; 蒋宁; 赵立军
Original assignee: Mashang Xiaofei Finance Co Ltd
Current assignee: Mashang Xiaofei Finance Co Ltd; Mashang Consumer Finance Co Ltd
Priority date: 2019-11-28
Filing date: 2019-11-28
Publication date: 2021-05-28

Abstract

The invention provides a portrait extraction, quality evaluation, identity verification and model training method and a device, wherein the portrait extraction method comprises the following steps: extracting a card area image in an image to be processed; and extracting a portrait area image in the card area image according to a target relative position, wherein the target relative position is a relative position of a preset card area image and the portrait area image. The portrait extracting method provided by the invention can accurately extract the portrait area image from the card image.

Description

Portrait extraction, quality evaluation, identity verification and model training method and device

Technical Field

The invention relates to the technical field of information processing, in particular to a method and a device for portrait extraction, quality evaluation, identity verification and model training.

Background

With the continuous development of internet technology, a large number of internet applications, such as internet shopping, internet finance (e.g., credit, financing, payment, etc.), and the like, emerge. To ensure security, many internet applications require verification of identity information of a user. There are some schemes for automatically performing face recognition on a user based on a card image (e.g., an identification card image, a social security card image, a passport image, etc.) uploaded by the user to verify the user's identity.

In general, the face recognition effect is affected by the image quality. At present, most of the quality evaluation methods of the image of the card are to detect the image of the portrait area in the image of the card through a human face detection algorithm, then calculate the gradient function of the image of the portrait area, etc., and calculate the fuzziness of the image of the portrait area, or judge whether the image of the portrait area reflects light through the distribution of the histogram of the image of the portrait area, etc. However, the card image quality evaluation method has large fluctuation on image dimensions, and card images uploaded by users often have different sizes, so that a good identification precision cannot be achieved on different dimensions, and in addition, due to the influence of various factors such as illumination, shooting angle, shooting equipment and the like, the card images uploaded by users usually have the condition that the image characteristics of the portrait area are lost, so that the detection of the portrait area images through a face detection algorithm is easy to fail or the detection result is inaccurate, and further the judgment of the subsequent image blurring degree or the light reflection condition is influenced.

Therefore, the problem that the accuracy of the image in the portrait area for detecting the card image is poor exists in the prior art.

Disclosure of Invention

The embodiment of the invention provides a portrait extracting, quality evaluating, identity verifying and model training method and device, and aims to solve the problem that in the prior art, the accuracy of a portrait area image for detecting a card image is poor.

In order to solve the technical problem, the invention is realized as follows:

in a first aspect, an embodiment of the present invention provides a portrait extracting method, where the method includes:

extracting a card area image in an image to be processed;

and extracting a portrait area image in the card area image according to a target relative position, wherein the target relative position is a relative position of a preset card area image and the portrait area image.

In a second aspect, an embodiment of the present invention provides an image quality evaluation method. The method comprises the following steps:

acquiring an image to be processed, wherein the image to be processed is an image to be evaluated;

processing the image to be evaluated by using the portrait processing method to obtain a portrait area image;

and inputting the portrait area image into an image quality evaluation model to obtain an image quality evaluation result of the portrait area image, wherein the image quality evaluation model is obtained based on target neural network training.

In a third aspect, an embodiment of the present invention provides an identity authentication method. The method comprises the following steps:

receiving an image to be processed uploaded by a user, wherein the image to be processed comprises a card area image;

performing image quality evaluation on the image to be processed by using the image quality evaluation method to obtain an image quality evaluation result;

if the image quality evaluation result indicates that the image to be processed is a qualified image, performing identity verification according to the image to be processed;

and if the image quality evaluation result indicates that the image to be processed is an unqualified image, outputting prompt information, wherein the prompt information is used for prompting a user to upload the image again.

In a fourth aspect, an embodiment of the present invention provides a model training method. The method comprises the following steps:

the method comprises the steps of obtaining S image samples and label data of the S image samples, wherein each image sample comprises a card area image, the label data are used for indicating the image quality category of a portrait area image in the card area image, and S is an integer larger than 1;

respectively extracting the card area images in each image sample in the S image samples to obtain S card area images;

respectively extracting portrait area images in the S card area images according to a target relative position, wherein the target relative position is a relative position of a preset card area image and the portrait area image;

and training a target neural network according to the portrait area images in the S card area images and the label data to obtain an image quality evaluation model.

In a fifth aspect, the embodiment of the present invention further provides a portrait extracting apparatus. The portrait extraction device includes:

the first extraction module is used for extracting a card area image in the image to be processed;

and the second extraction module is used for extracting the portrait area image in the card area image according to a target relative position, wherein the target relative position is a preset relative position of the card area image and the portrait area image.

In a sixth aspect, an embodiment of the present invention further provides an image quality evaluation apparatus. The image quality evaluation apparatus includes:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an image to be processed, and the image to be processed is an image to be evaluated;

the portrait extracting module is used for processing the image to be evaluated by using the portrait processing method to obtain a portrait area image;

and the evaluation module is used for inputting the portrait area image into an image quality evaluation model to obtain an image quality evaluation result of the portrait area image, wherein the image quality evaluation model is a model obtained based on target neural network training.

In a seventh aspect, an embodiment of the present invention further provides an identity authentication apparatus. The identity authentication device includes:

the receiving module is used for receiving an image to be processed uploaded by a user, wherein the image to be processed comprises a card area image;

the quality evaluation module is used for evaluating the image quality of the image to be processed by using the image quality evaluation method to obtain an image quality evaluation result;

the verification module is used for performing identity verification according to the image to be processed if the image quality evaluation result indicates that the image to be processed is a qualified image;

and the output module is used for outputting prompt information if the image quality evaluation result indicates that the image to be processed is an unqualified image, wherein the prompt information is used for prompting a user to upload the image again.

In an eighth aspect, an embodiment of the present invention further provides a model training apparatus. The model training device includes:

the system comprises an acquisition module, a storage module and a display module, wherein the acquisition module is used for acquiring S image samples and label data of the S image samples, each image sample comprises a card area image, the label data is used for indicating the image quality category of a portrait area image in the card area image, and S is an integer greater than 1;

the first extraction module is used for respectively extracting the card area images in each image sample in the S image samples to obtain S card area images;

the second extraction module is used for respectively extracting portrait area images in the S card area images according to a target relative position, wherein the target relative position is a relative position of a preset card area image and the portrait area image;

and the training module is used for training a target neural network according to the portrait area images in the S card area images and the label data to obtain an image quality evaluation model.

In a ninth aspect, an embodiment of the present invention further provides an electronic device, which includes a processor, a memory, and a computer program stored in the memory and executable on the processor, where the computer program, when executed by the processor, implements the steps of the portrait extraction method provided in the first aspect, or implements the steps of the image quality evaluation method provided in the second aspect, or implements the steps of the identity verification method provided in the third aspect, or implements the steps of the model training method provided in the fourth aspect.

In a ninth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when being executed by a processor, the computer program implements the steps of the portrait extraction method provided in the first aspect, or implements the steps of the image quality evaluation method provided in the second aspect, or implements the steps of the identity verification method provided in the third aspect, or implements the steps of the model training method provided in the fourth aspect.

In the embodiment of the invention, the portrait area image in the card area image is extracted according to the relative position of the target by extracting the card area image in the image to be processed. Because the portrait area image is extracted from the card image based on the relative position of the card area image and the portrait area image, the portrait area image can be extracted from the card image more accurately no matter whether the portrait area image characteristic is lost or not in the card image.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.

Fig. 1 is a flowchart of a portrait extracting method according to an embodiment of the present invention;

FIG. 2 is a schematic illustration of a card image provided by an embodiment of the present invention;

FIG. 3a is a schematic diagram of a card area image with tilt according to an embodiment of the present invention;

FIG. 3b is a schematic diagram of an image of a card area after tilt correction according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of the relative positions of a card region image and a portrait region image provided by an embodiment of the present invention;

FIG. 5 is a flow chart of an image quality evaluation method provided by an embodiment of the invention;

fig. 6 is a flowchart of an authentication method provided in an embodiment of the present invention;

FIG. 7 is a flow chart of a model training method provided by an embodiment of the invention;

FIG. 8 is a schematic structural diagram of a target neural network provided by an embodiment of the present invention;

fig. 9 is a structural diagram of a portrait extracting apparatus according to an embodiment of the present invention;

fig. 10 is a structural diagram of an image quality evaluation apparatus provided by an embodiment of the present invention;

fig. 11 is a block diagram of an authentication apparatus according to an embodiment of the present invention;

FIG. 12 is a block diagram of a model training apparatus according to an embodiment of the present invention;

fig. 13 is a block diagram of a portrait extracting apparatus according to yet another embodiment of the present invention;

fig. 14 is a structural view of an image quality evaluation apparatus provided in another embodiment of the present invention;

fig. 15 is a block diagram of an authentication apparatus according to still another embodiment of the present invention;

fig. 16 is a block diagram of a model training apparatus according to still another embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides a portrait extracting method. Referring to fig. 1, fig. 1 is a flowchart of a portrait extraction method provided by an embodiment of the present invention, as shown in fig. 1, including the following steps:

step 101, extracting a card area image in an image to be processed.

In this embodiment, the image to be processed may be any image including a card area, which may also be referred to as a card image, such as an identification card image, a driving license image, a passport image, a social security card image, a student's license image, a employee's license image, and the like. In general, the card image usually includes some background area images in addition to the card area image. For example, as shown in fig. 2, the card image 10 includes a card area image 11 and a background area image 12.

Optionally, the card area image in the image to be processed may be positioned and the card area image may be intercepted by an image edge detection method, or the card area image in the image to be processed may be positioned and the card area image may be intercepted by a pre-trained card detection model, which is not limited in this embodiment. The above-mentioned card detection model may include, but is not limited to, a model obtained based on generation of confrontation network training or a model obtained based on yoolov 3 network training, etc.

Optionally, if the extracted card area image has a slope, the extracted card area image may be transformed to correct the card area image, for example, the card area image before correction is as shown in fig. 3a, and the card area image after correction is as shown in fig. 3 b.

Optionally, before extracting the card area image in the image to be processed, the embodiment may further perform preprocessing on the image to be processed, for example, perform image enhancement processing, image filtering processing, normalization processing, and the like on the image to be processed, where the normalization processing may refer to converting the image to be processed into a preset size, and then extract the card area image in the image to be processed after the preprocessing.

And 102, extracting a portrait area image in the card area image according to a target relative position, wherein the target relative position is a preset relative position of the card area image and the portrait area image.

In practice, the portrait is usually located at a fixed position of the card, and the size of the portrait and the card are often fixed. Therefore, the relative positions of the card area image and the portrait area image tend to be fixed. For example, taking the id card image as an example, the portrait area image is often located at 0.638 to 0.927 positions of the card area image in the longitudinal direction or the horizontal direction, and the portrait area image is often located at 0.175 to 0.703 positions of the card area image in the width direction or the vertical direction, as shown in fig. 4.

According to the embodiment, the portrait area image in the card area image is extracted according to the relative position of the card area image and the portrait area image, so that whether the face area characteristics are lost in the card image or not can be accurately extracted from the card image.

Optionally, the method may further include:

acquiring N image samples, wherein N is an integer greater than 1, and each image sample comprises a card area image;

extracting card area images of each image sample in the N image samples to obtain N card area images;

respectively carrying out face detection on each card area image in the N card area images to obtain a human image area image in each card area image in the N card area images;

counting the relative position of each card area image and the portrait area image in the N card area images to obtain N relative positions;

and determining the relative position of the target according to the N relative positions.

In this embodiment, the value of N may be set reasonably according to actual requirements, for example, 1000, 2000, 10000, 100000, or the like.

In practical situations, because the extracted card area images often have a certain degree of position deviation, the present embodiment can improve the accuracy of the set target relative position by counting the relative positions of the card area images and the portrait area images in a certain number of image samples, and determining the target relative position based on the counted relative positions of the card area images and the portrait area images in a certain number.

Specifically, a card area image of each image sample in N image samples may be extracted, a coordinate position of a face area of each card area image (for example, a coordinate position of a face frame for framing the face area) may be obtained through a face detection algorithm, the obtained coordinate position of the face area may be enlarged (for example, the coordinate position of the face frame is enlarged) according to a relative size relationship between the face area and the image of the face area, for example, the coordinate position of the image of the face area is enlarged by 1.2 times or 1.3 times, and the relative position of each card area image and the image of the face area image thereof is counted, so as to determine a target relative position based on the N relative positions.

The relative position between the card area image and the portrait area image may include a sub-relative position between the card area image and the portrait area image in the length direction, for example, a position of a first row of pixel points of the portrait area image relative to the length of the card area image and a position of a last row of pixel points of the portrait area image relative to the length of the card area image, and a sub-relative position between the card area image and the portrait area image in the width direction, for example, a position of a first row of pixel points of the portrait area image relative to the width of the card area image and a position of a last row of pixel points of the portrait area image relative to the width of the card area image.

Optionally, determining the target relative position based on the N relative positions may include determining an average value of corresponding sub-relative positions of the sub-relative positions included in the N relative positions as corresponding sub-relative positions of the target relative position. For example, the N relative positions include N first sub relative positions, N second sub relative positions, N third sub relative positions, and N fourth sub relative positions, wherein the first sub-relative position is used for representing the position of a first row of pixel points of the portrait area image relative to the length of the card area image, the second relative position represents the position of a last row of pixel points of the portrait area image relative to the length of the card area image, the third relative position represents the position of a first row of pixel points of the portrait area image relative to the width of the card area image, the fourth relative position represents the position of a last row of pixel points of the portrait area image relative to the width of the card area image, the target relative position may include an average of N first sub relative positions, an average of N second sub relative positions, an average of N third sub relative positions, and an average of N fourth sub relative positions.

Optionally, determining the target relative position based on the N relative positions may also include determining a maximum value or a minimum value of corresponding sub-relative positions of the sub-relative positions included in the N relative positions as corresponding sub-relative positions of the target relative position. For example, a minimum value of the N first sub relative positions is determined as a first sub relative position in the target relative position, a maximum value of the N second sub relative positions is determined as a second sub relative position in the target relative position, a minimum value of the N third sub relative positions is determined as a third sub relative position in the target relative position, and a maximum value of the N fourth sub relative positions is determined as a fourth sub relative position in the target relative position.

It should be noted that the Face detection algorithm may include, but is not limited to, Cascade CNN, Face R-CNN, Faceness-Net, MTCNN, or DenseBox, and the like, which is not limited in this embodiment.

Optionally, the extracting the card area image in the image to be processed may include:

inputting the image to be processed into a card detection model to obtain position information of a card area image in the image to be processed, wherein the card detection model is obtained based on YoloV3 network training;

and extracting the card area image in the image to be processed according to the position information of the card area image in the image to be processed.

In this embodiment, the card area image in the image to be processed may be located through the card detection model obtained based on YoloV3 network training, that is, the position information of the card area image in the image to be processed is obtained, and then the card area image may be extracted from the image to be processed based on the position information of the card area image. In practical application, the yoolov 3 network can be trained based on a plurality of image samples including the card area image and label data thereof to obtain a card detection model.

According to the method and the device, the card area image is extracted from the image to be processed through the card detection model, and the speed and the accuracy of extracting the card area image can be improved.

The embodiment of the invention also provides an image quality evaluation method. Referring to fig. 5, fig. 5 is a flowchart of an image quality evaluation method according to an embodiment of the present invention, and as shown in fig. 5, the method includes the following steps:

and 501, acquiring an image to be processed, wherein the image to be processed is an image to be evaluated.

In this embodiment, the image to be processed is the same as the image to be processed in the above embodiment; the image to be processed may be any image containing a card area, and may also be referred to as a card image, such as an identification card image, a driver's license image, a passport image, a social security card image, a student's license image, a work card image, and the like. In general, the card image usually includes some background area images in addition to the card area image. For example, as shown in fig. 2, the card image 10 includes a card area image 11 and a background area image 12.

Step 502, processing the image to be evaluated by using a portrait processing method to obtain a portrait area image.

In this embodiment, an image to be evaluated may be processed based on the portrait processing method provided in any of the above embodiments, so as to obtain a portrait area image of the image to be evaluated. The content related to the portrait processing method can be referred to the foregoing discussion, and is not described herein again.

Step 503, inputting the portrait area image into an image quality evaluation model to obtain an image quality evaluation result of the portrait area image, wherein the image quality evaluation model is obtained based on target neural network training.

In this embodiment, the target neural network may include, but is not limited to, an inclusion network, a ResNet network, a DenseNet network, or a custom neural network. In practical application, the target neural network can be trained based on a plurality of image samples and label data thereof, so as to obtain the image quality evaluation model. The label data is used to indicate image quality categories of the portrait area images of the image samples, such as sharp, blurred or reflective images, and the image samples may include images of different quality categories.

In this step, the image quality evaluation result is output by inputting the portrait area image to the image quality evaluation model to perform image quality evaluation on the portrait area image through the image quality evaluation model, wherein the image quality evaluation result may include each image quality category and its probability, for example, 0.90 sharp, 0.095 fuzzy, 0.005 reflective; it is also possible to include only the most probable image quality category, e.g. sharpness.

In this embodiment, the image quality evaluation is performed on the extracted portrait area image based on the trained image quality evaluation model, so that not only can a relatively accurate image quality evaluation result be obtained for different image scales, but also the applicability is wider, and the method can be used for performing image quality evaluation on images without image quality problems caused by factors (such as motion blur, gaussian blur, reflection and the like).

Optionally, the target neural network includes an input sub-network, a separable convolution sub-network, and an output sub-network, which are connected in sequence, where the input sub-network is configured to receive an input image, the output sub-network is configured to output probabilities of M image quality classes corresponding to the input image, and M is a positive integer.

The M image quality categories may be reasonably set according to actual needs, for example, the M image quality categories may include three image quality categories of sharpness, blur, and reflection, may also include four image quality categories of sharpness, blur, reflection, and low illuminance, and may also include five image quality categories of sharpness, motion blur, gaussian blur, reflection, and low illuminance, and the like.

The input sub-network may include, but is not limited to, standard convolutional layers for receiving the input image and may extract relatively rich low-level abstract features of the input image. The separable convolution sub-network can carry out separable convolution on the input feature graph, greatly reduces network parameters, reduces the size of a model and can keep the strong learning capacity of the network. The above-mentioned output sub-networks may include, but are not limited to, a max-pooling layer, a full-connectivity layer, and a softmax layer to output probabilities of M image quality classes.

Optionally, the separable convolution sub-network includes R separable convolution units, where the separable convolution units include a separable convolution layer, a point convolution layer, a bulk normalization layer, and an activation layer, and R is an integer greater than 1.

The value of the R can be reasonably set according to actual requirements, for example, 5, 6, 7 and the like. Preferably, R is 5.

The separable convolutional layers may include depth convolutional layers and point convolutional layers (i.e., 1 × 1 convolutional layers) for performing separable convolution on the input feature map. The above-described point convolutional layer (i.e., 1 × 1 convolutional layer) can perform the relationship integration between channels on the feature map output by the separable convolutional layer.

The batch normalization layer (also called batchnorm layer) can be used for accelerating network training and enabling the loss function to be converged quickly. The activation layer may also be referred to as a stimulus layer, and the activation function may include, but is not limited to, a relu function. Optionally, in this embodiment, a batch normalization layer and an activation layer may be provided in each separable convolution layer and each point convolution layer, so as to improve the training speed and the classification capability of the network.

Optionally, the input sub-network includes P standard convolution layers, an output end of at least one of the P standard convolution layers is connected to a batch normalization layer and an activation layer, and P is a positive integer.

The value of P can be set reasonably according to actual conditions, for example, 1, 2, etc. Preferably, P has a value of 1. The activation function of the activation layer may include, but is not limited to, a relu function.

In this embodiment, the input sub-network includes a P-layer standard convolution layer, a batch normalization layer and an activation layer, which can improve the training speed and classification capability of the network.

Optionally, the output sub-network includes a standard convolution layer, a batch normalization layer, a maximum pooling layer, a full connection layer, and a softmax layer, which are connected in sequence.

Optionally, before the image quality evaluation result of the portrait area image is obtained by inputting the portrait area image into an image quality evaluation model, the method may further include:

and training the image quality evaluation model.

For example, a plurality of image samples and label data of the plurality of image samples may be acquired, where each image sample includes a card area image, the label data is used to indicate an image quality category of a portrait area image in the card area image, and a target neural network is trained according to the plurality of image samples and the label data to obtain an image quality evaluation model.

Optionally, the training of the image quality evaluation model may include: the method comprises the steps of obtaining S image samples and label data of the S image samples, wherein each image sample comprises a card area image, the label data are used for indicating the image quality category of a portrait area image in the card area image, and S is an integer larger than 1; respectively extracting the card area images in each image sample in the S image samples to obtain S card area images; respectively extracting portrait area images in the S card area images according to a target relative position, wherein the target relative position is the relative position of the card area images and the portrait area images; and training a target neural network according to the portrait area images in the S card area images and the label data to obtain an image quality evaluation model.

The embodiment of the invention also provides an identity authentication method. Referring to fig. 6, fig. 6 is a flowchart of an authentication method provided in an embodiment of the present invention, as shown in fig. 6, including the following steps:

step 601, receiving an image to be processed uploaded by a user, wherein the image to be processed comprises a card area image.

In this embodiment, the image to be processed may be any image including an image of a card area, which may also be referred to as a card image, such as an identification card image, a driving license image, a passport image, a social security card image, a student's license image, and a work card image.

Step 602, performing image quality evaluation on the image to be processed by using the image quality evaluation method to obtain an image quality evaluation result.

In this embodiment, the image quality evaluation method provided in any one of the above embodiments may be used to perform image quality evaluation on the portrait area image of the card area image of the image to be processed, so as to obtain an image quality evaluation result. The relevant content of the image quality evaluation method can be referred to the foregoing discussion, and is not described herein again.

Step 603, if the image quality evaluation result indicates that the image to be processed is a qualified image, performing identity verification according to the image to be processed.

In this step, if the image quality evaluation result indicates that the image to be processed is a qualified image, for example, the image quality evaluation result indicates that the image quality category with the highest probability is clear, identity verification may be performed based on the image to be processed, for example, face recognition may be performed on the image to be processed.

And step 604, outputting prompt information if the image quality evaluation result indicates that the image to be processed is an unqualified image, wherein the prompt information is used for prompting a user to upload the image again.

In this step, if the image quality evaluation result indicates that the image to be processed is an unqualified image, for example, the image quality type with the highest probability of the image quality evaluation result indicates blur or reflection, and the like, at this time, if identity verification is performed based on the image to be processed, face recognition failure due to loss of portrait features is likely to occur, and therefore, prompt information can be output to prompt a user to upload a qualified image again.

In an actual situation, in order to ensure security, many internet applications need to perform identity authentication on a user, in this embodiment, a card image uploaded by the user is received and quality evaluation is performed on the card image uploaded by the user under the condition that the identity authentication is required on the user, and if the card image is a qualified image, identity authentication can be directly performed on the basis of the card image, for example, face recognition is performed; if the card image is an unqualified image, in order to reduce the occurrence of the situation of identity authentication identification caused by image quality, the user can be prompted to upload the card image again, and the card image uploaded again by the user can be subjected to image quality evaluation.

Optionally, the prompt message may include reason information that the card image is not qualified, for example, at least one of motion blur, gaussian blur, light reflection, low illumination, and the like, so that the user may upload the card image again with reference to the reason information.

In this embodiment, the image quality evaluation method is used to perform image quality evaluation on the to-be-processed image uploaded by the user to obtain an image quality evaluation result, perform identity verification according to the to-be-processed image when the image quality evaluation result indicates that the to-be-processed image is a qualified image, and output prompt information to prompt the user to upload a qualified image again when the image quality evaluation result indicates that the to-be-processed image is an unqualified image, so that accuracy of identity verification can be improved.

The embodiment of the invention also provides a model training method. Referring to fig. 7, fig. 7 is a flowchart of a model training method according to an embodiment of the present invention, as shown in fig. 7, including the following steps:

step 701, obtaining S image samples and label data of the S image samples, where each image sample includes a card area image, the label data is used to indicate an image quality category of a portrait area image in the card area image, and S is an integer greater than 1.

The value of S can be set reasonably according to actual requirements, for example, 5000, 20000, 100000, or the like. The S image samples may include image samples of different image quality categories, for example, the S image samples may include blurred image samples, reflected image samples, and sharp image samples. Alternatively, the blur-like image samples may comprise image samples of different blur types, for example, image samples of a gaussian blur type, image samples of a motion blur type, image samples mixed with a gaussian blur and a motion blur, and the like. Alternatively, the retroreflective image samples may include image samples of different retroreflective intensities.

And step 702, respectively extracting the card area images in each image sample in the S image samples to obtain S card area images.

In this embodiment, the card area image in the image sample may be positioned and the card area image may be intercepted in an image edge detection manner, or the card area image in the image sample may be positioned and the card area image may be intercepted by a pre-trained card detection model, which is not limited in this embodiment. The above-mentioned card detection model may include, but is not limited to, a model obtained based on generation of confrontation network training or a model obtained based on yoolov 3 network training, etc.

And 703, respectively extracting portrait area images in the S card area images according to a target relative position, wherein the target relative position is a relative position of a preset card area image and a preset portrait area image.

The relevant content of the relative position of the target can be referred to the foregoing discussion, and is not described herein.

Step 704, training a target neural network according to the portrait area images in the S card area images and the label data to obtain an image quality evaluation model.

In this step, the target neural network may be trained based on the portrait area images in the S card area images and the label data corresponding to each portrait area image to obtain an image quality evaluation model, and then image quality detection may be performed based on the image quality evaluation model.

Optionally, the target neural network may include an input sub-network, a separable convolution sub-network, and an output sub-network connected in sequence, where the input sub-network is configured to receive an input image, the output sub-network is configured to output probabilities of M image quality classes corresponding to the input image, and M is a positive integer.

In the model training stage, the portrait area images are extracted from the card images based on the relative positions of the card area images and the portrait area images, so that the portrait area image extraction efficiency can be improved while the portrait area image extraction accuracy is ensured, and further, the model training efficiency and accuracy are improved. In addition, the built target neural network is provided with a batch normalization layer and an activation layer, so that the model training speed can be increased, and the classification capability of the trained model can be improved.

The following description is made with reference to the target neural network shown in fig. 8 as an example:

referring to fig. 8, the target neural network provided in this embodiment may include:

a first layer: the standard convolution layer, which may also be referred to as the normal convolution layer, has a convolution kernel of 3 × 3, a step size of 2, a batcnorm, an activation function of relu, a number of output channels of 32, an input image of 112 × 3, and an output feature map (i.e., feature map) of 56 × 32.

A second layer: convolution layers with convolution kernel 3 x 3, step 2, using batcnorm, activation function relu, output channel number 64, and feature map of its output may be 28 x 64.

And a third layer: the dot convolution layer, which may also be referred to as a 1 × 1 convolution layer, has a convolution kernel of 1 × 1, a step size of 1, a batcnorm, an activation function of relu, a number of output channels of 64, and a feature map of its output of 28 × 64. The point convolution layer may be used to integrate the relationship between channels.

A fourth layer: convolution layers with convolution kernel 3 x 3, step 2, activation function relu using batcnorm, number of output channels 128, and feature map of its output may be 14 x 128.

And a fifth layer: 1 × 1 convolution layer with convolution kernel 1 × 1, step size 1, using batcnorm, activation function relu, output channel number 128, and feature map output 14 × 128. The 1 x 1 convolutional layer can also be used to integrate the relationship between channels.

A sixth layer: convolution layers with convolution kernel 3 x 3, step size 2, using batcnorm, activation function relu, output channel number 256, and feature map of output 7 x 256 can be separated.

A seventh layer: 1 × 1 convolution layer with convolution kernel 1 × 1, step size 1, using batcnorm, activation function relu, output channel number 256, and feature map output 7 × 256. The 1 x 1 convolutional layer can also be used to integrate the relationship between channels.

An eighth layer: convolution layers with convolution kernel 3 x 3, step size 2, using batcnorm, activation function relu, output channel number 512, and feature map of output 4 x 512 can be separated.

A ninth layer: 1 × 1 convolution layer, convolution kernel 1 × 1, step size 1, using batcnorm, activation function relu, output channel number 512, and feature map output 4 × 512. The 1 x 1 convolutional layer can also be used to integrate the relationship between channels.

A tenth layer: the convolution kernel of the normal convolution layer is 3 × 3, the step size is 2, the Batchnorm is used, the activation function is relu, the number of output channels is 512, and the feature map of the output is 2 × 512.

The eleventh layer: the largest pooling layer (i.e., maxpoling) is 2 x 2 in size, 1 in step size, does not fill the boundary, and outputs a feature map of 1 x 512.

A twelfth layer: and the full link layer and the softmax layer can determine the number of outputs according to the number of categories.

It should be noted that the loss function used in the embodiment to train the target neural network may be a cross entropy loss function.

In the target neural network provided by the embodiment, the number of channels is doubled after the size of the input image is reduced by half, and the increase of the batchnorm in each layer can accelerate the network training, so that the loss function is quickly converged.

Referring to fig. 9, fig. 9 is a structural diagram of a portrait extracting apparatus according to an embodiment of the present invention. As shown in fig. 9, the portrait extracting apparatus 900 includes:

a first extracting module 901, configured to extract a card area image in an image to be processed;

a second extracting module 902, configured to extract a portrait area image in the card area image according to a target relative position, where the target relative position is a relative position of a preset card area image and the portrait area image.

Optionally, the apparatus further comprises:

the first acquisition module is used for acquiring N image samples before extracting a portrait area image in the card area image according to the target relative position, wherein N is an integer greater than 1, and each image sample comprises the card area image;

the third extraction module is used for extracting the card area image of each image sample in the N image samples to obtain N card area images;

the detection module is used for respectively carrying out face detection on each card area image in the N card area images to obtain a portrait area image in each card area image in the N card area images;

the statistical module is used for counting the relative position of each card area image and the portrait area image in the N card area images to obtain N relative positions;

and the determining module is used for determining the relative position of the target according to the N relative positions.

Optionally, the first extraction module is specifically configured to:

inputting the image to be processed into a pre-trained card detection model to obtain position information of a card area image in the image to be processed, wherein the card detection model is obtained based on YoloV3 network training;

The portrait extraction apparatus 900 according to the embodiment of the present invention can implement each process in the portrait extraction method embodiment, and is not described herein again to avoid repetition.

According to the portrait extracting device 900 provided by the embodiment of the invention, the portrait area image is extracted from the card image based on the relative position of the card area image and the portrait area image, so that the portrait area image can be accurately extracted from the card image regardless of whether the portrait area image characteristic is lost in the card image.

Referring to fig. 10, fig. 10 is a structural diagram of an image quality evaluation apparatus according to an embodiment of the present invention. As shown in fig. 10, the image quality evaluation apparatus 1000 includes:

an obtaining module 1001, configured to obtain an image to be processed, where the image to be processed is an image to be evaluated;

a portrait extracting module 1002, configured to process the image to be evaluated by the portrait processing method to obtain a portrait area image;

an evaluation module 1003, configured to input the portrait area image into an image quality evaluation model to obtain an image quality evaluation result of the portrait area image, where the image quality evaluation model is a model obtained based on target neural network training.

Optionally, the apparatus further comprises:

and the training module is used for training the image quality evaluation model before inputting the portrait area image into the image quality evaluation model and obtaining the image quality evaluation result of the portrait area image.

The image quality evaluation device 1000 according to the embodiment of the present invention can implement each process in the above-described image quality evaluation method embodiment, and is not described here again to avoid repetition.

According to the image quality evaluation device 1000 of the embodiment of the invention, the portrait area image is extracted from the card image based on the relative position of the card area image and the portrait area image, so that the portrait area image can be accurately extracted from the card image regardless of the loss of the facial area characteristics in the card image, and in addition, the image quality evaluation is carried out on the extracted portrait area image based on the trained image quality evaluation model, so that an accurate image quality evaluation result can be obtained for different image scales.

Referring to fig. 11, fig. 11 is a structural diagram of an authentication apparatus according to an embodiment of the present invention. As shown in fig. 9, the authentication apparatus 1100 includes:

the receiving module 1101 is configured to receive an image to be processed uploaded by a user, where the image to be processed includes a card area image;

a quality evaluation module 1102, configured to perform image quality evaluation on the image to be processed by using the image quality evaluation method described above, so as to obtain an image quality evaluation result;

the verification module 1103 is configured to perform identity verification according to the image to be processed if the image quality evaluation result indicates that the image to be processed is a qualified image;

an output module 1104, configured to output a prompt message if the image quality evaluation result indicates that the image to be processed is an unqualified image, where the prompt message is used to prompt a user to upload an image again.

The identity authentication apparatus 1100 provided in the embodiment of the present invention can implement each process in the above-described identity authentication method embodiment, and is not described here again to avoid repetition.

The identity authentication device 1100 of the embodiment of the present invention includes a receiving module 1101, configured to receive an image to be processed uploaded by a user, where the image to be processed includes a card area image; a quality evaluation module 1102, configured to perform image quality evaluation on the image to be processed by using the image quality evaluation method described above, so as to obtain an image quality evaluation result; the verification module 1103 is configured to perform identity verification according to the image to be processed if the image quality evaluation result indicates that the image to be processed is a qualified image; an output module 1104, configured to output a prompt message if the image quality evaluation result indicates that the image to be processed is an unqualified image, where the prompt message is used to prompt a user to upload an image again. The accuracy of the identity authentication can be improved.

Referring to fig. 12, fig. 12 is a block diagram of a model training apparatus according to an embodiment of the present invention. As shown in fig. 12, the model training apparatus 1200 includes:

an obtaining module 1201, configured to obtain S image samples and tag data of the S image samples, where each image sample includes a card area image, the tag data is used to indicate an image quality category of a portrait area image in the card area image, and S is an integer greater than 1;

a first extraction module 1202, configured to extract the card area images in each image sample of the S image samples respectively to obtain S card area images;

a second extraction module 1203, configured to extract portrait area images from the S card area images respectively according to a target relative position, where the target relative position is a relative position of a preset card area image and the portrait area image;

a training module 1204, configured to train a target neural network according to the portrait area images in the S card area images and the tag data, to obtain an image quality evaluation model.

The model training device 1200 provided in the embodiment of the present invention can implement each process in the above-described model training method embodiment, and is not described here again to avoid repetition.

The model training device 1200 of the embodiment of the present invention includes an obtaining module 1201, configured to obtain S image samples and label data of the S image samples, where each image sample includes a card area image, the label data is used to indicate an image quality category of a portrait area image in the card area image, and S is an integer greater than 1; a first extraction module 1202, configured to extract the card area images in each image sample of the S image samples respectively to obtain S card area images; a second extraction module 1203, configured to extract portrait area images from the S card area images respectively according to a target relative position, where the target relative position is a relative position of a preset card area image and the portrait area image; a training module 1204, configured to train a target neural network according to the portrait area images in the S card area images and the tag data, to obtain an image quality evaluation model. Because the portrait area image is extracted from the card image based on the relative position of the card area image and the portrait area image, the extraction accuracy of the portrait area image can be ensured, the extraction efficiency of the portrait area image can be improved, and the training efficiency and accuracy of the image quality evaluation model can be further improved.

Referring to fig. 13, fig. 13 is a block diagram of a human image extracting apparatus according to still another embodiment of the present invention, and as shown in fig. 13, a human image extracting apparatus 1300 includes: a processor 1301, a memory 1302 and a computer program stored on the memory 1302 and operable on the processor, the various components in the portrait extraction apparatus 1300 being coupled together by a bus interface 1303, the computer program when executed by the processor 1301 implementing the steps of:

extracting a card area image in an image to be processed;

It should be understood that, in this embodiment, the processor 1301 can implement the processes of the embodiment of the portrait extracting method, and details are not described here to avoid repetition.

Referring to fig. 14, fig. 14 is a block diagram of an image quality evaluation apparatus according to still another embodiment of the present invention, and as shown in fig. 14, an image quality evaluation apparatus 1400 includes: a processor 1401, a memory 1402 and a computer program stored on said memory 1402 and executable on said processor, the various components of the image quality evaluation apparatus 1400 being coupled together by means of a bus interface 1403, said computer program realizing the following steps when executed by said processor 1401:

It should be understood that, in this embodiment, the processor 1401 can implement the processes of the above-described embodiment of the image quality evaluation method, and details are not described here to avoid repetition.

Referring to fig. 15, fig. 15 is a structural diagram of an authentication apparatus according to another embodiment of the present invention, and as shown in fig. 15, an authentication apparatus 1500 includes: a processor 1501, a memory 1502 and a computer program stored on the memory 1502 and executable on the processor, the various components of the authentication apparatus 1500 being coupled together by a bus interface 1503, the computer program when executed by the processor 1501 implementing the steps of:

Referring to fig. 16, fig. 16 is a block diagram of a model training apparatus according to still another embodiment of the present invention, and as shown in fig. 16, a model training apparatus 1600 includes: a processor 1601, a memory 1602, and a computer program stored on the memory 1602 and operable on the processor, the various components of the model training apparatus 1600 being coupled together by a bus interface 1603, the computer program when executed by the processor 1601 performing the steps of:

An embodiment of the present invention further provides an electronic device, which includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements each process of the portrait extraction method embodiment, or implements each process of the image quality evaluation method embodiment, or implements each process of the identity verification method embodiment, or implements each process of the model training method, and can achieve the same technical effect, and in order to avoid repetition, the computer program is not described herein again.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when executed by a processor, the computer program implements each process of the embodiment of the portrait extraction method, or implements each process of the embodiment of the image quality evaluation method, or implements each process of the embodiment of the identity verification method, or implements each process of the model training method, and can achieve the same technical effect, and is not described herein again to avoid repetition. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A portrait extraction method is characterized by comprising the following steps:

extracting a card area image in an image to be processed;

2. The method of claim 1, wherein determining the preset target relative position comprises:

3. The method according to claim 1, wherein the extracting of the card area image in the image to be processed comprises:

4. An image quality evaluation method is characterized by comprising:

processing the image to be evaluated by using the portrait processing method of any one of claims 1 to 3 to obtain a portrait area image;

5. The method of claim 4, wherein the target neural network comprises an input sub-network, a separable convolution sub-network, and an output sub-network connected in sequence, the input sub-network being configured to receive an input image, the output sub-network being configured to output probabilities of M image quality classes to which the input image corresponds, M being a positive integer.

6. The method of claim 5, wherein the separable convolution sub-network comprises R separable convolution elements including a separable convolution layer, a point convolution layer, a bulk normalization layer, and an activation layer, R being an integer greater than 1.

7. Method according to claim 5 or 6, characterized in that the input subnetwork comprises P layers of standard convolutional layers, of which at least one output is connected to a bulk normalization layer and an activation layer, P being a positive integer.

8. The method of claim 5, wherein the output sub-network comprises a standard convolutional layer, a batch normalization layer, a max pooling layer, a full connectivity layer, and a softmax layer connected in sequence.

9. An identity verification method, comprising:

performing image quality evaluation on the image to be processed by using the image quality evaluation method according to any one of claims 4 to 8 to obtain an image quality evaluation result;

10. A method of model training, comprising:

11. A portrait extracting apparatus, comprising:

12. An image quality evaluation apparatus, comprising:

a portrait extracting module, configured to process the image to be evaluated by using the portrait processing method according to any one of claims 1 to 3, so as to obtain a portrait area image;

13. An authentication apparatus, comprising:

a quality evaluation module, configured to perform image quality evaluation on the image to be processed by using the image quality evaluation method according to any one of claims 4 to 8, so as to obtain an image quality evaluation result;

14. An electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the portrait processing method according to any one of claims 1 to 3, or the steps of the image quality evaluation method according to any one of claims 4 to 8, or the steps of the identity verification method according to claim 9, or the steps of the model training method according to claim 10.

15. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the portrait processing method according to any one of claims 1 to 3, or the steps of the image quality evaluation method according to any one of claims 4 to 8, or the steps of the identity verification method according to claim 9, or the steps of the model training method according to claim 10.