CN113221897B

CN113221897B - Image correction method, image text recognition method, identity verification method and device

Info

Publication number: CN113221897B
Application number: CN202010081389.2A
Authority: CN
Inventors: 唐东凯; 曾定衡; 蒋宁; 赵立军
Original assignee: Mashang Xiaofei Finance Co Ltd
Current assignee: Mashang Xiaofei Finance Co Ltd
Priority date: 2020-02-06
Filing date: 2020-02-06
Publication date: 2023-04-18
Anticipated expiration: 2040-02-06
Also published as: CN113221897A

Abstract

The invention provides an image correction method, an image text recognition method, an identity verification method and an identity verification device, wherein the image correction method comprises the following steps: inputting an image to be processed into a pre-trained image segmentation model to segment a target image area of the image to be processed to obtain a segmented image of the image to be processed; wherein the target image area is an image area of a target object in the image to be processed; determining the position of the target image area according to the segmentation image; and correcting the target image area according to the position of the target image area. By the image correction method provided by the invention, the target image area of the image to be processed can be accurately segmented for image backgrounds with different complexity degrees, and the target image area can be accurately positioned for correction, so that the correction effect of the target image area of the image is improved.

Description

Image correction method, image text recognition method, identity verification method and device

Technical Field

The invention relates to the technical field of information processing, in particular to an image correction method, an image text recognition method, an identity verification method and an identity verification device.

Background

With the continuous development of internet technology, the internet is applied more and more widely in various industries, such as internet finance, banking, telecommunication, education and the like. In order to improve the security of internet applications, authentication of users is often required. At present, the user is usually authenticated by automatically recognizing the identity information on the card image uploaded by the user, for example, by recognizing the information (such as name, address, identification number, etc.) on the user identity card image, so as to improve the efficiency of authentication.

In practical situations, due to the problem of the shooting angle, the shot card image and the like often have the effects of inclination and perspective, which have great influence on the character positioning and character recognition accuracy, so that the image needs to be corrected before the characters on the card image and the like are recognized. In the prior art, taking an identity card image as an example, an input color identity card image is usually converted into a gray scale image, binary segmentation is performed on the gray scale image by using a threshold value to obtain a binary image, and then an identity card area in the identity card image is positioned based on the binary image to perform image correction. However, when the background of the image is complex, the threshold is not well determined, and the background area and the identity card area in the identity card image cannot be effectively distinguished, so that the image correction effect is poor.

Therefore, the problem that the image correction effect is poor under the condition that the image background is complex in the prior art exists.

Disclosure of Invention

The embodiment of the invention provides an image correction method and device, and aims to solve the problem of poor image correction effect under the condition of complex image background.

In order to solve the technical problem, the invention is realized as follows:

in a first aspect, an embodiment of the present invention provides an image rectification method. The method comprises the following steps:

inputting an image to be processed into a pre-trained image segmentation model to segment a target image area of the image to be processed to obtain a segmented image of the image to be processed; the target image area is an image area of a target object in the image to be processed;

determining the position of the target image area according to the segmentation image;

and correcting the target image area according to the position of the target image area.

In a second aspect, the embodiment of the invention further provides an image text recognition method. The method comprises the following steps:

correcting a target image area of an image to be processed by using the image correction method; the target image area is an image area of a target object in the image to be processed;

positioning a text area in the corrected target image area;

text in the text region is identified.

In a third aspect, an embodiment of the present invention further provides an identity authentication method. The method comprises the following steps:

acquiring an image to be processed, wherein the image to be processed comprises a card area, and the card area comprises a text area;

recognizing the text of the text area of the card area of the image to be processed by using the image text recognition method;

and performing identity authentication according to the text.

In a fourth aspect, the embodiment of the invention further provides a model training method. The method comprises the following steps:

the method comprises the steps of obtaining S image samples and label images corresponding to each image sample in the S image samples, wherein the label images are images of positions of target image areas marked with the image samples, S is an integer larger than 1, and the target image areas are image areas of target objects in the image samples;

and performing image segmentation model training according to the S image samples and the label image corresponding to each image sample in the S image samples.

In a fifth aspect, an embodiment of the present invention further provides an image rectification apparatus. The image rectification device includes:

the input module is used for inputting an image to be processed into a pre-trained image segmentation model so as to segment a target image area of the image to be processed, and a segmented image of the image to be processed is obtained; wherein the target image area is an image area of a target object in the image to be processed;

a determining module for determining a position of the target image region from the segmented image;

and the correction module is used for correcting the target image area according to the position of the target image area.

In a sixth aspect, an embodiment of the present invention further provides an image text recognition apparatus. The image text recognition apparatus includes:

the correction module is used for correcting a target image area of the image to be processed by using the image correction method; wherein the target image area is an image area of a target object in the image to be processed;

the positioning module is used for positioning the corrected text region in the target image region by a user;

and the identification module is used for identifying the text in the text area.

In a seventh aspect, an embodiment of the present invention further provides an identity authentication apparatus. The identity authentication device includes:

the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an image to be processed, the image to be processed comprises a card area, and the card area comprises a text area;

the identification module is used for identifying the text of the text area of the card area of the image to be processed by using the image text identification method;

and the verification module is used for performing identity verification according to the text.

In an eighth aspect, an embodiment of the present invention further provides a model training apparatus. The model training device includes:

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring S image samples and label images corresponding to each image sample in the S image samples, the label images are images of positions of target image areas marked with the image samples, S is an integer larger than 1, and the target image areas are image areas of target objects in the image samples;

and the training module is used for carrying out image segmentation model training according to the S image samples and the label images corresponding to each image sample in the S image samples.

In a ninth aspect, an embodiment of the present invention further provides an image rectification apparatus, including a processor, a memory, and a computer program stored on the memory and executable on the processor, where the computer program, when executed by the processor, implements the steps of the image rectification method, or implements the steps of the image text recognition method, or implements the steps of the identity verification method, or implements the steps of the model training method.

In a tenth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the steps of the image rectification method, or implements the steps of the image text recognition method, or implements the steps of the identity verification method, or implements the steps of the model training method.

In the embodiment of the invention, a target image area of an image to be processed is segmented by a pre-trained image segmentation model to obtain a segmented image, and the position of the target image area is determined according to the segmented image so as to correct the target image area. The image segmentation model has a good segmentation effect on image backgrounds with different complexity degrees, so that a target image area of an image to be processed can be segmented accurately, the target image area can be positioned accurately for correction, and the correction effect of the target image area of the image is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.

FIG. 1 is a flowchart of an image rectification method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an identification card image provided by an embodiment of the invention;

FIG. 3 is a schematic diagram of a segmented image provided by an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a Unet network provided by an embodiment of the present invention;

FIG. 5 is a schematic diagram of an embodiment of the present invention for tilt correction of an image of an identification card;

FIG. 6 is a flowchart of an image text recognition method according to an embodiment of the present invention;

FIG. 7 is a flowchart of a method for recognizing image texts according to still another embodiment of the present invention;

fig. 8 is a flowchart of an authentication method provided by an embodiment of the present invention;

FIG. 9 is a flow chart of a model training method provided by an embodiment of the present invention;

FIG. 10a is a schematic diagram of a sample image provided by an embodiment of the present invention;

FIG. 10b is a schematic illustration of a label image provided by an embodiment of the invention;

FIG. 11 is a block diagram of an image rectification device according to an embodiment of the present invention;

fig. 12 is a block diagram of an image text recognition apparatus according to an embodiment of the present invention;

fig. 13 is a block diagram of an authentication apparatus according to an embodiment of the present invention;

FIG. 14 is a block diagram of a model training apparatus according to an embodiment of the present invention;

FIG. 15 is a block diagram of an image rectification apparatus according to still another embodiment of the present invention;

fig. 16 is a block diagram of an image text recognition apparatus according to still another embodiment of the present invention;

fig. 17 is a block diagram of an authentication apparatus according to still another embodiment of the present invention;

fig. 18 is a block diagram of a model training apparatus according to still another embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides an image correction method which is applied to electronic equipment such as a computer, a server, a mobile phone, a tablet computer and the like. Referring to fig. 1, fig. 1 is a flowchart of an image rectification method according to an embodiment of the present invention, as shown in fig. 1, including the following steps:

step 101, inputting an image to be processed into a pre-trained image segmentation model to segment a target image area of the image to be processed, so as to obtain a segmented image of the image to be processed; and the target image area is an image area of a target object in the image to be processed.

In this embodiment, the target object may be an object having a specific format, such as a card, a bill, etc., for example, the target object may include, but is not limited to, an identity card, a bank card, a social security card, a student's certificate, a driver's license, a passport, a work permit, a business license, an invoice, etc. The image to be processed may be any image including an image area of the target object, for example, a captured image of an identification card. In practical cases, the identification card image usually includes some background area in addition to the image area of the identification card, for example, as shown in fig. 2, the identification card image 10 includes an image area 11 (i.e., an identification card area) of the identification card and a background area 12.

The image segmentation model may be a model obtained by training based on a Convolutional neural network, and the neural network may include, but is not limited to, a Full Convolutional Network (FCN), mask RCNN, segNet, unet, deep lab, and the like. Specifically, the convolutional neural network may be trained based on a plurality of image samples and their label images to obtain the image segmentation model. The label image is an image marked with the position of the image area of the target object in the image sample, and the image area of the target object in the image to be processed can be segmented based on the image segmentation model.

Taking the above target object as an identification card as an example, inputting the image to be processed into the image segmentation model, a segmented image as shown in fig. 3 can be obtained, where the segmented image can be a binary image, and can also be referred to as a MASK image (i.e. MASK image), for example, the pixel value of the image area of the identification card is 255, and the pixel value of the background area is 0.

Optionally, before the image to be processed is input into the image segmentation model, the image to be processed may be preprocessed, for example, one or more of normalization processing, enhancement processing, filtering processing, and the like, and then the preprocessed image to be processed is input into the image segmentation model. The normalization processing may be to convert the image to be processed into a preset size.

And step 102, determining the position of the target image area according to the segmentation image.

In this embodiment, since the image area of the target object (i.e., the target image area) and the background area in the image to be processed are distinguished in the segmented image, the position of the target image area can be determined more conveniently based on the segmented image.

Taking the example shown in fig. 3, each pixel point with a pixel value of 255 in the segmented image is a pixel point of the image area of the identity card, edge pixel points of the image area of the identity card can be obtained by performing edge detection on the segmented image, and the position of the contour line of the image area of the identity card can be determined based on the edge pixel points of the image area of the identity card, so as to locate the position of the target image area.

And 103, correcting the target image area according to the position of the target image area.

In this embodiment, after the position of the target image area is determined, the target image area may be corrected, for example, by tilt correction or sharpness correction. Since the position of the target image area can be accurately positioned through the steps 101 to 102, the target image area can be accurately corrected, and the problem of poor correction effect of the target image area caused by inaccurate position positioning of the target image area due to a complex background is solved.

According to the image correction method provided by the embodiment of the invention, the target image area of the image to be processed is segmented through the pre-trained image segmentation model to obtain the segmented image, and the position of the target image area is determined according to the segmented image so as to correct the target image area. The image segmentation model has a good segmentation effect on image backgrounds with different complexity degrees, so that a target image area of an image to be processed can be segmented accurately, the target image area can be positioned accurately for correction, and the correction effect of the target image area of the image is improved.

Optionally, the image segmentation model may be a model obtained based on a net network training, the net network includes N layers of convolution units and N layers of deconvolution units, the N layers of convolution units are respectively connected to the deconvolution units with the same size as the output feature maps in the N layers of deconvolution units, a value range of N is [6, 10], and N is a positive integer.

In this embodiment, each convolution unit (which may also be referred to as an encoder) in the above-mentioned N layers of convolution units may include a standard convolution layer, a separable convolution layer, or a residual network layer, etc. for performing a convolution operation on an input image. Each convolution unit may further include a Pooling layer (i.e., pooling layer), wherein the Pooling layer is used for down-sampling the input feature map to reduce the amount of parameters to be learned. Each of the above-described deconvolution units (which may also be referred to as decoders) of the N-layer deconvolution unit may include a deconvolution layer for performing a deconvolution operation on the input image. Optionally, each deconvolution unit may further include a batch normalization layer (i.e., a batch norm layer) and an activation layer, wherein the activation function of the activation layer may include, but is not limited to, a relu function. Each convolution unit of the N layers of convolution units is respectively connected with the deconvolution unit with the same size as the output characteristic diagram of the convolution unit in the N layers of deconvolution units, so that some important characteristic information lost in the down-sampling process is reserved to the maximum extent.

In this embodiment, since the image region of the target object is used as the segmentation target, and the target is relatively single, the segmentation can be achieved by constructing a relatively simple Unet network, and the training speed can be increased and the model size can be reduced. Optionally, in this embodiment, the value range of N may be [6, 10], and is preferably 8.

For example, the Unet network shown in fig. 4 includes 8 encoders (i.e., encoder _1 to Encoder _ 8) and 8 decoders (i.e., decoder _1 to Decoder _ 8), and a skip layer connection policy (i.e., skip connections) is added between the encoders and the decoders, which can protect the information of the original image from being lost. Specifically, the size of the input image may be 256 × 3, the size of the feature map output by the Encoder _1 may be 128 × 64, the size of the feature map output by the Encoder _2 may be 64 × 128, the size of the feature map output by the Encoder _3 may be 32 × 256, the size of the feature map output by the Encoder _4 may be 16 × 16 512, the size of the feature map output by the Encoder _5 may be 8 × 512, the size of the feature map output by the Encoder _6 may be 4 × 512, the size of the feature map output by the Encoder _7 may be 2 × 512, and the size of the feature map output by the Encoder _8 may be 1 × 512; the size of the feature map output through Decoder _8 may be 2 × 512, the size of the feature map output through Decoder _7 may be 4 × 512, the size of the feature map output through Decoder _6 may be 8 × 512, the size of the feature map output through Decoder _5 may be 16 × 512, the size of the feature map output through Decoder _4 may be 32 × 256, the size of the feature map output through Decoder _3 may be 64 × 128, the size of the feature map output through Decoder _2 may be 128 × 64, and the size of the feature map output through Decoder _1 may be 256 × 1, that is, an output image. Optionally, the size of the convolution kernel when the encoder performs the convolution operation may be 3*3.

Optionally, each convolution unit in the N layers of convolution units includes a separable convolution layer, a batch normalization layer, and an activation layer.

In this embodiment, the separable convolution layers may include a depth convolution layer and a dot convolution layer (i.e., the convolution layer of 1*1) for separable convolution of an input image. The batch normalization layer can be used for accelerating network training and enabling the loss function to be converged quickly. The activation function of the activation layer may include, but is not limited to, a relu function.

In this embodiment, the separable convolution layer is used to perform separable convolution operation, so that the size of the image segmentation model can be reduced, the training speed of the image segmentation model can be increased, and the image segmentation model can be transplanted to a mobile terminal for use while ensuring the image segmentation effect of the trained image segmentation model.

Optionally, the target object is rectangular in shape; the step 102, namely, the determining the position of the target image region according to the segmented image, may include:

respectively acquiring contour points of all edges of a target image area in the segmentation image;

respectively performing straight line fitting on contour points of each edge of a target image area in the segmentation image to obtain contour lines of each edge of the target image area;

determining the positions of four corner points of the target image area according to the contour lines of all edges of the target image area;

the step 103, namely, the correcting the target image area according to the position of the target image area, may include:

and performing tilt correction on the target image area according to the positions of four corner points of the target image area.

In this embodiment, the target object may include, but is not limited to, an object having a rectangular shape and a specific format, such as an identity card, a bank card, a social security card, a student's certificate, a driver's license, a passport, a work permit, a business license, or an invoice.

Specifically, contour points of each side of the target image region in the segmented image may be detected respectively, a straight line fitting may be performed based on the contour points of each side of the target image region, to obtain a contour line of each side of the target image region, positions of four corner points of the target image region may be determined based on an intersection point of the contour lines of each side of the target image region, and then the target image region may be subjected to tilt correction based on the positions of the four corner points of the target image region, for example, the target image region may be subjected to perspective transformation based on the positions of the four corner points of the target image region, so as to obtain the target image region with a forward viewing angle.

In the embodiment, contour points of each edge of a target image area in the segmentation image are respectively obtained; respectively performing linear fitting on contour points of each edge of a target image area in the segmentation image to obtain a contour line of each edge of the target image area; the positions of the four corner points of the target image area are determined according to the contour lines of all the edges of the target image area, so that the position accuracy of the obtained corner points can be improved, further perspective transformation is carried out based on the positions of the obtained corner points to obtain the target image area with a forward visual angle, and the effect of carrying out inclination correction on the target image area can be improved.

The image to be processed is taken as an identity card image, and the image segmentation model is taken as a model obtained based on Unet network training. Referring to fig. 5, an input image 21 (i.e., the image to be processed) is input into an image segmentation model to obtain a segmented image 22, i.e., a Mask image, edge detection and line fitting are performed on the segmented image 22 to obtain an image 23 in which four corners of an identity card region are located, and perspective transformation is performed on the image 23 in which the four corners of the identity card region are located to obtain an output image 24, i.e., an identity card region after skew correction.

The embodiment of the invention also provides an image text recognition method. Referring to fig. 6, fig. 6 is a flowchart of an image text recognition method according to an embodiment of the present invention, and as shown in fig. 6, the method includes the following steps:

601, correcting a target image area of an image to be processed by using the image correction method; and the target image area is an image area of a target object in the image to be processed.

In this embodiment, the target object may include, but is not limited to, an identity card, a bank card, a social security card, a student's certificate, a driver's license, a passport, a work card, a business license, an invoice, and the like.

In this step, the target image region of the image to be processed may be corrected based on the image correction method provided in any of the embodiments described above, so as to obtain a corrected target image region. The related content of the image rectification method can be referred to the foregoing discussion, and is not described herein again.

And step 602, positioning a text area in the corrected target image area.

In this embodiment, the text region may be located based on a horizontal projection and a vertical projection of the text region in the target image region, or the text region in the target image region may be located by a pre-trained text location model.

For example, the text region in the target image region may be located based on a pre-trained generation countermeasure network model, where the generation countermeasure network model may be a model trained based on an image sample and a corresponding label image, the image sample may be an image including the target image region, and the label image is an image obtained by labeling a text region of the target image region in the image sample.

Step 603, identifying the text in the text area.

For example, the text in the located text region may be recognized using an OCR (Optical Character Recognition) technique.

In this embodiment, the image correction method is used to correct the target image region of the image to be processed, so that the correction effect of the target image region of the image to be processed can be improved, and the accuracy of character recognition of the text region in the target image region can be further improved.

The following description takes identification card character recognition as an example:

referring to fig. 7, the identification card character recognition may include the following steps:

step 701, inputting an identity card image.

The identification card image input in this step includes an identification card area and a background area.

Step 702, preprocessing.

In this step, the identification card image may be adjusted to a size of 256 × 256.

Step 703, correcting the inclination.

In this step, the identity card region of the preprocessed identity card image may be segmented based on an image segmentation model obtained by the training of the Unet network, and the identity card region may be subjected to tilt correction based on the segmented image. For example, edge detection and straight line fitting are performed according to the segmented image to obtain four edges of the identity card region, intersection points of straight lines are obtained to obtain positions of four corner points of the identity card region, and perspective transformation is performed according to the positions of the four corner points of the identity card region to obtain the identity card region after inclination correction.

And 704, positioning the characters.

In this step, the text area in the tilt-corrected identification card area may be located.

Step 705, character recognition.

In this step, character recognition may be performed on the located character area.

And step 706, outputting the recognition result.

The embodiment of the invention provides an identity authentication method. Referring to fig. 8, fig. 8 is a flowchart of an authentication method according to an embodiment of the present invention, as shown in fig. 8, including the following steps:

step 801, acquiring an image to be processed, wherein the image to be processed comprises a card area, and the card area comprises a text area.

In this embodiment, the image to be processed includes a card area, that is, an image area of the card, and the target object is the card at this time.

Step 802, recognizing the text of the text area of the card area of the image to be processed by using the image text recognition method.

In this step, the text of the text region of the card region of the image to be processed may be identified by using the above-mentioned image text identification method, so as to obtain the text of the text region of the card region. The relevant content of the image text recognition method can be referred to the foregoing discussion, and is not described herein again.

And step 803, performing identity authentication according to the text.

In an actual situation, in order to ensure security, many internet applications all need to authenticate the identity information of a user, this embodiment can receive the card image uploaded by the user under the condition that the identity information of the user needs to be authenticated, correct the card area of the card image uploaded by the user, recognize the text of the text area of the corrected card area, that is, the identity information, and then can authenticate the user based on the text, and can improve the accuracy of the authentication result.

The embodiment of the invention provides a model training method, and the image segmentation model of the embodiment can be a model obtained by training based on the model training method provided by the embodiment of the invention. Referring to fig. 9, fig. 9 is a flowchart of a model training method according to an embodiment of the present invention, as shown in fig. 9, including the following steps:

step 901, obtaining S image samples and a label image corresponding to each image sample in the S image samples, where the label image is an image of a position of a target image area labeled with the image sample, S is an integer greater than 1, and the target image area is an image area of a target object in the image sample.

The value of S can be set reasonably according to actual requirements, for example, 5000, 20000, 30000, 100000, or the like. The image sample may be an image of any image area including a target object, wherein the target object may include, but is not limited to, an identification card, a bank card, a social security card, a student's certificate, a driver's license, a passport, a work permit, a business license, an invoice, and the like. The label image may be an image of a position of a target image area marked with an image sample, and taking an identification card image as an example, the label image may be an image of positions of four corner points of an identification card area marked with an identification card image, for example, the image sample shown in fig. 10a, and a corresponding label image thereof may be as shown in fig. 10 b. In practical application, the image sample can be expanded through rotation, illumination change and other modes, so that the effect of the image segmentation model obtained through training is improved.

And step 902, performing image segmentation model training according to the S image samples and the label images corresponding to each image sample in the S image samples.

In this step, the S image samples and the label image corresponding to each image sample in the S image samples may be input to a pre-constructed convolutional neural network for training, so as to obtain an image segmentation model.

Optionally, before the S image samples and the label image corresponding to each of the S image samples are input into the pre-constructed convolutional neural network, the S image samples and the label image corresponding to each of the S image samples may be pre-processed, for example, normalized to 256 × 256 size image.

It should be noted that, after the image segmentation model is obtained through training, the image segmentation model may be tested based on the test set to obtain the accuracy of the image segmentation model.

According to the embodiment of the invention, the image segmentation model training is carried out according to the S image samples and the label images corresponding to each image sample in the S image samples, and then the target image area of the image to be processed can be more accurately segmented based on the image segmentation model obtained by training in the image backgrounds with different complexity, so that the target image area can be more accurately positioned for correction, and the correction effect of the target image area of the image is improved.

Optionally, the image segmentation model may be a model obtained based on a net network training, the net network includes N layers of convolution units and N layers of deconvolution units, the N layers of convolution units are respectively connected to the deconvolution units with the same size as the output feature maps in the N layers of deconvolution units, a value range of N is [6, 10], and N is an integer.

In this embodiment, each convolution unit (which may also be referred to as an encoder) in the above-mentioned N layers of convolution units may include a standard convolution layer, a separable convolution layer, or a residual network layer, etc. for performing a convolution operation on an input image. Each convolution unit may further include a Pooling layer (i.e., pooling layer), wherein the Pooling layer is used for down-sampling the input feature map to reduce the amount of parameters to be learned. Each of the above-mentioned N-layered deconvolution units (which may also be referred to as decoders) may include a deconvolution layer for performing a deconvolution operation on the input image. Optionally, each deconvolution unit may further include a batch normalization layer (i.e., batcnorm layer) and an activation layer, where the activation function of the activation layer may include, but is not limited to, a relu function. Each convolution unit of the N layers of convolution units is respectively connected with the deconvolution unit with the same size as the output characteristic diagram of the convolution unit in the N layers of deconvolution units, so that some important characteristic information lost in the down-sampling process is reserved to the maximum extent.

Optionally, each convolution unit of the N layers of convolution units includes a separable convolution layer, a batch normalization layer, and an activation layer.

In this embodiment, the separable convolutional layers may include a depth convolutional layer and a point convolutional layer (i.e., the convolutional layer of 1*1) for performing separable convolution on an input image. The batch normalization layer can be used for accelerating network training and enabling the loss function to be converged quickly. The activation function of the activation layer may include, but is not limited to, a relu function.

Referring to fig. 11, fig. 11 is a structural diagram of an image rectification apparatus according to an embodiment of the present invention. As shown in fig. 11, the image rectification apparatus 1100 includes:

an input module 1101, configured to input a to-be-processed image into a pre-trained image segmentation model to segment a target image region of the to-be-processed image, so as to obtain a segmented image of the to-be-processed image; wherein the target image area is an image area of a target object in the image to be processed;

a determining module 1102, configured to determine a position of the target image region according to the segmented image;

a correcting module 1103, configured to correct the target image area according to the position of the target image area.

Optionally, the image segmentation model is a model obtained based on training of a Unet network, the Unet network includes N layers of convolution units and N layers of deconvolution units, the N layers of convolution units are respectively connected with the deconvolution units with the same size as the output feature maps in the N layers of deconvolution units, and a value range of N is [6, 10].

Optionally, the target object is rectangular in shape; the determining module is specifically configured to:

respectively performing linear fitting on contour points of each edge of a target image area in the segmentation image to obtain a contour line of each edge of the target image area;

the correction module is specifically configured to:

The image correction apparatus 1100 according to the embodiment of the present invention can implement each process in the above-described image correction method embodiment, and is not described here again to avoid repetition.

The image correction device 1100 of the embodiment of the present invention includes an input module 1101, configured to input a to-be-processed image into a pre-trained image segmentation model to segment a target image region of the to-be-processed image, so as to obtain a segmented image of the to-be-processed image; the target image area is an image area of a target object in the image to be processed; a determining module 1102, configured to determine a position of the target image region according to the segmented image; a correcting module 1103, configured to correct the target image area according to the position of the target image area. The image segmentation model has a good segmentation effect on image backgrounds with different complexity degrees, so that a target image area of an image to be processed can be segmented accurately, the target image area can be positioned accurately for correction, and the correction effect of the target image area of the image is improved.

Referring to fig. 12, fig. 12 is a block diagram of an image text recognition apparatus according to an embodiment of the present invention. As shown in fig. 12, the image text recognition apparatus 1200 includes:

a rectification module 1201, configured to rectify a target image area of an image to be processed by using the image rectification method; wherein the target image area is an image area of a target object in the image to be processed;

a positioning module 1202, configured to position a text region in the corrected target image region by a user;

an identifying module 1203 is configured to identify a text in the text region.

The image text recognition apparatus 1200 provided in the embodiment of the present invention can implement each process in the above-described image text recognition method embodiment, and is not described here again to avoid repetition.

The image text recognition device 1200 of the embodiment of the present invention includes a rectification module 1201, configured to rectify a target image region of an image to be processed by using the image rectification method; wherein the target image area is an image area of a target object in the image to be processed; a positioning module 1202, configured to position a text region in the corrected target image region by a user; the identifying module 1203 is configured to identify a text in the text region, so that a correction effect of a target image region of the image to be processed may be improved, and further, accuracy of character identification of the text region in the target image region may be improved.

Referring to fig. 13, fig. 13 is a structural diagram of an authentication apparatus according to an embodiment of the present invention. As shown in fig. 13, the authentication apparatus 1300 includes:

an obtaining module 1301, configured to obtain an image to be processed, where the image to be processed includes a card region, and the card region includes a text region;

an identifying module 1302, configured to identify a text of a text region of the card region of the image to be processed by using the image text identifying method;

and the verification module 1303 is used for performing identity verification according to the text.

The identity authentication apparatus 1300 provided in the embodiment of the present invention can implement each process in the above-described identity authentication method embodiment, and is not described here again to avoid repetition.

The identity authentication apparatus 1300 of the embodiment of the present invention includes an obtaining module 1301, configured to obtain an image to be processed, where the image to be processed includes a card region, and the card region includes a text region; an identifying module 1302, configured to identify a text of a text region of the card region of the image to be processed by using the image text identifying method; and the verification module 1303 is used for performing identity verification according to the text, so that the accuracy of an identity verification result can be improved.

Referring to fig. 14, fig. 14 is a block diagram of a model training apparatus according to an embodiment of the present invention. As shown in fig. 14, the model training apparatus 1400 includes:

an obtaining module 1401, configured to obtain S image samples and a label image corresponding to each image sample in the S image samples, where the label image is an image at a position of a target image area labeled with an image sample, S is an integer greater than 1, and the target image area is an image area of a target object in the image sample;

the training module 1402 is configured to perform image segmentation model training according to the S image samples and the label image corresponding to each image sample in the S image samples.

The model training device 1400 provided by the embodiment of the present invention can implement each process in the above-described model training method embodiments, and is not described here again to avoid repetition.

The model training device 1400 of the embodiment of the present invention includes an obtaining module 1401, configured to obtain S image samples and a label image corresponding to each image sample in the S image samples, where the label image is an image at a position of a target image area labeled with an image sample, S is an integer greater than 1, and the target image area is an image area of a target object in the image sample; the training module 1402 is configured to perform image segmentation model training according to the S image samples and the label images corresponding to each image sample in the S image samples, and further more accurately segment a target image region of an image to be processed based on the image segmentation model obtained through training in image backgrounds of different complexity, and further more accurately position the target image region for correction, so as to improve a correction effect of the target image region of the image.

Referring to fig. 15, fig. 15 is a block diagram of an image rectification apparatus according to another embodiment of the present invention, and as shown in fig. 15, an image rectification apparatus 1500 includes: a processor 1501, a memory 1502 and a computer program stored on the memory 1502 and executable on the processor, the various components in the image rectification device 1500 being coupled together by a bus interface 1503, the computer program when executed by the processor 1501 implementing the steps of:

It should be understood that, in this embodiment, the processor 1501 can implement the processes of the embodiment of the image rectification method, and details are not described herein for avoiding repetition.

Referring to fig. 16, fig. 16 is a block diagram of an image text recognition apparatus according to another embodiment of the present invention, and as shown in fig. 16, an image text recognition apparatus 1600 includes: a processor 1601, a memory 1602 and a computer program stored on the memory 1602 and operable on the processor, the various components of the image text recognition apparatus 1600 being coupled together by a bus interface 1603, the computer program when executed by the processor 1601 performing the steps of:

correcting a target image area of an image to be processed by using the image correction method; wherein the target image area is an image area of a target object in the image to be processed;

positioning a text area in the corrected target image area;

identifying text in the text region.

Referring to fig. 17, fig. 17 is a structural diagram of an authentication apparatus according to another embodiment of the present invention, and as shown in fig. 17, an authentication apparatus 1700 includes: a processor 1701, a memory 1702 and a computer program stored on the memory 1702 and executable on the processor, the various components in the authentication device 1700 being coupled together by a bus interface 1703, the computer program, when executed by the processor 1701, realizing the steps of:

and performing identity authentication according to the text.

Referring to fig. 18, fig. 18 is a block diagram of a model training apparatus according to still another embodiment of the present invention, and as shown in fig. 18, the model training apparatus 1800 includes: a processor 1801, a memory 1802, and a computer program stored on the memory 1802 and executable on the processor, the various components in the model training apparatus 1800 being coupled together by a bus interface 1803, the computer program when executed by the processor 1801 implementing the steps of:

An embodiment of the present invention further provides an electronic device, which includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements each process of the embodiment of the image correction method, or implements each process of the embodiment of the image text recognition method, or implements each process of the embodiment of the identity verification method, or implements each process of the embodiment of the model training method, and can achieve the same technical effect, and the details are not repeated here to avoid repetition.

An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when executed by a processor, the computer program implements each process of the embodiment of the image correction method, or implements each process of the embodiment of the image text recognition method, or implements each process of the embodiment of the identity verification method, or implements each process of the embodiment of the model training method, and can achieve the same technical effect, and is not described herein again to avoid repetition. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.

Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. An image rectification method is applied to a mobile terminal and comprises the following steps:

inputting an image to be processed into a pre-trained image segmentation model to segment a target image area of the image to be processed to obtain a segmented image of the image to be processed; the target image area is an image area of a target object in the image to be processed, and the target object is a card;

correcting the target image area according to the position of the target image area;

the image segmentation model is obtained based on Unet network training, the Unet network comprises N layers of convolution units and N layers of deconvolution units, the N layers of convolution units are respectively connected with deconvolution units with the same size of output characteristic graphs in the N layers of deconvolution units, and the value of N is 8;

each convolution unit in the N layers of convolution units comprises a separable convolution layer, and each deconvolution unit in the N layers of deconvolution units comprises a deconvolution layer;

the target object is rectangular in shape; the determining the location of the target image region from the segmented image comprises:

contour points of all edges of a target image area in the segmentation image are respectively obtained, the segmentation image is a binary image, the pixel value of the target image area in the segmentation image is 255, and the pixel value of a background area in the segmentation image is 0;

the correcting the target image area according to the position of the target image area includes:

2. An image text recognition method is applied to a mobile terminal and comprises the following steps:

correcting a target image area of an image to be processed by using the image correction method according to claim 1; the target image area is an image area of a target object in the image to be processed, and the target object is a card;

positioning a text area in the corrected target image area;

text in the text region is identified.

3. An identity authentication method is applied to a mobile terminal, and comprises the following steps:

recognizing the text of the text region of the card region of the image to be processed by using the image text recognition method according to claim 2;

and performing identity authentication according to the text.

4. An image rectification device, applied to a mobile terminal, includes:

the input module is used for inputting an image to be processed into a pre-trained image segmentation model so as to segment a target image area of the image to be processed, and a segmented image of the image to be processed is obtained; the target image area is an image area of a target object in the image to be processed, and the target object is a card;

the correction module is used for correcting the target image area according to the position of the target image area;

the target object is rectangular in shape; the determining module is specifically configured to:

respectively acquiring contour points of all edges of a target image area in the segmentation image, wherein the segmentation image is a binary image, the pixel value of the target image area in the segmentation image is 255, and the pixel value of a background area in the segmentation image is 0;

the correction module is specifically configured to:

5. An image text recognition device, applied to a mobile terminal, includes:

a correction module, configured to correct a target image area of an image to be processed by using the image correction method according to claim 1; the target image area is an image area of a target object in the image to be processed, and the target object is a card;

the positioning module is used for positioning the corrected text area in the target image area by a user;

6. An identity authentication device, which is applied to a mobile terminal, comprises:

the device comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring an image to be processed, the image to be processed comprises a card area, and the card area comprises a text area;

an identification module, configured to identify a text of a text region of a card region of the image to be processed by using the image text identification method according to claim 2;

7. An electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, performing the steps of the image rectification method of claim 1, or performing the steps of the image text recognition method of claim 2, or performing the steps of the identity verification method of claim 3.

8. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps of the image rectification method according to claim 1, or the steps of the image text recognition method according to claim 2, or the steps of the identity verification method according to claim 3.