CN111507957B - Identity card picture conversion method and device, computer equipment and storage medium - Google Patents

Identity card picture conversion method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN111507957B
CN111507957B CN202010294589.6A CN202010294589A CN111507957B CN 111507957 B CN111507957 B CN 111507957B CN 202010294589 A CN202010294589 A CN 202010294589A CN 111507957 B CN111507957 B CN 111507957B
Authority
CN
China
Prior art keywords
picture
initial picture
positioning information
initial
positioning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010294589.6A
Other languages
Chinese (zh)
Other versions
CN111507957A (en
Inventor
谭江龙
范有文
郑泽重
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Qianhai Huanrong Lianyi Information Technology Service Co Ltd
Original Assignee
Shenzhen Qianhai Huanrong Lianyi Information Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Qianhai Huanrong Lianyi Information Technology Service Co Ltd filed Critical Shenzhen Qianhai Huanrong Lianyi Information Technology Service Co Ltd
Priority to CN202010294589.6A priority Critical patent/CN111507957B/en
Publication of CN111507957A publication Critical patent/CN111507957A/en
Application granted granted Critical
Publication of CN111507957B publication Critical patent/CN111507957B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/28Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
    • G06V30/287Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of Kanji, Hiragana or Katakana characters

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Character Input (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an identity card picture conversion method, a device, computer equipment and a storage medium, wherein the method comprises the steps of obtaining an identity card picture to be identified so as to obtain an initial picture; inputting the initial picture into an attribute identification model to carry out attribute identification so as to obtain an attribute identification result; inputting the initial picture into a region positioning model to perform region positioning so as to obtain positioning information; judging whether the initial picture meets the quality requirement or not according to the positioning information; if yes, cutting the initial picture according to the positioning information to obtain a target area; calculating the proportion of the occupied graph and the gradient of the characters according to the positioning information and the target area to obtain index data; judging whether the initial picture can be optimized or not according to the index value; if the initial picture can be optimized, optimizing the initial picture to generate a processing result; and feeding back the processing result and the corresponding attribute identification result to the terminal. The invention realizes automatic identification card picture attribute distinguishing and other conversion and optimization, so as to facilitate the improvement of the follow-up OCR recognition accuracy.

Description

Identity card picture conversion method and device, computer equipment and storage medium
Technical Field
The present invention relates to a method for converting pictures, and more particularly to a method, an apparatus, a computer device and a storage medium for converting pictures of an identification card.
Background
The user identity card certificate pictures collected and stored in some old business projects have various formats, such as copies, mobile phone photographing, screenshot and the like, and the existing pictures cannot meet the requirements of subsequent projects and business, and the pictures need to be converted and optimized so as to meet the requirements of the business projects aiming at the identity cards.
At present, when certificate pictures such as identity cards and the like collected in business projects are processed, a mode of manual batch processing is adopted, namely a batch of pictures are manually processed, the whole process comprises manually distinguishing whether the current certificate picture is a copy picture or a terminal picture, manually cutting the picture, manually optimizing the picture by using a picture processing tool, using code processing attributes and the like, but the mode of manual processing is low in efficiency, high in cost and not suitable for popularization, has related technical requirements for operators, is difficult to implement, and greatly troubles brought to occasions such as follow-up OCR (optical character recognition ) recognition, page display and information verification, so that the progress of the business projects is blocked.
Therefore, a new method is needed to be designed to realize automatic identification card picture attribute distinction and other conversion and optimization, and to increase the definition of the whole picture so as to improve the accuracy of the subsequent OCR recognition.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an identity card picture conversion method, an identity card picture conversion device, computer equipment and a storage medium.
In order to achieve the above purpose, the present invention adopts the following technical scheme: the identity card picture conversion method comprises the following steps:
acquiring an identity card picture to be identified to obtain an initial picture;
inputting the initial picture into an attribute identification model to carry out attribute identification so as to obtain an attribute identification result;
inputting the initial picture into a region positioning model for region positioning to obtain positioning information;
judging whether the initial picture meets the quality requirement or not according to the positioning information;
if the initial picture meets the quality requirement, cutting the initial picture according to the positioning information to obtain a target area;
calculating the proportion of the occupied graph and the gradient of the characters according to the positioning information and the target area to obtain index data;
judging whether the initial picture can be optimized or not according to the index value;
If the target area cannot be optimized, generating prompt information, and feeding the prompt information back to the terminal so as to display the prompt information on the terminal;
if the target area can be optimized, optimizing the initial picture to generate a processing result;
feeding back the processing result and the corresponding attribute identification result to the terminal so as to display the processing result and the corresponding attribute identification result on the terminal;
the attribute identification model is obtained by training a neural network through sample picture data with picture attribute labels;
the area positioning model is obtained by training a YOLO target detection network by adopting sample picture data with a specified area coordinate label.
The further technical scheme is as follows: the picture attribute tags comprise terminal photographing picture category tags and copy picture category tags.
The further technical scheme is as follows: after judging whether the initial picture meets the quality requirement according to the positioning information, the method further comprises the following steps:
and if the initial picture does not meet the quality requirement, executing the generation of the prompt information, and feeding the prompt information back to the terminal so as to display the prompt information at the terminal.
The further technical scheme is as follows: the loss function of the YOLO target detection network comprises a loss function for calculating a center coordinate loss value, a loss function for calculating width and height loss values of a boundary box, a loss function for calculating a picture category loss value and a loss function for calculating a confidence loss value.
The further technical scheme is as follows: the step of judging whether the initial picture meets the quality requirement according to the positioning information comprises the following steps:
judging whether the confidence coefficient in the positioning information is not smaller than a confidence coefficient threshold value or not;
if the confidence coefficient in the positioning information is not smaller than a confidence coefficient threshold value, the initial picture meets the quality requirement;
if the confidence coefficient in the positioning information is smaller than a confidence coefficient threshold value, the initial picture does not meet the quality requirement.
The further technical scheme is as follows: and calculating the occupation proportion and the character gradient according to the positioning information and the target area to obtain index data, wherein the method comprises the following steps:
calculating the pixel size according to the position coordinates of the target area in the positioning information, and calculating the proportion of the map according to the pixel size;
OCR recognition is carried out on the target area so as to obtain the position of the text area;
calculating the character gradient according to the position of the character area;
integrating the proportion of the map and the character gradient to obtain index data.
The further technical scheme is as follows: the optimizing the initial picture to generate a processing result includes:
and rotating the initial picture by a corresponding angle according to the character gradient, and removing the watermark of the initial picture to generate a processing result.
The invention also provides an identity card picture conversion device, which comprises:
the picture acquisition unit is used for acquiring an identity card picture to be identified so as to obtain an initial picture;
the attribute identification unit is used for inputting the initial picture into an attribute identification model to carry out attribute identification so as to obtain an attribute identification result;
the area positioning unit is used for inputting the initial picture into an area positioning model to perform area positioning so as to obtain positioning information;
the information judging unit is used for judging whether the initial picture meets the quality requirement or not according to the positioning information;
the cutting unit is used for cutting the initial picture according to the positioning information to obtain a target area if the initial picture meets the quality requirement;
the index calculation unit is used for calculating the occupation proportion and the character gradient according to the positioning information and the target area so as to obtain index data;
the optimization judging unit is used for judging whether the initial picture can be optimized or not according to the index value;
the information generation unit is used for generating prompt information if the target area cannot be optimized, and feeding the prompt information back to the terminal so as to display the prompt information at the terminal;
The optimizing unit is used for optimizing the initial picture to generate a processing result if the target area can be optimized;
and the information feedback unit is used for feeding back the processing result and the corresponding attribute identification result to the terminal so as to display the processing result and the corresponding attribute identification result on the terminal.
The invention also provides a computer device which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the method when executing the computer program.
The present invention also provides a storage medium storing a computer program which, when executed by a processor, performs the above-described method.
Compared with the prior art, the invention has the beneficial effects that: according to the invention, attribute recognition is carried out by utilizing an attribute recognition model, whether an identity card picture belongs to a terminal photographed picture or a duplicate picture is determined, then a region positioning model formed by using an intelligent recognition frame based on TensorFlow is adopted for picture positioning, and the picture occupation proportion and the character inclination angle are carried out for the picture which detects a target and can effectively recognize the position coordinates of the target region, when the two indexes meet the requirements, the picture optimization can be carried out by utilizing the existing python picture processing tool library, the identity card picture attribute distinction and other conversion and optimization can be automatically carried out, the definition of the whole picture is increased, and the follow-up OCR recognition accuracy is improved.
The invention is further described below with reference to the drawings and specific embodiments.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of an application scenario of an identification card picture conversion method provided by an embodiment of the present invention;
fig. 2 is a flow chart of an identification card picture conversion method according to an embodiment of the present invention;
FIG. 3 is a schematic sub-flowchart of an ID card image conversion method according to an embodiment of the present invention;
fig. 4 is a schematic sub-flowchart of an identification card picture conversion method according to an embodiment of the present invention;
FIG. 5 is a schematic block diagram of an ID card picture conversion device according to an embodiment of the present invention;
FIG. 6 is a schematic block diagram of an index calculation unit of the identity card picture conversion device provided by the embodiment of the invention;
fig. 7 is a schematic block diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
Referring to fig. 1 and fig. 2, fig. 1 is a schematic view of an application scenario of an identification card picture conversion method according to an embodiment of the present invention. Fig. 2 is a schematic flow chart of an identification card picture conversion method provided by an embodiment of the invention. The identity card picture conversion method is applied to a server, the server performs data interaction with a terminal, the terminal acquires the identity card picture to be identified, performs attribute identification and region positioning to perform quality detection, optimizes the optimizable picture to improve the definition of the whole picture, and facilitates subsequent OCR identification.
Fig. 2 is a flow chart of an identification card picture conversion method according to an embodiment of the present invention. As shown in fig. 2, the method includes the following steps S110 to S200.
S110, acquiring an identity card picture to be identified so as to obtain an initial picture.
In this embodiment, the initial picture may be an identity card picture collected by the service item, and of course, the initial picture may also be an identity card picture obtained by real-time transmission through the terminal.
S120, inputting the initial picture into an attribute identification model to carry out attribute identification so as to obtain an attribute identification result.
In this embodiment, the attribute identification result refers to whether the category of the initial picture belongs to the category of the terminal shot picture or the category of the copy picture.
The attribute identification model is obtained by training a neural network through sample picture data with picture attribute labels.
The picture attribute tags comprise terminal photographing picture category tags and copy picture category tags.
Taking a large number of terminals with picture attribute labels as sample picture data, training a neural network, calculating the difference between the result obtained by each training and the actual label by adopting a loss function, and when the difference falls within an allowable range, indicating that the current neural network is converged and can be used as an attribute identification model; otherwise, the neural network is considered to be unsatisfactory, and the weight corresponding to the neural network needs to be readjusted to perform secondary training until the two processes of training and testing meet the requirements.
S130, inputting the initial picture into a region positioning model to perform region positioning so as to obtain positioning information.
In this embodiment, the positioning information includes the position coordinates of the target area and the confidence.
The area positioning model is obtained by training a YOLO target detection network by adopting sample picture data with a specified area coordinate label. And (3) performing image positioning on the identity card picture based on a pattern recognition technology of TensorFlow. The YOLO target detection network resets the picture size to 448x448 size when training; putting the picture into a network for processing; the non-maximum suppression processing is performed to obtain a result. YOLO, unlike conventional detection algorithms, employs a sliding window to find a target. YOLO directly employs a single convolutional neural network to predict multiple region and class probabilities.
After the identification card picture to be identified is obtained, positioning analysis of a picture target pattern is firstly carried out, wherein the positioning analysis mainly comprises the size of a target area pixel, the proportion of the target area to the whole picture and the gradient of the picture characters; and comparing whether the feedback meets the uploading quality requirement or not according to a predefined threshold value, wherein the specific threshold value is set by analyzing a picture set with the optimal effect in the actual recognition condition.
In an embodiment, the loss function of the YOLO target detection network includes a loss function that calculates center coordinate loss values, a loss function that calculates width and height loss values of bounding boxes, a loss function that calculates picture category loss values, a loss function that calculates confidence loss values.
Wherein, the loss function for calculating the center coordinate loss value is that The loss function predicts a loss value for the bounding box position (x, y). Lambda (lambda) coordi=0 Is a given constant. The function calculates the sum of each bounding box predictor for each grid cell.
If there is a target in the grid cell, the jth bounding box predictor is valid for that prediction.
A corresponding bounding box is predicted for each grid cell YOLO. During training, only one bounding box predictor is desired for each target. Based on which prediction has the highest real-time IOU (cross-over-Union) and the ground truth, it is confirmed that it is valid for predicting one target. (x) i ,y i ) Is the location of the prediction bounding box,is the actual position obtained from the sample picture data.
Calculating the width and height loss values of the bounding box as a loss functionw i 、h i Width and height of the bounding box for prediction; / >Is the actual width and height of the target area derived from the sample picture data.
Calculating a loss function of a picture category loss value asp i (c) For the predicted picture whether or not there is a category of the target area, < >>Whether the actual picture obtained from the sample picture data has the category of the target area or not.
Calculating a confidence loss value as a loss function C i For confidence of prediction, ++>The prediction bounding box refers to a predicted target region, which is the intersection of the prediction bounding box and the ground truth. When there is an object in one cell, l ijobj Equal to 1, otherwise, the value is 0. Where lambda is noobji=0 Parameters are in different weighted parts of the loss function. This is critical to the stability of the model. The highest penalty is λ prediction for coordinates coordi=0 =5, when no target is detected, there is a lowest confidence prediction penalty λ noobji=0 =0.5。
And S140, judging whether the initial picture meets the quality requirement or not according to the positioning information.
In one embodiment, referring to fig. 3, the step S140 may include steps S141 to S143.
S141, judging whether the confidence coefficient in the positioning information is not smaller than a confidence coefficient threshold value;
s142, if the confidence coefficient in the positioning information is not smaller than a confidence coefficient threshold value, the initial picture meets the quality requirement;
S143, if the confidence coefficient in the positioning information is smaller than a confidence coefficient threshold value, the initial picture does not meet the quality requirement.
In this embodiment, the uploaded id card pictures are all provided with the target area, so that only the confidence level in the positioning information is needed to judge, when the confidence level is low, the target area cannot be positioned, and the quality of the current initial picture is not in accordance with the requirement and cannot be adjusted manually.
In an embodiment, the region positioning model can be set to output a category result of whether the region positioning model has a target region or not, and double judgment is carried out by combining whether the region positioning model has the target region with the confidence level, and when the target cannot be positioned in the picture positioning process, the current initial picture is not clear, and the current requirement cannot be met; when a target can be detected in the picture positioning process, whether the confidence coefficient corresponding to the target area meets the requirement or not needs to be judged, and when the confidence coefficient meets the requirement, the initial picture meets the picture quality requirement.
And S150, if the initial picture meets the quality requirement, cutting the initial picture according to the positioning information to obtain a target area.
In this embodiment, the target area refers to a picture area including only the identification card, and does not include any background.
And cutting according to the coordinate information in the positioning information to accurately obtain the target area.
And S160, calculating the proportion of the occupied graph and the inclination of the characters according to the positioning information and the target area so as to obtain index data.
In this embodiment, the index data refers to the duty ratio of the target area and the character gradient.
In one embodiment, referring to fig. 4, the step S160 may include steps S161 to S164.
And S161, calculating the pixel size according to the position coordinates of the target area in the positioning information, and calculating the proportion of the map according to the pixel size.
In this embodiment, the map proportion refers to a proportion of the target area occupying the entire business license picture.
The pixel size of the target area can be obtained according to the position coordinates of the target area, so that the area of the target area is obtained, and the area of the whole identity card picture is used for calculating the proportion of the map.
S162, OCR is conducted on the target area so as to obtain the position of the text area.
And performing character recognition on the target picture by adopting OCR to obtain the position of the character area.
S163, calculating the character gradient according to the position of the character area.
In this embodiment, the character gradient refers to a gradient angle of the character area with respect to the horizontal line.
The alignment angle of the text region, i.e., the text tilt angle, can be obtained by using the position of the text region and the OCR technique.
And cutting out the target area for OCR recognition once, and calculating the character gradient according to the position of the character area marked by OCR to be used as a character gradient index of the picture.
And S164, integrating the occupation proportion and the character gradient to obtain index data.
The obtained index is compared with the configured threshold value, and a result is prompted, for example, the quality requirement of the picture is completely met, or a certain part of the index is not met, for example, the picture is insufficient in pixels, and the picture is prompted to be blurred.
S170, judging whether the initial picture can be optimized or not according to the index value.
In this embodiment, when the index value exceeds the configured threshold, it indicates that the identification card picture is not optimizable, and when the index value does not exceed the configured threshold, it indicates that the identification card picture is optimizable.
And S180, if the target area cannot be optimized, generating prompt information, and feeding the prompt information back to the terminal so as to display the prompt information on the terminal.
In this embodiment, the prompt information includes information which does not meet the quality requirement and cannot be optimized, and the picture is required to be uploaded again.
And S190, if the target area can be optimized, performing optimization processing on the initial picture to generate a processing result.
In this embodiment, the processing result refers to the optimized picture.
Specifically, the business license picture is rotated by a corresponding angle according to the character gradient, and the watermark of the initial picture is removed to generate a processing result.
Specifically, the processing optimization is carried out on the pictures based on the Python picture library, and the operations comprise angle adjustment, effective area extraction, contrast optimization, watermark removal and the like.
For optimizing the picture contrast, the picture contrast can be judged according to the three-channel color number parameters of the picture, the effect of increasing the black and white of the picture visually more obviously can be achieved by changing the pigment channel of the picture to be optimized, and the accuracy of OCR can be improved; the seal optimizing process firstly extracts the seal through the HSV color space, then acquires the needed information, judges the red channel information, removes the red pigment, can visually remove the seal effect on the picture, and the picture cutting process is carried out, so that the front and the back of the identity card are possibly in one picture, and the seal optimizing process is carried out according to the identification result.
And S200, feeding back the processing result and the corresponding attribute identification result to the terminal so as to display the processing result and the corresponding attribute identification result on the terminal.
If the target image is not processed and optimized, for example, the target image cannot be identified through a picture detection model, the original image is output, prompting that the target image cannot be optimized, and if the target image can be processed and optimized through picture detection, the target image is output to a designated area according to requirements, namely, only the optimized target area, the corresponding attribute category and the like are output.
If the initial picture does not meet the quality requirement, the step S180 is executed.
According to the identification card picture conversion method, the attribute recognition model is utilized to carry out attribute recognition, whether the identification card picture belongs to a terminal photographed picture or a duplicate picture is determined, then the area positioning model formed by the intelligent recognition framework based on TensorFlow is adopted to carry out picture positioning, the picture occupation proportion and the character inclination angle are carried out on the picture which detects the target and can effectively recognize the position coordinates of the target area, when the two indexes meet the requirements, the conventional python picture processing tool library can be utilized to carry out picture optimization, identification card picture attribute distinguishing and other conversion and optimization are automatically carried out, and the definition of the whole picture is increased so as to facilitate the improvement of the follow-up OCR recognition accuracy.
Fig. 5 is a schematic block diagram of an identification card picture conversion apparatus 300 according to an embodiment of the present invention. As shown in fig. 5, the present invention further provides an identification card picture conversion device 300 corresponding to the above identification card picture conversion method. The identification card picture conversion apparatus 300 includes a unit for performing the above identification card picture conversion method, and may be configured in a server. Specifically, referring to fig. 5, the identification card picture conversion apparatus 300 includes a picture acquisition unit 301, an attribute identification unit 302, a region positioning unit 303, an information determination unit 304, a cutting unit 305, an index calculation unit 306, an optimization determination unit 307, an information generation unit 308, an optimization unit 309, and an information feedback unit 310.
The picture obtaining unit 301 is configured to obtain an identification card picture to be identified, so as to obtain an initial picture; the attribute identifying unit 302 is configured to input the initial picture to an attribute identifying model for attribute identification, so as to obtain an attribute identifying result; the attribute identification model is obtained by training a neural network through sample picture data with picture attribute labels; the region positioning unit 303 is configured to input the initial picture to a region positioning model for region positioning, so as to obtain positioning information; the area positioning model is obtained by training a YOLO target detection network by adopting sample picture data with a designated area coordinate label; an information judging unit 304, configured to judge whether the initial picture meets a quality requirement according to the positioning information; a cutting unit 305, configured to cut the initial picture according to the positioning information to obtain a target area if the initial picture meets a quality requirement; the index calculation unit 306 is configured to perform calculation of a map occupation ratio and a text gradient according to the positioning information and the target area, so as to obtain index data; an optimization judging unit 307, configured to judge whether the initial picture can be optimized according to the index value; the information generating unit 308 is configured to generate a prompt message if the target area cannot be optimized, and feed back the prompt message to the terminal, so as to display the prompt message at the terminal; an optimizing unit 309, configured to perform an optimizing process on the initial picture if the target area can be optimized, so as to generate a processing result; and the information feedback unit 310 is configured to feedback the processing result and the corresponding attribute identification result to the terminal, so as to display the processing result and the corresponding attribute identification result on the terminal.
In an embodiment, the information determining unit 304 is configured to determine whether the confidence level in the positioning information is not less than a confidence threshold; if the confidence coefficient in the positioning information is not smaller than a confidence coefficient threshold value, the initial picture meets the quality requirement; if the confidence coefficient in the positioning information is smaller than a confidence coefficient threshold value, the initial picture does not meet the quality requirement.
In one embodiment, as shown in fig. 6, the index calculating unit 306 includes a duty ratio calculating subunit 3061, an identifying subunit 3062, an inclination calculating subunit 3063, and an integrating subunit 3064.
The duty ratio calculating subunit 3061 is configured to calculate a pixel size according to the target area position coordinate in the positioning information, and calculate a duty ratio according to the pixel size; a recognition subunit 3062, configured to perform OCR recognition on the target area to obtain a position of the text area; an inclination calculation subunit 3063, configured to calculate a text inclination according to the location of the text region; and the integrating subunit 3064 is configured to integrate the map occupation ratio and the text inclination to obtain index data.
In an embodiment, the optimizing unit 309 is configured to perform rotation of the initial picture by a corresponding angle according to the text inclination, and remove a watermark of the initial picture to generate a processing result.
It should be noted that, as a person skilled in the art can clearly understand, the specific implementation process of the identification card picture conversion apparatus 300 and each unit may refer to the corresponding description in the foregoing method embodiment, and for convenience and brevity of description, the description is omitted here.
The identification card picture conversion apparatus 300 described above may be implemented in the form of a computer program that can be run on a computer device as shown in fig. 7.
Referring to fig. 7, fig. 7 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 500 may be a server, where the server may be a stand-alone server or may be a server cluster formed by a plurality of servers.
With reference to FIG. 7, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032 includes program instructions that, when executed, cause the processor 502 to perform an identification card picture conversion method.
The processor 502 is used to provide computing and control capabilities to support the operation of the overall computer device 500.
The internal memory 504 provides an environment for the execution of a computer program 5032 in the non-volatile storage medium 503, which computer program 5032, when executed by the processor 502, causes the processor 502 to perform an identification card picture conversion method.
The network interface 505 is used for network communication with other devices. It will be appreciated by those skilled in the art that the architecture shown in fig. 7 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting of the computer device 500 to which the present inventive arrangements may be implemented, as a particular computer device 500 may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
Wherein the processor 502 is configured to execute a computer program 5032 stored in a memory to implement the steps of:
acquiring an identity card picture to be identified to obtain an initial picture; inputting the initial picture into an attribute identification model to carry out attribute identification so as to obtain an attribute identification result; inputting the initial picture into a region positioning model for region positioning to obtain positioning information; judging whether the initial picture meets the quality requirement or not according to the positioning information; if the initial picture meets the quality requirement, cutting the initial picture according to the positioning information to obtain a target area; calculating the proportion of the occupied graph and the gradient of the characters according to the positioning information and the target area to obtain index data; judging whether the initial picture can be optimized or not according to the index value; if the target area cannot be optimized, generating prompt information, and feeding the prompt information back to the terminal so as to display the prompt information on the terminal; if the target area can be optimized, optimizing the initial picture to generate a processing result; and feeding back the processing result and the corresponding attribute identification result to the terminal so as to display the processing result and the corresponding attribute identification result on the terminal.
The attribute identification model is obtained by training a neural network through sample picture data with picture attribute labels. The picture attribute tags comprise terminal photographing picture category tags and copy picture category tags.
The area positioning model is obtained by training a YOLO target detection network by adopting sample picture data with a specified area coordinate label.
The loss function of the YOLO target detection network comprises a loss function for calculating a center coordinate loss value, a loss function for calculating width and height loss values of a boundary box, a loss function for calculating a picture category loss value and a loss function for calculating a confidence loss value.
In an embodiment, after implementing the step of determining whether the initial picture meets the quality requirement according to the positioning information, the processor 502 further implements the following steps:
and if the initial picture does not meet the quality requirement, executing the generation of the prompt information, and feeding the prompt information back to the terminal so as to display the prompt information at the terminal.
In an embodiment, when implementing the step of determining whether the initial picture meets the quality requirement according to the positioning information, the processor 502 specifically implements the following steps:
Judging whether the confidence coefficient in the positioning information is not smaller than a confidence coefficient threshold value or not; if the confidence coefficient in the positioning information is not smaller than a confidence coefficient threshold value, the initial picture meets the quality requirement; if the confidence coefficient in the positioning information is smaller than a confidence coefficient threshold value, the initial picture does not meet the quality requirement.
In an embodiment, when the step of calculating the map occupation ratio and the text gradient according to the positioning information and the target area to obtain the index data is implemented by the processor 502, the following steps are specifically implemented:
calculating the pixel size according to the position coordinates of the target area in the positioning information, and calculating the proportion of the map according to the pixel size; OCR recognition is carried out on the target area so as to obtain the position of the text area; calculating the character gradient according to the position of the character area; integrating the proportion of the map and the character gradient to obtain index data.
In an embodiment, when the step of optimizing the initial picture to generate the processing result is implemented by the processor 502, the following steps are specifically implemented:
and rotating the initial picture by a corresponding angle according to the character gradient, and removing the watermark of the initial picture to generate a processing result.
It should be appreciated that in an embodiment of the application, the processor 502 may be a central processing unit (Central Processing Unit, CPU), the processor 502 may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSPs), application specific integrated circuits (Application Specific Integrated Circuit, ASICs), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Those skilled in the art will appreciate that all or part of the flow in a method embodying the above described embodiments may be accomplished by computer programs instructing the relevant hardware. The computer program comprises program instructions, and the computer program can be stored in a storage medium, which is a computer readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present application also provides a storage medium. The storage medium may be a computer readable storage medium. The storage medium stores a computer program which, when executed by a processor, causes the processor to perform the steps of:
Acquiring an identity card picture to be identified to obtain an initial picture; inputting the initial picture into an attribute identification model to carry out attribute identification so as to obtain an attribute identification result; inputting the initial picture into a region positioning model for region positioning to obtain positioning information; judging whether the initial picture meets the quality requirement or not according to the positioning information; if the initial picture meets the quality requirement, cutting the initial picture according to the positioning information to obtain a target area; calculating the proportion of the occupied graph and the gradient of the characters according to the positioning information and the target area to obtain index data; judging whether the initial picture can be optimized or not according to the index value; if the target area cannot be optimized, generating prompt information, and feeding the prompt information back to the terminal so as to display the prompt information on the terminal; if the target area can be optimized, optimizing the initial picture to generate a processing result; and feeding back the processing result and the corresponding attribute identification result to the terminal so as to display the processing result and the corresponding attribute identification result on the terminal.
The attribute identification model is obtained by training a neural network through sample picture data with picture attribute labels. The picture attribute tags comprise terminal photographing picture category tags and copy picture category tags.
The area positioning model is obtained by training a YOLO target detection network by adopting sample picture data with a specified area coordinate label.
The loss function of the YOLO target detection network comprises a loss function for calculating a center coordinate loss value, a loss function for calculating width and height loss values of a boundary box, a loss function for calculating a picture category loss value and a loss function for calculating a confidence loss value.
In an embodiment, after executing the computer program to implement the step of determining whether the initial picture meets the quality requirement according to the positioning information, the processor further implements the following steps:
and if the initial picture does not meet the quality requirement, executing the generation of the prompt information, and feeding the prompt information back to the terminal so as to display the prompt information at the terminal.
In an embodiment, when the processor executes the computer program to implement the step of determining whether the initial picture meets the quality requirement according to the positioning information, the method specifically includes the following steps:
judging whether the confidence coefficient in the positioning information is not smaller than a confidence coefficient threshold value or not; if the confidence coefficient in the positioning information is not smaller than a confidence coefficient threshold value, the initial picture meets the quality requirement; if the confidence coefficient in the positioning information is smaller than a confidence coefficient threshold value, the initial picture does not meet the quality requirement.
In one embodiment, when the processor executes the computer program to implement the step of calculating the graph occupation ratio and the text inclination according to the positioning information and the target area to obtain the index data, the method specifically includes the following steps:
calculating the pixel size according to the position coordinates of the target area in the positioning information, and calculating the proportion of the map according to the pixel size; OCR recognition is carried out on the target area so as to obtain the position of the text area; calculating the character gradient according to the position of the character area; integrating the proportion of the map and the character gradient to obtain index data.
In one embodiment, when the processor executes the computer program to perform the optimizing process on the initial picture to generate a processing result, the following steps are specifically implemented:
and rotating the initial picture by a corresponding angle according to the character gradient, and removing the watermark of the initial picture to generate a processing result.
The storage medium may be a U-disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, or other various computer-readable storage media that can store program codes.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of each unit is only one logic function division, and there may be another division manner in actual implementation. For example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed.
The steps in the method of the embodiment of the invention can be sequentially adjusted, combined and deleted according to actual needs. The units in the device of the embodiment of the invention can be combined, divided and deleted according to actual needs. In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The integrated unit may be stored in a storage medium if implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a terminal, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (10)

1. The identity card picture conversion method is characterized by comprising the following steps of:
acquiring an identity card picture to be identified to obtain an initial picture;
inputting the initial picture into an attribute identification model to carry out attribute identification so as to obtain an attribute identification result;
inputting the initial picture into a region positioning model for region positioning to obtain positioning information;
judging whether the initial picture meets the quality requirement or not according to the positioning information;
if the initial picture meets the quality requirement, cutting the initial picture according to the positioning information to obtain a target area;
calculating the proportion of the occupied graph and the gradient of the characters according to the positioning information and the target area to obtain index data;
judging whether the initial picture can be optimized or not according to the index value;
if the target area cannot be optimized, generating prompt information, and feeding the prompt information back to the terminal so as to display the prompt information on the terminal;
if the target area can be optimized, optimizing the initial picture to generate a processing result;
feeding back the processing result and the corresponding attribute identification result to the terminal so as to display the processing result and the corresponding attribute identification result on the terminal;
The attribute identification model is obtained by training a neural network through sample picture data with picture attribute labels;
the area positioning model is obtained by training a YOLO target detection network by adopting sample picture data with a specified area coordinate label.
2. The method for converting an id card picture according to claim 1, wherein the picture attribute tags include a terminal photographed picture category tag and a copy picture category tag.
3. The method for converting an id card picture according to claim 1, wherein after said determining whether the initial picture meets the quality requirement according to the positioning information, further comprises:
and if the initial picture does not meet the quality requirement, executing the generation of the prompt information, and feeding the prompt information back to the terminal so as to display the prompt information at the terminal.
4. The method for converting an identification card picture according to claim 1, wherein the loss function of the YOLO target detection network includes a loss function for calculating a center coordinate loss value, a loss function for calculating a width and height loss value of a bounding box, a loss function for calculating a picture category loss value, and a loss function for calculating a confidence loss value.
5. The method for converting an id card picture according to claim 1, wherein said determining whether the initial picture meets quality requirements according to the positioning information includes:
judging whether the confidence coefficient in the positioning information is not smaller than a confidence coefficient threshold value or not;
if the confidence coefficient in the positioning information is not smaller than a confidence coefficient threshold value, the initial picture meets the quality requirement;
if the confidence coefficient in the positioning information is smaller than a confidence coefficient threshold value, the initial picture does not meet the quality requirement.
6. The method for converting an identification card picture according to claim 1, wherein the calculating the occupation ratio and the character gradient according to the positioning information and the target area to obtain the index data comprises:
calculating the pixel size according to the position coordinates of the target area in the positioning information, and calculating the proportion of the map according to the pixel size;
OCR recognition is carried out on the target area so as to obtain the position of the text area;
calculating the character gradient according to the position of the character area;
integrating the proportion of the map and the character gradient to obtain index data.
7. The method for converting an id card picture according to claim 1, wherein said optimizing the initial picture to generate a processing result includes:
And rotating the initial picture by a corresponding angle according to the character gradient, and removing the watermark of the initial picture to generate a processing result.
8. Identity card picture conversion equipment, its characterized in that includes:
the picture acquisition unit is used for acquiring an identity card picture to be identified so as to obtain an initial picture;
the attribute identification unit is used for inputting the initial picture into an attribute identification model to carry out attribute identification so as to obtain an attribute identification result;
the area positioning unit is used for inputting the initial picture into an area positioning model to perform area positioning so as to obtain positioning information;
the information judging unit is used for judging whether the initial picture meets the quality requirement or not according to the positioning information;
the cutting unit is used for cutting the initial picture according to the positioning information to obtain a target area if the initial picture meets the quality requirement;
the index calculation unit is used for calculating the occupation proportion and the character gradient according to the positioning information and the target area so as to obtain index data;
the optimization judging unit is used for judging whether the initial picture can be optimized or not according to the index value;
the information generation unit is used for generating prompt information if the target area cannot be optimized, and feeding the prompt information back to the terminal so as to display the prompt information at the terminal;
The optimizing unit is used for optimizing the initial picture to generate a processing result if the target area can be optimized;
and the information feedback unit is used for feeding back the processing result and the corresponding attribute identification result to the terminal so as to display the processing result and the corresponding attribute identification result on the terminal.
9. A computer device, characterized in that it comprises a memory on which a computer program is stored and a processor which, when executing the computer program, implements the method according to any of claims 1-7.
10. A storage medium storing a computer program which, when executed by a processor, performs the method of any one of claims 1 to 7.
CN202010294589.6A 2020-04-15 2020-04-15 Identity card picture conversion method and device, computer equipment and storage medium Active CN111507957B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010294589.6A CN111507957B (en) 2020-04-15 2020-04-15 Identity card picture conversion method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010294589.6A CN111507957B (en) 2020-04-15 2020-04-15 Identity card picture conversion method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111507957A CN111507957A (en) 2020-08-07
CN111507957B true CN111507957B (en) 2023-09-05

Family

ID=71877575

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010294589.6A Active CN111507957B (en) 2020-04-15 2020-04-15 Identity card picture conversion method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111507957B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112232336A (en) * 2020-09-02 2021-01-15 深圳前海微众银行股份有限公司 Certificate identification method, device, equipment and storage medium
CN112686847B (en) * 2020-12-23 2024-05-14 平安银行股份有限公司 Identification card image shooting quality evaluation method and device, computer equipment and medium
CN114494751A (en) * 2022-02-16 2022-05-13 国泰新点软件股份有限公司 License information identification method, device, equipment and medium
CN116363677A (en) * 2023-03-28 2023-06-30 浙江海规技术有限公司 Identification card identification method and device under complex background, computer equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10332245B1 (en) * 2018-12-11 2019-06-25 Capital One Services, Llc Systems and methods for quality assurance of image recognition model
CN109961040B (en) * 2019-03-20 2023-03-21 深圳市华付信息技术有限公司 Identity card area positioning method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN111507957A (en) 2020-08-07

Similar Documents

Publication Publication Date Title
CN111507957B (en) Identity card picture conversion method and device, computer equipment and storage medium
US9542752B2 (en) Document image compression method and its application in document authentication
CN110569856B (en) Sample labeling method and device, and damage category identification method and device
CN110913243B (en) Video auditing method, device and equipment
CN111160169B (en) Face detection method, device, equipment and computer readable storage medium
CN106683073B (en) License plate detection method, camera and server
CN109255300B (en) Bill information extraction method, bill information extraction device, computer equipment and storage medium
CN111489347B (en) Business license picture quality detection method and device, computer equipment and storage medium
CN112651953B (en) Picture similarity calculation method and device, computer equipment and storage medium
CN116168351B (en) Inspection method and device for power equipment
CN112668640A (en) Text image quality evaluation method, device, equipment and medium
CN110210467B (en) Formula positioning method of text image, image processing device and storage medium
CN115131590A (en) Training method of target detection model, target detection method and related equipment
CN112434640B (en) Method, device and storage medium for determining rotation angle of document image
CN112966687B (en) Image segmentation model training method and device and communication equipment
CN114445716B (en) Key point detection method, key point detection device, computer device, medium, and program product
JP4967045B2 (en) Background discriminating apparatus, method and program
CN107330470B (en) Method and device for identifying picture
CN113496223A (en) Method and device for establishing text region detection model
CN112396648A (en) Target identification method and system capable of positioning mass center of target object
CN117373030B (en) OCR-based user material identification method, system, device and medium
CN111681151B (en) Image watermark detection method and device and computing equipment
CN113792780B (en) Container number identification method based on deep learning and image post-processing
CN113902922A (en) Picture identification method, medium, device and computing equipment
CN115631493A (en) Text area determining method, system and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant