CA3135111A1

CA3135111A1 - Character positioning method and system for certificate document

Info

Publication number: CA3135111A1
Application number: CA3135111A
Authority: CA
Inventors: Yuan Wang
Original assignee: 10353744 Canada Ltd
Current assignee: 10353744 Canada Ltd
Priority date: 2020-10-20
Filing date: 2021-10-20
Publication date: 2022-04-20
Also published as: CN112364863B; CN112364863A

Abstract

The present invention discloses to a license document character locating method and system, the method comprises: after inputting to be tested image into deep learning model, outputting spliced and fused feature map; performing image difference calculation on feature map marked with different indexes to obtain difference value of feature map; performing binarization calculation on the difference value of feature map; marking image connected component according to binarization calculation result; traversing all connected components and converting into locating coordinates, outputting coordinate set. The invention achieves the automatic recognition of the license document character with high positioning accuracy and supports character detection in any direction.

Description

CHARACTER POSITIONING METHOD AND SYSTEM FOR CERTIFICATE
DOCUMENT
Field [0001] The present disclosure relates to computer image processing and deep learning technology field, particularly to a license document character locating method and system.
Background

[0002] In financial business, the review and inspection of license document are often involved. For example, when a company applies for a loan from a financial institution, the company needs to provide its business license and send it to the financial institution in the form of original or photocopied, scanned document, the credit approving officer of the financial institution verifies the authenticity, uniqueness, and legitimacy of company's license based on the text information, and accurately enters the information into the business system of the financial institution for subsequent management process.

[0003] In the industry, this type of license review and entry work can be carried out in two ways. One is manual method, and the other one is machinery automation method.

[0004] Manual method is the most common operation method. A salesperson usually takes 5 minutes to review a license, and the work is highly repetitive, prone to human error and operational risks. Another problem brought about by manual method is that with the increase of business volume, human resource also increases, and they cannot be scaled effectively, and economic costs cannot achieve diminishing marginal utility.

[0005] Another way to deal with this type of work is an automatic way, that is, using computer programs to automatically obtain the electronic version of the license, and then using computer technologies such as image processing and character locating to automatically locate the location of the character, recognize the text information, and automatically extract its Date recue / Date received 2021-12-20 correspondingly content, then reviewing and entering into the business system of the financial institution without human involvement in the entire process.

[0006] Among them, the character locating system based on deep learning has become a mainstream technology because of its robustness and accuracy which is widely used in current image recognition. However, the current deep learning-based license document recognition system cannot usually accurately locate the position of the characters in the license document, especially for the character that is not in a regular direction, the recognition rate is not high, and the locating is inaccurate.
Invention Content

[0007] The purpose of the present invention is to provide a method for license document character locating, so as to solve low accuracy problem of existing license document recognition system for character locating.

[0008] The technical solution adopted by the present invention is as following:

[0009] A license document character locating method, the method comprises:

[0010] After inputting to be tested image into deep learning model, outputting spliced and fused feature map;

[0011] Performing image difference calculation on feature map marked with different indexes to obtain difference value of feature map;

[0012] Performing binarization calculation on the difference value of feature map;

[0013] Marking image connected component according to binarization calculation result;

[0014] Traversing all connected components and converting into locating coordinates, Date recue / Date received 2021-12-20 outputting coordinate set.

[0015] Furthermore, the connected components are converted into locating coordinates, comprising:

[0016] Calculating relative value between each element's area in connected component set and feature map's area after image difference calculation;

[0017] Cutting out elements, wherein the elements' relative values are not greater than a pre-set threshold;

[0018] Calculating outer envelope contour of the cutting out elements;

[0019] Performing pixel scaling error compensation on the cutting out elements to form a new outer envelope contour;

[0020] Performing size transformation on new outer envelope contour coordinates, converting coordinate value into coordinate system corresponding to inputted to be tested image;

[0021] Calculating envelope coordinates of minimum rotated rectangle and outputting coordinates of each element in connected component.

[0022] Furthermore, the deep learning module comprises back-end model, mid-section model, and head model, the inputted to be tested image enters the head model after being processed by the back-end model and mid-section model in turn, the head model performs a 3-layer 1*1 convolution calculation on entered feature map and forms a 3-layer feature map with index marks.

[0023] Furthermore, slicing the feature map output from depth model first, according to first Date recue / Date received 2021-12-20 index value and second index value of extracted sliced image, distributed calculating sigmoid function, in channel dimension, calculating difference value between the first index value and the second index value, then zooming in and adjusting the difference to pixel size of previous zoomed image, and performing binarization calculation on pixels of the feature map.

[0024] Furthermore, after performing pixel scaling error compensation on cutting out element, then calculating outer envelope expansion to obtain new outer envelope contour, wherein the new outer envelope contour can completely wrap character edge.

[0025]
Furthermore, according to binarization calculation result, performing 8-way connected component marking calculation to obtain connected component set, and all the connected components are descending sorted according to area.

[0026] Furthermore, before inputted to be tested image enters deep learning model, performing image scaling and preprocessing first, the scaling is 2 times the Nth power.

[0027] In the other aspect of the present invention, a license document character locating system is provided, comprising:

[0028] A feature map fusing module configured to splice and merge the feature map processed by deep learning model;

[0029] An image difference calculation module configured to performing image difference calculation on feature map marked with different indexes to obtain difference value of feature map;

[0030] A binarization calculation module configured to perform binarization calculation on the difference value of feature map;

[0031] A connected component marking module configured to mark image connected Date recue / Date received 2021-12-20 component according to binarization calculation result to form a connected component set consisting of a plurality of connected components;

[0032] A locating coordinates conversion module configured to traverse all connected components and convert into locating coordinates, outputting coordinate set.

[0033] Furthermore, the locating coordinates conversion module comprises:

[0034] A cutting out module configured to cut out elements, wherein the elements' relative values are not greater than a pre-set threshold;

[0035] An outer envelope contour calculation module configured to calculate outer envelope contour of the cutting out elements;

[0036] An error compensation module configured to perform pixel scaling error compensation on the cutting out elements to form a new outer envelope contour;

[0037] A size transformation module configured to perform size transformation on new outer envelope contour coordinates, converting coordinate value into coordinate system corresponding to inputted to be tested image;

[0038] a connected component coordinates calculation module configured to calculate envelope coordinates of minimum rotated rectangle and output coordinates of each element in connected component.

[0039] Comparing with the prior art, the character locating method and system of the license document disclosed in the present invention, by outputting the fused feature map, calculating the difference of the feature map, compensation for image scaling error, expanding the character envelope, and taking the minimum rectangular envelope which achieves automatic license document character locating, supports character detection and recognition in any Date recue / Date received 2021-12-20 directions, and improves locating accuracy.
Drawing Description

[0040] Figure 1 is a process diagram of a license document character locating method in an implementation of the present invention.

[0041] Figure 2 is a process diagram of post-processing of feature map in an implementation of the present invention;

[0042] Figure 3 is a structural diagram of deep learning model in an implementation of the present invention.

[0043] Figure 4 is an architecture diagram of a license document character locating system in an implementation of the present invention.

[0044] Figure 5 is an architecture diagram of a locating coordinates conversion module in an implementation of the present invention.
Specific Implementation Methods

[0045] The following describes the present invention in further detail with reference to the attached drawings, but it is not intended to limit the present invention.

[0046] In order to enable those skilled in the art to better understand the technical solutions of the present invention, the present invention will be described in further detail below with reference to the attached drawings and specific implementation methods.

[0047] Referring to Figure 1 to Figure 3, an implementation of the present invention discloses a method for license document character locating, the method comprises:

Date recue / Date received 2021-12-20

[0048] Step Si, scaling to be tested image img0 to the Nth power of 2, preferably 32 times, and calculating the scaling rate scale of the to be tested image img0;

[0049] Step S2, inputting the scaled image to obtain image img scaled;

[0050] Step S3, performing image preprocessing on the inputted image, wherein, the preprocessing includes: edge detection, rotation and correction; quality evaluation; color processing; smoothing processing, etc. Preprocessing the inputted image before entering the model, the main purpose is to eliminate irrelevant information in the image, restore useful real information, enhance the detectability of related information and simplify the data to the greatest extent, so as to improve the reliability of feature extraction, image segmentation, matching and recognition.

[0051] Step S4, inputting the preprocessed image into the pretrained deep learning model, and the model outputs the spliced and fused feature map.

[0052] Referring to Figure 3, the deep learning model in present implementation adopts a three-terminal structure, that is, a back-end model (Backbone), a mid-section model (Neck) and a head model (Head). The inputted preprocessed image passes through the back-end model, the middle model and the head model in turn, and then outputs the corresponding feature map.

[0053] Among them, the back-end model adopts a pretrained image classification model, which is mainly used to extract image features, the adopted model structure can be a VGG or ResNet structure, and the segmentation of image semantics adopts a Fully Convolutional Network (FCN) network framework. Because CNN (Convolutional Neural Network) is used, the scale of the final extracted feature becomes smaller, specifically the length and width become smaller, and the channels become more. In order to make the scale extracted by CNN
reach the size of the original image, the FCN network uses up-sampling and deconvolution to the size of the original image; then performs pixel-level classification, enters the original image, and passes through the VGG16 network to obtain the feature map, and then up-sampling back Date recue / Date received 2021-12-20 the feature map; then classifies the prediction result and ground truth each pixel in a one-to-one correspondence, and pixel-level classification, which means that the segmentation problem is turned into a classification problem, which is convenient for deep learning.

[0054] The mid-section model uses the deep learning segmentation network UNet, UNet includes two parts, the first part is feature extraction, each pooling layer forms a scale with multiple scales; the second part is up-sampling, for each up-sampling, fusing the same scale as the number of channels corresponding to the feature extraction part, and the fusing here is also splicing. Since it is impossible to input the size of the original image into the network during segmentation, it needs to be cut into small patches one by one, when cutting the image, it needs to include the surrounding area and provide texture and other information for the edge of the segmentation area.

[0055] The head model uses 32 layers of 3x3 convolutional layer, 3 layers of lx1 convolutional layer, index 0 represents text pixel 2 classification feature map, index 1 represents the feature map of the text area envelope, index 2 represents the feature map of the 2D Gaussian kernel distribution, outputting feature map which is the spliced and fused three-channel feature map of the above 3 layers of lx1 convolutional layer. Using this head model structure can improve the model learning ability and provide detection basis for subsequent pixel-level character detection.

[0056] After the three-stage reasoning of the detection model, entering the feature map post-processing flow of the following steps.

[0057] Step S5, performing slicing processing on the feature map output from the deep learning model, since the feature maps of the previous three channels are indexed, index 0 and index 1 are extracted respectively, and the sigmoid function is calculated for index 0 of the feature map to obtain f map 0; the sigmoid function is calculated for index 1 of the feature map to obtain f map 1, and in the channel dimension, calculate f map 1 - f map 0 to obtain the difference diff;

Date recue / Date received 2021-12-20

[0058] Step S6, performing image difference calculation for the difference diff value, and adjust the dimension difference diff to the size of img scaled, that is, enlarge the pixels of the image to form an adjusted image diff scaled.

[0059] Step S7, for diff scaled, performing a binarization calculation, the threshold can be manually pre-set or an adaptive threshold setting, the pixels larger than the threshold are set to 1, otherwise, set to 0. After the image is binarized, the gray value of the pixels on the image is set to 0 or 255, which means that the entire image presents an obvious visual effect of only black and white.

[0060] Step S8, performing 8-connectivity (8-connectivity) connected component marking calculation, if the pixel x has the same pixel y in the 8 directions of its up, down, left, right, up left, up right, down left, and down right, which means that the x pixel and the y pixel are considered to be connected. In this way, the image will be divided into a plurality of polygonal regions, the shape of these polygonal regions can be the same or different, the area size can be the same or different, and finally obtaining the connected component region set region list.

[0061] Step S9, the regions of all elements in the region list are descending sorted according to the size of the area, the region with the largest area is ranked first, and the region with the smallest area is ranked last; in this way, subsequent processes will asynchronously preferentially process the coordinates of the region with the largest area, reducing the system wait time and improving efficiency.

[0062] Step S10, for each element in the connected component region set region list, performing processing and calculation of the connected-component-to-location coordinates, wherein, this step specifically includes the following:

[0063] Step S101: calculating the relative area value between each regional element and diff scaled;

Date recue / Date received 2021-12-20

[0064] Step S102, ignoring elements whose relative area is greater than a preset threshold;

[0065] Step S103, for elements whose relative area is less than or equal to a preset threshold, cutting out the elements to obtain cut img;

[0066] Step S104: calculating the outer envelope contour convex hull of the cutting out image cut img;

[0067] Step S105, image scaling error compensation; since in actual image processing, the position of the pixel will be deviated, and if the subsequent coordinate conversion and magnification are directly entered without the expansion calculation, the error will be magnified, in this step, performing the error compensation of image scaling first, so that the coordinate position of each pixel is closer to the actual pixel position, even if the coordinate expansion or size enlargement is subsequently performed, the accuracy of the position of each pixel is guaranteed.

[0068] Step S106: performing an envelope expansion calculation to obtain the envelope coordinates of the character position. Through the calculation of outer envelope expansion, the outer envelope contour is expanded outwards, as far as possible to cover the entire character.

[0069] Step S107: performing size transformation on the character envelope coordinates and converting the coordinate values into the coordinate system corresponding to img0; since to be tested image has been scaled before entering the deep learning model, the aspect ratio will be reduced, this step will restore to the original size.

[0070] Step S108: calculating the coordinates of the smallest rotated rectangle envelope, which are used as the final output coordinates of the character region set represented by the connected component; since the previous character envelope is a polygon, in order to facilitate subsequent computer recognition processing, calculating the coordinates of the smallest Date recue / Date received 2021-12-20 rectangle envelope, which can be rectangles with different angles according to different envelope contours.

[0071] Step S109, repeating the above steps of S101 to S108, until the coordinates of all elements in each connected component set are output.

[0072] Step S11, removing the empty coordinates, and returning to the coordinate set of all elements representing the character position again, and completing the whole process of character position coordinate detection.

[0073] Step S12, after the characters coordinate positions are detected, assigning to different processes to identify all characters in the frame.

[0074] Comparing with the prior art, the character locating method and system of the license document disclosed in the present invention, by outputting the fused feature map, calculating the difference of the feature map, compensation for image scaling error, expanding the character envelope, and taking the minimum rectangular envelope which achieves automatic license document character locating, supports character detection and recognition in any directions, and improves locating accuracy.

[0075] Corresponding to the method in the above-mentioned implementations, and with reference to Figure 4 and 5, another implementation of the present invention also provides a character locating system for a license document, which includes:

[0076] A feature map fusing module configured to splice and merge the feature map processed by deep learning model;

[0077] An image difference calculation module configured to performing image difference calculation on feature map marked with different indexes to obtain difference value of feature map;

Date recue / Date received 2021-12-20

[0078] A binarization calculation module configured to perform binarization calculation on the difference value of feature map;

[0079] A connected component marking module configured to mark image connected component according to binarization calculation result to form a connected component set consisting of a plurality of connected components;

[0080] A locating coordinates conversion module configured to traverse all connected components and convert into locating coordinates, outputting coordinate set.

[0081] Wherein, the locating coordinates conversion module comprises:

[0082] A cutting out module configured to cut out elements, wherein the elements' relative values are not greater than a pre-set threshold;

[0083] An outer envelope contour calculation module configured to calculate outer envelope contour of the cutting out elements;

[0084] An error compensation module configured to perform pixel scaling error compensation on the cutting out elements to form a new outer envelope contour;

[0085] A size transformation module configured to perform size transformation on new outer envelope contour coordinates, converting coordinate value into coordinate system corresponding to inputted to be tested image;

[0086] A connected component coordinates calculation module configured to calculate envelope coordinates of minimum rotated rectangle and output coordinates of each element in connected component.

Date recue / Date received 2021-12-20

[0087] The implementation of present invention discloses the character locating method and system of the license document, by feature map fusing module, image difference calculation module, binarization calculation module, connected component marking module, locating coordinates conversion module which achieves automatic license document character locating, supports character detection and recognition in any directions, and improves locating accuracy.

[0088] The specific execution steps of the above-mentioned modules have been described in detail in the implementation of the method. For details that are not described in this implementation, please refer to the above-mentioned method implementations.

[0089] The above description shows and describes several preferred implementations of the present invention, but as mentioned above, it should be understood that the present invention is not limited to the form disclosed herein and should not be regarded as the exclusion of other implementations. It can be used in various other combinations, modifications, and environments, and can be modified through the above teachings or technology or knowledge in related fields within the scope of the inventive concept described herein.
The modifications and changes made by those skilled in the art do not depart from the spirit and scope of the present invention and should fall within the protection scope of the appended claims of the present invention.

Date recue / Date received 2021-12-20

Claims

Claims:

1. A license document character locating method, the method comprises:
after inputting to be tested image into deep learning model, outputting spliced and fused feature map;
performing image difference calculation on feature map marked with different indexes to obtain difference value of feature map;
performing binarization calculation on the difference value of feature map;
marking image connected component according to binarization calculation result; and traversing all connected components and converting into locating coordinates, outputting coordinate set.

2. The character locating method according to claim 1, wherein, the connected components are converted into locating coordinates, comprising:
calculating relative value between each element's area in connected component set and feature map's area after image difference calculation;
cutting out elements, wherein the elements' relative values are not greater than a pre-set threshold;
calculating outer envelope contour of the cutting out elements;
performing pixel scaling error compensation on the cutting out elements to form a new outer envelope contour;

Date recue / Date received 2021-12-20 performing size transformation on new outer envelope contour coordinates, converting coordinate value into coordinate system corresponding to inputted to be tested image;
and calculating envelope coordinates of minimum rotated rectangle and outputting coordinates of each element in connected component.

3. The character locating method according to claim 1 or claim 2, wherein, the deep learning module comprises back-end model, mid-section model, and head model, the inputted to be tested image enters the head model after being processed by the back-end model and mid-section model in turn, the head model performs a 3-layer 1*1 convolution calculation on entered feature map and forms a 3-layer feature map with index marks.

4. The character locating method according to claim 3, wherein, slicing the feature map output from depth model first, according to first index value and second index value of extracted sliced image, distributed calculating sigmoid function, in channel dimension, calculating difference value between the first index value and the second index value, then zooming in and adjusting the difference to pixel size of previous zoomed image, and performing binarization calculation on pixels of the feature map.

5. The character locating method according to claim 2, wherein, after performing pixel scaling error compensation on cutting out element, then calculating outer envelope expansion to obtain new outer envelope contour, wherein the new outer envelop contour can completely wrap character edge.

6. The character locating method according to claim 4, wherein, according to binarization calculation result, performing 8-way connected component marking calculation to obtain connected component set, and all the connected components are descending sorted according to area.
Date recue / Date received 2021-12-20

7. The character locating method according to claim 3, wherein, before inputted to be tested image enters deep learning model, performing image scaling and preprocessing first, the scaling is 2 times the Nth power.

8. The character locating method according to claim 4, wherein, marking the first index value as feature map of character pixel's binary classification, marking the second index value as feature map of character area's envelope.

9. A license document character locating system comprises:
a feature map fusing module configured to splice and merge the feature map processed by deep learning model;
an image difference calculation module configured to performing image difference calculation on feature map marked with different indexes to obtain difference value of feature map;
a binarization calculation module configured to perform binarization calculation on the difference value of feature map;
a connected component marking module configured to mark image connected component according to binarization calculation result to form a connected component set consisting of a plurality of connected components; and a locating coordinates conversion module configured to traverse all connected components and convert into locating coordinates, outputting coordinate set.

10. The character locating system according to claim 9, wherein, the locating coordinates conversion module comprises:

Date recue / Date received 2021-12-20 a cutting out module configured to cut out elements, wherein the elements' relative values are not greater than a pre-set threshold;
an outer envelope contour calculation module configured to calculate outer envelope contour of the cutting out elements;
an error compensation module configured to perform pixel scaling error compensation on the cutting out elements to form a new outer envelope contour;
a size transformation module configured to perform size transformation on new outer envelope contour coordinates, converting coordinate value into coordinate system corresponding to inputted to be tested image; and a connected component coordinates calculation module configured to calculate envelope coordinates of minimum rotated rectangle and output coordinates of each element in connected component.

Date recue / Date received 2021-12-20