CN117710988A

CN117710988A - Seal bending text line correction method, device and system

Info

Publication number: CN117710988A
Application number: CN202410003543.2A
Authority: CN
Inventors: 沈达伟; 王勇; 朱军民; 王立刚; 孙朗
Original assignee: Beijing Yidao Boshi Technology Co ltd
Current assignee: Beijing Yidao Boshi Technology Co ltd
Priority date: 2024-01-02
Filing date: 2024-01-02
Publication date: 2024-03-15

Abstract

The invention discloses a method, a device and a system for correcting a curved text line of a seal, and relates to the field of computer vision. The method comprises the following steps: receiving a seal image, and performing text segmentation on the seal image to obtain a text line mask, a sub-character mask and a curved text line first character mask; dividing all text lines into straight text lines and curved text lines according to the three masks, and forming a curved text line example for the curved text lines; sorting all the sub-character boxes contained in the curved text line instance; and performing perspective transformation, cutting and splicing on all the sub-character boxes contained in the curved text line instance according to the sequencing result to obtain corrected curved text line content. According to the technical scheme, for any seal image, all the bent text lines can be segmented and corrected so as to be convenient for subsequent identification, and the method has the characteristics of universality, high robustness and high precision.

Description

Seal bending text line correction method, device and system

Technical Field

The invention relates to the field of computer vision, in particular to a method, a device and a system for correcting curved text lines of a seal.

Background

Stamp identification is of great value in many image information extraction services. Stamp identification a technical pain point is the correction and identification of curved text lines in a circular stamp.

The current recognition of such curved text lines can be divided into an end-to-end recognition scheme and a cascading scheme based on text line segmentation, correction, recognition. Because the recognition effect of the end-to-end scheme is greatly different from that of the cascading scheme, the cascading scheme is currently the main stream in the industry. Whereas the curved text lines in the cascading scheme work as pain points throughout the flow.

For curved text line correction, the current industry falls into two method routes: one approach is to straighten the curved text line using TPS transformation, which has the disadvantage that TPS transformation causes character deformation (as in fig. 4), which leads to a deterioration of the subsequent character recognition effect; another approach is character segmentation combining, i.e., each character in a curved text line is cut out and combined and spliced into a straight text line to achieve the effect of curve correction and avoid character distortion. However, the TPS conversion method of the first method route is still preferred in the industry today for curved text line correction due to the two-point fatal defect of the second method route. The effect of this method is shown in figure 1.

Specifically, the two-point fatal defect of the mainstream method of the second route is as follows:

1. for stamps containing two or more curved text lines, it is not possible to distinguish between different text line instances (i.e., it is not possible to distinguish to which text line a character should belong);

2. the curved text lines of all stamps must be aligned in a uniform direction (all clockwise or all counter-clockwise).

Disclosure of Invention

In order to solve the problems, the technical scheme of the invention provides a method, a device and a system for correcting a curved text line of a seal. For any seal image, all the curved text lines can be segmented and corrected so as to carry out subsequent identification, and the method has the characteristics of universality, high robustness and high precision. Therefore, the deadly defect of the main stream method of the second route is broken through while the problem of character deformation caused by TPS conversion is avoided.

According to a first aspect of the present invention, there is provided a method for correcting a curved text line of a seal, the seal including a straight text line and/or a curved text line, wherein the method for correcting a curved text line of a seal includes:

s1, text segmentation: receiving a seal image, and performing text segmentation on the seal image to obtain a text line mask, a sub-character mask and a curved text line first character mask;

s2, an example construction step: dividing all text lines into straight text lines and curved text lines according to the three masks, and forming a curved text line example for the curved text lines;

s3, sequencing the characters: sorting all the sub-character boxes contained in the curved text line instance;

s4, cutting and splicing the characters: and performing perspective transformation, cutting and splicing on all the sub-character boxes contained in the curved text line instance according to the sequencing result to obtain corrected curved text line content.

Further, in the step of S1 text segmentation, a Real-time scene text detection (Real-time Scene Text Detection with Differentiable Binarization, DBNet) algorithm model capable of differential binarization is adopted to perform text segmentation.

Further, in the step of S1 text segmentation, the real-time scene text detection algorithm model includes three prediction heads with the same structure, which are respectively used for outputting the text line mask, the sub-character mask and the curved text line first character mask.

Further, the step of S1 text segmentation further includes: and expanding the text line mask, the sub-character mask and the bent text line first character mask which are contracted after the processing into the original instance size.

Further, the step of constructing the S2 example specifically includes:

s21: determining text line coordinates (coordinate representation of text line outline) from the text line mask;

s22: mapping the text line coordinates to the curved text line first character mask, determining whether characters exist at the curved text line first character position, and if so, determining the curved text line; if not, the text is a straight text line (without processing by a construction example);

s23: the curved text line is taken, the outline of each sub-character is calculated to be the minimum circumscribed rectangular frame according to the sub-character mask, and the sub-character frame and the coordinates of each sub-character are obtained;

s24: and taking the coordinates of the sub-character frame and the coordinates of the first character frame of the curved text line as curved text line examples.

Further, the step of sorting the S3 characters specifically includes:

s31: sorting all the characters contained in each curved text line instance according to the curved text line instance containing information to obtain character sorting information;

s32: and sorting the vertexes of all the characters contained in each curved text line instance according to the curved text line instance containing information to obtain character vertex sorting information.

Further, the step S31 specifically includes:

finding the coordinates of the sub-character frame overlapped with the coordinates of the first character frame of the curved text line from the coordinates of the sub-character frame of each curved text line instance, and taking the coordinates as the first character; the character closest to the first character in the remaining characters is the second character; the third character closest to the second character among the remaining characters is a third character until only one character remains, and the character is the last character;

thereby obtaining character ordering information.

Further, in the step S31, the character distance is obtained according to the straight line distance between the midpoints of the two character frames.

Further, the step S32 specifically includes:

s321: for each curved text line instance, calculating the midpoints of the sub-character boxes of all the characters according to the character ordering information;

s322: taking the midpoints of the first sub-character frame and the second sub-character frame to construct a forward direction vector; constructing direction vectors of all vertex directions from the middle point to the vertexes for 4 vertexes of the first sub-character frame respectively;

s323: calculating a vector cross product and a vector dot product of the forward direction vector and a direction vector of any vertex direction, and determining whether the forward direction vector and the direction vector of any vertex direction are a lower left vertex, an upper right vertex or a lower right vertex according to a result;

s324: sequentially arranging the vertexes of the sub-character frames in a clockwise direction by taking the lower left vertex as a starting point, so as to realize the sequential ordering of the vertexes of the first sub-character frame;

s325: and by analogy, calculating the second sub-character frame and the third sub-character frame, and realizing the vertex ordering of the second sub-character frame until the last sub-character frame, and calculating the forward direction vector formed by the last sub-character frame and the sub-character frame, thereby obtaining the character vertex ordering information.

Further, in the step S323, the lower left vertex, the upper right vertex or the lower right vertex is determined according to the following principles:

if the cross product is less than zero and the dot product is less than zero, the vertex is the upper left vertex;

if the cross product is less than zero and the dot product is greater than zero, the vertex is the upper right vertex;

if the cross product is greater than zero and the dot product is less than zero, the vertex is the lower left vertex;

if the cross product is greater than zero and the dot product is greater than zero, then the vertex is the lower right vertex.

Further, the step of cutting and splicing the S4 character specifically comprises the following steps:

setting a fixed text line target height, and scaling the target width of each character according to the aspect ratio of the character;

for each curved text line instance, cutting out each target character area on the seal image according to the character vertex ordering information by perspective transformation at the text line target height, and splicing the target character areas into text lines according to the character ordering information, thereby obtaining corrected curved text line content.

Further, in the step S4, the text line is aimed at 32 pixels or 48 pixels.

Further, in the step S4, before perspective transformation, the sub-character frames are expanded by 0.1 character width from left to right, so as to avoid the adhesion of the characters due to too close distance of the text lines spliced.

According to a second aspect of the present invention, there is provided a seal bending text line correction apparatus, the content recognition apparatus operating based on the seal bending text line correction method according to any one of the above aspects, comprising:

the text segmentation unit is used for receiving the seal image, and performing text segmentation on the seal image to obtain a text line mask, a sub-character mask and a curved text line first character mask;

an example construction unit for dividing all text lines into straight text lines and curved text lines according to the above three masks, and constructing a curved text line example for the curved text lines;

a character ordering unit, configured to order all the sub-character boxes included in the curved text line instance;

and the character cutting and splicing unit is used for performing perspective transformation, cutting and splicing on all the sub-character frames contained in the curved text line example according to the sequencing result to obtain corrected curved text line content.

According to a third aspect of the present invention there is provided a seal curved text line correction system, the system comprising: a processor and a memory for storing executable instructions; wherein the processor is configured to execute the executable instructions to perform the seal-bending text line correction method as described in any of the above aspects.

According to a fourth aspect of the present invention, there is provided a computer-readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, implements the seal-bending text line correction method according to any one of the above aspects.

The invention has the beneficial effects that:

1. by adopting the segmentation, correction and splicing strategies of the character level, each character keeps the original shape structure, so that the problem of subsequent character recognition rate reduction caused by obvious character deformation brought by the TPS conversion mode of the current main stream can be effectively avoided;

2. by adopting a text segmentation module containing three pieces of information with different dimensions and combining an instance construction module, all curved text lines in the seal can be constructed into independent curved text line instances, so that correction of a plurality of curved text lines in the seal can be supported, and the defect that different instances cannot be distinguished by the existing method is overcome;

3. by adopting the character sequencing module and combining the initial character information predicted by the text segmentation module, the initial character and the sequence of each curved text line example can be adaptively determined, so that any arrangement direction of characters can be supported, especially the situation that the arrangement directions of the characters of a plurality of curved text lines in the same seal are inconsistent, and the application range is greatly improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to the structures shown in these drawings without inventive effort for a person skilled in the art.

Figure 1 shows a prior art effect diagram of a TPS transformation method.

Fig. 2 shows a flow chart of a method according to an embodiment of the solution of the invention.

Fig. 3 shows a schematic program structure according to an embodiment of the present invention.

Fig. 4 shows a schematic diagram of a mask prediction result of a text segmentation module according to an embodiment of the present invention.

Fig. 5 shows a comparison chart of the identification effect of the technical scheme of the invention and the TPS transformation method.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

The terms first, second and the like in the description and in the claims, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein, for example.

Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

A plurality, including two or more.

And/or, it should be understood that for the term "and/or" used in this disclosure, it is merely one association relationship describing associated objects, meaning that there may be three relationships. For example, a and/or B may represent: a exists alone, A and B exist together, and B exists alone.

The technical scheme of the invention firstly provides a method for correcting a bent text line of a seal, wherein the seal comprises a straight text line and a bent text line, and the method for correcting the bent text line of the seal comprises the following steps:

In the step of S1 text segmentation, a DBNet algorithm model is adopted for text segmentation.

In the step of S1 text segmentation, the depth bidirectional network algorithm model comprises three prediction heads with the same structure, and the three prediction heads are respectively used for outputting text line masks, sub-character masks and curved text line first character masks.

Wherein, the step of segmenting the S1 text further comprises: and expanding the text line mask, the sub-character mask and the bent text line first character mask which are contracted after the processing into the original instance size.

The step of S2 example construction specifically comprises the following steps:

s21: determining text line coordinates according to the text line mask;

s22: mapping text line coordinates to a curved text line first character mask, determining whether characters exist at the curved text line first character position, and if so, determining that the curved text line is the curved text line; if not, the text is a straight text line;

s23: taking a curved text line, and solving the outline of each sub-character to obtain a minimum circumscribed rectangular frame according to a sub-character mask to obtain a sub-character frame and coordinates of each sub-character;

s24: the sub-frame coordinates and the first-frame coordinates of the curved text line are taken as curved text line examples.

The step of S3 character sequencing specifically comprises the following steps:

s31: sorting all characters contained in each curved text line instance according to the curved text line instance containing information to obtain character sorting information;

The step S31 specifically includes:

thereby obtaining character ordering information.

In step S31, a character distance is obtained according to the straight line distance between the midpoints of the two character frames.

The step S32 specifically includes:

s321: for each curved text line instance, calculating midpoints of sub-character boxes of all characters according to the character ordering information;

s322: taking the midpoints of the first sub-character frame and the second sub-character frame to construct a forward direction vector; the method comprises the steps of respectively constructing direction vectors from the middle point to the vertexes aiming at 4 vertexes of a first sub-character frame;

s323: calculating a vector cross product and a vector dot product of the forward direction vector and a direction vector of any vertex direction, and determining the forward direction vector and the vector dot product to be a lower left vertex, an upper right vertex or a lower right vertex according to the result;

s324: sequentially arranging the vertexes of the sub-character frames in a clockwise direction, so that the vertex sequence ordering of the first sub-character frame is realized;

In step S323, the lower left vertex, the upper right vertex or the lower right vertex is determined according to the following principles:

The S4 character cutting and splicing step specifically comprises the following steps:

for each curved text line instance, cutting out a target character area on the seal image by perspective transformation at the text line target height according to the character vertex ordering information, and splicing the text lines according to the character ordering information, thereby obtaining corrected curved text line content.

In step S4, the text line is 32 pixels or 48 pixels.

In step S4, the sub-character frames are first left and right expanded by 0.1 character width before perspective transformation.

The technical scheme of the invention also provides a seal bending text line correction device, and the content recognition device operates based on the seal bending text line correction method according to any one of the aspects, and comprises the following steps:

the character ordering unit is used for ordering all the sub-character boxes contained in the curved text line instance;

The technical scheme of the invention also provides a seal bending text line correction system, which comprises: a processor and a memory for storing executable instructions; wherein the processor is configured to execute executable instructions to perform the seal-bending text line correction method as in any of the above aspects.

The technical scheme of the invention further provides a computer readable storage medium, which is characterized in that a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the method for correcting the curved text line of the seal according to any aspect is realized.

Examples

According to the embodiment of the invention, a seal image is received, and the seal image is firstly input into a text segmentation module to obtain a text line mask, a sub-character mask and a curved text line first character mask. All text lines (including curved text lines and straight text lines) are then structured into respective text line instances according to the three masks described above, wherein each instance contains sub-box coordinates, and first-box coordinates for the curved text line. At this time, whether the example includes a valid curved text line first character or not can be judged to be a straight text line or a curved text line. Wherein the straight text lines do not need correction, and for the curved text line examples, the sub-character boxes contained in the straight text lines are ordered by using the method. And then, according to the sequencing result, sequentially performing perspective transformation cutting on the sub-characters in the examples, and then splicing to obtain a final corrected text line. Fig. 2 is a schematic flow chart of the present invention. FIG. 3 is a schematic diagram of a program structure according to an embodiment of the present invention.

Text segmentation module

The invention uses the DBNet algorithm model which is mainstream in the industry to execute the text segmentation task, and other text segmentation algorithms such as pixellink and the like can achieve similar effects. The native DBNet algorithm has only one prediction header, outputting a text line mask. On the basis of the algorithm, 2 pre-measuring heads with identical structures are additionally added for outputting a sub-character mask and a curved text line head character mask. Wherein the text line mask is used to predict the instance pixels of the text line, the sub-character mask is used to predict the instance pixels of each character in the image, and the curved text line first character mask is used to predict the instance pixels of the start character of each curved text line in the image. The schematic diagram is shown in fig. 4. Because the division result of the DBNet has a contraction coefficient, the example mask of the network output becomes fine, and the mask can be expanded to the original example size in the post-processing stage of the DBNet.

Example construction Module

And sending the three expanded masks acquired from the text segmentation module to an instance construction module.

First, the contours of all text lines are found from the text line mask. The found text lines then need to be divided into two categories, curved text lines and straight text lines: since the curved text line first character mask contains the first character information of all the curved text lines, for each text line contour region, it is calculated whether or not there is a first character on the curved text line first character mask at the position corresponding to the region. If present, the text behavior bends text lines, and if not present, the text behavior is straight text lines.

For each curved text line, all effective sub-characters on the sub-character mask at the corresponding position of the text line outline are obtained, and the outline of each sub-character is calculated to obtain the minimum circumscribed rectangular frame. The curved text line outline, the first character frame of the curved text line, and all the sub-character frames of the curved text line are then assembled into a curved text line instance.

For all straight text lines, it is not within the discussion of the present invention, as the straight text lines can be identified directly without additional corrective processing.

Character ordering module

And sending all the curved text line examples acquired by the example construction module to a character sequencing module.

For each curved text line instance, taking a sub-character frame set of the curved text line instance, and finding a first character overlapping with the first character frame; the second character is closest to the first character in the rest characters; the third character closest to the second character in the remaining characters is the third character, and the third character is the third character until only one character is left, and the last character is the last character; so far, the alphabetical arrangement of the text line is completed. The calculation method of each character distance can adopt the straight line distance between the points of two character frames.

Character vertex ordering module

To ensure that the direction in which the character is subsequently cut from the artwork is character-headed upwards, the vertices of all the character boxes in all the curved text line instances need to be ordered clockwise, the desired vertex order for each character being: lower left, upper right, lower right.

For each curved text line instance, the midpoints of all the sub-character boxes that have been ordered are calculated. Then we take the midpoint of the first and second character boxes to form a vector representing the direction of travel from the first character to the second character, denoted as forward vector; at this time, the direction vectors from the middle point to the vertices are calculated for the 4 vertices of the first character frame, and one of them may be denoted as vector1. Calculating a vector cross product and a vector dot product of the forward vector and the vector1, wherein if the cross product is smaller than zero and the dot product is smaller than zero, the vertex is the top left vertex (the first pixel point at the top left corner of the image is taken as the origin in computer vision, so that the cross product is smaller than zero means that the vertex is above the forward direction vector, and the dot product is smaller than zero means that the included angle between two vectors is larger than 90 degrees); if the cross product is less than zero and the dot product is greater than zero, the vertex is the upper right vertex; if the cross product is greater than zero and the dot product is less than zero, the vertex is the lower left vertex; if the cross product is greater than zero and the dot product is greater than zero, then the vertex is the lower right vertex. And sequentially arranging the vertexes of the character frames in the clockwise direction according to the lower left, the upper right and the lower right so as to realize the vertex sequential ordering of the first character frame. And by analogy, the second character frame and the third character frame are taken for calculation, so that the vertex ordering of the second character frame is realized. Specifically, the last character frame is taken, and the previous character frame and the character frame form a forward direction vector.

According to the method, all the vertex sequences of all the character frames of all the text line examples are ordered.

Character cutting and splicing module

A fixed text line height is set, such as 32 or 48 pixels as is commonly used, and the target width of each character is scaled according to its aspect ratio. For each curved text line instance, using each character box in the instance, the target character areas are cut out at a fixed height on the original by perspective transformation and spliced into one text line in sequence. Before perspective transformation, it is recommended to expand the character frame by 0.1 character width to avoid the text lines spliced by them, and the character distance is too close. The effect is shown in fig. 5.

According to the method, character cutting and splicing are carried out on all text line examples, and correction of all curved text lines is achieved.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

From the above description of the embodiments, it will be apparent to those skilled in the art that the above implementation may be implemented by means of software plus necessary general purpose hardware platform, or of course by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.

The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.

Claims

1. A method for correcting a curved text line of a seal, the seal comprising a straight text line and/or a curved text line, the method comprising:

2. The method for correcting a curved text line of a seal according to claim 1, wherein in the step of S1 text segmentation, a real-time scene text detection algorithm model capable of differential binarization is used for text segmentation.

3. The method for correcting curved text lines of a seal according to claim 2, wherein in the step of S1 text segmentation, the real-time scene text detection algorithm model includes three prediction heads with the same structure, and the prediction heads are respectively used for outputting the text line mask, the sub-character mask and the curved text line head character mask.

4. The method for correcting a curved line of text for a seal according to claim 1, wherein said S1 text segmentation step further comprises: and expanding the text line mask, the sub-character mask and the bent text line first character mask which are contracted after the processing into the original instance size.

5. The method for correcting a curved text line of a seal according to claim 1, wherein said S2 instance construction step specifically comprises:

s21: determining text line coordinates according to the text line mask;

s22: mapping the text line coordinates to the curved text line first character mask, determining whether characters exist at the curved text line first character position, and if so, determining the curved text line; if not, the text is a straight text line;

6. The method for correcting a curved text line of a seal according to claim 1, wherein said step of ordering S3 characters comprises:

7. The method for correcting a curved text line of a seal according to claim 6, wherein said step S31 specifically comprises:

thereby obtaining character ordering information.

8. The method for correcting a curved text line of a stamp according to claim 7, wherein in the step S31, the character distance is obtained based on a straight line distance between points of two character frames.

9. The method for correcting a curved text line of a seal according to claim 6, wherein said step S32 specifically comprises:

10. The method for correcting a curved text line of a seal according to claim 9, wherein in step S323, the lower left vertex, the upper right vertex or the lower right vertex is determined according to the following principle:

11. The method for correcting a curved text line of a seal according to claim 1, wherein said step of cutting and splicing S4 characters comprises:

12. The method according to claim 11, wherein in the step S4, the text line is 32 pixels or 48 pixels in height.

13. The method for correcting curved text lines of a seal according to claim 11, wherein in the step S4, the sub-character frames are first expanded by 0.1 character width before perspective transformation, so as to avoid the adhesion of the characters due to too close distance of the spliced text lines.

14. A seal-bending text line correction apparatus characterized in that the content recognition apparatus operates based on the seal-bending text line correction method according to any one of claims 1 to 13, comprising:

15. A seal curved text line correction system, the system comprising: a processor and a memory for storing executable instructions; the method according to any one of claims 1 to 13, characterized in that the processor is configured to execute the executable instructions to perform the seal bending text line correction method.

16. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements a seal-bending text line correction method according to any one of claims 1 to 13.