CN115205861A - Method for acquiring abnormal character recognition area, electronic equipment and storage medium - Google Patents

Method for acquiring abnormal character recognition area, electronic equipment and storage medium Download PDF

Info

Publication number
CN115205861A
CN115205861A CN202210984470.0A CN202210984470A CN115205861A CN 115205861 A CN115205861 A CN 115205861A CN 202210984470 A CN202210984470 A CN 202210984470A CN 115205861 A CN115205861 A CN 115205861A
Authority
CN
China
Prior art keywords
area
target
text
text recognition
list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210984470.0A
Other languages
Chinese (zh)
Other versions
CN115205861B (en
Inventor
石江枫
于伟
靳雯
赵洲洋
王全修
吴凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rizhao Ruian Information Technology Co ltd
Beijing Rich Information Technology Co ltd
Original Assignee
Rizhao Ruian Information Technology Co ltd
Beijing Rich Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rizhao Ruian Information Technology Co ltd, Beijing Rich Information Technology Co ltd filed Critical Rizhao Ruian Information Technology Co ltd
Priority to CN202210984470.0A priority Critical patent/CN115205861B/en
Publication of CN115205861A publication Critical patent/CN115205861A/en
Application granted granted Critical
Publication of CN115205861B publication Critical patent/CN115205861B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/147Determination of region of interest
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1475Inclination or skew detection or correction of characters or of image to be recognised
    • G06V30/1478Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • G06V30/1607Correcting image deformation, e.g. trapezoidal deformation caused by perspective

Abstract

The invention relates to a method for acquiring an abnormal character recognition area, which comprises the following steps: the method comprises the steps of identifying an intermediate image, when an r text identification area is not rectangular, obtaining a first text identification result list corresponding to the r text identification area, obtaining a second text identification result list based on a mapping proportion, carrying out equal ratio division on the r text identification area, obtaining a third text identification result list, and obtaining a k identification area based on the first text identification list, the second text identification list and the third text identification list; acquiring a first heightening identification area based on the kth identification area; thereby obtaining the sigma-th heightening area, and obtaining the single character text recognition area corresponding to the sigma-th heightening area as the final recognition area. The invention detects the characters one by one, and improves the detection precision of single character.

Description

Method for acquiring abnormal character recognition area, electronic equipment and storage medium
Technical Field
The present invention relates to the field of text detection, and in particular, to a method, an electronic device, and a storage medium for acquiring an abnormal text recognition area.
Background
The image character detection and recognition technology has wide application scenes, and the OCR-based text recognition is defined as recognizing characters of a print form from a paper document, analyzing and processing the characters, and recognizing character information in an image, but in the ordinary document recognition, the quality of a text image is often low due to low resolution of a scanner, poor quality of paper and ink, and the like, and the shape and the direction of a character line may be horizontal, vertical, inclined, curved and the like, and for the detection of a bent text, pixelLink and the like are commonly used.
Disclosure of Invention
A method for acquiring an abnormal character recognition area comprises the following steps: s201, when the r text recognition area is not rectangular, based on the CRNN model, obtaining a mapping proportion K 3 A first text recognition result list X = { X ] corresponding to the r-th text recognition area 1 ,…,X k ,…,X k1 },X k The coordinate of the center of a recognition area corresponding to the kth character on the X axis is defined, and the value range of k is 1 to k 1 ,k 1 The number of characters in the r-th text recognition area is referred to;
s202, identifying a list and mapping proportion K based on the first text 3 Acquiring a second text recognition result list O = { O = { O = 1 ,…,O k ,…,O k1 },O k =(X k1 ,Y k1 ,X k2 ,Y k2 ) Wherein, in the step (A),
X k1 =K 3 *X k -H r /2,Y k1 =Yʹʹ s ,X k2 =K 3 *X k +H r /2,Y k2 =Yʹʹ s +Hʹʹ s wherein, Y \697 s Refers to the coordinate of the upper left corner Y axis of the r-th text recognition area, H r Identifying a height of the area for the r-th text;
s203, performing equal ratio segmentation on the length of the r-th text recognition region based on L \697toobtain a third text recognition result list O \697 1 ,…,Oʹʹ k ,…,Oʹʹ k1 },Oʹʹ k =(X k3 ,Y k3 ,X k4 ,Y k4 ),X k3 Means the coordinate of the upper left corner of the third identification region corresponding to the k character on the X axis, Y k3 The coordinate of the upper left corner of the third recognition region corresponding to the kth character on the X axis, X \697 k2 Means the coordinate of the lower right corner of the third recognition area corresponding to the k character on the X axis, Y k4 Refers to the coordinate of the lower right corner of the third recognition region corresponding to the k-th character on the Y-axis, wherein L \ 697 r /k 1
S204, when | X k1 -X k3 |<(Y k2 -Y k1 +Y k4 -Y k3 ) (v 4) set of first position coordinates ((X) k1 +X k3 )/2,(Y k1 +Y k3 )/2,(X k2 +X k4 )/2,(Y k2 +Y k4 ) (v 2) as the kth identification area;
s205, when | X k1 -X k3 |≥(Y k2 -Y k1 +Y k4 -Y k3 ) /4, set the second position coordinate (X) k1 ,Y k1 ,X k2 ,Y k2 ) As the kth recognition region;
s206, acquiring a first heightened identification area based on the k identification area, wherein the first heightened area takes the upper edge of the k identification area as a starting position, increases the height rho in the opposite direction of the Y axis, and takes the length as the character length L r P is a second preset growth factor;
s207, when the first pixel value of the first heightened identification area is larger than the preset pixel value threshold, judging that the second heightened area reaches the sigma-th area 1 HeighteningPixel value of region, σ 1 The first pixel value is the average value of the pixel values of all points in the first heightened identification area;
s208, when the average pixel value of the sigma +1 th heightening area is not more than a preset pixel threshold value, a single character text recognition area corresponding to the sigma heightening area is obtained as a final recognition area, wherein the single character text recognition area takes the upper edge of the sigma heightening area as an initial position and faces to the Y-axis direction, and the height is H r A rectangle having a width of the length of the third position of the kth character.
The invention has at least the following beneficial effects:
when abnormal characters exist, for example, when the characters are shifted up and down, the situation of incomplete detection can be caused by using the text recognition area for detection, the range of the text recognition area is usually expanded in the prior art to ensure that all the characters are detected, but the detection area is expanded at the same time, and the identification effect of the up-and-down inclination of the characters is poor, so that the characters are detected one by one according to a second preset growth factor, and the detection precision of a single character is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a data processing system for acquiring a text recognition area according to an embodiment of the present invention.
Fig. 2 is a flowchart of a method for obtaining an abnormal text recognition area according to an embodiment of the present invention.
Fig. 3 is a flowchart of a method for recognizing a target text based on a text according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
A data processing system for acquiring a text recognition area, the system comprising an imaging device, a database in which a specified image list A = { A } is stored, a processor, and a memory in which a computer program is stored 1 ,…,A i ,…,A m },A i Is the ith designated image, i has a value in the range of 1 to m, m is the number of designated images, and when the computer program is executed by the processor, the following steps are realized:
s101, carrying out affine transformation processing on a target image to obtain an intermediate image corresponding to the target image, wherein the target image is an image corresponding to a target text acquired by a camera device;
wherein, when affine transformation is carried out on the target image, based on the first target point list C = { C = { (C) 1 ,C 2 ,C 3 Acquiring a target point list C by the following steps:
s1011, obtaining a first preset point list C \697, = { C \697 1 ,…,Cʹ j ,…,Cʹ n },Cʹ j The number of the jth first preset points is 1 to n, and n is the number of the first preset points, wherein the first type of preset points are pre-designated points and corner points of a target image;
s1013, obtaining C 1 At the fixed area D ζ1 Randomly selecting a first preset point, labeled C, at D \ 697 2 (ii) a Wherein, C 1 D \697, a first preset point selected at random, refers to the removal of the fixed region D ζ1 D = (D) second fixed area list of (c) 1 ,…,D ζ1 ,…,D ζ2 ,…,D ψ ),D ζ Is the value of the zeta fixed area divided in the target imageThe range is 1 to ψ, which means the number of fixed areas;
s1015, obtain C 2 In a fixed area D ζ2
S1017, obtaining a second pre-set point list C \ 697 corresponding to D \ 697 1 ,…,Cʹʹ j1 ,…,Cʹʹ n1 },Cʹʹ j1 The number of the j1 second preset points is 1 to n1, n1 is the number of the second preset points, and the second preset point is a first preset point located at D \697; d \ 697 ζ1 And D ζ2 A third fixed area list of (2);
s1019, traverse C \ 697 j1 And C 1 、C 2 Constituting a first planar region list α \697 1 ,…,αʹʹ j1 ,…,αʹʹ n1 H } and obtaining a first planar region area list S \697corresponding to the first planar region list 1 ,…,Sʹʹ j1 ,…,Sʹʹ n1 H, wherein, α \ 697 j1 Referred to as C \ 697 j1 And C 1 、C 2 The planar region of (a), S \697 j1 α \697 j1 The corresponding area;
s1021, acquiring a second planar area region list S = { S } based on S \ 697 1 ,…,S j2 ,…,S n1 And will S 1 The corresponding plane area is taken as a target plane area, and the 3 rd point corresponding to the target plane area is marked as C 3 Wherein S is j2 Refers to the j2 th plane area in the second plane area list, the value range of j2 is 1 to n1, wherein S j2 ≥S j2+1
Based on S1011-S1021, affine transformation is carried out by selecting 3 points of a target image, two points are randomly selected from the pre-designated points and angular points, a fixed area where the two points are located is obtained, the remaining one point in the remaining fixed area is selected to form a plane area, and a third point with the largest plane area is obtained to be used as a target point, so that the situation that three angular points are used in a common method but two or more angular points are true is avoided, and the use condition of affine transformation is expanded; meanwhile, more special points on the target image can be used as target points, so that the identification and acquisition are easier, such as the center point of a preset position area and the center point of an identification code.
S103, a text recognition area list B = { B ] of the intermediate image is acquired 1 ,…,B r ,…,B r1 },B r =(X r ,Y r ,H r ,L r ),B r Is the r-th text recognition area of the intermediate image, and the value range of r is 1 to r 1 ,r 1 Refers to the number of text recognition areas, X r Means for B r Coordinates of the upper left-hand corner X-axis, Y r Means B r Coordinates of the upper left-hand Y-axis, H r Means B r Height of (L) r Means B r Length of (d).
Specifically, in the present invention, the upper left corner of the target designation image is taken as the origin of coordinate axes, the positive X-axis direction is horizontal to the right, and the positive Y-axis direction is vertical to the bottom.
In one embodiment of the present invention, B is obtained by r
S1031, acquiring a target specified image based on the specified image list A and the target image;
s1032, acquiring a first history image list B \697corresponding to the target designated image 1 ,…,Bʹ s ,…,Bʹ s1 S has a value ranging from s to s 1 ,s 1 It is referred to the number of the history images,
s1033, normalizing the first historical image to obtain a second historical image;
s1034, obtaining an r text recognition region list B \ 697 1 ,…,Bʹʹ s ,…,Bʹʹ s1 },Bʹʹ s =(Xʹʹ s ,Yʹʹ s ,Hʹʹ s ,Lʹʹ s ),Bʹʹ s The first r text recognition region X \697correspondingto the second history image s Refers to B \697 s Coordinates of the left upper X-axis, Y \697 s Referred to as B \697 s Coordinates of the left upper corner Y-axis, H \697 s Refers to B \697 s L \ 697ʹ s Referred to as B \697 s Length of (d);
s1035, obtaining B r :X r =Xʹʹ s ,Y r =Yʹʹ s ,H r =max(Hʹʹ s ),L r =max(Lʹʹ s )。
Specifically, before S101, the method further includes:
s1, acquiring a target position corresponding to an intermediate image, identifying based on the target position, and acquiring a target position character string;
s2, traversing the specified image A i A designated position character string corresponding to the target position, and when the designated position character string is equal to the target position character string, a designated image A corresponding to the designated position character string i As a target designation image corresponding to the target image.
Specifically, those skilled in the art know that the position of the target position character string can be obtained through a neural network training method.
Based on S1-S2, the target position character string is identified according to the preset position region, and when the designated title character string is equal to the target title character string, it can be understood that the title of the designated image is the same as the title of the target image, so the designated image is taken as the target designated image corresponding to the target image.
The invention also comprises after S2:
s3, the first target point list C = { C 1 ,C 2 ,C 3 Obtaining a second target point list theta = { theta through affine transformation processing 1 ,θ 2 ,θ 3 },
S4, obtaining A i ={A i1 ,A i2 ,A i3 },θ=A i ,A i A list of target points corresponding to the designated image.
Specifically, when the number Sum >1 of target specification images, the following steps are performed:
s31, acquiring a second preset character string list E = { E } corresponding to the target designated image in a second preset position area 1 ,…,E g ,…,E z },E g A second preset character string corresponding to the g-th target designation image,g ranges from 1 to z, wherein z refers to the number of target designated images;
s33, acquiring a second target character string corresponding to the target image in a second preset position area;
s35, traversing the E to enable the second middle preset character string to be equal to the second target character string, and when the number k \697of the second middle preset character string is =1, taking the target designated image corresponding to the second middle preset character string as a final target designated image.
In an embodiment of the present invention, the second intermediate predetermined string may be a special field such as "copy".
It can be understood that: when the number of the target designated images is more than 1, a plurality of templates with the same title or templates of the same type belonging to the title are provided, so that when a plurality of target designated images are provided, the final target designated image is determined by comparing the second preset character string, the types of the templates are various, the matched target images are more accurate, meanwhile, the designated position character string and the second middle character string are compared in sequence, and the time is saved and the efficiency is improved.
In another embodiment of the present invention, H can also be obtained in S1035 by the following steps r
H r =(1/s 1 )∑ s1 s=1 Hʹʹ s
In yet another embodiment of the present invention, L can also be obtained in S1035 by the following steps r
L r =(1/s 1 )∑ s1 s=1 Lʹʹ s
Based on S101-S103, affine transformation is carried out on a target image based on three points, so that the direction of the image subjected to affine transformation is completely the same as that of a target designated image, the target image comprises a plurality of text recognition areas, and the length of each text recognition area is determined by obtaining the minimum value and the maximum value of the starting position and the cut-off position of the corresponding text recognition area in S text images.
Example 2
On the basis of the embodiment 1, the invention further comprises a method for identifying based on the abnormal characters, wherein the method comprises the following steps:
s201, when the r text recognition area is not rectangular, based on the CRNN model, obtaining a mapping proportion K 3 A first text recognition result list X = { X ] corresponding to the r-th text recognition area 1 ,…,X k ,…,X k1 },X k The coordinate of the center of a recognition area corresponding to the kth character on the X axis is defined, and the value range of k is 1 to k 1 ,k 1 The number of characters in the r-th text recognition area is referred to;
s202, identifying a list and a mapping ratio K based on the first text 3 Acquiring a second text recognition result list O = { O = { O = 1 ,…,O k ,…,O k1 },O k =(X k1 ,Y k1 ,X k2 ,Y k2 ) Wherein, in the step (A),
X k1 =K 3 *X k -H r /2,Y k1 =Yʹʹ s ,X k2 =K 3 *X k +H r /2,Y k2 =Yʹʹ s +Hʹʹ s wherein, Y \697 s Refers to the coordinate of the upper left corner Y axis of the r-th text recognition area, H r Identifying a height of the area for the r-th text;
s203, performing equal ratio division on the length of the r-th text recognition region based on L \697, to obtain a third text recognition result list O697, = { O697 1 ,…,Oʹʹ k ,…,Oʹʹ k1 },Oʹʹ k =(X k3 ,Y k3 ,X k4 ,Y k4 ),X k3 Means the coordinate of the upper left corner of the third identification region corresponding to the k character on the X axis, Y k3 The coordinate of the upper left corner of the third recognition region corresponding to the kth character on the X axis, X \697 k2 Means the coordinate of the lower right corner of the third recognition area corresponding to the kth character on the X axis, Y k4 Is the k thCoordinates of the lower right corner of the third recognition area corresponding to the character in the Y axis, wherein L \ 697 r /k 1
S204, when | X k1 -X k3 |<(Y k2 -Y k1 +Y k4 -Y k3 ) (v 4) set of first position coordinates ((X) k1 +X k3 )/2,(Y k1 +Y k3 )/2,(X k2 +X k4 )/2,(Y k2 +Y k4 ) (v 2) as the kth identification area;
s205, when | X k1 -X k3 |≥(Y k2 -Y k1 +Y k4 -Y k3 ) /4, set the second position coordinate (X) k1 ,Y k1 ,X k2 ,Y k2 ) As the kth recognition region;
s206, acquiring a first heightened identification area based on the k identification area, wherein the first heightened area takes the upper edge of the k identification area as a starting position, increases the height rho in the opposite direction of the Y axis, and takes the length as the character length L r P is a second preset growth factor;
s207, when the first pixel value of the first heightened identification area is larger than the preset pixel value threshold, judging that the second heightened area reaches the sigma-th area 1 Pixel value of heightened region, σ 1 The first pixel value refers to an average value of pixel values of all points in the first heightened identification area;
s208, when the average pixel value of the sigma +1 th heightening area is not more than a preset pixel threshold value, a single character text recognition area corresponding to the sigma heightening area is obtained as a final recognition area, wherein the single character text recognition area takes the upper edge of the sigma heightening area as an initial position and faces to the Y-axis direction, and the height is H r A rectangle having a width of the length of the third position of the kth character.
Based on S201-S208, when there is an abnormal character, for example, when the character is shifted up and down, the detection using the text recognition area may cause incomplete detection, in the prior art, the range of the text recognition area is often expanded to ensure that all characters are detected, but the detection area is expanded at the same time, and the recognition effect of the up-down tilt of the character is not good, so the invention adopts the second preset growth factor to detect the characters one by one, thereby improving the detection accuracy of a single character.
Example 3
On the basis of the embodiment 2, the invention further comprises a method for recognizing the target text based on the text, wherein the method comprises the following steps:
s301, processing the target image to acquire polygonal labeling information corresponding to the text recognition area;
s302, based on the labeling information of the polygon, the inward height and length of the polygon are reduced to obtain a first label, wherein,
L=L 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H=H 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ) Wherein H is 1 Height, L, of the rectangle to which the polygon is labeled 1 The length of a rectangle marked by a polygon is represented, r is an empirical coefficient, H is the height of the rectangle marked by the reduced polygon, L is the length of the rectangle marked by the reduced polygon, and k is a preset reduction parameter;
s303, expanding the outward height and length of the polygon based on the labeling information of the polygon to obtain a second label,
L 2 =L 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 =H 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 refers to the height, L, of the rectangle of the expanded polygon 2 The length of the rectangle marked by the expanded polygon is referred to;
s304, inputting the first label, the second label and the target image into an image processing model to obtain a final text recognition area;
and S305, performing character recognition based on the final text recognition area to obtain a target character string.
In the prior art, when a DBNet calculates a label, a calculated offset D = a × r/L, where D is an amount of shortening of each side of a labeled polygon, a is an area of the polygon, r is an empirical coefficient 1.5, and L is a length of a side of a multi-deformation, and assuming that text objects are all rectangles, the formula can be simplified as follows: d = w × h/(w + h), we consider w as a constant and make a partial derivative of h, we get = r (w/(w + h)) × 2, the partial derivative is constantly greater than 0, we increase, let x = w/h get r (x/(x + 1)) × 2, we can find that the partial derivative is incremental, i.e. the partial derivative increases with increasing aspect ratio, i.e. D increases with increasing aspect ratio, and as such, we can create text recognition regions with a larger aspect ratio, whose width after shrank is relatively small, which can prompt the model to learn a region with a relatively narrow width when encountering text regions with a larger aspect ratio, which can prompt the model to output a narrower region, and the text region can not completely cover the upper and lower boundaries of the text.
Based on S301-S305, a text image to be processed is obtained, the text image to be processed is preprocessed, an intermediate text image is obtained, when preprocessing is performed, a label data set is obtained, data enhancement is performed on the label data set, polygon labeling is traversed, when character content exists in a target area, polygon labeling is kept, the target area corresponding to the polygon labeling is obtained, pixels are inwards reduced on each edge of the polygon labeling, preset reduction parameters are introduced in the reduction process, the length and the width of the target area can be properly reduced in the reduction process, the situation that in the reduction process, for a rectangle with a large length-width ratio difference, the text area cannot completely cover the upper boundary and the lower boundary of the character is avoided, the preset reduction parameters are introduced, the length and the width of the rectangle are reduced in a self-adaptive mode, and the target character is finally obtained.
Wherein, after S305, the method further comprises the following steps:
s3051, acquiring a target text recognition area list Q = { Q) corresponding to a text image to be processed 1 ,…,Q v ,…,Q β Corresponding target text character string, Q, corresponding to the target text recognition area v The method refers to a v-th target text recognition area corresponding to a text image to be processed, wherein the value range of v is 1 to beta,beta refers to the number of the target text recognition areas;
s3053, traversing a target text recognition area list Q according to an r-th text recognition area corresponding to a target designated image corresponding to a text image to be processed, and acquiring Q v A center point of (a);
s3055, when Q is v When the center point coordinate of (2) belongs to the area range of the r-th text recognition area, the r-th text recognition area and the target text recognition area Q are obtained v The intersection set IoU of (2);
s3057, when the IoU is larger than a preset intersection threshold value, Q is added v And associating the corresponding target text to the r text recognition area to form a key value pair.
Based on S3051-S3057, when the central point of the target text recognition area is judged to be in the r-th text recognition area corresponding to the target appointed image, acquiring an intersection set of the r-th text recognition area and the target text recognition area, and generating a corresponding key value pair when the intersection set meets a preset intersection range, so that the formed key value pair is more accurately corresponding.
In an embodiment of the present invention, the present invention further includes the following steps:
s10, the method comprises the following steps of, acquiring a predefined feature list U = { U = { (U) } 1 ,…,U γ ,…,U δ },U γ The method refers to the gamma-th characteristic, the value range of gamma is 1 to delta, and delta refers to the predefined characteristic quantity.
In one embodiment of the present invention, the predefined feature list includes an identification code, a stamp identifier, a fingerprint identifier, and a signature identifier.
Preferably, δ ≧ 3.
S20, detecting whether the text image to be processed comprises the predefined characteristics.
S30, when the text image to be processed has U γ When in use, U is turned on γ The corresponding key-value pair is labeled "1".
S40, when the text image to be processed does not have U γ When in use, U is turned on γ The corresponding key-value pair is labeled "0".
Further, the key-value pair identifier is used for identifying whether the feature has an identifier of the text image to be processed, wherein the key-value pair identifier is "1" or "0"; it can be understood that: those skilled in the art know that when a feature exists in the present image to be processed, the key-value pair is identified as "1" or "0"; otherwise, when the feature does not have a text image to be processed, the key-value pair is identified as "0" or "1".
Preferably, when the characteristic exists in the text image to be processed, the key-value pair is marked as "1"; otherwise, when the feature does not have a text image to be processed, the key-value pair is identified as "0".
And based on S10-S40, judging whether the to-be-processed text image comprises the characteristics such as the identification code and the like by using the universal target detection, and detecting the characteristics in the to-be-processed text image.
Embodiments of the present invention also provide a non-transitory computer-readable storage medium, which may be configured in an electronic device to store at least one instruction or at least one program for implementing a method of the method embodiments, where the at least one instruction or the at least one program is loaded into and executed by a processor to implement the method provided by the above embodiments.
Embodiments of the present invention also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
Embodiments of the present invention also provide a computer program product comprising program code means for causing an electronic device to carry out the steps of the method according to various exemplary embodiments of the invention described above in the present description, when said program product is run on the electronic device.
Although some specific embodiments of the present invention have been described in detail by way of example, it should be understood by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. It will also be appreciated by those skilled in the art that various modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.

Claims (10)

1. A method for acquiring an abnormal character recognition area is characterized by comprising the following steps:
s201, when the r text recognition area is not rectangular, based on the CRNN model, obtaining a mapping proportion K 3 A first text recognition result list X = { X ] corresponding to the r-th text recognition area 1 ,…,X k ,…,X k1 },X k The coordinate of the center of a recognition area corresponding to the kth character on the X axis is defined, and the value range of k is 1 to k 1 ,k 1 The number of characters in the r-th text recognition area is referred to;
s202, identifying a list and mapping proportion K based on the first text 3 Acquiring a second text recognition result list O = { O = { O = 1 ,…,O k ,…,O k1 },O k =(X k1 ,Y k1 ,X k2 ,Y k2 ) Wherein, in the step (A),
X k1 =K 3 *X k -H r /2,Y k1 =Yʹʹ s ,X k2 =K 3 *X k +H r /2,Y k2 =Yʹʹ s +Hʹʹ s wherein, Y \697 s Refers to the coordinate of the upper left corner Y axis of the r-th text recognition area, H r Identifying a height of the area for the r-th text;
s203, performing equal ratio segmentation on the length of the r-th text recognition region based on L \697toobtain a third text recognition result list O \697 1 ,…,Oʹʹ k ,…,Oʹʹ k1 },Oʹʹ k =(X k3 ,Y k3 ,X k4 ,Y k4 ),X k3 Means the coordinate of the upper left corner of the third identification region corresponding to the k character on the X axis, Y k3 The coordinate of the upper left corner of the third recognition region corresponding to the kth character on the X axis, X \697 k2 Means the coordinate of the lower right corner of the third recognition area corresponding to the kth character on the X axis, Y k4 Refers to the coordinate of the lower right corner of the third recognition region corresponding to the k-th character on the Y-axis, wherein L \ 697 r /k 1
S204, when | X k1 -X k3 |<(Y k2 -Y k1 +Y k4 -Y k3 ) (v 4) set of first position coordinates ((X) k1 +X k3 )/2,(Y k1 +Y k3 )/2,(X k2 +X k4 )/2,(Y k2 +Y k4 ) (v 2) as the kth identification area;
s205, when | X k1 -X k3 |≥(Y k2 -Y k1 +Y k4 -Y k3 ) /4, set the second position coordinate (X) k1 ,Y k1 ,X k2 ,Y k2 ) As the kth identification region;
s206, acquiring a first heightened identification area based on the k identification area, wherein the first heightened area takes the upper edge of the k identification area as a starting position, increases the height rho in the opposite direction of the Y axis, and takes the length as the character length L r P is a second preset growth factor;
s207, when the first pixel value of the first heightening identification area is larger than the preset pixel value threshold, judging that the second heightening area reaches the sigma-th area 1 Pixel value of heightened region, σ 1 The first pixel value is the average value of the pixel values of all points in the first heightened identification area;
s208, when the average pixel value of the sigma +1 th heightening area is not more than the preset pixel threshold value, acquiring a single character text recognition area corresponding to the sigma heightening area as a final recognition area, wherein the single character text recognition area takes the upper edge of the sigma heightening area as an initial position and is towards the Y-axis direction, and the height is H r A rectangle having a width of the length of the third position of the kth character.
2. The method for recognizing based on abnormal texts as claimed in claim 1, further comprising before S201:
s101, carrying out affine transformation processing on a target image to obtain an intermediate image corresponding to the target image, wherein the target image is an image corresponding to a target text acquired by a camera device;
whereinWhen affine transformation is performed on the target image, based on the first target point list C = { C = { C } 1 ,C 2 ,C 3 Acquiring a target point list C by the following steps:
s1011, obtaining a first preset point list C \697 1 ,…,Cʹ j ,…,Cʹ n },Cʹ j The number of the jth first preset points is 1 to n, and n is the number of the first preset points, wherein the first type of preset points are pre-designated points and corner points of a target image;
s1013, obtaining C 1 At the fixed area D ζ1 Randomly selecting a first preset point, labeled C, at D \ 697 2 (ii) a Wherein, C 1 D \697, a first preset point chosen at random, refers to the removal of the fixed zone D ζ1 D = (D) second fixed area list of (c) 1 ,…,D ζ1 ,…,D ζ2 ,…,D ψ ),D ζ The method comprises the steps that a zeta-th fixed area is divided in a target image, the value range of zeta is from 1 to psi, and psi refers to the number of the fixed areas;
s1015, obtain C 2 Is located in a fixed area D ζ2
S1017, obtaining a second pre-set point list C \ 697 corresponding to D \ 697 1 ,…,Cʹʹ j1 ,…,Cʹʹ n1 },Cʹʹ j1 The number of the j1 second preset points is 1 to n1, n1 is the number of the second preset points, and the second preset point is a first preset point located at D \697; d \ 697 ζ1 And D ζ2 A third fixed area list of (1);
s1019, traverse C \ 697 j1 And C 1 、C 2 Constituting a first planar region list α \697 1 ,…,αʹʹ j1 ,…,αʹʹ n1 H } and obtaining a first planar region area list S \697corresponding to the first planar region list 1 ,…,Sʹʹ j1 ,…,Sʹʹ n1 H, wherein, α \ 697 j1 Referred to as C \ 697 j1 And C 1 、C 2 The constituted planar region, S \ 697ʹ j1 Referred to as α \ 697 j1 The corresponding area;
s1021, acquiring a second planar area region list S = { S } based on S \ 697 1 ,…,S j2 ,…,S n1 And will S 1 The corresponding plane area is taken as a target plane area, and the 3 rd point corresponding to the target plane area is marked as C 3 Wherein S is j2 Refers to the j2 th plane area in the second plane area list, the value range of j2 is 1 to n1, wherein S j2 ≥S j2+1
S103, a text recognition area list B = { B ] of the intermediate image is acquired 1 ,…,B r ,…,B r1 },B r =(X r ,Y r ,H r ,L r ),B r Is the r-th text recognition area of the intermediate image, and the value range of r is 1 to r 1 ,r 1 Refers to the number of text recognition areas, X r Means B r Coordinates of the upper left corner X-axis, Y r Means B r Coordinates of the upper left-hand Y-axis, H r Means B r Height of (L) r Means B r Length of (d).
3. The method of claim 2, wherein the center point of the predetermined location area is marked as C 1
4. The method of claim 2, wherein the center point for acquiring the identification code is marked as C 1
5. The method for recognizing based on abnormal words as claimed in claim 1, wherein the method further comprises the steps of:
s301, processing the target image to acquire polygonal labeling information corresponding to the text recognition area;
s302, based on the labeling information of the polygon, the inward height and length of the polygon are reduced to obtain a first label, wherein,
L=L 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H=H 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ) Wherein H is 1 Height, L, of the rectangle to which the polygon is labeled 1 The length of a rectangle marked by a polygon is represented, r is an empirical coefficient, H is the height of the rectangle marked by the reduced polygon, L is the length of the rectangle marked by the reduced polygon, and k is a preset reduction parameter;
s303, expanding the outward height and length of the polygon based on the labeling information of the polygon to obtain a second label,
L 2 =L 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 =H 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 refers to the height, L, of the rectangle of the expanded polygon 2 The length of the rectangle marked by the expanded polygon is referred to;
s304, inputting the first label, the second label and the target image into an image processing model to obtain a final text recognition area;
and S305, performing character recognition based on the final text recognition area to obtain a target character string.
6. The method for performing identification based on abnormal words according to claim 5, further comprising the following steps after S305:
s3051, acquiring a target text recognition area list Q = { Q) corresponding to a text image to be processed 1 ,…,Q v ,…,Q β Corresponding target text character string, Q, corresponding to the target text recognition area v The method comprises the steps that a v-th target text recognition area corresponding to a text image to be processed is defined, the value range of v is 1 to beta, and beta is the number of the target text recognition areas;
s3053, appointing the image corresponding to the r-th image according to the target corresponding to the text image to be processedText recognition area, traversing target text recognition area list Q, and obtaining Q v A center point of (a);
s3055, when Q is v When the center point coordinate of (2) belongs to the area range of the r-th text recognition area, the r-th text recognition area and the target text recognition area Q are obtained v The intersection set IoU;
s3057, when the IoU is larger than a preset intersection threshold value, Q is added v And associating the corresponding target text to the r text recognition area to form a key value pair.
7. The method of claim 6, wherein IoU is greater than or equal to 0.5.
8. The method of claim 6, wherein k =0.05.
9. A non-transitory computer readable storage medium having stored therein at least one instruction or at least one program, the at least one instruction or the at least one program being loaded and executed by a processor to implement the method of any one of claims 1-8.
10. An electronic device comprising a processor and the non-transitory computer readable storage medium of claim 9.
CN202210984470.0A 2022-08-17 2022-08-17 Method for acquiring abnormal character recognition area, electronic equipment and storage medium Active CN115205861B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210984470.0A CN115205861B (en) 2022-08-17 2022-08-17 Method for acquiring abnormal character recognition area, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210984470.0A CN115205861B (en) 2022-08-17 2022-08-17 Method for acquiring abnormal character recognition area, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115205861A true CN115205861A (en) 2022-10-18
CN115205861B CN115205861B (en) 2023-03-31

Family

ID=83585185

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210984470.0A Active CN115205861B (en) 2022-08-17 2022-08-17 Method for acquiring abnormal character recognition area, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115205861B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117648296A (en) * 2024-01-29 2024-03-05 北京惠朗时代科技有限公司 Graphic data reading device and using method

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109271967A (en) * 2018-10-16 2019-01-25 腾讯科技(深圳)有限公司 The recognition methods of text and device, electronic equipment, storage medium in image
CN111091123A (en) * 2019-12-02 2020-05-01 上海眼控科技股份有限公司 Text region detection method and equipment
CN112241739A (en) * 2020-12-17 2021-01-19 北京沃东天骏信息技术有限公司 Method, device, equipment and computer readable medium for identifying text errors
CN112287941A (en) * 2020-11-26 2021-01-29 国际关系学院 License plate recognition method based on automatic character region perception
CN112364656A (en) * 2021-01-12 2021-02-12 北京睿企信息科技有限公司 Named entity identification method based on multi-dataset multi-label joint training
CN112613506A (en) * 2020-12-23 2021-04-06 金蝶软件(中国)有限公司 Method and device for recognizing text in image, computer equipment and storage medium
CN112990220A (en) * 2021-04-19 2021-06-18 烟台中科网络技术研究所 Intelligent identification method and system for target text in image
CN113537184A (en) * 2021-06-03 2021-10-22 广州市新文溯科技有限公司 OCR (optical character recognition) model training method and device, computer equipment and storage medium
CN114067339A (en) * 2021-11-26 2022-02-18 中国工商银行股份有限公司 Image recognition method and device, electronic equipment and computer readable storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109271967A (en) * 2018-10-16 2019-01-25 腾讯科技(深圳)有限公司 The recognition methods of text and device, electronic equipment, storage medium in image
CN111091123A (en) * 2019-12-02 2020-05-01 上海眼控科技股份有限公司 Text region detection method and equipment
CN112287941A (en) * 2020-11-26 2021-01-29 国际关系学院 License plate recognition method based on automatic character region perception
CN112241739A (en) * 2020-12-17 2021-01-19 北京沃东天骏信息技术有限公司 Method, device, equipment and computer readable medium for identifying text errors
CN112613506A (en) * 2020-12-23 2021-04-06 金蝶软件(中国)有限公司 Method and device for recognizing text in image, computer equipment and storage medium
CN112364656A (en) * 2021-01-12 2021-02-12 北京睿企信息科技有限公司 Named entity identification method based on multi-dataset multi-label joint training
CN112990220A (en) * 2021-04-19 2021-06-18 烟台中科网络技术研究所 Intelligent identification method and system for target text in image
CN113537184A (en) * 2021-06-03 2021-10-22 广州市新文溯科技有限公司 OCR (optical character recognition) model training method and device, computer equipment and storage medium
CN114067339A (en) * 2021-11-26 2022-02-18 中国工商银行股份有限公司 Image recognition method and device, electronic equipment and computer readable storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BOYU ZHANG等: "MULTI-SCALE VIDEO TEXT DETECTION BASED ON CORNER AND STROKE WIDTH VERIFICATION", 《2013 VISUAL COMMUNICATIONS AND IMAGE PRECESSING》 *
XU ZHAO等: "Text From Corners: A Novel Approach to Detect Text and Caption in Videos", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 *
章安: "基于Tesseract的文字识别研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
颜建强: "图像视频复杂场景中文字检测识别方法研究", 《中国博士学位论文全文数据库 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117648296A (en) * 2024-01-29 2024-03-05 北京惠朗时代科技有限公司 Graphic data reading device and using method
CN117648296B (en) * 2024-01-29 2024-04-09 北京惠朗时代科技有限公司 Graphic data reading device and using method

Also Published As

Publication number Publication date
CN115205861B (en) 2023-03-31

Similar Documents

Publication Publication Date Title
CN110427793B (en) Bar code detection method and system based on deep learning
US8391602B2 (en) Character recognition
CN109740606B (en) Image identification method and device
US8687886B2 (en) Method and apparatus for document image indexing and retrieval using multi-level document image structure and local features
CN112001406B (en) Text region detection method and device
US10318803B1 (en) Text line segmentation method
CN110675940A (en) Pathological image labeling method and device, computer equipment and storage medium
CN112613506A (en) Method and device for recognizing text in image, computer equipment and storage medium
CN115205861B (en) Method for acquiring abnormal character recognition area, electronic equipment and storage medium
CN110210467B (en) Formula positioning method of text image, image processing device and storage medium
CN111652205A (en) Text correction method, device, equipment and medium based on deep learning
CN112699704B (en) Method, device, equipment and storage device for detecting bar code
CN109035285B (en) Image boundary determining method and device, terminal and storage medium
CN114648771A (en) Character recognition method, electronic device and computer readable storage medium
CN112580499A (en) Text recognition method, device, equipment and storage medium
CN115410191B (en) Text image recognition method, device, equipment and storage medium
CN112036232A (en) Image table structure identification method, system, terminal and storage medium
CN115331230B (en) Data processing system for acquiring text recognition area
CN115331231A (en) Method for recognizing target text based on text, electronic equipment and storage medium
US20220222537A1 (en) Method for Operating a Deep Neural Network
CN109871910B (en) Handwritten character recognition method and device
CN114241463A (en) Signature verification method and device, computer equipment and storage medium
CN113837119A (en) Method and equipment for recognizing confusable characters based on gray level images
CN113537216A (en) Dot matrix font text line inclination correction method and device
JPH07168910A (en) Document layout analysis device and document format identification device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant