CN115331230B - Data processing system for acquiring text recognition area - Google Patents
Data processing system for acquiring text recognition area Download PDFInfo
- Publication number
- CN115331230B CN115331230B CN202210984372.7A CN202210984372A CN115331230B CN 115331230 B CN115331230 B CN 115331230B CN 202210984372 A CN202210984372 A CN 202210984372A CN 115331230 B CN115331230 B CN 115331230B
- Authority
- CN
- China
- Prior art keywords
- target
- image
- area
- preset
- list
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/146—Aligning or centring of the image pick-up or image-field
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
- G06V10/247—Aligning, centring, orientation detection or correction of the image by affine transforms, e.g. correction due to perspective effects; Quadrilaterals, e.g. trapezoids
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/146—Aligning or centring of the image pick-up or image-field
- G06V30/147—Determination of region of interest
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/414—Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Character Input (AREA)
Abstract
The invention relates to a data processing system for acquiring a text recognition area, the system comprising a camera device, a database, a processor and a memory storing a computer program, the database storing a list of specified images, the computer program, when executed by the processor, implementing the steps of: performing affine transformation processing on a target image to obtain an intermediate image corresponding to the target image, wherein the target image is an image corresponding to a target text acquired by a camera device; based on the intermediate image, the height and the length of the text recognition area are obtained, and affine transformation is carried out by using more special points on the target image as target points, so that the condition of corner point loss is avoided.
Description
Technical Field
The invention relates to the field of text detection, in particular to a data processing system for acquiring a text recognition area.
Background
At present, an OCR technology is often used for text recognition, and the OCR generally refers to all image character detection and recognition technologies, including a traditional document image recognition technology and a scene character recognition technology, which can be regarded as prohibition and upgrade of the traditional OCR technology.
In the prior art, affine transformation is generally used when operations including rotation, translation, stretching and the like are performed on an image, the affine transformation achieves the effect of enhancing text image data through an affine matrix, when the affine transformation is performed, 3 or 4 vertexes of the affine transformation are generally adopted for performing affine transformation, but when the operations are actually performed, the vertexes of the text image are missing, and when two or more vertexes are missing, the affine transformation cannot be directly performed.
Disclosure of Invention
Aiming at the technical problems, the technical scheme adopted by the invention is as follows: a data processing system for acquiring a text recognition area, the system comprising an imaging device, a database in which a specified image list A = { A } is stored, a processor, and a memory in which a computer program is stored 1 ,…,A i ,…,A m },A i Is the ith designated image, i has a value in the range of 1 to m, m is the number of designated images, and when the computer program is executed by the processor, the following steps are realized:
s101, carrying out affine transformation processing on a target image to obtain an intermediate image corresponding to the target image, wherein the target image is an image corresponding to a target text acquired by a camera device;
wherein, when affine transformation is carried out on the target image, based on the first target point list C = { C = { (C) 1 ,C 2 ,C 3 Obtaining a target point list C by:
s1011, a first preset point list C ' = { C ' is obtained ' 1 ,…,C′ j ,…,C′ n },C′ j The number of the jth first preset points is 1 to n, and n is the number of the first preset points, wherein the first type of preset points are points which are specified in advance and corner points of a target image;
s1013, obtaining C 1 In the fixing areaDomain D ζ1 Randomly selecting a first preset point at D', labeled C 2 (ii) a Wherein, C 1 For a randomly selected first preset point, D' means the removal of the fixed zone D ζ1 Second fixed area list of (a), D = (D) 1 ,…,D ζ1 ,…,D ζ2 ,…,D ψ ),D ζ The method comprises the steps of indicating a zeta-th fixed area which is divided in a target image, wherein the value range of zeta is from 1 to psi, and psi refers to the number of the fixed areas;
s1015, obtain C 2 In a fixed area D ζ2 ;
S1017, a second preset point list C "= { C" ", corresponding to D", is obtained 1 ,…,C″ j1 ,…,C″ n1 },C″ j1 The number of the j1 th second preset points is the j1 th, the value range of the j1 is 1 to n1, the n1 is the number of the second preset points, and the second preset points are the first preset points located at D'; d' refers to the removal of the fixed region D ζ1 And D ζ2 A third fixed area list of (2);
s1019, go through C', and obtain C ″) j1 And C 1 、C 2 Constitute a first planar area list α "= { α ″" 1 ,…,α″ j1 ,…,α′′ n1 And obtaining a first planar area list S '= { S' = corresponding to the first planar area list 1 ,…,S″ j1 ,…,S″ n1 In which α ″) j1 Means C j1 And C 1 、C 2 Formed flat areas, S j1 Means alpha ″) j1 The corresponding area;
s1021, acquiring a second plane area list S = { S' based on S 1 ,…,S j2 ,…,S n1 And will S 1 The corresponding plane area is taken as a target plane area, and the 3 rd point corresponding to the target plane area is marked as C 3 Wherein S is j2 Refers to the j2 th plane area in the second plane area list, the value range of j2 is 1 to n1, wherein S j2 ≥S j2+1 ;
S103, a text recognition area list B = { B ] of the intermediate image is acquired 1 ,…,B r ,…,B r1 },B r =(X r ,Y r ,H r ,L r ),B r Is the r-th text recognition area of the intermediate image, and the value range of r is 1 to r 1 ,r 1 Refers to the number of text recognition regions, X r Means B r Coordinates of the upper left corner X-axis, Y r Means B r Coordinates of the upper left-hand Y-axis, H r Means B r Height of (L) r Means for B r Length of (d).
The invention has at least the following beneficial effects: the situation that three angular points are used in a common method but two or more angular points are true is avoided, and the use situation of affine transformation is enlarged; meanwhile, more special points on the target image can be used as target points, so that the target points can be distinguished and acquired more easily.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a data processing system for acquiring a text recognition area according to an embodiment of the present invention.
Fig. 2 is a flowchart of a method for obtaining an abnormal text recognition area according to an embodiment of the present invention.
Fig. 3 is a flowchart of a method for recognizing a target text based on a text according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in other sequences than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
A data processing system for acquiring a text recognition area, the system comprising an imaging device, a database in which a specified image list A = { A } is stored, a processor, and a memory in which a computer program is stored 1 ,…,A i ,…,A m },A i Is the ith designated image, the value range of i is 1 to m, m is the number of the designated images, when the computer program is executed by a processor, the following steps are realized:
s101, carrying out affine transformation processing on a target image to obtain an intermediate image corresponding to the target image, wherein the target image is an image corresponding to a target text acquired by a camera device;
wherein, when affine transformation is carried out on the target image, the first target point list C = { C is based on 1 ,C 2 ,C 3 Obtaining a target point list C by:
s1011, obtaining a first preset point list C '= { C' 1 ,…,C′ j ,…,C′ n },C′ j Refers to the jth first preset point, j ranges from 1 to n, and n refers to the first preset pointThe number of points, wherein the first-class preset points refer to points which are specified in advance and corner points of the target image;
s1013, obtaining C 1 At the fixed area D ζ1 Randomly selecting a first preset point at D', marked C 2 (ii) a Wherein, C 1 For a randomly selected first preset point, D' means the removal of the fixed zone D ζ1 D = (D) second fixed area list of (c) 1 ,…,D ζ1 ,…,D ζ2 ,…,D ψ ),D ζ The method comprises the steps that a zeta-th fixed area is divided in a target image, the value range of zeta is from 1 to psi, and psi refers to the number of the fixed areas;
s1015, obtain C 2 Is located in a fixed area D ζ2 ;
S1017, a second preset point list C "= { C" ", corresponding to D", is obtained 1 ,…,C″ j1 ,…,C″ n1 },C″ j1 The number of the j1 th second preset points is the j1 th, the value range of the j1 is 1 to n1, the n1 is the number of the second preset points, and the second preset points are the first preset points located at D'; d' means removing the fixed region D ζ1 And D ζ2 A third fixed area list of (2);
s1019, go through C', and obtain C ″) j1 And C 1 、C 2 Constitute a first planar area list α "= { α ″" 1 ,…,α″ j1 ,…,α′′ n1 And acquires a first planar area list S "= { S", to which the first planar area list corresponds 1 ,…,S″ j1 ,…,S″ n1 In which α ″) j1 Means C ″ j1 And C 1 、C 2 Formed flat areas, S j1 Means alpha ″) j1 The corresponding area;
s1021, acquiring a second plane area list S = { S' based on S 1 ,…,S j2 ,…,S n1 And will S 1 The corresponding plane area is taken as a target plane area, and the 3 rd point corresponding to the target plane area is marked as C 3 Wherein S is j2 Refers to the j2 th plane area in the second plane area listField, j2 ranges from 1 to n1, where S j2 ≥S j2+1 ;
Based on S1011-S1021, affine transformation is carried out by selecting 3 points of a target image, two points are randomly selected from the pre-designated points and angular points, a fixed area where the two points are located is obtained, the remaining one point in the remaining fixed area is selected to form a plane area, and a third point with the largest plane area is obtained to be used as a target point, so that the situation that three angular points are used in a common method but two or more angular points are true is avoided, and the use condition of affine transformation is expanded; meanwhile, more special points on the target image can be used as target points, so that the identification and acquisition are easier, such as the center point of a preset position area and the center point of an identification code.
S103, a text recognition area list B = { B ] of the intermediate image is acquired 1 ,…,B r ,…,B r1 },B r =(X r ,Y r ,H r ,L r ),B r Is the r-th text recognition area of the intermediate image, and the value range of r is 1 to r 1 ,r 1 Refers to the number of text recognition regions, X r Means B r Coordinates of the upper left-hand corner X-axis, Y r Means for B r Coordinates of the upper left-hand Y-axis, H r Means for B r Height of (L) r Means B r Of the length of (c).
Specifically, in the invention, the upper left corner of the target designated image is taken as the origin of the coordinate axes, the positive direction of the X axis is horizontal to the right, and the positive direction of the Y axis is vertical to the down.
In one embodiment of the present invention, B is obtained by r :
S1031, acquiring a target specified image based on the specified image list A and the target image;
s1032 acquires a first history image list B ' = { B ' corresponding to the target designated image ' 1 ,…,B′ s ,…,B′ s1 S has a value ranging from s to s 1 ,s 1 It is referred to the number of the history images,
s1033, normalizing the first historical image to obtain a second historical image;
s1034, an r text recognition area list B '= { B' of the second history image is obtained 1 ,…,B″ s ,…,B″ s1 },B″ s =(X″ s ,Y″ s ,H″ s ,L″ s ),B″ s Is the r-th text recognition area X' corresponding to the second history image s Is referred to as B ″) s Coordinates of the upper left-hand corner X-axis, Y ″) s Is referred to as B ″) s Coordinates of the upper left Y-axis, H ″) s Is meant for B ″) s Height, L ″) s Is referred to as B ″) s The length of (d);
s1035, obtaining B r :X r =X″ s ,Y r =Y″ s ,H r =max(H″ s ),L r =max(L″ s )。
Specifically, before S101, the method further includes:
s1, acquiring a target position corresponding to an intermediate image, identifying based on the target position, and acquiring a target position character string;
s2, traversing the specified image A i A designated position character string corresponding to the target position, and when the designated position character string is equal to the target position character string, a designated image A corresponding to the designated position character string i As a target designation image corresponding to the target image.
Specifically, those skilled in the art know that the position of the target position character string can be obtained through a neural network training method.
Based on S1-S2, the target position character string is recognized according to the preset position region, and when the designated title character string is equal to the target title character string, it can be understood that the title of the designated image is the same as the title of the target image, so the designated image is the target designated image corresponding to the target image.
The invention also comprises after S2:
s3, the first target point list C = { C 1 ,C 2 ,C 3 Obtaining a second target point list theta = { theta through affine transformation processing 1 ,θ 2 ,θ 3 },
S4, obtaining A i ={A i1 ,A i2 ,A i3 },θ=A i ,A i A list of target points corresponding to the specified image.
Specifically, when the number Sum >1 of target specification images, the following steps are performed:
s31, acquiring a second preset character string list E = { E } corresponding to the target designated image in a second preset position area 1 ,…,E g ,…,E z },E g The image processing method comprises the steps that a second preset character string corresponding to the g-th target designated image is indicated, the value range of g is 1 to z, and z is the number of the target designated images;
s33, acquiring a second target character string corresponding to the target image in a second preset position area;
and S35, traversing the E to enable the second middle preset character string to be equal to the second target character string, and when the number k' =1 of the second middle preset character string, taking the target designated image corresponding to the second middle preset character string as a final target designated image.
In an embodiment of the present invention, the second intermediate preset character string may be a special field such as "copy".
It can be understood that: when the number of the target designated images is larger than 1, a plurality of templates with the same title or templates of the same type belonging to the title exist, therefore, when a plurality of target designated images exist, the final target designated image is determined by comparing the second preset character string, the types of the templates are various, the matched target images are more accurate, meanwhile, the comparison of the designated position character string and the second middle character string is carried out according to the sequence, the time is saved, and the efficiency is improved.
In another embodiment of the present invention, H can also be obtained in S1035 by the following steps r :
H r =(1/s 1 )∑ s1 s=1 H″ s 。
In yet another embodiment of the present invention, L can also be obtained in S1035 by the following steps r :
L r =(1/s 1 )∑ s1 s=1 L″ s 。
Based on S101-S103, affine transformation is carried out on a target image based on three points, so that the direction of the image after affine transformation is completely the same as that of a target designated image, the target image comprises a plurality of text recognition areas, and the length of each text recognition area is determined by obtaining the minimum value and the maximum value of the starting position and the cut-off position of the corresponding text recognition area in S text images.
Example 2
On the basis of the embodiment 1, the invention also comprises a method for identifying based on the abnormal characters, which comprises the following steps:
a method for identifying an area based on obtaining abnormal characters is characterized by comprising the following steps:
s201, when the r text recognition area is not rectangular, based on the CRNN model, obtaining a mapping proportion K 3 A first text recognition result list X = { X ] corresponding to the r-th text recognition area 1 ,…,X k ,…,X k1 },X k The coordinate of the center of a recognition area corresponding to the kth character on the X axis is defined, and the value range of k is 1 to k 1 ,k 1 The number of characters in the r-th text recognition area is referred to;
s202, identifying a list and mapping proportion K based on the first text 3 Acquiring a second text recognition result list O = { O = { O = 1 ,…,O k ,…,O k1 },O k =(X k1 ,Y k1 ,X k2 ,Y k2 ) Wherein, in the process,
X k1 =K 3 *X k -H r /2,Y k1 =Y″ s ,X k2 =K 3 *X k +H r /2,Y k2 =Y″ s +H″ s wherein, Y ″) s Refers to the coordinate of the upper left corner Y axis of the r-th text recognition area, H r Identifying a height of the area for the r-th text;
s203, carrying out equal ratio division on the length of the r-th text recognition area based on L ', and acquiring a third text recognition result list O ' = { O ') 1 ,…,O″ k ,…,O″ k1 },O″ k =(X k3 ,Y k3 ,X k4 ,Y k4 ),X k3 Means the coordinate of the upper left corner of the third identification region corresponding to the kth character on the X axis, Y k3 Is the coordinate, X ', of the upper left corner of the third identification region corresponding to the k-th character on the X axis' k2 Means the coordinate of the lower right corner of the third recognition area corresponding to the k character on the X axis, Y k4 Refers to the coordinate of the lower right corner of the third recognition area corresponding to the k-th character on the Y-axis, wherein L' = L r /k 1 ;
S204, when | X k1 -X k3 |<(Y k2 -Y k1 +Y k4 -Y k3 ) (v 4) set of first position coordinates ((X) k1 +X k3 )/2,(Y k1 +Y k3 )/2,(X k2 +X k4 )/2,(Y k2 +Y k4 ) (v 2) as the kth identification area;
s205, when | X k1 -X k3 |≥(Y k2 -Y k1 +Y k4 -Y k3 ) /4, set the second position coordinate (X) k1 ,Y k1 ,X k2 ,Y k2 ) As the kth identification region;
s206, based on the k identification area, acquiring a first heightening identification area, wherein the first heightening identification area takes the upper edge of the k identification area as a starting position, increases the height rho in the opposite direction of the Y axis, and has the length of character length L r P is a second preset growth factor;
s207, when the first pixel value of the first heightened identification area is larger than the preset pixel value threshold, judging that the second heightened area reaches the sigma-th area 1 Pixel value of heightened region, σ 1 Is a predetermined height threshold, said firstThe pixel value refers to the average value of the pixel values of all points in the first heightened identification area;
s208, when the average pixel value of the sigma +1 th heightening area is not more than a preset pixel threshold value, a single character text recognition area corresponding to the sigma heightening area is obtained as a final recognition area, wherein the single character text recognition area takes the upper edge of the sigma heightening area as an initial position and faces to the Y-axis direction, and the height is H r A rectangle having a width of the length of the third position of the kth character.
Based on S201-S208, when there is an abnormal character, for example, when the character is shifted up and down, the detection using the text recognition area may cause incomplete detection, in the prior art, the range of the text recognition area is often expanded to ensure that all characters are detected, but the detection area is expanded at the same time, and the recognition effect of the up-down tilt of the character is not good, so the invention adopts the second preset growth factor to detect the characters one by one, thereby improving the detection accuracy of a single character.
Example 3
On the basis of the embodiment 2, the invention further comprises a method for recognizing the target text based on the text, wherein the method comprises the following steps:
s301, processing the target image to acquire polygonal labeling information corresponding to the text recognition area;
s302, based on the labeling information of the polygon, the inward height and length of the polygon are reduced to obtain a first label, wherein,
L=L 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H=H 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ) Wherein H is 1 Height, L, of the rectangle to which the polygon is labeled 1 The length of a rectangle marked by a polygon is represented, r is an empirical coefficient, H is the height of the rectangle marked by the reduced polygon, L is the length of the rectangle marked by the reduced polygon, and k is a preset reduction parameter;
s303, expanding the outward height and length of the polygon based on the labeling information of the polygon to obtain a second label,
L 2 =L 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 =H 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 refers to the height, L, of the rectangle of the expanded polygon 2 The length of the rectangle marked by the expanded polygon;
s304, inputting the first label, the second label and the target image into an image processing model to obtain a final text recognition area;
and S305, performing character recognition based on the final text recognition area to obtain a target character string.
In the prior art, when a DBNet calculates a label, an offset D = a × r/L is calculated, where D is an amount of shortening each side of a labeled polygon, a is an area of the polygon, r is an empirical coefficient 1.5, and L is a length of a side of a multi-deformation, and assuming that our text objects are all rectangles, the formula can be simplified as follows: d = w × r/(w + h), we consider w as a constant and we derive h as a partial derivative = r (w/(w + h)) × 2, the partial derivative is constantly greater than 0, and increases, let x = w/h derive r (x/(x + 1)) × 2, it can be seen that the partial derivative increases, i.e. the partial derivative increases with increasing aspect ratio, i.e. D increases with increasing aspect ratio, which in turn results in a text recognition region with a larger aspect ratio, whose width after shrank is relatively small, which prompts the model to learn a relatively narrow region when encountering a text region with a larger aspect ratio, which prompts the model to output a narrower region that does not completely cover the top and bottom borders of the text.
Based on S301 to S305, a to-be-processed text image is obtained, the to-be-processed text image is preprocessed, an intermediate text image is obtained, during preprocessing, a labeled data set is obtained, data enhancement is performed on the labeled data set, polygon labeling is traversed, when there is a target region in a character content, polygon labeling is retained and a target region corresponding to the polygon labeling is obtained, pixels are inwardly reduced on each side of the polygon labeling, a preset reduction parameter is introduced during reduction, so that the length and the width of the target region can be properly reduced during reduction, and it is avoided that in reduction, for a rectangle with a large aspect ratio difference, the text region cannot completely cover the upper and lower boundaries of the character, so the preset reduction parameter is introduced, the length and the width of the rectangle are adaptively reduced, and finally the target character is obtained.
Wherein, after S305, the method further comprises the following steps:
s3051, acquiring a target text recognition area list Q = { Q) corresponding to a text image to be processed 1 ,…,Q v ,…,Q β Corresponding target text character string, Q, corresponding to the target text recognition area v The method comprises the steps that a v-th target text recognition area corresponding to a text image to be processed is defined, the value range of v is 1 to beta, and beta is the number of the target text recognition areas;
s3053, traversing a target text recognition area list Q according to the r-th text recognition area corresponding to the target designated image corresponding to the text image to be processed to obtain Q v A center point of (a);
s3055, when Q is v When the center point coordinate of (2) belongs to the area range of the r-th text recognition area, the r-th text recognition area and the target text recognition area Q are obtained v The intersection set IoU;
s3057, when the IoU is larger than a preset intersection threshold value, Q is added v And associating the corresponding target text to the r text recognition area to form a key value pair.
Based on S3051-S3057, when the central point of the target text recognition area is judged to be in the r-th text recognition area corresponding to the target appointed image, acquiring an intersection set of the r-th text recognition area and the target text recognition area, and generating a corresponding key value pair when the intersection set meets a preset intersection range, so that the formed key value pair is more accurately corresponding.
In an embodiment of the present invention, the present invention further includes the steps of:
s10, a predefined feature list U = { is obtained 1 ,…,U γ ,…,U δ },U γ Refers to the gamma-th feature, gamma ranges from 1 to delta, and delta refers to the predefined number of features.
In one embodiment of the present invention, the predefined feature list includes an identification code, a stamp identifier, a fingerprint identifier, and a signature identifier.
Preferably, δ ≧ 3.
S20, detecting whether the text image to be processed comprises the predefined features.
S30, when the text image to be processed has U γ When in use, U is turned on γ The corresponding key-value pair is labeled "1".
S40, when the text image to be processed does not have U γ When in use, U is turned on γ The corresponding key-value pair is labeled "0".
Further, the key-value pair identifier is used for identifying whether the feature has an identifier of the text image to be processed, wherein the key-value pair identifier is "1" or "0"; it can be understood that: those skilled in the art know that when a feature exists in the present image to be processed, the key-value pair is identified as "1" or "0"; otherwise, when the feature does not have a text image to be processed, the key-value pair is identified as "0" or "1".
Preferably, when the characteristic exists in the to-be-processed text image, the key-value pair is marked as "1"; otherwise, when the feature does not have a text image to be processed, the key-value pair is identified as "0".
And based on S10-S40, judging whether the to-be-processed text image comprises the characteristics such as the identification code and the like by using the universal target detection, and detecting the characteristics in the to-be-processed text image.
Embodiments of the present invention also provide a non-transitory computer-readable storage medium, which may be configured in an electronic device to store at least one instruction or at least one program for implementing a method of the method embodiments, where the at least one instruction or the at least one program is loaded into and executed by a processor to implement the method provided by the above embodiments.
Embodiments of the present invention also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
Embodiments of the present invention also provide a computer program product comprising program code means for causing an electronic device to carry out the steps of the method according to various exemplary embodiments of the invention described above in the present description, when said program product is run on the electronic device.
Although some specific embodiments of the present invention have been described in detail by way of illustration, it should be understood by those skilled in the art that the above illustration is only for the purpose of illustration and is not intended to limit the scope of the invention. It will also be appreciated by those skilled in the art that various modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.
Claims (10)
1. A data processing system for acquiring a text recognition area, the system comprising an imaging device, a database in which a specified image list A = { A } is stored, a processor, and a memory in which a computer program is stored 1 ,…,A i ,…,A m },A i Is the ith designated image, i has a value in the range of 1 to m, m is the number of designated images, and when the computer program is executed by the processor, the following steps are realized:
s101, carrying out affine transformation processing on a target image to obtain an intermediate image corresponding to the target image, wherein the target image is an image corresponding to a target text acquired by a camera device;
wherein, when affine transformation is carried out on the target image, based on the first target point list C = { C = { (C) 1 ,C 2 ,C 3 Acquiring a target point list C by the following steps:
s1011, obtaining a first preset point list C '= { C' 1 ,…,C′ j ,…,C′ n },C′ j Refers to the jth first preset point, j ranges from 1 to n,n refers to the number of first preset points, wherein the first type of preset points refer to points which are specified in advance and corner points of a target image;
s1013, obtaining C 1 At the fixed area D ζ1 Randomly selecting a first preset point at D', marked C 2 (ii) a Wherein, C 1 For a randomly selected first preset point, D' means the removal of the fixed zone D ζ1 D = (D) second fixed area list of (c) 1 ,…,D ζ1 ,…,D ζ2 ,…,D ψ ),D ζ1 Is the zeta 1 th fixed area divided in the target image, and the value range of the zeta 1 is from 1 to psi, D ζ2 The method is characterized in that the method is used for dividing a zeta 2 th fixed area in a target image, the value range of zeta 2 is from 1 to psi, zeta 1 is not equal to zeta 2, and psi is the number of the fixed areas;
s1015, obtain C 2 Is located in a fixed area D ζ2 ;
S1017, a second preset point list C "= { C" ", corresponding to D", is obtained 1 ,…,C″ j1 ,…,C″ n1 },C″ j1 The number of the second preset points is j1, the value range of j1 is 1 to n1, n1 is the number of the second preset points, and the second preset points are the first preset points positioned at D'; d' means removing the fixed region D ζ1 And D ζ2 A third fixed area list of (1);
s1019, go through C', and obtain C ″) j1 And C 1 、C 2 Constitute a first planar area list α "= { α ″" 1 ,…,α″ j1 ,…,α″ n1 And obtaining a first planar area list S '= { S' = corresponding to the first planar area list 1 ,…,S″ j1 ,…,S″ n1 In which α ″) j1 Means C j1 And C 1 、C 2 Formed of a flat area, S j1 Means alpha ″' j1 The corresponding area;
s1021, obtaining a second planar area region list S = { S =, } based on S ″ 1 ,…,S j2 ,…,S n1 And will S 1 The corresponding planar area is used as the target planar area, targetThe 3 rd point corresponding to the plane area is marked as C 3 Wherein S is j2 Refers to the j2 th plane area in the second plane area list, the value range of j2 is 1 to n1, wherein S j2 ≥S j2+1 ;
S103, a text recognition area list B = { B ] of the intermediate image is acquired 1 ,…,B r ,…,B r1 },B r =(X r ,Y r ,H r ,L r ),B r Is the r-th text recognition area of the intermediate image, and the value range of r is 1 to r 1 ,r 1 Refers to the number of text recognition areas, X r Means B r Coordinates of the upper left corner X-axis, Y r Means for B r Coordinates of the upper left-hand Y-axis, H r Means for B r Height of (L) r Means B r Length of (d).
2. The system of claim 1, wherein S103 obtains B according to the following method r :
S1031, acquiring a target specified image based on the specified image list A and the target image;
s1032, a first history image list B ' = { B ' corresponding to the target specified image is obtained ' 1 ,…,B′ s ,…,B′ s1 S has a value in the range of 1 to s 1 ,s 1 It is referred to the number of the history images,
s1033, normalizing the first historical image to obtain a second historical image;
s1034, an r & ltth & gt text recognition area list B '= { B' ″ of the second history image is obtained 1 ,…,B″ s ,…,B″ s1 },B″ s =(X″ s ,Y″ s ,H″ s ,L″ s ),B″ s Is the r-th text recognition area X' corresponding to the second history image s Is meant for B ″) s Coordinates of the upper left-hand X-axis, Y ″) s Is referred to as B ″) s Coordinates of the upper left Y-axis, H ″) s Is meant for B ″) s Height, L ″) s Is meant for B ″) s The length of (d);
s1035, obtaining B r :X r =X″ s ,Y r =Y″ s ,H r =max(H″ s ),L r =max(L″ s )。
3. The system of claim 1, further comprising, prior to S101:
s1, acquiring a target position corresponding to the intermediate image, identifying based on the target position, and acquiring a target position character string;
s2, traversing the specified image A i A designated position character string corresponding to the target position, and when the designated position character string is equal to the target position character string, a designated image A corresponding to the designated position character string is assigned i As a target designation image corresponding to the target image.
4. The system of claim 3, further comprising after S2:
s3, the first target point list C = { C 1 ,C 2 ,C 3 Obtaining a second target point list theta = { theta through affine transformation processing 1 ,θ 2 ,θ 3 };
S4, obtaining A i ={A i1 ,A i2 ,A i3 },θ=A i ,A i1 Is A i First target point of (A) i2 Is A i A second target point of (A) i3 Is A i The third target point.
5. The system according to any one of claims 2 or 3, wherein the center point of the preset position area is acquired as C 1 。
6. System according to claim 2 or 3, characterized in that the central point of acquisition of the identification code is marked C 1 。
7. The system according to claim 3, further comprising in S3: when the number Sum >1 of the target specification images, the following steps are performed:
s31, acquiring a second preset character string list E = { E } corresponding to the target designated image in a second preset position area 1 ,…,E g ,…,E z },E g The first preset character string is corresponding to the g-th target designated image, the value range of g is 1 to z, and z is the number of the target designated images;
s33, acquiring a second target character string corresponding to the target image in a second preset position area;
and S35, traversing the E to enable the second middle preset character string to be equal to the second target character string, and when the number k' =1 of the second middle preset character string, taking the target designated image corresponding to the second middle preset character string as a final target designated image.
8. The system of claim 2, wherein H is further obtained in S1035 by r :
H r =(1/s 1 )∑ s1 s=1 H″ s 。
9. The system according to claim 2, wherein the step of obtaining L in S1035 further comprises r :
L r =(1/s 1 )∑ s1 s=1 L″ s 。
10. The system of claim 1, further comprising the steps of:
s301, processing the target image to acquire polygonal labeling information corresponding to the text recognition area;
s302, based on the labeling information of the polygon, the polygon is reduced in height and length to obtain a first label, wherein,
L=L 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H=H 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ) Wherein H is 1 Is composed of
Height, L, of the rectangle of the polygon label 1 The length of a rectangle marked by a polygon is represented, r is an empirical coefficient, H is the height of the rectangle marked by the reduced polygon, L is the length of the rectangle marked by the reduced polygon, and k is a preset reduction parameter;
s303, expanding the outward height and length of the polygon based on the labeling information of the polygon to obtain a second label,
L 2 =L 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 =H 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 is referred to as an expander
Height, L, of rectangle marked by post-tensioned polygon 2 The length of the rectangle marked by the expanded polygon is referred to;
s304, inputting the first label, the second label and the target image into an image processing model to obtain a final text recognition area;
and S305, performing character recognition based on the final text recognition area to obtain a target character string.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210984372.7A CN115331230B (en) | 2022-08-17 | 2022-08-17 | Data processing system for acquiring text recognition area |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210984372.7A CN115331230B (en) | 2022-08-17 | 2022-08-17 | Data processing system for acquiring text recognition area |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115331230A CN115331230A (en) | 2022-11-11 |
CN115331230B true CN115331230B (en) | 2023-04-14 |
Family
ID=83923138
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210984372.7A Active CN115331230B (en) | 2022-08-17 | 2022-08-17 | Data processing system for acquiring text recognition area |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115331230B (en) |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107688806B (en) * | 2017-08-21 | 2021-04-20 | 西北工业大学 | Affine transformation-based free scene text detection method |
CN113537189A (en) * | 2021-06-03 | 2021-10-22 | 深圳市雄帝科技股份有限公司 | Handwritten character recognition method, device, equipment and storage medium |
CN113313113B (en) * | 2021-06-11 | 2022-09-23 | 北京百度网讯科技有限公司 | Certificate information acquisition method, device, equipment and storage medium |
CN114119949A (en) * | 2021-09-23 | 2022-03-01 | 上海仪电人工智能创新院有限公司 | Method and system for generating enhanced text synthetic image |
CN114155527A (en) * | 2021-11-12 | 2022-03-08 | 虹软科技股份有限公司 | Scene text recognition method and device |
CN114067339A (en) * | 2021-11-26 | 2022-02-18 | 中国工商银行股份有限公司 | Image recognition method and device, electronic equipment and computer readable storage medium |
-
2022
- 2022-08-17 CN CN202210984372.7A patent/CN115331230B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN115331230A (en) | 2022-11-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110427793B (en) | Bar code detection method and system based on deep learning | |
US11030436B2 (en) | Object recognition | |
CN111325104B (en) | Text recognition method, device and storage medium | |
US8687886B2 (en) | Method and apparatus for document image indexing and retrieval using multi-level document image structure and local features | |
CN111445517A (en) | Robot vision end positioning method and device and computer readable storage medium | |
CN109685059B (en) | Text image labeling method, text image labeling device and computer readable storage medium | |
WO2010114486A1 (en) | Apparatus and methods for analysing goods packages | |
CN110675940A (en) | Pathological image labeling method and device, computer equipment and storage medium | |
CN108427959A (en) | Board state collection method based on image recognition and system | |
US20190354791A1 (en) | Character recognition method | |
CN113903024A (en) | Handwritten bill numerical value information identification method, system, medium and device | |
CN112001406A (en) | Text region detection method and device | |
Tribak et al. | QR code recognition based on principal components analysis method | |
CN114648771A (en) | Character recognition method, electronic device and computer readable storage medium | |
CN115205861B (en) | Method for acquiring abnormal character recognition area, electronic equipment and storage medium | |
CN117670884B (en) | Image labeling method, device, equipment and storage medium | |
CN114241463A (en) | Signature verification method and device, computer equipment and storage medium | |
CN115331230B (en) | Data processing system for acquiring text recognition area | |
CN112132054A (en) | Document positioning and segmenting method based on deep learning | |
EP2993623B1 (en) | Apparatus and method for multi-object detection in a digital image | |
CN111079575A (en) | Material identification method and system based on packaging image characteristics | |
CN113837129B (en) | Method, device, equipment and storage medium for identifying wrongly written characters of handwritten signature | |
CN115331231A (en) | Method for recognizing target text based on text, electronic equipment and storage medium | |
CN115063804A (en) | Seal identification and anti-counterfeiting detection method and system | |
CN115564734A (en) | Product detection method, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |