CN115331231B - Method for identifying target text based on text, electronic equipment and storage medium - Google Patents

Method for identifying target text based on text, electronic equipment and storage medium Download PDF

Info

Publication number
CN115331231B
CN115331231B CN202210984550.6A CN202210984550A CN115331231B CN 115331231 B CN115331231 B CN 115331231B CN 202210984550 A CN202210984550 A CN 202210984550A CN 115331231 B CN115331231 B CN 115331231B
Authority
CN
China
Prior art keywords
text
target
polygon
image
text recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210984550.6A
Other languages
Chinese (zh)
Other versions
CN115331231A (en
Inventor
石江枫
于伟
靳雯
赵洲洋
王全修
吴凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rizhao Ruian Information Technology Co ltd
Beijing Rich Information Technology Co ltd
Original Assignee
Rizhao Ruian Information Technology Co ltd
Beijing Rich Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rizhao Ruian Information Technology Co ltd, Beijing Rich Information Technology Co ltd filed Critical Rizhao Ruian Information Technology Co ltd
Priority to CN202210984550.6A priority Critical patent/CN115331231B/en
Publication of CN115331231A publication Critical patent/CN115331231A/en
Application granted granted Critical
Publication of CN115331231B publication Critical patent/CN115331231B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1473Recognising objects as potential recognition candidates based on visual cues, e.g. shapes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/1444Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields
    • G06V30/1448Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields based on markings or identifiers characterising the document or the area
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/414Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Artificial Intelligence (AREA)
  • Character Input (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method for identifying target text based on text, which comprises the following steps: processing the target image to obtain polygon labeling information corresponding to the text recognition area; based on the labeling information of the polygon, reducing the inward length to be L to obtain a first label; expanding the label with the outward length of L based on the labeling information of the polygon to obtain a second label; inputting an image processing model based on the first label, the second label and the target image to obtain a final text recognition area; acquiring a target character string based on the final text recognition area; and introducing preset reduction parameters to adaptively reduce the length and width of the rectangle, and finally obtaining the target characters.

Description

Method for identifying target text based on text, electronic equipment and storage medium
Technical Field
The present invention relates to the field of semantic analysis, and in particular, to a method, an electronic device, and a storage medium for identifying a target text based on a text.
Background
At present, semantic segmentation is one of the key problems in the current computer vision field, from a macroscopic view, semantic segmentation is a high-level task, roads are paved for realizing complete understanding of scenes, the scene understanding is a core computer vision problem, the importance of the semantic segmentation is that more and more application programs infer knowledge from images, including automatic driving, heat engine interaction and the like, text detection based on the semantic segmentation can need to be reduced in a recognition process, but in the reduction process, as the difference of length-width ratio of a text recognition area increases, the reduction part also increases, and the text recognition area cannot completely cover the upper boundary and the lower boundary of the text.
Disclosure of Invention
Aiming at the technical problems, the invention adopts the following technical scheme: a method of identifying target text based on text, the method comprising the steps of:
s301, processing a target image to obtain polygon labeling information corresponding to a text recognition area;
s302, based on the labeling information of the polygon, the inward height and length of the polygon are reduced, a first label is obtained, wherein,
L=L 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H=H 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ) Wherein H is 1 Height, L, of a rectangle marked for a polygon 1 The length of the rectangle marked by the polygon is r an empirical coefficient, H is the height of the rectangle marked by the polygon after reduction, L is the length of the rectangle marked by the polygon after reduction, and k is a preset reduction parameter;
s303, expanding the outward height and length of the polygon based on the labeling information of the polygon to obtain a second label,
L 2 =L 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 =H 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 refers to the height, L, of the polygon marked rectangle after expansion 2 The length of the polygon marked rectangle after expansion;
s304, inputting the first label, the second label and the target image into an image processing model to obtain a final text recognition area;
s305, performing character recognition based on the final text recognition area to acquire a target character string.
The invention has at least the following beneficial effects:
the preset reduction parameters are introduced in the reduction process, so that the length and the width of the target area can be properly reduced in the reduction process, the situation that the text area cannot completely cover the upper and lower boundaries of the text for the rectangle with larger length-width ratio difference in the reduction process is avoided, and therefore the preset reduction parameters are introduced, the length and the width of the rectangle are adaptively reduced, and finally the target text is obtained.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a data processing system for obtaining text recognition areas according to an embodiment of the present invention.
Fig. 2 is a flowchart of a method for obtaining an abnormal text recognition area according to an embodiment of the present invention.
Fig. 3 is a flowchart of a method for identifying a target text based on a text according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
Example 1
A data processing system for acquiring text recognition areas, the system comprising a camera, a database, a processor and a memory storing a computer program, the database storing a prescribed mapImage list a= { a 1 ,…,A i ,…,A m },A i Referring to the i-th designated image, the value range of i is 1 to m, and m is the number of designated images, when the computer program is executed by the processor, the following steps are implemented:
s101, carrying out affine transformation processing on a target image to obtain an intermediate image corresponding to the target image, wherein the target image is an image corresponding to a target text acquired by an imaging device;
wherein, when affine transformation is carried out on the target image, the affine transformation is based on a first target point list C= { C 1 ,C 2 ,C 3 Acquiring a target point list C by:
s1011, acquiring a first preset fixed point list C '= { C' 1 ,…,C′ j ,…,C′ n },C′ j The j-th first preset points are indicated, the value range of j is 1 to n, n refers to the number of the first preset points, and the first preset points refer to points designated in advance and angular points of a target image;
s1013, obtain C 1 The fixed area D ζ1 Randomly selecting a first preset point at D', labeled C 2 The method comprises the steps of carrying out a first treatment on the surface of the Wherein C is 1 For a first preset point selected randomly, D' refers to removing the fixed region D ζ1 D= (D) 1 ,…,D ζ1 ,…,D ζ2 ,…,D ψ ),D ζ The fixed region zeta is divided in the target image, the value range of zeta is 1 to phi, and phi refers to the number of the fixed regions;
s1015, obtain C 2 The fixed area D ζ2
S1017, obtaining a second preset point list C "= { C", which corresponds to the D "" 1 ,…,C″ j1 ,…,C″ n1 },C″ j1 The j1 th second preset point is indicated, the value range of j1 is indicated as 1 to n1, n1 is indicated as the number of the second preset points, and the second preset point is indicated as the first preset point positioned at D'; d "means removing the fixed region D ζ1 And D ζ2 Third fixing of (2)A region list;
s1019, traversing C ", obtaining C j1 And C 1 、C 2 Form a first planar region list α "= { α" 1 ,…,α″ j1 ,…,α′′ n1 And acquiring a first plane area list S "= { S", which corresponds to the first plane area list 1 ,…,S″ j1 ,…,S″ n1 "wherein, alpha j1 Refers to C j1 And C 1 、C 2 Planar region of composition S j1 Refers to alpha j1 A corresponding area;
s1021, acquiring a second planar area region list S= { S based on S' 1 ,…,S j2 ,…,S n1 And will S 1 The corresponding planar area is taken as a target planar area, and the 3 rd point corresponding to the target planar area is marked as C 3 Wherein S is j2 Refers to the j 2-th plane area of the second plane area list, and the value range of j2 is 1 to n1, wherein S j2 ≥S j2+1
Based on S1011-S1021, affine transformation is carried out by selecting 3 points of a target image, two points are randomly selected from pre-designated points and angular points, a fixed area where the two points are located is obtained, the remaining one point is selected in the remaining fixed area to form a plane area, and a third point with the largest area of the plane area is obtained as a target point, so that the use of three angular points in a common method is avoided, but the situation that the angular points have two or more than two definite points is avoided, and the use condition of affine transformation is enlarged; meanwhile, a more specific point on the target image can be used as a target point, so that the target point can be more easily distinguished and acquired, such as a center point of a preset position area and a center point of an identification code.
S103, acquiring a text recognition area list B= { B of the intermediate image 1 ,…,B r ,…,B r1 },B r =(X r ,Y r ,H r ,L r ),B r Refers to the r text recognition area of the intermediate image, and the value range of r is 1 to r 1 ,r 1 Refers to the number of text recognition areas, X r Refers to B r Coordinates of the upper left-hand X-axis, Y r Refers to B r Coordinates of the upper left-hand Y-axis, H r Refers to B r Height, L of r Refers to B r Is a length of (c).
Specifically, in the present invention, with the upper left corner of the target specified image as the coordinate axis origin, the positive X-axis direction is horizontally rightward, and the positive Y-axis direction is vertically downward.
In one embodiment of the invention, B is obtained by r
S1031, acquiring a target specified image based on the specified image list A and the target image;
s1032, a first history image list B ' = { B ' corresponding to the target specified image is obtained ' 1 ,…,B′ s ,…,B′ s1 Value range of s is s to s 1 ,s 1 Refers to the number of history images that are displayed,
s1033, carrying out normalization processing on the first historical image to obtain a second historical image;
s1034, acquiring a r text recognition area list B '= { B' of the second historical image 1 ,…,B″ s ,…,B″ s1 },B″ s =(X″ s ,Y″ s ,H″ s ,L″ s ),B″ s Refers to the (r) th text recognition area X', corresponding to the second historical image s Refer to B s Coordinates of the upper left X-axis, Y s Refer to B s Coordinates of the Y-axis in the upper left corner, H s Refer to B s Height, L s Refer to B s Is a length of (2);
s1035, obtain B r :X r =X″ s ,Y r =Y″ s ,H r =max(H″ s ),L r =max(L″ s )。
Specifically, before S101, the method further includes:
s1, acquiring a target position corresponding to an intermediate image, and identifying based on the target position to acquire a target position character string;
s2, traversing the designated image A i A specified position character string corresponding to the target position, and when the specified position character string is equal to the target position character string, a specified image A corresponding to the specified position character string is displayed i And a target specified image corresponding to the target image.
Specifically, those skilled in the art know that the position of the target position string can be obtained by a neural network training method.
Based on S1-S2, the target position character string is identified according to the preset position area, and when the specified title character string is equal to the target title character string, it can be understood that the title of the specified image is identical to the title of the target image, so the specified image is regarded as the target specified image corresponding to the target image.
The invention further comprises following step S2:
s3, a first target point list C= { C 1 ,C 2 ,C 3 Obtaining a second target point list theta= { theta by affine transformation processing 1 ,θ 2 ,θ 3 },
S4, obtaining A i ={A i1 ,A i2 ,A i3 },θ=A i ,A i And a target point list corresponding to the designated image.
Specifically, when the number Sum >1 of target specified images, the following steps are performed:
s31, a second preset character string list E= { E corresponding to the target specified image in the second preset position area is obtained 1 ,…,E g ,…,E z },E g The value range of g is 1 to z, and z refers to the number of target specified images;
s33, acquiring a second target character string corresponding to the target image in a second preset position area;
and S35, traversing E so that the second intermediate preset character strings are equal to the second target character strings, and taking the target specified image corresponding to the second intermediate preset character strings as a final target specified image when the number k' of the second intermediate preset character strings is=1.
In an embodiment of the present invention, the second intermediate preset string may be a special field such as "copy".
It can be understood that: when the number of the target specified images is greater than 1, templates with the same title or templates with the same type under the title are multiple, so that when the number of the target specified images is multiple, the final target specified images are determined by comparing the second preset character strings, the types of the templates are multiple, the matched target images are more accurate, and meanwhile, the comparison of the character strings at the specified positions and the second intermediate character strings is performed sequentially, so that the time is saved and the efficiency is improved.
In another embodiment of the present invention, H can also be obtained in S1035 by the following steps r
H r =(1/s 1 )∑ s1 s=1 H″ s
In yet another embodiment of the present invention, L may also be obtained in S1035 by r
L r =(1/s 1 )∑ s1 s=1 L″ s
Based on S101-S103, affine transformation is carried out on the target image based on three points, so that the direction of the affine transformed image is identical with that of the target designated image, the target image comprises a plurality of text recognition areas, the length of the text recognition areas is determined by acquiring the minimum value and the maximum value of the initial position and the cut-off position of the text recognition areas at corresponding positions in S text images, and in the invention, the average value of the initial position and the cut-off position of the text recognition areas can be acquired through S text images to serve as the length of the text recognition areas.
Example 2
On the basis of embodiment 1, the invention also comprises a method for identifying based on abnormal characters, which comprises the following steps:
s201, when the (r) text recognition area is not rectangular, obtaining a mapping proportion K based on a CRNN model 3 First text corresponding to the r-th text recognition areaIdentification result list x= { X 1 ,…,X k ,…,X k1 },X k Refers to the coordinate of the center of the identification area corresponding to the kth character on the X axis, and the value range of k is 1 to k 1 ,k 1 Refers to the number of characters in the r text recognition region;
s202, based on the first text recognition list and the mapping proportion K 3 Obtaining a second text recognition result list O= { O 1 ,…,O k ,…,O k1 },O k =(X k1 ,Y k1 ,X k2 ,Y k2 ) Wherein, the method comprises the steps of, wherein,
X k1 =K 3 *X k -H r /2,Y k1 =Y″ s ,X k2 =K 3 *X k +H r /2,Y k2 =Y″ s +H″ s wherein Y' s Refers to the coordinate of the Y axis of the upper left corner of the r text recognition area, H r The height of the r text recognition area;
s203, carrying out equal proportion division on the length of the (r) th text recognition area based on L ' to obtain a third text recognition result list O ' = { O ' 1 ,…,O″ k ,…,O″ k1 },O″ k =(X k3 ,Y k3 ,X k4 ,Y k4 ),X k3 Refers to the coordinate of the upper left corner of the third recognition area corresponding to the kth character on the X axis, Y k3 Refers to the coordinate of the upper left corner of the third recognition area corresponding to the kth character on the X axis, X' k2 Refers to the coordinate of the lower right corner of the third recognition area corresponding to the kth character in the X axis, Y k4 Refers to the coordinate of the lower right corner of the third recognition area corresponding to the kth character on the Y axis, wherein L' =L r /k 1
S204, when |X k1 -X k3 |<(Y k2 -Y k1 +Y k4 -Y k3 ) First position coordinate set ((X) k1 +X k3 )/2,(Y k1 +Y k3 )/2,(X k2 +X k4 )/2,(Y k2 +Y k4 ) 2) as a kth recognition region;
s205, when |X k1 -X k3 |≥(Y k2 -Y k1 +Y k4 -Y k3 ) 4, the second position coordinate set (X k1 ,Y k1 ,X k2 ,Y k2 ) As a kth recognition region;
s206, based on the kth recognition region, acquiring a first heightened recognition region, wherein the first heightened region is formed by taking the upper edge of the kth recognition region as a starting position, increasing the height to be ρ in the reverse direction of the Y axis, and the length is the character length L r ρ is a second predetermined growth factor;
s207, when the first pixel value of the first increase identification area is greater than the preset pixel value threshold, judging that the second increase area is to be sigma-th 1 Pixel value, sigma, of the raised area 1 The first pixel value refers to an average value of pixel values of each point of the first heightened identification area;
s208, when the average pixel value of the sigma+1 increasing area is not greater than the preset pixel threshold value, acquiring a single text recognition area corresponding to the sigma increasing area as a final recognition area, wherein the single text recognition area refers to the direction of Y axis by taking the upper edge of the sigma increasing area as a starting position, and the height is H r A rectangle having a width that is the length of the third position of the kth character.
Based on S201-S208, when abnormal characters exist, for example, when the characters deviate up and down, the situation that the detection is incomplete can be generated by using the text recognition area for detection, in the prior art, the range of the text recognition area is often enlarged to ensure that all the characters are detected, but the detection area is enlarged at the same time, and the recognition effect of the vertical inclination of the characters is poor, so that the invention detects the characters one by one according to the second preset growth factor, and the detection precision of single characters is improved.
Example 3
On the basis of embodiment 2, the invention further comprises a method for identifying target text based on the text, which comprises the following steps:
s301, processing a target image to obtain polygon labeling information corresponding to a text recognition area;
s302, based on the labeling information of the polygon, the inward height and length of the polygon are reduced, a first label is obtained, wherein,
L=L 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H=H 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ) Wherein H is 1 Height, L, of a rectangle marked for a polygon 1 The length of the rectangle marked by the polygon is r an empirical coefficient, H is the height of the rectangle marked by the polygon after reduction, L is the length of the rectangle marked by the polygon after reduction, and k is a preset reduction parameter;
s303, expanding the outward height and length of the polygon based on the labeling information of the polygon to obtain a second label,
L 2 =L 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 =H 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 refers to the height, L, of the polygon marked rectangle after expansion 2 The length of the polygon marked rectangle after expansion;
s304, inputting the first label, the second label and the target image into an image processing model to obtain a final text recognition area;
s305, performing character recognition based on the final text recognition area to acquire a target character string. .
In the prior art, when DBNet calculates a label, the offset d=a×r/L is calculated, where D is the amount by which each side of the labeled polygon is shortened, a is the area of the polygon, r is the empirical coefficient 1.5, L is the side length of the multiple deformations, and assuming that all our text objects are rectangles, the formula can be simplified as: d=w+h/(w+h), where w is taken as a constant to bias h to obtain =r (w/(w+h)). R=r (x/(x+1)) and r/(x+1)) is obtained by increasing the partial derivative to be x=w/h, it can be found that the partial derivative is increasing, that is, as the aspect ratio increases, the partial derivative increases, that is, D increases with the increase of the aspect ratio, and as a result, a text recognition region with a larger aspect ratio is caused, the width after shrunk is relatively smaller, which causes the model to learn a region with a relatively narrow width when encountering a text region with a larger aspect ratio, which causes the model to output a narrower region, and the text region cannot completely cover the upper and lower boundaries of the text.
Based on S301-S305, a text image to be processed is obtained, preprocessing is carried out on the text image to be processed, an intermediate text image is obtained, when preprocessing is carried out, a labeling data set is obtained, data enhancement is carried out on the labeling data set, polygon labeling is traversed, when character content exists in a target area, the polygon labeling is reserved, a target area corresponding to the polygon labeling is obtained, pixels are reduced inwards on each side of the polygon labeling, preset reduction parameters are introduced in the reduction process, the length and width of the target area can be properly reduced in the reduction process, the situation that the rectangle with larger length-width ratio difference cannot be completely covered on the upper boundary and the lower boundary of characters in the reduction process is avoided, therefore, the preset reduction parameters are introduced, the length and width of the rectangle are reduced in a self-adaption mode, and finally the target characters are obtained.
Wherein, after S305, the method further comprises the following steps:
s3051, acquiring a target text recognition area list Q= { Q corresponding to a text image to be processed 1 ,…,Q v ,…,Q β Corresponding target text character string corresponding to target text recognition area, Q v The method is characterized in that the method comprises the steps of referring to a v-th target text recognition area corresponding to a text image to be processed, wherein the value range of v is 1 to beta, and beta refers to the number of the target text recognition areas;
s3053, traversing a target text recognition area list Q according to the r text recognition area corresponding to the target specified image corresponding to the text image to be processed to obtain Q v Is defined by a center point of (2);
s3055, when Q v When the center point coordinates of the (b) are in the region range of the (r) th text recognition region, acquiring the (r) th text recognition region and the target text recognitionOther region Q v IoU of the intersection set;
s3057 when IoU is greater than the preset intersection threshold, Q v The corresponding target text is associated with the r text recognition region to form a key value pair.
Based on S3051-S3057, according to the fact that the center point of the target text recognition area is judged to be in the r text recognition area corresponding to the target specified image, when the center point is in the range of the text recognition area corresponding to the target specified image, an intersection set of the r text recognition area and the target text recognition area is obtained, and when the intersection set meets the preset intersection range, a corresponding key value pair is generated, judging standard is increased, and accordingly formed key value correspondence is more accurate.
In an embodiment of the present invention, the present invention further includes the following steps:
s10, acquiring a predefined feature list U= { U 1 ,…,U γ ,…,U δ },U γ The gamma is the gamma characteristic, the value range of gamma is 1 to delta, and delta is the predefined characteristic quantity.
In one embodiment of the invention, the predefined list of features includes an identification code, a stamp identification, a fingerprint identification, a signature identification.
Preferably, delta is equal to or greater than 3.
S20, detecting whether the text image to be processed comprises predefined characteristics.
S30, when the text image to be processed exists U γ When U is set γ The corresponding key value pair is marked as "1".
S40, when the text image to be processed does not have U γ When U is set γ The corresponding key value pair is marked as "0".
Further, the key value pair identifier is used for identifying whether the feature has an identifier of a text image to be processed, wherein the key value pair identifier is 1 or 0; it can be understood that: those skilled in the art will appreciate that when a feature exists in the present image to be processed, the key value pair is identified as either a "1" or a "0"; otherwise, when the feature does not have a text image to be processed, the key-value pair is identified as "0" or "1".
Preferably, when the feature has a present image to be processed, the key-value pair is identified as "1"; otherwise, when the feature does not have a text image to be processed, the key-value pair is identified as "0".
Based on S10-S40, the general target detection is used for judging whether the text image to be processed comprises the characteristics such as the identification codes and the like, and the characteristics in the text image to be processed are detected.
Embodiments of the present invention also provide a non-transitory computer readable storage medium that may be disposed in an electronic device to store at least one instruction or at least one program for implementing one of the methods embodiments, the at least one instruction or the at least one program being loaded and executed by the processor to implement the methods provided by the embodiments described above.
Embodiments of the present invention also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
Embodiments of the present invention also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the invention as described in the specification, when said program product is run on the electronic device.
While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.

Claims (10)

1. A method for identifying a target text based on text, the method comprising the steps of:
s301, processing a target image to obtain polygon labeling information corresponding to a text recognition area;
s302, based on the labeling information of the polygon, the inward height and length of the polygon are reduced, a first label is obtained, wherein,
L=L 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H=H 1 -[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ) Wherein H is 1 Is that
Height, L of polygon-labeled rectangle 1 The length of the rectangle marked by the polygon is r an empirical coefficient, H is the height of the rectangle marked by the polygon after reduction, L is the length of the rectangle marked by the polygon after reduction, and k is a preset reduction parameter;
s303, expanding the outward height and length of the polygon based on the labeling information of the polygon to obtain a second label, L 2 =L 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 =H 1 +[L 1 *H 1 *r/(L 1 +H 1 )]*(1-k*L 1 /H 1 ),H 2 Refers to the height, L, of the polygon marked rectangle after expansion 2 The length of the polygon marked rectangle after expansion;
s304, inputting the first label, the second label and the target image into an image processing model to obtain a final text recognition area;
s305, performing character recognition based on the final text recognition area to acquire a target character string.
2. The method for identifying a target text based on text as recited in claim 1, wherein r is 1.5.
3. The method for recognizing a target text based on text according to claim 1, wherein k is 0.05.
4. The method for recognizing a target text based on text according to claim 1, further comprising the step of, after S305:
s3051, getTaking a target text recognition area list Q= { Q corresponding to a target image 1 ,…,Q v ,…,Q β Target text character string corresponding to target text recognition area, Q v The method is characterized in that the method refers to a v-th target text recognition area corresponding to a target image, the value range of v is 1 to beta, and beta refers to the number of the target text recognition areas;
s3053, traversing a target text recognition area list Q according to the r-th text recognition area corresponding to the target specified image corresponding to the target image to obtain Q v Is defined by a center point of (2);
s3055, when Q v When the center point coordinates of the (b) are within the region range of the (r) th text recognition region, acquiring the (r) th text recognition region and the target text recognition region (Q) v IoU of the intersection set;
s3057 when IoU is greater than the preset intersection threshold, Q v The corresponding target text is associated with the r text recognition region to form a key value pair.
5. The method for text-based recognition of a target text according to claim 4, wherein the target specification image and the target image are in the same coordinate system.
6. The method for recognizing a target text based on a text according to claim 1, further comprising the steps of:
s10, acquiring a predefined feature list U= { U 1 ,…,U γ ,…,U δ },U γ The gamma is the gamma characteristic, the value range of gamma is 1 to delta, and delta is the number of the characteristic which is defined in advance;
s20, detecting whether the target image comprises a predefined feature;
s30, when the target image exists U γ When U is set γ The corresponding key value pair is marked as "1";
s40, when the target image does not exist U γ When U is set γ The corresponding key value pair is marked as "0".
7. The method for text-based recognition of a target text of claim 6, wherein the predefined list of features includes an identification code.
8. The method for text-based recognition of a target text according to claim 6, wherein δ is ≡3.
9. A non-transitory computer readable storage medium having stored therein at least one instruction or at least one program loaded and executed by a processor to implement the method of any one of claims 1-8.
10. An electronic device comprising a processor and the non-transitory computer readable storage medium of claim 9.
CN202210984550.6A 2022-08-17 2022-08-17 Method for identifying target text based on text, electronic equipment and storage medium Active CN115331231B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210984550.6A CN115331231B (en) 2022-08-17 2022-08-17 Method for identifying target text based on text, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210984550.6A CN115331231B (en) 2022-08-17 2022-08-17 Method for identifying target text based on text, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115331231A CN115331231A (en) 2022-11-11
CN115331231B true CN115331231B (en) 2023-05-05

Family

ID=83923663

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210984550.6A Active CN115331231B (en) 2022-08-17 2022-08-17 Method for identifying target text based on text, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115331231B (en)

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105260733A (en) * 2015-09-11 2016-01-20 北京百度网讯科技有限公司 Method and device for processing image information
CN112001406B (en) * 2019-05-27 2023-09-08 杭州海康威视数字技术股份有限公司 Text region detection method and device
CN110378338A (en) * 2019-07-11 2019-10-25 腾讯科技(深圳)有限公司 A kind of text recognition method, device, electronic equipment and storage medium
CN111353489A (en) * 2020-02-27 2020-06-30 平安国际智慧城市科技股份有限公司 Text image processing method and device, computer equipment and storage medium
CN113591827A (en) * 2021-01-25 2021-11-02 腾讯科技(深圳)有限公司 Text image processing method and device, electronic equipment and readable storage medium
CN113065557A (en) * 2021-04-16 2021-07-02 潍坊工程职业学院 Image matching method based on character extraction
CN113610076A (en) * 2021-07-19 2021-11-05 用友网络科技股份有限公司 Text image processing method, system and readable storage medium
CN114067328A (en) * 2021-11-19 2022-02-18 中国建设银行股份有限公司 Text recognition method and device and electronic equipment

Also Published As

Publication number Publication date
CN115331231A (en) 2022-11-11

Similar Documents

Publication Publication Date Title
CN107833213B (en) Weak supervision object detection method based on false-true value self-adaptive method
US8965127B2 (en) Method for segmenting text words in document images
CN108921204B (en) Electronic device, picture sample set generation method, and computer-readable storage medium
US8687886B2 (en) Method and apparatus for document image indexing and retrieval using multi-level document image structure and local features
US9613299B2 (en) Method of identifying pattern training need during verification of recognized text
US20090304271A1 (en) Object region extracting device
CN107730553B (en) Weak supervision object detection method based on false-true value search method
CN110675940A (en) Pathological image labeling method and device, computer equipment and storage medium
CN112001406A (en) Text region detection method and device
US7370059B2 (en) Model of documents and method for automatically classifying a document
CN113569968B (en) Model training method, target detection method, device, equipment and storage medium
CN112257595A (en) Video matching method, device, equipment and storage medium
CN111222409A (en) Vehicle brand labeling method, device and system
CN114463767A (en) Credit card identification method, device, computer equipment and storage medium
CN115205861B (en) Method for acquiring abnormal character recognition area, electronic equipment and storage medium
WO2021147221A1 (en) Text recognition method and apparatus, and electronic device and storage medium
CN115331231B (en) Method for identifying target text based on text, electronic equipment and storage medium
CN112884866A (en) Coloring method, device, equipment and storage medium for black and white video
US9152876B1 (en) Methods and systems for efficient handwritten character segmentation
CN116091784A (en) Target tracking method, device and storage medium
Deniziak et al. Query by shape for image retrieval from multimedia databases
CN115331230B (en) Data processing system for acquiring text recognition area
CN115937862A (en) End-to-end container number identification method and system
CN116958977A (en) Signature position determining method, device, equipment and storage medium
US20230260262A1 (en) Automated annotation of visual data through computer vision template matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant