CN111199224B - Method and device for recognizing curved characters - Google Patents
Method and device for recognizing curved characters Download PDFInfo
- Publication number
- CN111199224B CN111199224B CN201811379524.0A CN201811379524A CN111199224B CN 111199224 B CN111199224 B CN 111199224B CN 201811379524 A CN201811379524 A CN 201811379524A CN 111199224 B CN111199224 B CN 111199224B
- Authority
- CN
- China
- Prior art keywords
- text
- word
- curved
- frames
- characters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
- G06V10/245—Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Character Discrimination (AREA)
Abstract
The disclosure provides a method and a device for recognizing curved characters, and relates to the field of character recognition. And acquiring text frames in the image to be detected and words corresponding to the text frames, detecting whether the words are bent text according to the angle difference between adjacent text frames in the words, inserting spaces between the text frames of the bent text, and inputting the processed bent text into a text recognition model for text recognition. Thus, detection and recognition of the bent text are realized.
Description
Technical Field
The present disclosure relates to the field of text recognition, and in particular, to a method and apparatus for recognizing curved text.
Background
At present, in the field of artificial intelligence, a single-shot multi-frame detection (single shot multibox detection, abbreviated as SSD) method can only detect horizontal characters, and an extended Seglink method can only detect inclined characters on the same straight line. The related art cannot detect the bent text.
Disclosure of Invention
The present disclosure proposes a scheme capable of detecting and recognizing curved text.
Some embodiments of the present disclosure provide a method for recognizing curved text, including:
acquiring text frames in an image to be detected and words corresponding to the text frames;
detecting whether the word is a bent word or not according to the angle difference between adjacent text frames in the word;
inserting spaces between character frames of the bent characters;
and inputting the processed bent characters into a character recognition model to perform character recognition.
In some embodiments, text boxes in the image to be detected are obtained by inputting the image to be detected into a convolutional neural network CNN algorithm,
the convolutional neural network algorithm is trained by using text samples in advance.
In some embodiments, the word corresponding to each text box is obtained by entering each text box into a depth first search DFS algorithm.
In some embodiments, a word is determined to be a curved word if the angular difference between adjacent text boxes in the word is between a minimum threshold and a maximum threshold.
In some embodiments, a word is determined to be non-curved if the angular difference between adjacent text boxes in the word is less than or equal to a minimum threshold;
if the angle difference between adjacent text boxes in a word is greater than or equal to a maximum threshold, the word is split.
In some embodiments, further comprising:
and inputting the non-bending characters and the angle average value information of each character frame in the non-bending characters into a character recognition model to perform character recognition.
In some embodiments, the word recognition model is a join sense time-classified CTC word recognition model.
In some embodiments, a graph model is built using each literal box as a node, and a DFS algorithm is used to find connected components from the graph model, each connected component being a word.
Some embodiments of the present disclosure provide a curved text recognition device, including:
a memory; and
a processor coupled to the memory, the processor configured to perform the curved text recognition method of any of the previous embodiments based on instructions stored in the memory.
Some embodiments of the present disclosure propose a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the curved text recognition method of any of the previous embodiments.
Drawings
The drawings that are required for use in the description of the embodiments or the related art will be briefly described below. The present disclosure will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings,
it will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without inventive faculty.
Fig. 1 is a flow chart illustrating a method for recognizing curved text according to some embodiments of the present disclosure.
Fig. 2 is a schematic structural diagram of a curved text recognition device according to some embodiments of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure.
Fig. 1 is a flow chart illustrating a method for recognizing curved text according to some embodiments of the present disclosure.
As shown in fig. 1, the method of this embodiment includes:
s110, acquiring a text box (which is set as segment and can be abbreviated as seg) in the image to be detected.
Wherein a text box, also called a "segment", is a bounding box that covers a portion of a word.
In some embodiments, the image to be detected is input into a convolutional neural network (CNN, convolutional Neural Networks) algorithm, and text boxes in the image to be detected are output. The convolutional neural network algorithm is trained by using text samples in advance. The descriptive parameters of the text box include, for example, (x, y, w, h, θ), where (x, y) is the position coordinates, (w, h) is the width and height, and the angle is θ.
S120, acquiring words (word) corresponding to the text frames.
In some embodiments, information for each text box is input into a Depth-First Search (DFS) algorithm to obtain a word corresponding to each text box.
Specifically, each text box is used as a node to build a graph model, and a DFS algorithm is used for finding connected components from the graph model, wherein each connected component is a word.
S130, detecting whether the word is a bent word according to the angle difference between adjacent text frames in the word.
The detection rule is, for example:
and comparing the angle difference between the adjacent text frames with a preset minimum threshold value and a preset maximum threshold value, and judging according to a comparison result.
If the angle difference between adjacent text boxes in a word is between a minimum threshold and a maximum threshold, indicating that the adjacent text boxes are on a curve, the word is determined to be a curved text.
If the angle difference between adjacent text boxes in a word is less than or equal to a minimum threshold value, indicating that the adjacent text boxes are on a straight line, the word is determined to be a non-curved text, such as a horizontal text or an inclined text on the same straight line.
If the angle difference between the adjacent text boxes in the word is greater than or equal to the maximum threshold value, the two text boxes are indicated to belong to different words, and the word is split.
S140, inserting spaces between character frames of the bent characters.
S150, inputting the processed bent characters into a character recognition model to perform character recognition.
In some embodiments, the word recognition model is a join sense temporal classification (CTC, connectionist Temporal Classification) word recognition model.
S160, inputting the angle average value information of the non-bending characters and the character frames in the non-bending characters into a character recognition model for character recognition.
In some embodiments, the text recognition model is a CTC text recognition model.
The embodiment realizes the detection and identification of the bent characters and can be applied to the detection of trademark marks, advertisements, artistic characters and other bent characters.
The above scheme is described algorithmically below.
Firstly, a text box (x, y, w, h, theta) in an image to be detected is obtained through a CNN algorithm.
Next, words corresponding to each text box are obtained by using DFS algorithm, and a word composed of n text boxes can be expressed as:
word=(seg 1 ,seg 2 ,seg 3 ,……,seg i ,……,seg n )
let each count variable i=0, j=1, k=0 in the cycle.
Let k=k+1, i=i+1, if i < n holds, the following operations 1-3) are cyclically performed:
1)word i [k]=seg i ,curve j =false,sum_θ j =θ i
wherein, cut represents the curved word mark, if false, the description is not curved word, if true, the description is curved word, θ i Text box seg i Is a function of the angle of (a).
2) Calculating the angle difference of adjacent text frames in the same connected component
diff_θ i =|θ i+1 -θ i |
3) Will be the angle difference diff theta i Comparing with a minimum threshold diff_min and a maximum threshold diff_max:
3-1) if diff_θ i Less than or equal to diff_min, indicating that the adjacent text frames are on a straight line, and calculating the angle average value aver of the text frames j :sum_θ j =sum_θ j +θ i+1 ,aver j =sum_θ j /k;
3-2) if diff/umin<diff_θ i < diff_max, indicating that the adjacent text box is on the curve, cut j Text box seg =true i And seg i+1 Space is inserted between the two;
3-3) if diff_θ i Gtoreq diff_max, indicating that the two text boxes belong to different words, splitting the word, letting j=j+1, k=0.
By circularly performing operations 1-3), individual word words can be obtained j 。
Finally, word is processed j Inputting a CTC model for character recognition.
Fig. 2 is a schematic structural diagram of a curved text recognition device according to some embodiments of the present disclosure.
As shown in fig. 2, the apparatus 200 of this embodiment includes:
a memory 210 and a processor 220 coupled to the memory 210, the processor 220 being configured to perform the network performance monitoring method of any of the foregoing embodiments based on instructions stored in the memory 210.
The memory 210 may include, for example, system memory, fixed nonvolatile storage media, and the like. The system memory stores, for example, an operating system, application programs, boot Loader (Boot Loader), and other programs.
Some embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the curved text recognition method of any of the foregoing embodiments.
It will be appreciated by those skilled in the art that embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flowchart and/or block of the flowchart illustrations and/or block diagrams, and combinations of flowcharts and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing description of the preferred embodiments of the present disclosure is not intended to limit the disclosure, but rather to enable any modification, equivalent replacement, improvement or the like, which fall within the spirit and principles of the present disclosure.
Claims (9)
1. A method of curved text recognition, comprising:
acquiring text frames in an image to be detected and words corresponding to the text frames;
detecting whether the word is a bent word or not according to the angle difference between adjacent text frames in the word;
inserting spaces between character frames of the bent characters;
inputting the processed bent characters into a character recognition model to perform character recognition;
and inputting the non-bending characters and the angle average value information of each character frame in the non-bending characters into a character recognition model to perform character recognition.
2. The method of claim 1, wherein,
the text box in the image to be detected is obtained by inputting the image to be detected into a convolutional neural network CNN algorithm,
the convolutional neural network algorithm is trained by using text samples in advance.
3. The method of claim 1, wherein,
words corresponding to each text box are obtained by inputting each text box into a depth first search DFS algorithm.
4. The method of claim 1, wherein,
a word is determined to be a curved word if the angular difference between adjacent text boxes in the word is between a minimum threshold and a maximum threshold.
5. The method according to claim 4, wherein the method comprises,
if the angle difference between adjacent text boxes in a word is less than or equal to a minimum threshold value, the word is determined to be non-curved text;
if the angle difference between adjacent text boxes in a word is greater than or equal to a maximum threshold, the word is split.
6. The method of claim 1, wherein,
the character recognition model is a join sense time classification CTC character recognition model.
7. The method of claim 3, wherein,
the graph model is built by taking each text box as a node, and the DFS algorithm is used for finding connected components from the graph model, wherein each connected component is a word.
8. A curved text recognition device, comprising:
a memory; and
a processor coupled to the memory, the processor configured to perform the curved word recognition method of any of claims 1-7 based on instructions stored in the memory.
9. A computer readable storage medium having stored thereon a computer program which when executed by a processor implements the curved text recognition method of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811379524.0A CN111199224B (en) | 2018-11-20 | 2018-11-20 | Method and device for recognizing curved characters |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811379524.0A CN111199224B (en) | 2018-11-20 | 2018-11-20 | Method and device for recognizing curved characters |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111199224A CN111199224A (en) | 2020-05-26 |
CN111199224B true CN111199224B (en) | 2023-06-23 |
Family
ID=70745695
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811379524.0A Active CN111199224B (en) | 2018-11-20 | 2018-11-20 | Method and device for recognizing curved characters |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111199224B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1131301A (en) * | 1995-03-13 | 1996-09-18 | 财团法人工业技术研究院 | Word cutting method |
EP0905643A2 (en) * | 1997-09-29 | 1999-03-31 | Xerox Corporation | Method and system for recognizing handwritten words |
JPH11353415A (en) * | 1999-05-31 | 1999-12-24 | Fujitsu Ltd | Image extracting device |
US6188790B1 (en) * | 1996-02-29 | 2001-02-13 | Tottori Sanyo Electric Ltd. | Method and apparatus for pre-recognition character processing |
JP2007316754A (en) * | 2006-05-23 | 2007-12-06 | Canon Inc | Handwritten character processing device and method |
CN101408937A (en) * | 2008-11-07 | 2009-04-15 | 东莞市微模式软件有限公司 | Method and apparatus for locating character row |
CN105809164A (en) * | 2016-03-11 | 2016-07-27 | 北京旷视科技有限公司 | Character identification method and device |
JP2017161969A (en) * | 2016-03-07 | 2017-09-14 | 日本電気株式会社 | Character recognition device, method, and program |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
NO20052656D0 (en) * | 2005-06-02 | 2005-06-02 | Lumex As | Geometric image transformation based on text line searching |
JP4956366B2 (en) * | 2007-10-16 | 2012-06-20 | キヤノン株式会社 | Image processing device |
WO2016125177A1 (en) * | 2015-02-05 | 2016-08-11 | Hewlett-Packard Development Company, L.P. | Character spacing adjustment of text columns |
US10121088B2 (en) * | 2016-06-03 | 2018-11-06 | Adobe Systems Incorporated | System and method for straightening curved page content |
RU2628266C1 (en) * | 2016-07-15 | 2017-08-15 | Общество с ограниченной ответственностью "Аби Девелопмент" | Method and system of preparing text-containing images to optical recognition of symbols |
-
2018
- 2018-11-20 CN CN201811379524.0A patent/CN111199224B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1131301A (en) * | 1995-03-13 | 1996-09-18 | 财团法人工业技术研究院 | Word cutting method |
US6188790B1 (en) * | 1996-02-29 | 2001-02-13 | Tottori Sanyo Electric Ltd. | Method and apparatus for pre-recognition character processing |
EP0905643A2 (en) * | 1997-09-29 | 1999-03-31 | Xerox Corporation | Method and system for recognizing handwritten words |
JPH11353415A (en) * | 1999-05-31 | 1999-12-24 | Fujitsu Ltd | Image extracting device |
JP2007316754A (en) * | 2006-05-23 | 2007-12-06 | Canon Inc | Handwritten character processing device and method |
CN101408937A (en) * | 2008-11-07 | 2009-04-15 | 东莞市微模式软件有限公司 | Method and apparatus for locating character row |
JP2017161969A (en) * | 2016-03-07 | 2017-09-14 | 日本電気株式会社 | Character recognition device, method, and program |
CN105809164A (en) * | 2016-03-11 | 2016-07-27 | 北京旷视科技有限公司 | Character identification method and device |
Non-Patent Citations (2)
Title |
---|
Liu Yang.synthetically supervised feature learning for scene text recognition.《computer vision-ECCV 2018》.2018,全文. * |
Liu,Yuliang.deep matching prior network:toward tighter multi-oriented text detection.《30th IEEE conference on computer vision and pattren recognition》.2017,全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN111199224A (en) | 2020-05-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230077355A1 (en) | Tracker assisted image capture | |
CN111327945B (en) | Method and apparatus for segmenting video | |
US8965127B2 (en) | Method for segmenting text words in document images | |
US9852511B2 (en) | Systems and methods for tracking and detecting a target object | |
CN110996169B (en) | Method, device, electronic equipment and computer-readable storage medium for clipping video | |
CN111696130B (en) | Target tracking method, target tracking device, and computer-readable storage medium | |
US10460175B1 (en) | Deep learning processing of video | |
CN109635740B (en) | Video target detection method and device and image processing equipment | |
CN110781960B (en) | Training method, classification method, device and equipment of video classification model | |
KR102540393B1 (en) | Method for detecting and learning of objects simultaneous during vehicle driving | |
CN112381104A (en) | Image identification method and device, computer equipment and storage medium | |
KR102195940B1 (en) | System and Method for Detecting Deep Learning based Human Object using Adaptive Thresholding Method of Non Maximum Suppression | |
KR20220093187A (en) | Positioning method and apparatus, electronic device, computer readable storage medium | |
Zhang et al. | Bing++: A fast high quality object proposal generator at 100fps | |
CN113052019B (en) | Target tracking method and device, intelligent equipment and computer storage medium | |
CN114078108B (en) | Method and device for processing abnormal region in image, and method and device for dividing image | |
US20240185590A1 (en) | Method for training object detection model, object detection method and apparatus | |
CN115393625A (en) | Semi-supervised training of image segmentation from coarse markers | |
KR101705584B1 (en) | System of Facial Feature Point Descriptor for Face Alignment and Method thereof | |
CN111199224B (en) | Method and device for recognizing curved characters | |
CN113128215A (en) | Artificial intelligence emotion analysis method and system | |
CN110428373B (en) | Training sample processing method and system for video frame interpolation | |
CN108154521B (en) | Moving target detection method based on target block fusion | |
CN110555182A (en) | User portrait determination method and device and computer readable storage medium | |
CN112989869B (en) | Optimization method, device, equipment and storage medium of face quality detection model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |