CN107301414B - Chinese positioning, segmenting and identifying method in natural scene image - Google Patents

Chinese positioning, segmenting and identifying method in natural scene image Download PDF

Info

Publication number
CN107301414B
CN107301414B CN201710484646.5A CN201710484646A CN107301414B CN 107301414 B CN107301414 B CN 107301414B CN 201710484646 A CN201710484646 A CN 201710484646A CN 107301414 B CN107301414 B CN 107301414B
Authority
CN
China
Prior art keywords
character
single character
candidate
region
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710484646.5A
Other languages
Chinese (zh)
Other versions
CN107301414A (en
Inventor
陈凯
韦建
何建华
周异
黄征
杜保发
周文贵
查宏远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Shangji Network Technology Co ltd
Original Assignee
Xiamen Shangji Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Shangji Network Technology Co ltd filed Critical Xiamen Shangji Network Technology Co ltd
Priority to CN201710484646.5A priority Critical patent/CN107301414B/en
Publication of CN107301414A publication Critical patent/CN107301414A/en
Application granted granted Critical
Publication of CN107301414B publication Critical patent/CN107301414B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/245Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)

Abstract

The invention provides a Chinese positioning, segmenting and identifying method in a natural scene image. The method comprises the steps of carrying out primary character positioning on an original picture through an FASText model, extracting a candidate character region, carrying out pre-segmentation on the candidate character region, then identifying a single character part of the pre-segmented character region, and carrying out further single character segmentation and identification on the field part. The method utilizes the accurate extraction of character stroke characteristics and the strong character recognition capability of a deep residual error neural network, combines a path tree method, simply and effectively realizes the purposes of Chinese positioning and recognition, and can be applied to various natural scenes without supervision and training.

Description

Chinese positioning, segmenting and identifying method in natural scene image
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a Chinese positioning, segmenting and identifying method in a natural scene image.
Background
Text recognition in natural scenes is a very important visual detection target, and texts in images have much useful information and are very important for understanding and acquiring visual contents. There are many related text recognition applications including road signs, license plates, tickets, etc.
Generally, the conventional OCR technology is influenced by a complex background of a natural scene, and it is difficult to correctly complete a related task. Overall, such tasks can be divided into two phases, localization and recognition of text. The positioning of the text is the precise positioning of the text position in the image, and mainly distinguishes fields from backgrounds according to the extraction of relevant character features, such as MSERs. Compared with the traditional text feature-based detection method, the method for realizing text positioning by training the deep neural network also appears at present. However, this method often requires a large amount of manual labeling data for training, and the trained model is difficult to be directly extended to more other application scenarios.
Disclosure of Invention
The invention aims to provide a simple and effective Chinese positioning, segmenting and identifying method in a natural scene image, which can be applied to more scenes in an expanded mode.
In order to achieve the above purpose, the invention adopts the technical scheme that: a Chinese positioning, segmenting and identifying method in natural scene images comprises the following steps:
1) performing primary character positioning on an original picture through an FASText model, and extracting candidate character areas;
2) pre-dividing a candidate character area;
3) and identifying the single character part of the character region after the pre-segmentation, and further segmenting and identifying the single character part of the field part.
Further, the candidate text region is extracted by getCharSegmentation function of fasttext.
Further, the specific process of pre-segmentation in the step 2) is as follows: and (3) calibrating the communication area of the candidate character area, removing some small communication areas (noise), taking the area which accords with the length-width ratio of the Chinese character as a single character to be directly cut out, and taking out the rest communication areas.
Further, the specific process of performing further single character segmentation on the field part after the pre-segmentation in the step 3) is as follows:
(1) training a deep residual error neural network to obtain a single word recognizer ResNet;
(2) directly identifying the single character result obtained after the pre-segmentation through a single character identifier;
(3) further performing single character segmentation on the field result obtained after pre-segmentation, acquiring the region range of candidate characters in the field picture by using FASText, collecting vertical lines in all the region ranges, and using the vertical lines as a candidate segmentation line set for single character segmentation
Figure 131016DEST_PATH_IMAGE001
(ii) a Generating all candidate single character segmentation schemes by using a path tree method, wherein each path corresponds to a single character segmentation scheme;
(4) scheme for dividing any single word by using trained single word recognizer ResNet (such as
Figure 678803DEST_PATH_IMAGE002
) Identifying, recording each individual character and corresponding identification confidence degree (
Figure 976054DEST_PATH_IMAGE003
) Then taking the average value
Figure 980174DEST_PATH_IMAGE004
(5) Selecting the single character segmentation scheme with the highest average confidence coefficient as the optimal single character segmentation scheme;
(6) and taking the single character recognition result corresponding to the optimal single character segmentation scheme as an optimal field recognition scheme, and outputting a corresponding field recognition result.
Further, the rectangular boxes corresponding to the candidate single characters on each path in the step (3) are not overlapped with each other and cover all character strokes detected by the FASText.
The invention utilizes a character stroke detector FASText based on character stroke characteristics to extract candidate single characters and field regions, and then provides a path tree method on the basis of a candidate single character rectangular frame to generate a candidate single character segmentation scheme. For each single character segmentation scheme, a single character recognizer ResNet trained by a deep residual error neural network is used for recognizing all single characters corresponding to the single character segmentation scheme and recording single character recognition confidence coefficients, a field recognition confidence coefficient corresponding to each single character segmentation scheme is calculated, and the scheme with the highest field recognition confidence coefficient is selected as a final single character segmentation and recognition scheme. The method utilizes the accurate extraction of character stroke characteristics and the strong character recognition capability of a deep residual error neural network, combines a path tree method, simply and effectively realizes the purposes of Chinese positioning and recognition, and can be applied to various natural scenes without supervision and training.
Compared with the prior art, the invention has the following beneficial effects: firstly, aiming at the obvious stroke structural characteristics in Chinese characters, an FASText model is adopted, and through the detection of the stroke parts of the characters, the initial positioning of character areas is realized, so that the influence of background factors is effectively eliminated. Second, there are both single word and field parts in the acquired candidate area. Aiming at the field part, the invention adopts FASText to further separate the single character of the detected candidate field region, and simultaneously utilizes a deep residual error neural network to identify the separated single character part, the method integrates the field separation and the single character identification, and the optimal scheme is found out on the premise of trying all candidate separation schemes, thereby having higher robustness and accuracy.
Drawings
FIG. 1 is a flow chart of the present invention.
FIG. 2 is a diagram illustrating the effect of locating and identifying the Chinese embodiment according to the present invention.
Detailed Description
As shown in fig. 1, the present embodiment provides a method for locating, segmenting and recognizing chinese characters in natural scene images, and the process can be divided into the following steps:
1) performing primary character positioning on an original picture through an FASText model, and extracting candidate character areas;
2) pre-dividing a candidate character area;
3) and identifying the single character part of the character region after the pre-segmentation, and further segmenting and identifying the single character part of the field part.
As shown in fig. 2, in which diagram (a) is an original picture; step 1, extracting candidate image regions by using a getCharSegmentation function of FASText, wherein the extracted image is shown as a picture (b); the pre-segmentation operation of the step 2 is specifically that the connected region extracted in the step 1 is determined, after some small connected regions (noise) are removed, the region which accords with the aspect ratio (close to 1: 1) of the Chinese character is regarded as a single character and is directly cut out, and then the remaining connected regions (as shown in a figure (c)) are taken out; in step 3, the word result obtained by pre-segmentation can be directly identified by a word identifier, such as "kou", "mao", "yi", "there", "limited", "official" and "department" in the graph (c), and the field result needs to be further divided into words, such as "shanghai real in and out" in the graph (c), where a region (region box) of a candidate character (Label Candidates) in the field picture needs to be obtained by FASText first (as shown in the graph (d)); then, the vertical lines in all the area ranges are collected and used as candidate segmentation line sets of single character segmentation
Figure 866222DEST_PATH_IMAGE001
(as shown in fig. (e)); then, a path tree method is used for generating all candidate single-word segmentation schemes, and any single-word segmentation scheme (such as
Figure 709544DEST_PATH_IMAGE002
) Respectively calculating the confidence of each single segmentation region on the word recognizer ResNet ()
Figure 743359DEST_PATH_IMAGE003
) Then taking the average value
Figure 330329DEST_PATH_IMAGE004
Selecting the final single-word segmentation scheme with the highest average confidence coefficient from all the segmentation schemes, wherein the field identifiers corresponding to the single-word segmentation schemeThe other result is used as the final field identification result.
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.

Claims (2)

1. A Chinese positioning, segmenting and identifying method in natural scene images is characterized by comprising the following steps:
1) performing primary character positioning on an original picture through a FASText model, and extracting a candidate character region, wherein the candidate character region is extracted through a getCharSegmentation function of the FASText;
2) pre-dividing the candidate character area, wherein the specific process of the pre-dividing is as follows: calibrating a communication region of the candidate character region, removing some smaller communication regions, directly cutting out the region conforming to the length-width ratio of the Chinese character as a single character, and taking out the remaining communication region;
3) identifying the single character part of the character region after the pre-segmentation, and further segmenting and identifying the single character in the field part; the specific process of the step is as follows:
(1) training a deep residual error neural network to obtain a single word recognizer ResNet;
(2) directly identifying the single character result obtained after the pre-segmentation through a single character identifier;
(3) the field part is further divided into single words, the FASText is used for acquiring the region range of the candidate characters in the field picture, the vertical lines in all the region ranges are collected and used as the candidate division line set for single word division; generating all candidate single character segmentation schemes by using a path tree method, wherein each path corresponds to a single character segmentation scheme;
(4) recognizing any single character segmentation scheme by using a trained single character recognizer ResNet, recording each recognized single character and a corresponding recognition confidence coefficient, and then averaging;
(5) selecting the single character segmentation scheme with the highest average confidence coefficient as the optimal single character segmentation scheme;
(6) and taking the single character recognition result corresponding to the optimal single character segmentation scheme as an optimal field recognition result, and outputting a corresponding field recognition result.
2. The method of claim 1, wherein the method comprises the steps of: and (4) the rectangular frames corresponding to the candidate single characters on each path in the step (3) are not overlapped and cover all character strokes detected by the FASText.
CN201710484646.5A 2017-06-23 2017-06-23 Chinese positioning, segmenting and identifying method in natural scene image Active CN107301414B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710484646.5A CN107301414B (en) 2017-06-23 2017-06-23 Chinese positioning, segmenting and identifying method in natural scene image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710484646.5A CN107301414B (en) 2017-06-23 2017-06-23 Chinese positioning, segmenting and identifying method in natural scene image

Publications (2)

Publication Number Publication Date
CN107301414A CN107301414A (en) 2017-10-27
CN107301414B true CN107301414B (en) 2020-07-07

Family

ID=60135051

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710484646.5A Active CN107301414B (en) 2017-06-23 2017-06-23 Chinese positioning, segmenting and identifying method in natural scene image

Country Status (1)

Country Link
CN (1) CN107301414B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108458716B (en) * 2018-02-02 2023-04-18 北京交通大学 Electric vehicle charging navigation method based on charging pile dynamic occupancy rate prediction
CN108229463A (en) * 2018-02-07 2018-06-29 众安信息技术服务有限公司 Character recognition method based on image
CN110321760A (en) * 2018-03-29 2019-10-11 北京和缓医疗科技有限公司 A kind of medical document recognition methods and device
CN112789623B (en) * 2018-11-16 2024-08-16 北京比特大陆科技有限公司 Text detection method, device and storage medium
CN109615719B (en) * 2019-01-04 2021-04-20 天地协同科技有限公司 Freight vehicle non-stop charging system and method based on road safety monitoring system
CN109977878B (en) * 2019-03-28 2021-01-22 华南理工大学 Vehicle detection method based on heavily weighted Anchor
CN111062393B (en) * 2019-11-08 2021-12-17 西安理工大学 Natural scene Chinese character segmentation method based on spectral clustering
CN111488873B (en) * 2020-04-03 2023-10-24 中国科学院深圳先进技术研究院 Character level scene text detection method and device based on weak supervision learning

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777124A (en) * 2010-01-29 2010-07-14 北京新岸线网络技术有限公司 Method for extracting video text message and device thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777124A (en) * 2010-01-29 2010-07-14 北京新岸线网络技术有限公司 Method for extracting video text message and device thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FASText Efficient Unconstrained Scene Text Detector;Michal Buta;Luká Neumann ;Jirí Matas;《2015 IEEE International Conference on Computer Vision (ICCV)》;20151213;第1206-1213页 *

Also Published As

Publication number Publication date
CN107301414A (en) 2017-10-27

Similar Documents

Publication Publication Date Title
CN107301414B (en) Chinese positioning, segmenting and identifying method in natural scene image
Greenhalgh et al. Recognizing text-based traffic signs
Ye et al. Text detection and recognition in imagery: A survey
Neumann et al. Efficient scene text localization and recognition with local character refinement
Shahab et al. ICDAR 2011 robust reading competition challenge 2: Reading text in scene images
Kuettel et al. Figure-ground segmentation by transferring window masks
CN104299009A (en) Plate number character recognition method based on multi-feature fusion
CN112818951A (en) Ticket identification method
CN109325487B (en) Full-category license plate recognition method based on target detection
CN104063701B (en) Fast electric television stations TV station symbol recognition system and its implementation based on SURF words trees and template matches
CN106778777B (en) Vehicle matching method and system
Faustina Joan et al. A survey on text information extraction from born-digital and scene text images
CN110852327A (en) Image processing method, image processing device, electronic equipment and storage medium
CN102098449A (en) Method for realizing automatic inside segmentation of TV programs by utilizing mark detection
CN102521582B (en) Human upper body detection and splitting method applied to low-contrast video
CN113743389B (en) Facial expression recognition method and device and electronic equipment
Karanje et al. Survey on text detection, segmentation and recognition from a natural scene images
CN114926635B (en) Target segmentation method in multi-focus image combined with deep learning method
Rakhmatillaevich et al. A novel method for extracting text from naturalscene images and TTS
TWI430187B (en) License plate number identification method
Devi et al. Text extraction from images using gamma correction method and different text extraction methods—A comparative analysis
Huang A novel video text extraction approach based on Log-Gabor filters
Chen et al. Text detection in traffic informatory signs using synthetic data
CN111027399B (en) Remote sensing image water surface submarine recognition method based on deep learning
Rani et al. Object Detection in Natural Scene Images Using Thresholding Techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 361000 room 3008, Xuan Ye Lou, Pioneer Park, Xiamen torch high tech Zone, Xiamen, Fujian

Applicant after: Xiamen Shang Ji Network Technology Co., Ltd.

Address before: 361000 room 3008, Xuan Ye Lou, Pioneer Park, Xiamen torch high tech Zone, Xiamen, Fujian

Applicant before: Xiamen Business Consulting Co., Ltd.

GR01 Patent grant
GR01 Patent grant