CN110598690A - End-to-end optical character detection and identification method and system - Google Patents

End-to-end optical character detection and identification method and system Download PDF

Info

Publication number
CN110598690A
CN110598690A CN201910707220.0A CN201910707220A CN110598690A CN 110598690 A CN110598690 A CN 110598690A CN 201910707220 A CN201910707220 A CN 201910707220A CN 110598690 A CN110598690 A CN 110598690A
Authority
CN
China
Prior art keywords
region
text image
image
interest
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910707220.0A
Other languages
Chinese (zh)
Other versions
CN110598690B (en
Inventor
蔡华
陈运文
王文广
纪达麒
马振宇
周炳诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Daerguan Information Technology (shanghai) Co Ltd
Original Assignee
Daerguan Information Technology (shanghai) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Daerguan Information Technology (shanghai) Co Ltd filed Critical Daerguan Information Technology (shanghai) Co Ltd
Priority to CN201910707220.0A priority Critical patent/CN110598690B/en
Publication of CN110598690A publication Critical patent/CN110598690A/en
Application granted granted Critical
Publication of CN110598690B publication Critical patent/CN110598690B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names

Abstract

The invention discloses an end-to-end optical character detection and identification method and a system, wherein the identification method comprises the following steps: extracting image features to obtain an interested region; classifying the region of interest to obtain angle information of a frame of the region of interest; segmenting the region of interest to obtain text image contour information in the region; dividing the text image into a plurality of circles based on polar coordinates based on the angle information and the text image outline information, and adjusting the circles and coordinates of the circle content so as to finish the text image; the trimmed text image is identified. The invention integrates a method for realizing equal-variability transformation by a transformation network, and realizes accurate transformation of a curved text region.

Description

End-to-end optical character detection and identification method and system
Technical Field
The invention belongs to the field of character recognition, and particularly relates to an end-to-end optical character detection and recognition method and system.
Background
The traditional OCR method is to divide character detection and character recognition into two separate parts, namely, inputting a picture, firstly carrying out character detection, detecting the position of characters, then carrying out character recognition, namely, extracting the detected characters and sending the extracted characters into a recognition network. This aspect is relatively time consuming and the second does not share the features of detection and identification. The disadvantage of this method is that the text may not be detected accurately enough, which causes some difficulty for recognition, for example, the text edge is framed by many blank areas.
Meanwhile, the conventional OCR method has an unsatisfactory recognition effect on a bent text, and has the difficulty that affine transformation is performed on a horizontal detection frame or a quadrilateral detection frame, so that a character area cannot be accurately positioned, the character areas in the horizontal detection frame and the quadrilateral detection frame only occupy a small part, most of the character areas are backgrounds, and the horizontal detection frame or the oblique detection frame cannot correct the text, so that the long and short time sequence memory network (LSTM) -based Convolutional Recurrent Neural Network (CRNN) recognition method has a poor effect. Moreover, because the design of the Convolutional Neural Network (CNN) for image feature extraction itself does not take special consideration for rotation invariance, the CNN is weak in extracting rotation invariant features in general. The CNN can only learn the rotation invariance by itself in such a way that data enhancement (artificially mirroring, rotating, scaling, etc. the sample).
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides an end-to-end optical character detection and recognition method and system.
In order to achieve the purpose, the invention adopts the following technical scheme:
an end-to-end optical character detection recognition method, the recognition method comprising: extracting image features to obtain an interested region; classifying the region of interest to obtain angle information of a frame of the region of interest; segmenting the region of interest to obtain text image contour information in the region; dividing the text image into a plurality of circles based on polar coordinates based on the angle information and the text image outline information, and adjusting the circles and coordinates of the circle content so as to finish the text image; the trimmed text image is identified.
Preferably, the extracting the image feature comprises: inputting the image into a characteristic pyramid network to obtain a main characteristic diagram of the image; and inputting the backbone feature map into the region to generate a network, and obtaining the region of interest.
Preferably, the classifying the region of interest includes: and classifying the region of interest into a specific category, and performing regression on the frame of the region of interest.
Preferably, the segmenting the region of interest comprises: deconvolving the region of interest to generate a mask of the text image.
Preferably, the dividing the text image into a plurality of circles based on polar coordinates based on the angle information and the text image outline information includes: finding out the center line of the text image based on the angle information and the outline information of the text image; drawing a first circle by taking one end of the center line as the center of the circle; and drawing subsequent circles at predetermined intervals along the center line until the text image is completely divided into the areas defined by the circles.
Preferably, the finding the center line of the text image comprises: selecting a point on a boundary of the text image; determining a tangent line passing through the point and then determining a perpendicular line passing through the point and perpendicular to the tangent line; moving the point along the vertical line to the boundary of the text image until the point is equal in distance from the vertical line to the two ends of the text image, wherein the point is a point on the central line; and fitting a plurality of the points to obtain a central line of the text image.
Preferably, a convolution cyclic neural network is adopted in the text image after the trimming is identified to identify the text image.
An end-to-end optical character detection recognition system, the recognition system comprising: the image feature extraction module extracts image features to obtain an interested region; the classification module classifies the region of interest to obtain angle information of a frame of the region of interest, and is connected with the image feature extraction module; the segmentation module is used for segmenting the region of interest to obtain the outline information of the text image in the region and is connected with the image feature extraction module; the system comprises an equal deformation transformation module, a text feature extraction module, a classification module and a segmentation module, wherein the equal deformation transformation module divides a text image into a plurality of circles based on polar coordinates based on angle information and text image outline information, adjusts the coordinates of the circles and the defined content thereof so as to finish the text image, and is connected with the image feature extraction module, the classification module and the segmentation module; and
and the character recognition module recognizes the trimmed text image and is connected with the equal-deformation transformation module.
Compared with the prior art, the invention has the beneficial effects that:
1. in the intelligent recognition system, an iso-variability transformation module is fused, so that the accurate transformation of a bent text region is realized;
2. the network is a multi-task learning structure and can obtain multi-task learning of element classification, text recognition and instance segmentation;
3. in the network structure, the image pyramid characteristics are extracted through a convolution module, and the detection and identification of texts with different scales are realized.
4. The system does not limit characters, and is suitable for intelligent detection and identification of all language characters;
5. the extracted image features are used by a classification module, a segmentation module and an equal deformation transformation module, so that the features are not repeatedly extracted, and the efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic overall structure diagram of an embodiment of the present invention.
Fig. 2 is a schematic diagram of a convolution network structure of the image feature extraction module.
Fig. 3 is a schematic diagram of a network structure of the classification module.
Fig. 4 is a schematic diagram of a network structure of a partitioning module.
FIG. 5 is a schematic diagram of the structure of sliding, centering and iso-degenerative transformation.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.
In the description of the present invention, it is to be understood that the terms "longitudinal", "lateral", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used merely for convenience of description and for simplicity of description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, are not to be construed as limiting the present invention.
As shown in fig. 1, the present embodiment mainly includes an image feature extraction module, an element classification and instance segmentation module, an iso-variability transformation module, and a character recognition module.
1 image feature extraction module
As shown in fig. 2, the image feature extraction module provides shared image feature information for the entire system, thereby improving the calculation efficiency and the accuracy of the calculation result.
An image feature pyramid can be constructed by a Feature Pyramid Network (FPN) using the output features of the convolutional network block. Targets of different sizes have different features, simple targets can be distinguished by shallow features, while complex objects can be distinguished by using deep features, the convolutional network structure is divided into 5 parts in fig. 2, the output of each part corresponds to [ C1, C2, C3, C4, C5], performing depth convolution on the input image, then adding a convolution layer of 1x1 to [ C1, C2, C3, C4 and C5], extracting the characteristics on the convolution block to obtain an image characteristic pyramid structure [ P1, P2, P3, P4, P5], the features of P5 were upsampled such that they had a corresponding size to the 1x1 convolved features of C4, then, an addition operation (corresponding element addition) is performed on the processed features, the obtained result is input to P4, and 3 × 3 convolution is carried out on the P5 to obtain relevant characteristics as the input of the Region Proposal Network (RPN). The same operation is performed on P4, P3 and P2 in turn, and the processed low-layer features and the processed high-layer features are accumulated, so that the purpose of accumulating the processed low-layer features and the processed high-layer features is that the low-layer features can provide more accurate position information, and the positioning information of a deep network has errors due to multiple down-sampling and up-sampling operations, so that the positioning information of the deep network is combined with the positioning information of the deep network, and a deeper feature pyramid is constructed, multi-layer feature information is fused, and the multi-layer feature information is output in different features. That is, the performance of the standard feature extraction pyramid is improved by adding a second pyramid, which can select high-level features from the first pyramid and transfer them to the bottom layer. By this process it allows the features of each level to be combined with the features of both the high and low levels. The idea behind this is to obtain a strong semantic information, which can improve the detection performance, and to construct the feature pyramid with deeper layers, which is done to use more robust information.
The RPN region generation network is a lightweight neural network that scans an image with a sliding window and finds a region where a target exists. The region scanned by the RPN is called an anchor point, corresponding to a rectangle distributed on the image region, the sliding window is realized by the convolution process of the RPN, and the RPN does not directly scan the image but scans the main feature map. This allows the RPN to efficiently multiplex the extracted features and avoid duplicate computations. Generating a plurality of anchor points by RPN network for the characteristic maps of [ P1, P2, P3, P4 and P5] in different scales, reserving partial RoI (region of interest) after NMS (non-maximum suppression) operation, respectively aligning the characteristic maps of [ P1, P2, P3, P4 and P5] in different scales due to different step sizes, then connecting the characteristic maps, and inputting the connected characteristic maps into tasks of full connected element classification, full convolution pixel segmentation and equal degeneration transformation.
2 Classification Module
As shown in fig. 3, the ROI classifier performs classification and regresses to a bounding box, which is deeper and can classify regions into specific classes, unlike the RPN which can only distinguish between foreground or background classes. Meanwhile, the frame can be finely adjusted, and the position and the size of the frame are further finely adjusted to package the target.
3 splitting module
The method of example segmentation can accurately detect the characters and generate a mask of character areas. And deconvoluting the ROI characteristic region to obtain a mask region which is consistent with the size of the input picture and obtains a text. Window dimension channel
4 equal-variability transformation module
As shown in fig. 5, in the transformation structure of this embodiment, an angle information of the text region can be obtained through the regression frame of the continuous text region in the classification module. Then, the center line of the text region is found by using the angle information and the segmented text region information, and then, according to the center line and the outline boundary of the text region, the text region can be extended horizontally. It can fit any shape of text well, such as horizontal text, multi-directional text, and curved text.
This embodiment randomly selects one pixel as a starting point and centers it. The search process then branches into two opposite directions — sliding and centering until the end. This process will generate two ordered points in two opposite directions and can combine to generate a final central axis that conforms to the progress of the text and describes the shape accurately. In addition, the embodiment also uses the local geometric attributes to describe the structure of the text instance, and converts the predicted curved text instance into a canonical form, which greatly reduces the work of the subsequent recognition stage.
This canonical form of translation describes text by a series of ordered, overlapping disks (disks), each located on the central axis of a text region, with a radius and direction that can vary. Geometric attributes of text instances (e.g., centerline point, radius, direction) are evaluated through a Full Convolution Network (FCN) to characterize a text region as a series of ordered and overlapping disks, each disk intersected by a centerline with a variable radius r and direction θ. The network module can change its shape to accommodate different changes, such as rotation, scaling, bending. Mathematically, a text instance t containing several characters can be seen as a sequence s (t), which is a set of a series of circular discs. Each disk D carries a set of geometrical properties, r being defined as half the local width of t, and the direction θ being the tangent of a centre line through the centre point c. Thus, by calculating the coincidence of the disks in s (t), the text region t can be easily reconstructed. Note that the circular disks do not correspond one-to-one to the characters of the text instance. However, the geometric properties of the disk sequence can correct the irregular text instance and convert it into a more friendly horizontal rectangle for the text recognizer, and an inscribed circle is first found at the boundary and then moved slowly along the center line at a small interval while drawing the inscribed circle, that is, the text in the text area is transformed to the horizontal direction by the inscribed circle, thereby completing the iso-metamorphic transformation.
In theory, assuming that we have a mode x, this mode can be changed into other forms through some transformation T, the mode after transformation is temporarily referred to as T (x | w), and all transformation parameters w can be determined (learned) from the original mode in the whole transformation process. Of course this transformation is not known. That is, the content we are studying learns either the transformation itself or the recognition model with invariance. The common transformation, the recognition model, should be invariant, is a spatial transformation. The invariance of the transformation is typically hard coded by using a Convolutional Neural Network (CNN). A common technique to achieve iso-degenerative recognition is to extend the training set with a spatially transformed version of the original image. Ideally, the machine learning system should be able to extrapolate beyond the range of parameter values in the training set.
Thus, conventional CNNs cannot generalize the concept of rotation without additional means (not only to infer unseen angles of rotation, but to shift the recognition capabilities of encountered angles from one category to another). The characters belong to image characters and can be based on shape characteristics to a certain extent. We use windows to slide on the picture, and naturally have translation invariance, which is inherent in the network, as long as the same feature is present and translation can be detected no matter where it is. Rotation invariance is the spatial structure invariance between small features inside a feature. This should be the case if different objects have different unique structures that the neural network learns to have rotational invariance. Similarly, invariance such as scaling, micro-deformation, etc. should be learned.
5 character recognition module
And inputting the network characteristics subjected to the isovariate transformation and the image characteristics obtained by the convolution network into a character recognition module for recognizing the text. The main structure of the module is Convolutional Recurrent Neural Network (CRNN), which is a combination of deep Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), and can directly learn from sequence tags to generate a series of class sequence tags. The word recognition module in this embodiment comprises a Bi-directional long-short time-series memory network (Bi-LSTM), a fully connected layer and a connected time-series classification (CTC) decoder. And mapping the high-order features extracted by the convolution module into a sequence in a time main form, and sending the sequence into the RNN for encoding. Bi-LSTM is used to obtain the range dependence of the input sequence features. Then, the implicit states calculated in each time step in two directions are summed and sent into a complete connection to obtain a distribution of each state on the character class set, and finally, the frame classification scores are converted into character label sequences by using CTC to obtain text recognition output.
Although the present invention has been described in detail with respect to the above embodiments, it will be understood by those skilled in the art that modifications or improvements based on the disclosure of the present invention may be made without departing from the spirit and scope of the invention, and these modifications and improvements are within the spirit and scope of the invention.

Claims (8)

1. An end-to-end optical character detection and recognition method, characterized in that the recognition method comprises:
extracting image features to obtain an interested region;
classifying the region of interest to obtain angle information of a frame of the region of interest;
segmenting the region of interest to obtain text image contour information in the region;
dividing the text image into a plurality of circles based on polar coordinates based on the angle information and the text image outline information, and adjusting the circles and coordinates of the circle content so as to finish the text image;
the trimmed text image is identified.
2. The end-to-end optical character detection recognition method of claim 1, wherein the extracting image features comprises:
inputting the image into a characteristic pyramid network to obtain a main characteristic diagram of the image;
and inputting the backbone feature map into the region to generate a network, and obtaining the region of interest.
3. The end-to-end optical character detection recognition method of claim 1, wherein the classifying the region of interest comprises:
and classifying the region of interest into a specific category, and performing regression on the frame of the region of interest.
4. The end-to-end optical character detection recognition method of claim 1, wherein the segmenting the region of interest comprises:
deconvolving the region of interest to generate a mask of the text image.
5. The method for end-to-end optical character detection and recognition according to claim 1, wherein the dividing the text image into a plurality of polar coordinate-based circles based on the angle information and the text image outline information comprises:
finding out the center line of the text image based on the angle information and the outline information of the text image;
drawing a first circle by taking one end of the center line as the center of the circle;
and drawing subsequent circles at predetermined intervals along the center line until the text image is completely divided into the areas defined by the circles.
6. The end-to-end optical character detection recognition method of claim 5, wherein the finding the centerline of the text image comprises:
selecting a point on a boundary of the text image;
determining a tangent line passing through the point and then determining a perpendicular line passing through the point and perpendicular to the tangent line;
moving the point along the vertical line to the boundary of the text image until the point is equal in distance from the vertical line to the two ends of the text image, wherein the point is a point on the central line;
and fitting a plurality of the points to obtain a central line of the text image.
7. The method for end-to-end optical character detection and recognition of claim 1, wherein the recognition of the trimmed text image uses a convolutional recurrent neural network to recognize the text image.
8. An end-to-end optical character detection recognition system, the recognition system comprising: the image feature extraction module extracts image features to obtain an interested region;
the classification module classifies the region of interest to obtain angle information of a frame of the region of interest, and is connected with the image feature extraction module;
the segmentation module is used for segmenting the region of interest to obtain the outline information of the text image in the region and is connected with the image feature extraction module;
the system comprises an equal deformation transformation module, a text feature extraction module, a classification module and a segmentation module, wherein the equal deformation transformation module divides a text image into a plurality of circles based on polar coordinates based on angle information and text image outline information, adjusts the coordinates of the circles and the defined content thereof so as to finish the text image, and is connected with the image feature extraction module, the classification module and the segmentation module; and
and the character recognition module recognizes the trimmed text image and is connected with the equal-deformation transformation module.
CN201910707220.0A 2019-08-01 2019-08-01 End-to-end optical character detection and recognition method and system Active CN110598690B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910707220.0A CN110598690B (en) 2019-08-01 2019-08-01 End-to-end optical character detection and recognition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910707220.0A CN110598690B (en) 2019-08-01 2019-08-01 End-to-end optical character detection and recognition method and system

Publications (2)

Publication Number Publication Date
CN110598690A true CN110598690A (en) 2019-12-20
CN110598690B CN110598690B (en) 2023-04-28

Family

ID=68853317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910707220.0A Active CN110598690B (en) 2019-08-01 2019-08-01 End-to-end optical character detection and recognition method and system

Country Status (1)

Country Link
CN (1) CN110598690B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275034A (en) * 2020-01-19 2020-06-12 世纪龙信息网络有限责任公司 Method, device, equipment and storage medium for extracting text region from image
CN111401357A (en) * 2020-02-12 2020-07-10 杭州电子科技大学 Pointer instrument reading method based on text detection
CN111507328A (en) * 2020-04-13 2020-08-07 北京爱咔咔信息技术有限公司 Text recognition and model training method, system, equipment and readable storage medium
CN111524106A (en) * 2020-04-13 2020-08-11 北京推想科技有限公司 Skull fracture detection and model training method, device, equipment and storage medium
CN111539438A (en) * 2020-04-28 2020-08-14 北京百度网讯科技有限公司 Text content identification method and device and electronic equipment
CN112036398A (en) * 2020-10-15 2020-12-04 北京一览群智数据科技有限责任公司 Text correction method and system
CN112115836A (en) * 2020-09-11 2020-12-22 北京金堤科技有限公司 Information verification method and device, computer readable storage medium and electronic equipment
CN113221890A (en) * 2021-05-25 2021-08-06 深圳市瑞驰信息技术有限公司 OCR-based cloud mobile phone text content supervision method, system and system
CN113743400A (en) * 2021-07-16 2021-12-03 华中科技大学 Electronic official document intelligent examination method and system based on deep learning
WO2022105521A1 (en) * 2020-11-20 2022-05-27 深圳壹账通智能科技有限公司 Character recognition method and apparatus for curved text image, and computer device
CN114842487A (en) * 2021-12-09 2022-08-02 上海鹑火信息技术有限公司 Method and system for identifying veronica characters

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120033852A1 (en) * 2010-08-06 2012-02-09 Kennedy Michael B System and method to find the precise location of objects of interest in digital images
CN109447078A (en) * 2018-10-23 2019-03-08 四川大学 A kind of detection recognition method of natural scene image sensitivity text

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120033852A1 (en) * 2010-08-06 2012-02-09 Kennedy Michael B System and method to find the precise location of objects of interest in digital images
CN109447078A (en) * 2018-10-23 2019-03-08 四川大学 A kind of detection recognition method of natural scene image sensitivity text

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张博等: "基于Mask R-CNN的ORB去误匹配方法", 《液晶与显示》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275034A (en) * 2020-01-19 2020-06-12 世纪龙信息网络有限责任公司 Method, device, equipment and storage medium for extracting text region from image
CN111275034B (en) * 2020-01-19 2023-09-12 天翼数字生活科技有限公司 Method, device, equipment and storage medium for extracting text region from image
CN111401357A (en) * 2020-02-12 2020-07-10 杭州电子科技大学 Pointer instrument reading method based on text detection
CN111401357B (en) * 2020-02-12 2023-09-15 杭州电子科技大学 Pointer type instrument reading method based on text detection
CN111524106B (en) * 2020-04-13 2021-05-28 推想医疗科技股份有限公司 Skull fracture detection and model training method, device, equipment and storage medium
CN111507328A (en) * 2020-04-13 2020-08-07 北京爱咔咔信息技术有限公司 Text recognition and model training method, system, equipment and readable storage medium
CN111524106A (en) * 2020-04-13 2020-08-11 北京推想科技有限公司 Skull fracture detection and model training method, device, equipment and storage medium
CN111539438A (en) * 2020-04-28 2020-08-14 北京百度网讯科技有限公司 Text content identification method and device and electronic equipment
US11810384B2 (en) 2020-04-28 2023-11-07 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus for recognizing text content and electronic device
CN111539438B (en) * 2020-04-28 2024-01-12 北京百度网讯科技有限公司 Text content identification method and device and electronic equipment
CN112115836A (en) * 2020-09-11 2020-12-22 北京金堤科技有限公司 Information verification method and device, computer readable storage medium and electronic equipment
CN112036398A (en) * 2020-10-15 2020-12-04 北京一览群智数据科技有限责任公司 Text correction method and system
CN112036398B (en) * 2020-10-15 2024-02-23 北京一览群智数据科技有限责任公司 Text correction method and system
WO2022105521A1 (en) * 2020-11-20 2022-05-27 深圳壹账通智能科技有限公司 Character recognition method and apparatus for curved text image, and computer device
CN113221890A (en) * 2021-05-25 2021-08-06 深圳市瑞驰信息技术有限公司 OCR-based cloud mobile phone text content supervision method, system and system
CN113743400A (en) * 2021-07-16 2021-12-03 华中科技大学 Electronic official document intelligent examination method and system based on deep learning
CN113743400B (en) * 2021-07-16 2024-02-20 华中科技大学 Electronic document intelligent examination method and system based on deep learning
CN114842487A (en) * 2021-12-09 2022-08-02 上海鹑火信息技术有限公司 Method and system for identifying veronica characters
CN114842487B (en) * 2021-12-09 2023-11-03 上海鹑火信息技术有限公司 Identification method and system for salomile characters

Also Published As

Publication number Publication date
CN110598690B (en) 2023-04-28

Similar Documents

Publication Publication Date Title
CN110598690B (en) End-to-end optical character detection and recognition method and system
CN108304873B (en) Target detection method and system based on high-resolution optical satellite remote sensing image
Liao et al. Rotation-sensitive regression for oriented scene text detection
CN107341517B (en) Multi-scale small object detection method based on deep learning inter-level feature fusion
CN104751187B (en) Meter reading automatic distinguishing method for image
US8655070B1 (en) Tree detection form aerial imagery
Das et al. Use of salient features for the design of a multistage framework to extract roads from high-resolution multispectral satellite images
CN102509091B (en) Airplane tail number recognition method
CN108090906B (en) Cervical image processing method and device based on region nomination
Alidoost et al. A CNN-based approach for automatic building detection and recognition of roof types using a single aerial image
Khan et al. Deep learning approaches to scene text detection: a comprehensive review
CN103049763B (en) Context-constraint-based target identification method
CN111091105A (en) Remote sensing image target detection method based on new frame regression loss function
CN111738055B (en) Multi-category text detection system and bill form detection method based on same
Jiao et al. A survey of road feature extraction methods from raster maps
CN111914698A (en) Method and system for segmenting human body in image, electronic device and storage medium
CN112232371A (en) American license plate recognition method based on YOLOv3 and text recognition
CN112766184A (en) Remote sensing target detection method based on multi-level feature selection convolutional neural network
Zelener et al. Cnn-based object segmentation in urban lidar with missing points
CN110210415A (en) Vehicle-mounted laser point cloud roadmarking recognition methods based on graph structure
Zhou et al. Building segmentation from airborne VHR images using Mask R-CNN
Mukhiddinov et al. Robust text recognition for Uzbek language in natural scene images
CN113989604A (en) Tire DOT information identification method based on end-to-end deep learning
CN109117841B (en) Scene text detection method based on stroke width transformation and convolutional neural network
Han et al. Accurate and robust vanishing point detection method in unstructured road scenes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant