CN110598690B - End-to-end optical character detection and recognition method and system - Google Patents

End-to-end optical character detection and recognition method and system Download PDF

Info

Publication number
CN110598690B
CN110598690B CN201910707220.0A CN201910707220A CN110598690B CN 110598690 B CN110598690 B CN 110598690B CN 201910707220 A CN201910707220 A CN 201910707220A CN 110598690 B CN110598690 B CN 110598690B
Authority
CN
China
Prior art keywords
text image
region
image
interest
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910707220.0A
Other languages
Chinese (zh)
Other versions
CN110598690A (en
Inventor
蔡华
陈运文
王文广
纪达麒
马振宇
周炳诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Datagrand Information Technology Shanghai Co ltd
Original Assignee
Datagrand Information Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Datagrand Information Technology Shanghai Co ltd filed Critical Datagrand Information Technology Shanghai Co ltd
Priority to CN201910707220.0A priority Critical patent/CN110598690B/en
Publication of CN110598690A publication Critical patent/CN110598690A/en
Application granted granted Critical
Publication of CN110598690B publication Critical patent/CN110598690B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names

Abstract

The invention discloses an end-to-end optical character detection and recognition method and a system, wherein the recognition method comprises the following steps: extracting image features to obtain a region of interest; classifying the region of interest to obtain angle information of a frame of the region of interest; segmenting an interested region to obtain text image contour information in the region; dividing the text image into a plurality of circles based on polar coordinates based on the angle information and the text image contour information, and adjusting the coordinates of the circles and the delineating content so as to trim the text image; and identifying the trimmed text image. The invention merges a method for realizing isomorphism transformation by a transformation network, and realizes the accurate transformation of a curved text region.

Description

End-to-end optical character detection and recognition method and system
Technical Field
The invention belongs to the field of character recognition, and particularly relates to an end-to-end optical character detection and recognition method and system.
Background
The traditional OCR method divides character detection and character recognition into two separated parts, namely, inputting a picture, firstly detecting the characters, detecting the positions of the characters, then carrying out character recognition, namely, matting out the detected characters and sending the text to a recognition network. Such an aspect is time consuming and second does not share the detected and identified features. The disadvantage of this method is that the text may be detected insufficiently accurately, which may cause difficulties for recognition, such as the text edge being framed by some blank areas, etc.
Meanwhile, the existing OCR method is not ideal in recognition effect on the bent text, and has the difficulty that a horizontal detection frame or a quadrilateral detection frame is subjected to affine transformation, a text area cannot be accurately positioned, the text area in the horizontal detection frame and the quadrilateral detection frame only occupies a small part, most of the text area is background, and the horizontal or inclined detection frame cannot twist the text, so that the recognition method of a convolutional neural network (CRNN) based on a long-short time sequence memory network (LSTM) has poor effect. Moreover, since the design of the Convolutional Neural Network (CNN) itself for image feature extraction does not take special consideration of rotation invariance, the ability of CNN to extract rotation invariance features is generally weak. The CNN can only learn the rotation invariance by itself in this way of data enhancement (manually mirroring, rotating, scaling, etc. the samples).
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides an end-to-end optical character detection and recognition method and system.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
an end-to-end optical character detection recognition method, the recognition method comprising: extracting image features to obtain a region of interest; classifying the region of interest to obtain angle information of a frame of the region of interest; segmenting an interested region to obtain text image contour information in the region; dividing the text image into a plurality of circles based on polar coordinates based on the angle information and the text image contour information, and adjusting the coordinates of the circles and the delineating content so as to trim the text image; and identifying the trimmed text image.
Preferably, the extracting the image features includes: inputting the image into a feature pyramid network to obtain a trunk feature map of the image; and inputting the trunk feature map into a region generation network to obtain the region of interest.
Preferably, the classifying the region of interest includes: the regions of interest are classified into specific categories and regression is performed on the borders of the regions of interest.
Preferably, the segmenting the region of interest comprises: deconvolving the region of interest generates a mask of the literal image.
Preferably, the dividing the text image into a plurality of polar coordinate-based circles based on the angle information and the text image contour information includes: finding out the center line of the text image based on the angle information and the text image contour information; drawing a first circle by taking one end of the central line as the center of a circle; subsequent circles are drawn at predetermined intervals along the centerline until the text image is entirely divided into areas delineated by the circles.
Preferably, the finding the center line of the text image includes: selecting a point on the boundary of the text image; determining a tangent line passing through the point and then determining a vertical line passing through the point and perpendicular to the tangent line; moving the point along the vertical line into the boundary of the text image until the distance between the point and the vertical line passing through the two ends of the text image is equal, wherein the point is a point on the central line; and obtaining the central line of the text image after fitting a plurality of the points.
Preferably, the text image after the trimming is identified by adopting a convolutional recurrent neural network.
An end-to-end optical character detection recognition system, the recognition system comprising: the image feature extraction module is used for extracting image features to obtain an interested region; the classification module classifies the region of interest to obtain angle information of a frame of the region of interest, and is connected with the image feature extraction module; the segmentation module is used for segmenting the region of interest to obtain text image contour information in the region and is connected with the image feature extraction module; the equal deformation transformation module is used for dividing the text image into a plurality of circles based on polar coordinates based on the angle information and the text image contour information, adjusting the coordinates of the circles and the delineation content thereof so as to trim the text image, and is connected with the image feature extraction module, the classification module and the segmentation module; and
the character recognition module is used for recognizing the trimmed text image and is connected with the equal deformation conversion module.
Compared with the prior art, the invention has the beneficial effects that:
1. in the intelligent recognition system, an isodegeneration conversion module is fused, so that the accurate conversion of a curved text region is realized;
2. the network is a multi-task learning structure, and can obtain multi-task learning of element classification, text recognition and instance segmentation;
3. in the network structure, image pyramid features are extracted through a convolution module, and detection and recognition of texts with different scales are realized.
4. The system does not limit the characters, and is suitable for intelligent detection and recognition of all language characters;
5. the extracted image features are used by a classification module, a segmentation module, a deformation transformation module and the like, the features are not repeatedly extracted, and the efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic overall structure of an embodiment of the present invention.
Fig. 2 is a schematic diagram of a convolutional network structure of the image feature extraction module.
Fig. 3 is a schematic diagram of a network structure of the classification module.
Fig. 4 is a schematic diagram of a network structure of the segmentation module.
FIG. 5 is a schematic diagram of a sliding, centering, and isodenaturing transformation.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without any inventive effort, are intended to be within the scope of the invention.
In the description of the present invention, it should be understood that the terms "longitudinal," "transverse," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate orientations or positional relationships based on the orientation or positional relationships shown in the drawings, merely to facilitate describing the present invention and simplify the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and therefore should not be construed as limiting the present invention.
As shown in fig. 1, the embodiment mainly comprises an image feature extraction module, an element classification and instance segmentation module, an isovariational transformation module and a character recognition module.
1 image feature extraction module
As shown in fig. 2, the image feature extraction module provides shared image feature information for the whole system, so as to improve the calculation efficiency and the accuracy of the calculation result.
Image feature pyramids may be constructed by a Feature Pyramid Network (FPN) using the output features of the convolutional network blocks. Different sizes of objects have different features, simple objects can be distinguished by shallow features, complex objects can be distinguished by deep features, a convolution network structure is divided into 5 parts in fig. 2, outputs of the parts respectively correspond to [ C1, C2, C3, C4, C5], deep convolution is performed on an input image, then a 1x1 convolution layer is added to [ C1, C2, C3, C4, C5], features on a convolution block are extracted to obtain an image feature pyramid structure [ P1, P2, P3, P4, P5], upsampling operation is performed on features of P5 so that the features are corresponding in size to the features of the C4 after 1*1 convolution, then an addition operation (corresponding element addition) is performed on the processed features, the obtained result is input into P4, and meanwhile, the P5 is subjected to 3*3 convolution to obtain relevant features as input of a Region Proposal Network (RPN). The same operation is sequentially carried out on P4, P3 and P2, the processed low-level features and the processed high-level features are accumulated, and the purpose of the operation is that the low-level features can provide more accurate position information, and the positioning information of a deep network is error caused by multiple downsampling and upsampling operations, so that the deep network is combined and used, a deeper feature pyramid is constructed, multiple layers of feature information are fused, and different features are output. That is, the performance of the standard feature extraction pyramid is enhanced by adding a second pyramid that can select advanced features from the first pyramid and pass them on to the underlying layer. Through this process, it allows the features of each stage to be combined with the high-level and low-level features. The idea behind this is to obtain a strong semantic information, which can improve the detection performance, and construct the feature pyramid with a deeper layer, which is to use more robust information.
The RPN region generation network is a lightweight neural network that scans images with a sliding window and finds regions where objects are present. The region scanned by the RPN is called an anchor point, and corresponds to a rectangle distributed on the image region, the sliding window is realized by a convolution process of the RPN, and the RPN does not directly scan the image, but scans the backbone feature map. This allows the RPN to effectively multiplex the extracted features and avoid repetitive calculations. The characteristic diagrams of [ P1, P2, P3, P4, P5] with different scales are generated into a plurality of anchor blocks by an RPN network, a part RoI (region of interest) is reserved after NMS (non-maximum suppression) operation, and the characteristic diagrams of [ P1, P2, P3, P4, P5] with different scales are respectively aligned due to different step sizes, and then the characteristic diagrams are connected and input into the tasks of full connection element classification, full convolution pixel segmentation and isodegeneration transformation.
2 sorting module
As shown in fig. 3, the ROI classifier implements classification and regression yields a bounding box, unlike RPN which can only resolve two categories, foreground or background, this network is deeper and can classify a region into a specific category. Meanwhile, the frame can be finely adjusted, and the position and the size of the frame are further finely adjusted so as to encapsulate the target.
3 split module
The text can be accurately detected by using the example segmentation method, and a mask of a text region is generated. Deconvolution is carried out on the ROI characteristic region to obtain a mask region which is consistent with the size of the input picture, and a text is obtained. Window dimension channel
4-class denaturation conversion module
As shown in fig. 5, in the transformation structure of the present embodiment, an angle information of the text region can be obtained by a regression box for the continuous text region in the classification module. The angle information and the segmented text region information are then used to find the center line of the text region, and based on this center line and the contour boundaries of the text region, we can extend the text region horizontally. Any shape of text, such as horizontal text, multidirectional text, curved text, can be fit well.
The present embodiment randomly selects one pixel as a starting point and centers it. The search process then branches into two opposite directions—sliding and centering until end. This process will generate two ordered points in two opposite directions and can combine to generate a final central axis that conforms to the progress of the text and accurately describes the shape. In addition, the embodiment also utilizes the local geometric attribute to describe the structure of the text instance, and converts the predicted curved text instance into a canonical form, which greatly lightens the work of the subsequent recognition stage.
The transformation of this canonical form is a description of the text by a series of ordered, overlapping disks (disks), each disk being located on the central axis of the text region and having a variable radius and direction. Geometric properties (e.g., center axis, radius, direction) of the text instance are evaluated through a Full Convolution Network (FCN) to characterize a text region as a series of ordered and overlapping disks, each disk being intersected by a center line and having a variable radius r and direction θ. The network module can change its shape to accommodate different changes, such as rotation, scaling, bending. Mathematically, a text instance t containing several characters can be seen as a sequence S (t), which is a collection of a series of disks. Each disk D has a set of geometric properties, r being defined as half the local width of t, the direction θ being the tangent of the center line through the center point c. Thus, by calculating the overlap of the disks in S (t), the text region t can be easily reconstructed. Note that the discs do not correspond one-to-one to the characters of the text instance. But the geometric property of the disc sequence can correct the irregular text instance and convert it into a horizontal rectangle more friendly to the text recognizer, firstly find an inscribed circle at the boundary, then move slowly at a small interval along the center line and draw the inscribed circle while moving, that is to say, the text in the text region is transformed to the horizontal direction by the inscribed circle, thus completing the isomorphism transformation.
Theoretically, we assume that we have a pattern x, which can be transformed into other forms by some transformation T, and the transformed pattern is once denoted T (x|w), and all transformation parameters w can be determined (learned) from the original pattern throughout the transformation. Of course this transformation is not known. That is, the content we study either learns the transformation itself or learns the recognition model with invariance. The common transformation, the recognition model should be invariant, a spatial transformation. Invariance of the transformation is typically hard coded by using Convolutional Neural Networks (CNNs). A common technique for implementing isomorphism recognition is to extend the training set with a spatially transformed version of the original image. Ideally, the machine learning system should be able to extrapolate beyond the range of parameter values in the training set.
Thus, conventional CNNs cannot generalize the rotation concept without additional means (not just to infer an unseen rotation angle, but rather to shift the recognition capability of an encountered angle from one category to another). The text belongs to the character of the character and can be based on a certain degree of shape characteristics. We slide on the picture with the window, naturally, as long as the same features are present, no matter where the translation is detected, so there is a translation invariance that the network itself has. Rotational invariance is the invariance of the spatial structure between small features inside the features. This should be that different objects have different unique structures that the neural network learns to have rotational invariance. Similarly, invariance such as scaling, micro-deformation and the like should be learned.
5 character recognition module
And inputting the network characteristics subjected to the isomorphism transformation and the image characteristics obtained by the convolution network into a text recognition module for recognizing the text. The main structure of the module is a Convolutional Recurrent Neural Network (CRNN), which is a combination of a deep Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN), and can be directly learned from sequence tags to generate a series of sequence-like tags. The Chinese character recognition module in this embodiment includes a two-way long and short time memory network (Bi-LSTM), a full connectivity layer and connected time sequence classification (CTC) decoder. And mapping the high-order features extracted by the previous convolution module into a sequence in a time main form, and sending the sequence into the RNN for encoding. Bi-LSTM is used to obtain the range dependence of the input sequence features. And then, summing the implicit states calculated in each time step in two directions, sending the implicit states into a complete connection to obtain a distribution of each state on a character class set, and finally, converting the frame classification score into a character label sequence by using CTC to obtain text recognition output.
While the foregoing embodiments have been described in detail and with reference to the present invention, it will be apparent to one skilled in the art that modifications and improvements can be made based on the disclosure without departing from the spirit and scope of the invention.

Claims (6)

1. An end-to-end optical character detection and recognition method, comprising:
extracting image features to obtain a region of interest;
classifying the region of interest to obtain angle information of a frame of the region of interest;
segmenting an interested region to obtain text image contour information in the region;
dividing the text image into a plurality of polar coordinate-based circles based on the angle information and the text image contour information, comprising: finding out the center line of the text image based on the angle information and the text image contour information; drawing a first circle by taking one end of the central line as the center of a circle; drawing subsequent circles at predetermined intervals along the central line until the text image is completely divided into areas defined by a plurality of circles; finding the centerline of the text image includes: selecting a point on the boundary of the text image; determining a tangent line passing through the point and then determining a vertical line passing through the point and perpendicular to the tangent line; moving the point along the vertical line into the boundary of the text image until the distance between the point and the vertical line passing through the two ends of the text image is equal, wherein the point is a point on the central line; fitting a plurality of points to obtain a central line of the text image;
adjusting the coordinates of the circle and its delineating content to trim the text image;
and identifying the trimmed text image.
2. The end-to-end optical character detection and recognition method according to claim 1, wherein the extracting image features comprises:
inputting the image into a feature pyramid network to obtain a trunk feature map of the image;
and inputting the trunk feature map into a region generation network to obtain the region of interest.
3. The end-to-end optical character detection and recognition method according to claim 1, wherein the classifying the region of interest comprises:
the regions of interest are classified into specific categories and regression is performed on the borders of the regions of interest.
4. The end-to-end optical character detection and recognition method according to claim 1, wherein the segmenting the region of interest comprises:
deconvolving the region of interest generates a mask of the literal image.
5. The end-to-end optical character detection and recognition method according to claim 1, wherein the text image after the recognition and trimming is recognized by using a convolutional recurrent neural network.
6. An end-to-end optical character detection recognition system, the recognition system comprising:
the image feature extraction module is used for extracting image features to obtain an interested region;
the classification module classifies the region of interest to obtain angle information of a frame of the region of interest, and is connected with the image feature extraction module;
the segmentation module is used for segmenting the region of interest to obtain text image contour information in the region and is connected with the image feature extraction module;
the equal deformation transformation module is used for dividing the text image into a plurality of circles based on polar coordinates based on the angle information and the text image contour information, adjusting the coordinates of the circles and the delineation content thereof so as to trim the text image, and is connected with the image feature extraction module, the classification module and the segmentation module; dividing the text image entirely into a plurality of polar-based circles includes: finding out the center line of the text image based on the angle information and the text image contour information; drawing a first circle by taking one end of the central line as the center of a circle; drawing subsequent circles at predetermined intervals along the central line until the text image is completely divided into areas defined by a plurality of circles; finding the centerline of the text image includes: selecting a point on the boundary of the text image; determining a tangent line passing through the point and then determining a vertical line passing through the point and perpendicular to the tangent line; moving the point along the vertical line into the boundary of the text image until the distance between the point and the vertical line passing through the two ends of the text image is equal, wherein the point is a point on the central line; fitting a plurality of points to obtain a central line of the text image; and
the character recognition module is used for recognizing the trimmed text image and is connected with the equal deformation conversion module.
CN201910707220.0A 2019-08-01 2019-08-01 End-to-end optical character detection and recognition method and system Active CN110598690B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910707220.0A CN110598690B (en) 2019-08-01 2019-08-01 End-to-end optical character detection and recognition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910707220.0A CN110598690B (en) 2019-08-01 2019-08-01 End-to-end optical character detection and recognition method and system

Publications (2)

Publication Number Publication Date
CN110598690A CN110598690A (en) 2019-12-20
CN110598690B true CN110598690B (en) 2023-04-28

Family

ID=68853317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910707220.0A Active CN110598690B (en) 2019-08-01 2019-08-01 End-to-end optical character detection and recognition method and system

Country Status (1)

Country Link
CN (1) CN110598690B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275034B (en) * 2020-01-19 2023-09-12 天翼数字生活科技有限公司 Method, device, equipment and storage medium for extracting text region from image
CN111401357B (en) * 2020-02-12 2023-09-15 杭州电子科技大学 Pointer type instrument reading method based on text detection
CN111507328A (en) * 2020-04-13 2020-08-07 北京爱咔咔信息技术有限公司 Text recognition and model training method, system, equipment and readable storage medium
CN111524106B (en) * 2020-04-13 2021-05-28 推想医疗科技股份有限公司 Skull fracture detection and model training method, device, equipment and storage medium
CN111539438B (en) 2020-04-28 2024-01-12 北京百度网讯科技有限公司 Text content identification method and device and electronic equipment
CN112115836A (en) * 2020-09-11 2020-12-22 北京金堤科技有限公司 Information verification method and device, computer readable storage medium and electronic equipment
CN112036398B (en) * 2020-10-15 2024-02-23 北京一览群智数据科技有限责任公司 Text correction method and system
CN112364873A (en) * 2020-11-20 2021-02-12 深圳壹账通智能科技有限公司 Character recognition method and device for curved text image and computer equipment
CN113221890A (en) * 2021-05-25 2021-08-06 深圳市瑞驰信息技术有限公司 OCR-based cloud mobile phone text content supervision method, system and system
CN113743400B (en) * 2021-07-16 2024-02-20 华中科技大学 Electronic document intelligent examination method and system based on deep learning
CN114842487B (en) * 2021-12-09 2023-11-03 上海鹑火信息技术有限公司 Identification method and system for salomile characters

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109447078A (en) * 2018-10-23 2019-03-08 四川大学 A kind of detection recognition method of natural scene image sensitivity text

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8170372B2 (en) * 2010-08-06 2012-05-01 Kennedy Michael B System and method to find the precise location of objects of interest in digital images

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109447078A (en) * 2018-10-23 2019-03-08 四川大学 A kind of detection recognition method of natural scene image sensitivity text

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于Mask R-CNN的ORB去误匹配方法;张博等;《液晶与显示》;20180815(第08期);全文 *

Also Published As

Publication number Publication date
CN110598690A (en) 2019-12-20

Similar Documents

Publication Publication Date Title
CN110598690B (en) End-to-end optical character detection and recognition method and system
CN108304873B (en) Target detection method and system based on high-resolution optical satellite remote sensing image
US11568639B2 (en) Systems and methods for analyzing remote sensing imagery
CN105931295B (en) A kind of geologic map Extracting Thematic Information method
CN104915636B (en) Remote sensing image road recognition methods based on multistage frame significant characteristics
CN104751187B (en) Meter reading automatic distinguishing method for image
CN102509091B (en) Airplane tail number recognition method
CN108090906B (en) Cervical image processing method and device based on region nomination
US8655070B1 (en) Tree detection form aerial imagery
CN103049763B (en) Context-constraint-based target identification method
Alidoost et al. A CNN-based approach for automatic building detection and recognition of roof types using a single aerial image
CN111860348A (en) Deep learning-based weak supervision power drawing OCR recognition method
CN111738055B (en) Multi-category text detection system and bill form detection method based on same
CN110008900B (en) Method for extracting candidate target from visible light remote sensing image from region to target
CN109726717A (en) A kind of vehicle comprehensive information detection system
Jiao et al. A survey of road feature extraction methods from raster maps
Chen et al. Automatic building extraction via adaptive iterative segmentation with LiDAR data and high spatial resolution imagery fusion
CN102147867A (en) Method for identifying traditional Chinese painting images and calligraphy images based on subject
CN113408584A (en) RGB-D multi-modal feature fusion 3D target detection method
CN107992856A (en) High score remote sensing building effects detection method under City scenarios
CN110458019B (en) Water surface target detection method for eliminating reflection interference under scarce cognitive sample condition
CN114581307A (en) Multi-image stitching method, system, device and medium for target tracking identification
CN111275732B (en) Foreground object image segmentation method based on depth convolution neural network
Sirmacek et al. Road detection from remotely sensed images using color features
Bulatov et al. Land cover classification in combined elevation and optical images supported by OSM data, mixed-level features, and non-local optimization algorithms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant