CN111368848A - Character detection method under complex scene - Google Patents
Character detection method under complex scene Download PDFInfo
- Publication number
- CN111368848A CN111368848A CN202010464622.5A CN202010464622A CN111368848A CN 111368848 A CN111368848 A CN 111368848A CN 202010464622 A CN202010464622 A CN 202010464622A CN 111368848 A CN111368848 A CN 111368848A
- Authority
- CN
- China
- Prior art keywords
- detection
- character
- value
- parameter value
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of artificial intelligence and computer vision, in particular to a character detection method based on deep learning under a complex scene. The character false detection method has the advantages that the network structure (SDetNet) of the segmentation module and the detection module and the spatial distribution characteristic of the Loss function (Shape Loss) learning data are fused, so that the false detection rate of characters can be reduced, the redundancy of a detection frame is reduced, and the interpretability is good. A character detection method under a complex scene comprises the following steps: scene preprocessing of image data; designing a network model; a loss function.
Description
Technical Field
The invention relates to the technical field of artificial intelligence and computer vision, in particular to a character detection method based on deep learning under a complex scene.
Background
Optical Character Recognition (OCR) refers to converting characters on an image into computer-editable Character contents. The most important step is to find out the character region features in the image through feature extraction, namely character detection. Text detection is divided into three mainstream methods: an algorithm based on text box regression; an algorithm based on pixel segmentation; a research algorithm based on a combination of segmentation and regression. At present, the character detection faces a plurality of challenges, such as the variability of character directions, the irregularity of character distribution and the non-uniqueness of character sizes. Due to the above challenges, character detection in a complex scene is prone to two situations of false detection and excessive redundancy of detection frames, and further bad influence is caused on character recognition.
In the field of computer vision, text Detection of complex scenes can utilize two different Detection ideas, namely Object Detection (Object Detection) and Object segmentation (Object segmentation). The paper detection Text in Natural Image with connectivity Text textual Network, published in 2016 by Zhi TIAN et al, first introduced RNN into the detection Network using a target detection method. The depth features of the image are obtained through the CNN, then the anchor with fixed width is used for detecting the text pitch, the features corresponding to the anchor in the same row are serialized into a sequence and input into the RNN, finally the full-connection layer is used for classification or regression, and the correct text pitch is combined into a text line, so that the detection precision is improved by the method for seamlessly combining the RNN and the CNN. Baoguang Shi et al, 2017, published detection organized Text in Natural Images by Linking Segments by first Detecting that a slice (segment) was created that represents a Text line or a portion of a word, which may be a character, a word, or several characters. The slices (segments) belonging to the same text line or word are linked by means of links. The linking is carried out at the central points of two overlapped slices, and finally, the slices and the links are combined into a complete text line through a combination algorithm to obtain the position and the rotation angle of the detection frame of the complete text line. The method has achieved a good expression on scene text detection through a direct regression method, but the scene text is subject to large scale, aspect ratio and direction changes. Qiangpeng Yang et al published 2018 that the A New expression-Text with Deformable PSROI Pooling for Multi-Oriented SceneText Detection proposed a new expression-Text Module for Multi-directional scene Text Detection, using a Deformable PSROI Pooling Module to process Multi-directional texts, using convolution branches of multiple different convolution kernels to process texts with different aspect ratio ratios, and connecting a Deformable convolution layer behind each branch to adapt to the Multi-directional texts, thereby realizing the Detection of texts in complex scenes.
In summary, the use of target detection and target segmentation algorithms to achieve natural scene text detection is a different and efficient approach. However, the character detection in a complex scene has certain disadvantages, and the complex character background is likely to cause false detection of characters and the like. How to improve the detection precision and reduce the false detection is also a hot spot of complex scene character detection research.
In a complex scene, due to diversification of a real scene, diversification of character distribution, difference of character sizes and the like, certain problems of false detection and detection frame redundancy occur in a text detection process. Meanwhile, under the condition that the image size is large, the character pixel proportion is small, and the small target is easy to miss detection. By using a single target segmentation algorithm, not only is complex post-processing operation existed, but also false detection condition exists; by using a single target detection algorithm, redundancy and false detection of detection frames easily occur in a complex scene.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a character detection method in a complex scene. The character false detection method has the advantages that the network structure (SDetNet) of the segmentation module and the detection module and the spatial distribution characteristic of the loss function (Shapeloss) learning data are fused, so that the false detection rate of characters can be reduced, the redundancy of a detection frame is reduced, and the character false detection method has good interpretability.
In order to solve the technical problem, the invention provides a character detection method under a complex scene, which is characterized by comprising the following steps:
the method comprises the following steps: the scene preprocessing of the image data comprises the steps of dividing a large pixel image in an original complex scene into a plurality of small image blocks, respectively detecting the small image blocks, and fusing detection results.
Step two: and (2) designing a network structure SDetNet integrating a segmentation module and a detection module, calculating the intersection ratio IOU of the detection frame of the detection module and the detection frame of the segmentation module, and judging whether characters exist in some local positions in the scene or not by using the intersection ratio parameter value and the text existence probability value through a merging module. Calculating by using the formula (1):
the detection module is used for detecting the character in the text, wherein the Pre _ Rect is an intersection ratio parameter value of the segmentation module and the detection module, and the Label _ Rect represents a real distribution area in which the character exists.
Step three: designing a loss function, setting IOU parameter values of a detection frame and a real frame as dynamic weight parameters, taking the dynamic weight parameters as a final target function of the model, and then performing CNN iterative training, wherein the method for calculating the loss function of the regression length-width ratio comprises the following steps:
the coordinate origin is set as a point (0, 0), x and y respectively represent the length and the width of a text box, a point A (x 1 and y 1) and a point B (x 2 and y 2) in the coordinate respectively represent the truth value of the detection box and the result value predicted by the model, and a theta parameter is used as an included angle between the point A and the point B to measure the similarity of vector sums. Optimizing a theta parameter value, and adjusting a detection frame according to the following formulas (2) and (3):
wherein, theta is the included angle between the true value coordinate A and the predicted coordinate B, when the parameter value of theta is increased, the cos function is increased, and the-ln function is also increased. And effectively adjusting the model through a gradient descent algorithm to enable the theta parameter value to be gradually reduced, wherein AL is the calculated vector direction difference degree parameter value.
And designing a dynamic weight value by utilizing the intersection ratio of the true value frame and the prediction frame, and when the IOU parameter value is larger, indicating that the character detection area can better cover the character area and setting a higher weight value. When the IOU parameter value is smaller, the effect of the character detection area covering the character area is poor, and a lower weight is set; the loss function formula is as follows (4):
in the above character detection method, the number of the small image blocks dividing the image in the original complex scene is 4.
In the character detection method, the detection module learns character region distribution and character inclination angle characteristics; the segmentation module learns the character distribution probability and character detection box features.
As the method is adopted, compared with the prior art, the invention has the following advantages:
1. in the invention, a network structure SDetNet of a segmentation module and a detection module is fused, wherein segmentation branches can effectively calculate the existence probability of character areas and characters, and the false detection rate of the characters can be effectively reduced by combining the detection branches;
2. according to the target frame Loss function Shape Loss, the character distribution is used for realizing the standardization of the detection of the region frame by using the prior characteristic of regular length-width ratio, the detection efficiency is improved, and the detection redundancy is reduced;
3. the method designs a dynamic weight parameter by using an intersection-to-parallel ratio IOU parameter. Due to the initial stage of network training, the learning of the model has high randomness, and a large number of character detection boxes can be generated. Through the IOU parameter value, the positive sample and the negative sample of the detection frame can be effectively obtained. When the positive sample exists, the corresponding text region is indicated to have higher probability, and the length-width ratio of the detection frame is adjusted. Conversely, when negative samples are used, there should be a lower probability that the ratio of the length to the width of the detection box is adjusted. Through the purposeful constraint, the model can well pay attention to the character region characteristics. Therefore, the learning of the model can be effectively and dynamically adjusted by using the cross-over ratio IOU parameter value.
The invention is further described with reference to the following figures and detailed description.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a diagram of the network architecture SDetNet of the present invention;
fig. 3 is a graph in accordance with the present invention.
Detailed Description
Referring to fig. 1, the invention relates to a text detection method in a complex scene, comprising the following steps:
the method comprises the following steps: scene preprocessing of image data, in an image with a pixel size of 1920 x 1080 in an original complex scene, pixels occupied by small characters are small. In order to improve the pixel ratio of characters, an original image is divided into 4 960 × 540 image blocks, detection is performed respectively, and detection results are fused.
Step two: the network model design, referring to fig. 2, by designing a network structure SDetNet that integrates a segmentation module and a detection module, the segmentation module and the detection module share a network backbone structure backbone respectively, segmentation branches obtain a character distribution region and probability, and detection branches obtain a character distribution angle parameter and region. And calculating the merging ratio IOU of the detection box of the detection module and the detection box of the segmentation module, and judging whether characters exist at some local positions in the scene or not by using the merging ratio parameter value and the text existence probability value through the merging module. Calculating the intersection ratio IOU by adopting a formula (1):
the detection module is used for detecting the character distribution, wherein the Pre _ Rect is an intersection ratio parameter value of the segmentation module and the detection module, and the Label _ Rect represents a real distribution area where the character exists;
in fig. 2, the input image size is batchsize 3 × 512, and the number of channels output by each module is as follows:
Conv:16
Conv Stage 1:64
Conv Stage 2:256
Conv Stage 3:384
DeConv Stage 1:128
DeConv Stage 2:64
DeConv Stage 3:32
DeConv:32
Detection Block: 5
Segmentation: 2
the feature map size in the output result is:
Score Map: batchsize*256 * 256 * 1
Box Geometry: batchsize*256 * 256 * 4
Rotation Angel: batchsize*256 * 256 * 1
as shown in fig. 2, three results were obtained: text region Score (Score Map), text Box size (Box Geometry), and text Rotation angle (Rotation angle). The Segmentation Block and the detection Block share a U-shaped network structure.
Step three: and (4) designing a loss function, screening the collected samples according to actual business requirements, and realizing character area polygon marking on the screened specific scene samples. In a complex scene, due to the characteristic that the character size, the distance and the distribution position have diversity and the defects of cross entropy, long IOU loss function pair compared with a rule, large wide-frame regression redundancy and the like. The invention designs a new loss function which can regress length and width ratios. In order to solve the problem of difficult convergence of the model, the IOU parameter values of the detection frame and the real frame are set as dynamic weight parameters to serve as final target functions of the model, and finally CNN iterative training is carried out, so that the redundancy of the detection frame is effectively reduced by the trained model;
the effects of Shape Loss and IOU in FIG. 1 are: standardizing the length-width ratio; the loss weight parameter value is dynamically adjusted. Setting IOU parameter values of the detection frame and the real frame as dynamic weight parameters, and performing CNN iterative training as a final target function of the model, wherein the method for calculating the loss function of the regression length-width ratio comprises the following steps:
referring to fig. 3, setting the coordinate origin as a (0, 0) point, x and y respectively representing the length and width of a text box, a point a (x 1, y 1) and a point B (x 2, y 2) in the coordinate respectively representing the true value of the detection box and the result value predicted by the model, and a theta parameter as the included angle between the point a and the point B, the similarity of the vector sum can be measured;
optimizing a theta parameter value, and adjusting a detection frame according to the following formulas (2) and (3):
wherein, theta is the included angle between the true value coordinate A and the predicted coordinate B, when the parameter value of theta is increased, the cos function is increased, and the-ln function is also increased. And effectively adjusting the model through a gradient descent algorithm to enable the theta parameter value to be gradually reduced, wherein AL is the calculated vector direction difference degree parameter value.
Because a large number of detection frames are generated in the early stage of network training, the model is difficult to converge due to the fact that the theta parameter value is simply minimized. Designing a dynamic weight value by utilizing the intersection ratio of the true value frame and the prediction frame, and when the IOU parameter value is larger, indicating that the character detection area can better cover the character area, and setting a higher weight value; when the IOU parameter value is smaller, the effect of the character detection area covering the character area is poor, and a lower weight is set. The loss function equation (4) is as follows:
and finally, removing redundant detection frames by using a non-maximum suppression algorithm (NMS) and outputting a final detection result.
The following alternatives used on the basis of the technical scheme of the invention all belong to the protection scope of the invention:
1. the scheme of the convolutional neural network CNN model can be replaced by a scheme combining other deep learning models or machine learning;
2. the segmentation and detection fusion network SDetNet designed by the invention can be replaced by other fusion methods;
3. the Loss function Shape Loss method designed by the invention can be replaced by other methods;
4. the dynamic threshold scheme designed by the invention can be replaced by other methods.
Claims (3)
1. A character detection method under a complex scene is characterized by comprising the following steps:
the method comprises the following steps: the scene preprocessing of the image data, divide the large pixel image in the original complex scene into several small image blocks, detect separately, and then fuse the detection results;
step two: designing a network structure SDetNet integrating a segmentation module and a detection module, calculating the intersection ratio IOU of the detection frame of the detection module and the detection frame of the segmentation module, and judging whether characters exist in some local positions in the scene or not by using the intersection ratio parameter value and the text existence probability value through a merging module; calculating the intersection ratio IOU by adopting a formula (1):
the detection module is used for detecting the character distribution, wherein the Pre _ Rect is an intersection ratio parameter value of the segmentation module and the detection module, and the Label _ Rect represents a real distribution area where the character exists;
step three: and a loss function, namely setting the IOU parameter values of the detection frame and the real frame as dynamic weight parameters, and performing CNN iterative training as a final target function of the model, wherein the method for calculating the loss function of the regression length-width ratio comprises the following steps:
setting the origin of coordinates as a (0, 0) point, wherein x and y respectively represent the length and the width of a text box, points A (x 1 and y 1) and B (x 2 and y 2) in the coordinates respectively represent the truth value of a detection box and the result value predicted by a model, and a theta parameter is used as an included angle between the points A and B to measure the similarity of vector sums; optimizing a theta parameter value, and adjusting a detection frame according to the following formulas (2) and (3):
wherein theta is an included angle between the true value coordinate A and the prediction coordinate B, and when the parameter value of theta is increased, the cos function is increased and the-ln function is also increased; the model is effectively adjusted through a gradient descent algorithm, so that a theta parameter value is gradually reduced, and AL is a calculated vector direction difference degree parameter value;
designing a dynamic weight value by utilizing the intersection ratio of the true value frame and the prediction frame, and when the IOU parameter value is larger, indicating that the character detection area can better cover the character area, and setting a higher weight value; when the IOU parameter value is smaller, the effect of the character detection area covering the character area is poor, and a lower weight is set; the loss function ShapeLoss is as follows (4):
2. the method for detecting words in a complex scene as claimed in claim 1, wherein the number of the image blocks in the original complex scene is 4.
3. The character detection method under the complex scene according to claim 1 or 2, wherein the detection module learns character region distribution and character inclination angle characteristics; the segmentation module learns the character distribution probability and character detection box features.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010464622.5A CN111368848B (en) | 2020-05-28 | 2020-05-28 | Character detection method under complex scene |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010464622.5A CN111368848B (en) | 2020-05-28 | 2020-05-28 | Character detection method under complex scene |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111368848A true CN111368848A (en) | 2020-07-03 |
CN111368848B CN111368848B (en) | 2020-08-21 |
Family
ID=71212292
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010464622.5A Active CN111368848B (en) | 2020-05-28 | 2020-05-28 | Character detection method under complex scene |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111368848B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112733639A (en) * | 2020-12-28 | 2021-04-30 | 贝壳技术有限公司 | Text information structured extraction method and device |
CN112926637A (en) * | 2021-02-08 | 2021-06-08 | 天津职业技术师范大学(中国职业培训指导教师进修中心) | Method for generating text detection training set |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107480648A (en) * | 2017-08-23 | 2017-12-15 | 南京大学 | A kind of method of natural scene text detection |
CN108615237A (en) * | 2018-05-08 | 2018-10-02 | 上海商汤智能科技有限公司 | A kind of method for processing lung images and image processing equipment |
CN109559300A (en) * | 2018-11-19 | 2019-04-02 | 上海商汤智能科技有限公司 | Image processing method, electronic equipment and computer readable storage medium |
CN109584251A (en) * | 2018-12-06 | 2019-04-05 | 湘潭大学 | A kind of tongue body image partition method based on single goal region segmentation |
US20190130229A1 (en) * | 2017-10-31 | 2019-05-02 | Adobe Inc. | Deep salient content neural networks for efficient digital object segmentation |
CN109815948A (en) * | 2019-01-14 | 2019-05-28 | 辽宁大学 | A kind of paper partitioning algorithm under complex scene |
CN110097568A (en) * | 2019-05-13 | 2019-08-06 | 中国石油大学(华东) | A kind of the video object detection and dividing method based on the double branching networks of space-time |
CN110428432A (en) * | 2019-08-08 | 2019-11-08 | 梅礼晔 | The deep neural network algorithm of colon body of gland Image Automatic Segmentation |
CN110689093A (en) * | 2019-12-10 | 2020-01-14 | 北京同方软件有限公司 | Image target fine classification method under complex scene |
CN110738207A (en) * | 2019-09-10 | 2020-01-31 | 西南交通大学 | character detection method for fusing character area edge information in character image |
CN110751154A (en) * | 2019-09-27 | 2020-02-04 | 西北工业大学 | Complex environment multi-shape text detection method based on pixel-level segmentation |
US10572760B1 (en) * | 2017-11-13 | 2020-02-25 | Amazon Technologies, Inc. | Image text localization |
-
2020
- 2020-05-28 CN CN202010464622.5A patent/CN111368848B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107480648A (en) * | 2017-08-23 | 2017-12-15 | 南京大学 | A kind of method of natural scene text detection |
US20190130229A1 (en) * | 2017-10-31 | 2019-05-02 | Adobe Inc. | Deep salient content neural networks for efficient digital object segmentation |
US10572760B1 (en) * | 2017-11-13 | 2020-02-25 | Amazon Technologies, Inc. | Image text localization |
CN108615237A (en) * | 2018-05-08 | 2018-10-02 | 上海商汤智能科技有限公司 | A kind of method for processing lung images and image processing equipment |
CN109559300A (en) * | 2018-11-19 | 2019-04-02 | 上海商汤智能科技有限公司 | Image processing method, electronic equipment and computer readable storage medium |
CN109584251A (en) * | 2018-12-06 | 2019-04-05 | 湘潭大学 | A kind of tongue body image partition method based on single goal region segmentation |
CN109815948A (en) * | 2019-01-14 | 2019-05-28 | 辽宁大学 | A kind of paper partitioning algorithm under complex scene |
CN110097568A (en) * | 2019-05-13 | 2019-08-06 | 中国石油大学(华东) | A kind of the video object detection and dividing method based on the double branching networks of space-time |
CN110428432A (en) * | 2019-08-08 | 2019-11-08 | 梅礼晔 | The deep neural network algorithm of colon body of gland Image Automatic Segmentation |
CN110738207A (en) * | 2019-09-10 | 2020-01-31 | 西南交通大学 | character detection method for fusing character area edge information in character image |
CN110751154A (en) * | 2019-09-27 | 2020-02-04 | 西北工业大学 | Complex environment multi-shape text detection method based on pixel-level segmentation |
CN110689093A (en) * | 2019-12-10 | 2020-01-14 | 北京同方软件有限公司 | Image target fine classification method under complex scene |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112733639A (en) * | 2020-12-28 | 2021-04-30 | 贝壳技术有限公司 | Text information structured extraction method and device |
CN112926637A (en) * | 2021-02-08 | 2021-06-08 | 天津职业技术师范大学(中国职业培训指导教师进修中心) | Method for generating text detection training set |
CN112926637B (en) * | 2021-02-08 | 2023-06-09 | 天津职业技术师范大学(中国职业培训指导教师进修中心) | Method for generating text detection training set |
Also Published As
Publication number | Publication date |
---|---|
CN111368848B (en) | 2020-08-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112884064B (en) | Target detection and identification method based on neural network | |
CN112837330B (en) | Leaf segmentation method based on multi-scale double-attention mechanism and full convolution neural network | |
CN109299274B (en) | Natural scene text detection method based on full convolution neural network | |
CN111160407B (en) | Deep learning target detection method and system | |
CN110807362B (en) | Image detection method, device and computer readable storage medium | |
CN115690542B (en) | Aerial insulator orientation identification method based on improvement yolov5 | |
CN111553837A (en) | Artistic text image generation method based on neural style migration | |
CN111401380B (en) | RGB-D image semantic segmentation method based on depth feature enhancement and edge optimization | |
CN109948533B (en) | Text detection method, device and equipment and readable storage medium | |
CN113516126A (en) | Adaptive threshold scene text detection method based on attention feature fusion | |
CN114419413A (en) | Method for constructing sensing field self-adaptive transformer substation insulator defect detection neural network | |
CN111368848B (en) | Character detection method under complex scene | |
CN117037119A (en) | Road target detection method and system based on improved YOLOv8 | |
CN113888461A (en) | Method, system and equipment for detecting defects of hardware parts based on deep learning | |
Jin et al. | Defect identification of adhesive structure based on DCGAN and YOLOv5 | |
Zhang et al. | A graph-voxel joint convolution neural network for ALS point cloud segmentation | |
CN114283431B (en) | Text detection method based on differentiable binarization | |
CN115439766A (en) | Unmanned aerial vehicle target detection method based on improved yolov5 | |
CN115861229A (en) | YOLOv5 s-based X-ray detection method for packaging defects of components | |
CN112560824B (en) | Facial expression recognition method based on multi-feature adaptive fusion | |
CN112967296B (en) | Point cloud dynamic region graph convolution method, classification method and segmentation method | |
CN114359286A (en) | Insulator defect identification method, device and medium based on artificial intelligence | |
CN111476226B (en) | Text positioning method and device and model training method | |
CN110348311B (en) | Deep learning-based road intersection identification system and method | |
CN118015611A (en) | Vegetable plant target detection method and device based on YOLOv8 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |