CN110263877B - Scene character detection method - Google Patents

Scene character detection method Download PDF

Info

Publication number
CN110263877B
CN110263877B CN201910567794.2A CN201910567794A CN110263877B CN 110263877 B CN110263877 B CN 110263877B CN 201910567794 A CN201910567794 A CN 201910567794A CN 110263877 B CN110263877 B CN 110263877B
Authority
CN
China
Prior art keywords
feature
feature map
size
loss
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910567794.2A
Other languages
Chinese (zh)
Other versions
CN110263877A (en
Inventor
张勇东
王裕鑫
谢洪涛
李岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhongke Research Institute
University of Science and Technology of China USTC
Original Assignee
Beijing Zhongke Research Institute
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongke Research Institute, University of Science and Technology of China USTC filed Critical Beijing Zhongke Research Institute
Priority to CN201910567794.2A priority Critical patent/CN110263877B/en
Publication of CN110263877A publication Critical patent/CN110263877A/en
Application granted granted Critical
Publication of CN110263877B publication Critical patent/CN110263877B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a scene character detection method, which comprises the following steps: carrying out feature extraction on an input image by using a neural network, and carrying out up-sampling operation on the extracted feature map to obtain feature maps with different sizes; mapping the feature maps with the rest sizes to be the same as the feature map with the maximum size by taking the feature map with the maximum size as a standard; performing scale information fusion on the feature maps mapped to the same size to obtain a fused feature map, wherein the fusion operation can uniformly activate characters with different sizes in the fused feature map; and performing regression and classification operation on the character frame on the fused feature graph to obtain a scene character detection result. The method can fundamentally improve the quality of the characteristic diagram, thereby improving the performance of scene character detection.

Description

Scene character detection method
Technical Field
The invention relates to the technical field of character recognition, in particular to a scene character detection method.
Background
The detection and identification of characters in natural scenes is a general character identification technology, has become a hot research direction in the field of computer vision and document analysis in recent years, and is widely applied to the fields of license plate identification, unmanned driving, human-computer interaction and the like.
Because the character detection and identification in the natural scene faces the difficulties of complex background, low resolution, variable fonts and the like, the traditional character detection and identification technology cannot be applied to the character detection and identification of the natural scene. The character detection technology is used as the basis of identification and has great research significance.
In recent years, with the development of deep learning technology in the field of target detection, a general target detection technology achieves a better effect in scene character detection. Deep learning becomes a trend for applying to natural scene character detection. However, since these methods involve complicated post-processing steps and the diversity of text detection itself, the speed and accuracy of detection still need to be improved.
Disclosure of Invention
The invention aims to provide a scene character detection method which can improve the recall rate of character detection.
The purpose of the invention is realized by the following technical scheme:
a scene text detection method comprises the following steps:
carrying out feature extraction on an input image by using a neural network, and carrying out up-sampling operation on the extracted feature map to obtain feature maps with different sizes;
mapping the feature maps with the rest sizes to be the same as the feature map with the maximum size by taking the feature map with the maximum size as a standard;
the feature graphs mapped to the same size are fused with information of different scales, and the fusion operation can enable the character features of different scales to be uniformly activated in the fused feature graphs of uniform size;
and performing regression and classification operation on the character frame on the fused feature graph to obtain a scene character detection result.
According to the technical scheme provided by the invention, the sizes of the feature maps can be unified through size mapping operation, the scale information of the feature maps can be transmitted through establishing the scale relation, and the feature maps with different sizes can better express characters with different scales (small-size feature maps can better detect large targets and lose details of small targets, and large-size feature maps are opposite), so that the characters with different scales can be more uniformly activated in the feature maps, the quality of the feature maps is fundamentally improved, and the performance of scene character detection is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic diagram of a scene text detection method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a size map according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a bidirectional convolution operation provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram of a feature aggregation operation provided by an embodiment of the present invention;
fig. 5 is a schematic diagram of a scene text detection result according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a scene character detection method. And then, establishing a scale relation module in the feature map to realize feature transfer of different scales. Because the feature maps have better expression on characters with different scales (small-size feature maps can better detect large targets and lose detailed detection on small targets, and large-size feature maps are opposite), the operation can enable the characters with different scales to be more uniformly activated in the feature maps, and the quality of the feature maps is fundamentally improved. In addition, the embodiment of the invention also provides a new Loss function Recall Loss, which enables the network to pay more attention to the undetected character area by increasing the weight of the Loss item in the Loss function for the weakly detected character instance, thereby effectively improving the Recall rate of character detection.
As shown in fig. 1, a schematic diagram of a scene text detection method provided in an embodiment of the present invention mainly includes:
1. and performing feature extraction on the input image by using a neural network, and performing up-sampling operation on the extracted feature map to obtain feature maps with different sizes.
In the embodiment of the invention, the extracted feature map is subjected to upsampling operation through a continuous upsampling module. In the up-sampling process, the current characteristic diagram and the shallow characteristic diagram with the same size are subjected to cascade operation.
Illustratively, four different size profiles may be obtained by this step.
Fig. 1 shows an exemplary network framework for implementing the method. The full connection layer behind the ResNet50 network is removed from the main network, and the network is embedded into the frame of the text to perform the feature extraction operation. 4 additional convolutional layers (F1, F2, F3, F4) were then added for the upsampling operation. In the up-sampling process, shallow feature maps with the same size as the current feature map are cascaded. Symbol in fig. 1
Figure BDA0002110003150000033
(symbol)
Figure BDA0002110003150000032
Respectively, a cascade operation, an up-sampling module.
It is noted that the number of feature maps of different sizes can be set according to practical situations.
2. And mapping the feature maps of the rest sizes to the same size as the feature map of the maximum size by taking the feature map of the maximum size as a standard.
In this step, the feature maps of the remaining sizes are mapped to the same size as the largest size feature map, so that the sizes of the feature maps are uniform, and the operation can be realized by a size mapping module.
Also described with reference to the example shown in fig. 1, the feature map in F4 in the example shown in fig. 1 is the largest size feature map, and the feature maps in the small-size convolutional layers F1, F2, and F3 are input to the size mapping module, and output the same size feature map as that of F4. As shown in fig. 2, first, the number of channels of the input feature map is changed to a specified size by the channel matching layer; then, the feature map size is expanded by the size mapping layer by compressing the number of channels of the feature map, so that the feature maps with different sizes are mapped to the same size. Dimension of input feature map is Ci×Hi×Wi(i ═ 1,2,3), with the output dimension being
Figure BDA0002110003150000031
In this example, m is 8,4,2 for F1, F2, and F3, respectively. Wherein C, H, W represents the number of channels, height and width of the characteristic diagram, respectively, and m represents the variation ratio.
3. And fusing the information of different scales on the feature graphs mapped to the same size, wherein the fusion operation can enable the character features of different scales to be activated more uniformly in the fused feature graphs of uniform size.
The method comprises the following steps of transmitting relevant scale information through a scale relation building module in different feature maps so as to improve the quality of the feature maps; the method mainly comprises two parts, namely a bidirectional convolution operation and a characteristic aggregation operation.
As shown in fig. 3, the bidirectional convolution operation is mainly used for unidirectional transmission of feature maps containing different scale information through two-direction continuous convolution operations; meanwhile, an attention mechanism (multiplication operation) is used for controlling the transmission of the scale information of the front layer;
as shown in fig. 4, the feature maps of the bidirectional convolution are fused together by a feature aggregation operation, so as to obtain a fused feature map.
4. And performing regression and classification operation on the character frame on the fused feature graph to obtain a scene character detection result.
The six-channel characteristic diagram with the size of one fourth of the original drawing is output in the step, and the six-channel characteristic diagram is as follows:
a. the single-channel character score map corresponds to the probability that each pixel belongs to a character.
b. And the text box size graph of four channels corresponds to the distance from each similar point to four borders.
c. And the single-channel text box rotation angle graph corresponds to the rotation angle of the text box to which each pixel point belongs.
And performing non-maximization suppression (NMS) on the obtained text box to obtain a final prediction result.
In the embodiment of the invention, in the training process, a random gradient descent (SGD) method can be adopted for end-to-end training, and the overall loss function is as follows:
L=LclsregLreg
in the above formula, LclsTo classify the loss, LregFor regression loss, λregAre balance parameters.
The embodiment of the invention provides a new Loss function Recall Loss, which enables a network to pay more attention to undetected character areas by increasing the weight occupied by the Loss items of weakly detected character instances in the Loss function, and achieves the purpose of increasing the Recall rate (Recall). In the method proposed herein, Recall Loss and Dice Loss are combined for classification tasks
Based on this, the classification loss LclsComprises the following steps:
Lcls=λRRL+λDLDice
Figure BDA0002110003150000041
Figure BDA0002110003150000042
IoU=S∩G/S∪G
where RL stands for Recall Loss, LDiceRepresents Dice Loss, λRAnd λDAll represent balance parameters, G represents the corresponding label box area, η1、η2Representing a balance parameter, p representing the probability of predicting the pixel as a character, y representing a label corresponding to the pixel, S representing each connected domain in the predicted single-channel character score map, IoU representing the value of the union above the intersection ratio, β representing a threshold, α representing an increased weight, and e being a constant.
Regression loss LregExpressed as:
Lreg=Lloc+Lθ
Figure BDA0002110003150000051
Lθ=1-cos(θ′-θ*)
where P represents the predicted text box, G represents the corresponding labeled text box, θ' represents the predicted angle, θ*Representing true angle, LθRepresenting a loss of angle.
Given some parameters set in training as an example below, when training is started, the learning rate is selected to be 0.0001, and the learning rate is decreased to 0.94 times of the original learning rate every 10k times of training until the model converges.
In the testing stage, after the scene character detection result is obtained, a non-maximization inhibition operation is added, and the character frame of repeated detection is screened by using the non-maximization inhibition to obtain a final detection result. Fig. 5 exemplarily shows four different scene text detection results, and it should be noted that fig. 5 is only an example of scene text detection, and is mainly used to illustrate that the scheme of the present invention can accurately detect a larger or smaller scene text, and the remaining parts that are not clear enough do not affect the implementation of the present invention.
In order to verify the performance of the above scheme of the invention, relevant experiments were also performed.
Data set relevant for the experiment:
ICDAR 2015: the dataset is a dataset for detecting multi-directional text of different scales, ambiguities, resolutions. It contains 1000 training images and 500 test images. The label is the coordinates of the 4 fixed points of each text box.
MSRA-TD 500: the data set is a data set for detecting arbitrary directions and multilingual long lines of text. It contains 300 images for training and 200 test images. The label is the position of the fixed point at the upper left corner of the text box, the length and the width of the text box and the rotation angle.
HUST: has the same labeling method as MSRA-TD500, and contains 400 pictures in total. This data set is added to the training set of MSRA-TD00 data herein because of the lack of MSRA-TD500 data.
Experimental results show that the method achieves advanced performance in scene text detection, the recall rate, the accuracy and the F value on an ICDAR2015 and an MSRA-TD500 data set are respectively 79.6%, 83.2% and 81.4%, the FPS is 8.8, 71.2%, 87.6% and 78.5%, and the FPS is 13.3.
Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are also within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (5)

1. A scene character detection method is characterized by comprising the following steps:
carrying out feature extraction on an input image by using a neural network, and carrying out up-sampling operation on the extracted feature map to obtain feature maps with different sizes;
mapping the feature maps with the rest sizes to be the same as the feature map with the maximum size by taking the feature map with the maximum size as a standard;
the feature graphs mapped to the same size are fused with information of different scales, and the fusion operation can enable the character features of different scales to be uniformly activated in the fused feature graphs of uniform size;
performing regression and classification operation on the character frame on the fused feature graph to obtain a scene character detection result;
wherein mapping the feature maps of the remaining sizes to the same size as the feature map of the largest size includes: changing the channel number of the input feature map to a specified size through a channel matching layer; the size mapping layer realizes the expansion of the size by compressing the channel number of the feature map, thereby mapping the size of the input feature map to the same size of the feature map with the maximum size;
the fusion of different scale information of the feature maps mapped to the same size comprises the following steps: overlapping feature maps containing different scale information through continuous convolution operation in two directions; meanwhile, the attention mechanism is used for controlling the transmission of the scale information of the front layer; and aggregating the superposition result and the attention mechanism operation result through the characteristic aggregation operation.
2. The method for detecting scene characters as claimed in claim 1, wherein the extracted feature map is up-sampled by a continuous up-sampling module; in the up-sampling process, the current characteristic diagram and the shallow characteristic diagram with the same size are subjected to cascade operation.
3. The method for detecting scene characters of claim 1, wherein in the training stage, an end-to-end training is performed by using a stochastic gradient descent method, and the overall loss function is as follows:
L=LclsregLreg
in the above formula, LclsTo classify the loss, LregFor regression loss, λregAre balance parameters.
4. The method as claimed in claim 3, wherein the scene text detection method,
loss of classification LclsExpressed as:
Lcls=λRRL+λDLDice
Figure FDA0003657038700000011
Figure FDA0003657038700000012
IoU=S∩G/S∪G
regression loss LregExpressed as:
Lreg=Lloc+Lθ
Figure FDA0003657038700000021
Lθ=1-cos(θ′-θ*)
where RL represents a recall loss, LDiceRepresents the loss of cross-over ratio, λRAnd λDAll represent equilibrium parameters, η1、η2Representing balance parameters, p representing the probability of predicting the current pixel to be a character, y representing a label corresponding to the current pixel, S representing each connected domain in the predicted single-channel character scoring graph, IoU representing the value of the union at the intersection ratio, beta representing a threshold value, alpha representing increased weight, and e being a constant; p represents the predicted text box, G represents the corresponding label text box, theta' represents the predicted angle, theta represents the true angle, LθRepresenting a loss of angle.
5. The method of claim 1, wherein in the testing stage, after the scene text detection result is obtained, a non-maximization suppression operation is added, and the text box of repeated detection is screened to obtain the final detection result.
CN201910567794.2A 2019-06-27 2019-06-27 Scene character detection method Active CN110263877B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910567794.2A CN110263877B (en) 2019-06-27 2019-06-27 Scene character detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910567794.2A CN110263877B (en) 2019-06-27 2019-06-27 Scene character detection method

Publications (2)

Publication Number Publication Date
CN110263877A CN110263877A (en) 2019-09-20
CN110263877B true CN110263877B (en) 2022-07-08

Family

ID=67922320

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910567794.2A Active CN110263877B (en) 2019-06-27 2019-06-27 Scene character detection method

Country Status (1)

Country Link
CN (1) CN110263877B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767935B (en) * 2019-10-31 2023-09-05 杭州海康威视数字技术股份有限公司 Target detection method and device and electronic equipment
CN111242120B (en) * 2020-01-03 2022-07-29 中国科学技术大学 Character detection method and system
CN111259764A (en) * 2020-01-10 2020-06-09 中国科学技术大学 Text detection method and device, electronic equipment and storage device
CN111680628B (en) * 2020-06-09 2023-04-28 北京百度网讯科技有限公司 Text frame fusion method, device, equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107688808A (en) * 2017-08-07 2018-02-13 电子科技大学 A kind of quickly natural scene Method for text detection
CN107977620A (en) * 2017-11-29 2018-05-01 华中科技大学 A kind of multi-direction scene text single detection method based on full convolutional network
CN108288088A (en) * 2018-01-17 2018-07-17 浙江大学 A kind of scene text detection method based on end-to-end full convolutional neural networks
CN108446698A (en) * 2018-03-15 2018-08-24 腾讯大地通途(北京)科技有限公司 Method, apparatus, medium and the electronic equipment of text are detected in the picture
CN108549893A (en) * 2018-04-04 2018-09-18 华中科技大学 A kind of end-to-end recognition methods of the scene text of arbitrary shape
CN108764228A (en) * 2018-05-28 2018-11-06 嘉兴善索智能科技有限公司 Word object detection method in a kind of image
CN109165697A (en) * 2018-10-12 2019-01-08 福州大学 A kind of natural scene character detecting method based on attention mechanism convolutional neural networks
CN109299274A (en) * 2018-11-07 2019-02-01 南京大学 A kind of natural scene Method for text detection based on full convolutional neural networks

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10679085B2 (en) * 2017-10-31 2020-06-09 University Of Florida Research Foundation, Incorporated Apparatus and method for detecting scene text in an image

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107688808A (en) * 2017-08-07 2018-02-13 电子科技大学 A kind of quickly natural scene Method for text detection
CN107977620A (en) * 2017-11-29 2018-05-01 华中科技大学 A kind of multi-direction scene text single detection method based on full convolutional network
CN108288088A (en) * 2018-01-17 2018-07-17 浙江大学 A kind of scene text detection method based on end-to-end full convolutional neural networks
CN108446698A (en) * 2018-03-15 2018-08-24 腾讯大地通途(北京)科技有限公司 Method, apparatus, medium and the electronic equipment of text are detected in the picture
CN108549893A (en) * 2018-04-04 2018-09-18 华中科技大学 A kind of end-to-end recognition methods of the scene text of arbitrary shape
CN108764228A (en) * 2018-05-28 2018-11-06 嘉兴善索智能科技有限公司 Word object detection method in a kind of image
CN109165697A (en) * 2018-10-12 2019-01-08 福州大学 A kind of natural scene character detecting method based on attention mechanism convolutional neural networks
CN109299274A (en) * 2018-11-07 2019-02-01 南京大学 A kind of natural scene Method for text detection based on full convolutional neural networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Gated Bi-directional CNN for Object Detection;Xingyu Zeng 等;《COMPUTER VISION - ECCV 2016》;20161231;1-16 *
SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection;Yonghyun Kim 等;《ECCV 2018》;20181231;1-16 *
Scale-Transferrable Object Detection;Peng Zhou 等;《2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》;20181231;528-537 *
自然场景图像中的文本检测及定位算法研究-基于边缘信息与笔画特征;李东勤 等;《重庆科技学院学报(自然科学版)》;20190615;第21卷(第3期);81-83 *

Also Published As

Publication number Publication date
CN110263877A (en) 2019-09-20

Similar Documents

Publication Publication Date Title
CN110263877B (en) Scene character detection method
US10699166B2 (en) Font attributes for font recognition and similarity
US10467508B2 (en) Font recognition using text localization
Dewi et al. Weight analysis for various prohibitory sign detection and recognition using deep learning
US20220253631A1 (en) Image processing method, electronic device and storage medium
US11270158B2 (en) Instance segmentation methods and apparatuses, electronic devices, programs, and media
US20190156144A1 (en) Method and apparatus for detecting object, method and apparatus for training neural network, and electronic device
US9824304B2 (en) Determination of font similarity
CN111104962A (en) Semantic segmentation method and device for image, electronic equipment and readable storage medium
US20200175700A1 (en) Joint Training Technique for Depth Map Generation
WO2017079522A1 (en) Subcategory-aware convolutional neural networks for object detection
CN108038409A (en) A kind of pedestrian detection method
US11255678B2 (en) Classifying entities in digital maps using discrete non-trace positioning data
WO2024051609A1 (en) Advertisement creative data selection method and apparatus, model training method and apparatus, and device and storage medium
CN114037985A (en) Information extraction method, device, equipment, medium and product
CN115861400B (en) Target object detection method, training device and electronic equipment
Hui et al. Detail texture detection based on Yolov4‐tiny combined with attention mechanism and bicubic interpolation
US20240037911A1 (en) Image classification method, electronic device, and storage medium
Liang et al. Car detection and classification using cascade model
CN113343981A (en) Visual feature enhanced character recognition method, device and equipment
CN113326766B (en) Training method and device of text detection model, text detection method and device
CN112396060B (en) Identification card recognition method based on identification card segmentation model and related equipment thereof
Golcarenarenji et al. Robust real-time traffic light detector on small-form platform for autonomous vehicles
CN112651399B (en) Method for detecting same-line characters in inclined image and related equipment thereof
Shi et al. Anchor free remote sensing detector based on solving discrete polar coordinate equation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant