CN110458165B - Natural scene text detection method introducing attention mechanism - Google Patents

Natural scene text detection method introducing attention mechanism Download PDF

Info

Publication number
CN110458165B
CN110458165B CN201910750169.1A CN201910750169A CN110458165B CN 110458165 B CN110458165 B CN 110458165B CN 201910750169 A CN201910750169 A CN 201910750169A CN 110458165 B CN110458165 B CN 110458165B
Authority
CN
China
Prior art keywords
attention
text
channel
feature
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910750169.1A
Other languages
Chinese (zh)
Other versions
CN110458165A (en
Inventor
牛作东
李捍东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou University
Original Assignee
Guizhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou University filed Critical Guizhou University
Priority to CN201910750169.1A priority Critical patent/CN110458165B/en
Publication of CN110458165A publication Critical patent/CN110458165A/en
Application granted granted Critical
Publication of CN110458165B publication Critical patent/CN110458165B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention discloses a natural scene text detection method introducing an attention mechanism, which comprises the following steps: in the process of using a PVANet network to perform down-sampling on images, a space attention module is generated by using the spatial relationship of the intermediate text feature information, the space attention module is used for capturing importance information judged for a target area in a two-dimensional space, and the feature information generated by each convolution belongs to the range of I ∈ R 1×H×W And activated through the sgmod function; extracting features in an unprooling mode in the image sampling process, using the extracted features to approximate the target position features to generate a channel attention module, and then adjusting the channel attention module through a shared network MLP; and finally, in the process of feature fusion, the channel attention weight and the space attention weight form the whole branch attention model. According to the method, the useful information is paid more attention to and the useless information is restrained when the text target features are extracted, the capability of detecting the long text by an EAST algorithm is effectively improved, and the detection precision is improved while the detection efficiency is not lost.

Description

Natural scene text detection method introducing attention mechanism
Technical Field
The invention relates to a natural scene text detection method introducing an attention mechanism, and belongs to the technical field of text detection methods.
Background
The classification strategy based on the original detection target is mainly a role-based detection algorithm, which is to detect a single character or a part of a text first and then group the single character or the part of the text into a word. The word-based detection method comprises: it is a way similar to general object detection to extract text directly. Text line-based detection algorithms: the method first detects lines of text and then subdivides the lines into words. The detection methods of the classification strategy based on the shape of the target bounding box can be divided into two categories, the first category is horizontal or near-horizontal detection methods, and such methods are focused on detecting horizontal or near-horizontal text in the image. The second type is a multi-directional detection method, and compared with a horizontal or nearly horizontal detection method, the multi-directional text detection is more stable, because the text in a natural scene can be in any direction in an image, the main research methods of the type utilize the rotation invariant feature of the multi-directional text detection, firstly estimate the center, proportion and direction information of a detection target before feature calculation, and then perform chain-level features according to size change, color self-similarity and structure self-similarity.
However, the EAST algorithm provides a fast and accurate scene text detection pipeline, which has only two stages. The pipeline employs a complete convolutional network (FCN) model to directly generate word or text line level predictions, without the inclusion of redundant and slow intermediate steps. The generated text prediction, which may be a rotated rectangle or quadrangle, is sent to non-maximum suppression to produce the final result, as shown in fig. 2, the method has the limitation of extracting long text, and the detection effect of the long text is poor.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the method for detecting the natural scene text with the attention mechanism is provided to solve the problems in the prior art.
The technical scheme adopted by the invention is as follows: a natural scene text detection method introducing an attention mechanism comprises the following steps: in the process of using a PVANet network to perform down-sampling on images, a space attention module is generated by using the spatial relationship of the intermediate text feature information, the space attention module is used for capturing importance information judged for a target area in a two-dimensional space, and the feature information generated by each convolution belongs to the range of I ∈ R 1×H×W And activated through an sgmod function, and the expression is as follows:
W S (I)=σf 7×7 Pool(I) (4)
wherein f is 7×7 For convolution operation, a convolution kernel is a convolution layer of 7 × 7, in the process of image sampling, features are extracted in an unprooling mode to be used for approximating target bit features to generate a channel attention module, and then adjustment is performed through a shared network MLP, wherein the expression is as follows:
W C (I′)=σMLP(unpool(I))=σW 1 W 0 I′ (5)
where σ is the singmod activation function, W 0 ∈R C/r×C And W 1 ∈R C×C/r Respectively MLP weight, and finally, in the process of feature fusion, forming a whole branch attention model by using the channel attention weight and the spatial attention weight, wherein the process is represented as follows:
I′=(W S (I)+1)⊙I (6)
I″=(W C (I′)+1)⊙I′ (7)
in the equation |, where the elements of the corresponding matrix are multiplied, since each module finally needs to be activated by using the sigmod function, the values of each element of the attention channel are between 0 and 1, and the effects of enhancing the useful image information and suppressing the useless information by the attention module are achieved.
The invention has the beneficial effects that: compared with the prior art, the invention aims at the problem that the visual field of the EAST algorithm is limited when the text direction features are extracted, and obtains an Attention-EAST detection method by introducing an Attention mechanism into the backbone network PVANet, so that a training model can pay more Attention to useful information and inhibit useless information when extracting text target features, and experiments prove that the method effectively improves the capability of the EAST algorithm for detecting long texts, and improves the detection precision without losing the detection efficiency.
Drawings
FIG. 1 is a basic flow diagram of an object detection algorithm;
FIG. 2 is a block diagram of the EAST algorithm;
FIG. 3 is a diagram of the Attention-EAST algorithm architecture;
FIG. 4 is a diagram of the EAST algorithm long text detection effect;
FIG. 5 is a diagram of the detection effect of the Attention-EAST algorithm for long text.
Detailed Description
The invention is further described with reference to the accompanying drawings and specific embodiments.
The feasibility of visual attention is mainly due to the reasonable assumption that human vision does not immediately process the entire image as a whole; instead, one focuses on selective portions of the entire visual space only when and where it is needed. In particular, attention is not directed to encoding images as static vectors, but rather to allowing image features to evolve from the sentence context at hand, resulting in richer and longer descriptions of cluttered images. In this way, visual attention can be viewed as a dynamic feature extraction mechanism that incorporates contextual localization over time.
When a mechanism of attention is added to the image processing task that describes the features and information of the detection target in the image, the feature information that the attention module needs to process contains an explicit sequence item a = { a = { (a) } 1 ,a 2 ,a 3 ,…,a L },a i ∈R D Where L represents the number of feature vectors and D represents the spatial dimension. The attention mechanism used therefore requires the calculation of each feature vector a at the current time t i Weight of alpha t,i The formula is as follows:
e ti =f att (a i ,h t-1 ) (1)
Figure BDA0002166919960000041
wherein the fatt () stands for multi-layer perceptron, e ti Represents an intermediate variable, h t-1 The implicit state at the last moment is represented and k represents the index of the feature vector. After the weight is calculated, the model can screen the input sequence a, and the screened sequence items are as follows:
Figure BDA0002166919960000042
the final objective is to determine whether the attention mechanism is hard or soft.
Example 1: as shown in fig. 3 to 5, a natural scene text detection method with attention mechanism introduced includes: in the process of using PVANet network to perform down-image sampling, a spatial attention module is generated by using the spatial relationship of the intermediate text characteristic information, and is used for capturing important judgment of a target area in a two-dimensional spaceCharacteristic information generated by each convolution is I epsilon R 1×H×W And activated through an sgmod function, and the expression is as follows:
W S (I)=σf 7×7 Pool(I) (4)
wherein f is 7×7 In order to perform convolution operation, a convolution kernel is a convolution layer of 7 multiplied by 7, characteristics are extracted in an unprooling mode in the image sampling process and used for approximating target bit characteristics to generate a channel attention module, and then adjustment is performed through a shared network MLP, wherein the expression is as follows:
W C (I′)=σMLP(unpool(I))=σW 1 W 0 I′ (5)
where σ is the singmod activation function, W 0 ∈R C/r×C And W 1 ∈R C×C/r The weights are MLP respectively, and finally, in the process of feature fusion, the channel attention weight and the space attention weight form a whole branch attention model, and the process is expressed as follows:
I′=(W S (I)+1)⊙I (6)
I″=(W C (I′)+1)⊙I′ (7)
in the formula, <' > is multiplication of corresponding matrix elements, since each module is finally activated by using the sigmod function, each element value of the attention channel is between [0 and 1], and the effects of enhancing useful image information and suppressing useless information by the attention module are achieved.
The file detection method of the invention has the following loss functions:
L=L sg L g (8)
wherein L is s And L g Denotes the loss of the fractional map and geometry respectively, and λ g Indicating the importance between the two losses. In the invention, λ is g Set to 1, to simplify the training process, the class balance cross entropy introduced by the present invention:
Figure BDA0002166919960000051
wherein
Figure BDA0002166919960000052
Is the predicted value of the score plot, and Y is the basic true value. The parameter β is a balance factor between positive and negative samples, and is given by:
Figure BDA0002166919960000053
in order to generate accurate geometric predictions for large and small text regions, keeping the regression loss scale unchanged, the rotated rectangular box RBox regression portion employs the IoU loss function because it is fixed for objects of different scales, whose expression is:
Figure BDA0002166919960000054
wherein
Figure BDA0002166919960000055
Expressed as predicted geometric shape, R is its corresponding true shape, intersecting rectangles
Figure BDA0002166919960000056
Respectively, the width and height of (a):
Figure BDA0002166919960000057
wherein d is 1 ,d 2 ,d 3 And d 4 Representing the distance of the pixel to the upper, right, lower and left boundaries of its corresponding rectangle, respectively. The union region is given by the following equation:
Figure BDA0002166919960000061
from this the intersection or union region is calculated, the rotation angle loss is calculated as follows:
Figure BDA0002166919960000062
in the formula (I), the compound is shown in the specification,
Figure BDA0002166919960000063
is a prediction of the angle of rotation, theta * Representing the actual value. Finally, the total geometric loss is calculated as:
L g =L Rθ L θ (15)
in the experimental process, the invention converts lambda into θ Set to 10.
As in the algorithm shown in fig. 3, the key part of the algorithm is a neural network model incorporating an attention module, which predicts the existence of text instances and their geometry directly from full images by training. The model is a fully convolutional neural network, is suitable for text detection, and outputs word or text line predictions with dense pixels. This eliminates intermediate steps such as candidate solution, text region formation and word segmentation. The post-processing step only includes the thresholds on the prediction geometry and the NMS. The algorithm is applied to text detection and mainly comprises three parts, including a feature extraction network, a feature fusion network and an output layer:
1. a feature extraction network: the convolutional neural network is first pre-trained on an ICDAR data set to generate initialization parameters for the neural network model. And then extracting four levels of feature maps with the sizes of 1/32, 1/16, 1/8 and 1/4 of the input image through convolution operation in a feature extraction stage based on a PVANet model. And then, calculating the spatial attention feature of each feature map by using a spatial attention feature module, wherein the spatial attention feature is used for focusing on the feature of the text and is marked as f i (i =1,2,3,4) as output for feature merging;
2. the feature fusion network comprises: in the network, the features extracted by the feature extraction network are combined by adopting a layer-by-layer combination method, and the calculation formula is as follows:
Figure BDA0002166919960000064
Figure BDA0002166919960000071
in the process of each combination, firstly, the feature map from the previous stage is firstly input into a sampling layer to enlarge the size of the feature map; and then the text position feature information is focused through a channel attention feature module. And combining the text feature map with the current layer feature extraction network. Finally, the number of channels and the amount of computation are reduced by convolution operation Conv1 × 1, and Conv3 × 3 fuses the local information to generate output h of the merging stage i (i =1,2,3,4). After the last merging stage, the convolution operation conv3 × 3 layer generates the final feature map of the merging branch and sends it to the output layer;
3. an output layer: several convolution Conv1 × 1 operations are included in the output layer to project the feature maps of 32 channels onto the fractional feature map of 1 channel and one multi-channel geometry feature map. The geometric feature map performs position regression on the detected text by using a rotating rectangular box, wherein the rectangular text box is described by four channels, the four channels respectively represent 4 distances from the pixel position to the top, right, bottom and left boundaries of the rectangle, and one channel represents the rotating angle of the text box. Finally, the text detected in the image is marked by the generated rotating rectangular box, and the detection effect is shown in the following fig. 5.
Model training: for the model provided by the invention, an Adam optimizer is adopted to train the network end to end according to the training mode of the EAST algorithm. In order to accelerate the learning speed, the training samples of the original images 512 × 512 are uniformly packed into 24 samples at a time for batch processing. Adam's learning rate from 1e -3 At the beginning, every 27300 small batch is reduced to one tenth and stopped at 1e -5 The network is trained until the performance improvement tends to be smooth.
Experimental verification and analysis:
the experimental environment is as follows: the experiment is carried out on an Ubuntu18.04LTS operating system, the development language is Python3.6, the integrated development environment is Pycharm, and the deep learning framework is TensorFlow of a GPU version. The hardware configuration CPU is i7-6700k with four cores and eight threads, the main frequency is 4GHz, the memory is 32GB, the GPU is NVIDIA GTX 1080T, and the video memory is 11G.
The experimental results are as follows: the data set adopted in the experiment is a data set used in an ICDAR challenge race, the data set is also a data set which is more popular in a text target detection algorithm, 1500 pictures are provided, 1000 pictures are used for model training, pictures are provided for a test set, text areas are annotated by four vertexes of a quadrangle and correspond to four-side geometric figures in target texts, the pictures are randomly shot by a mobile phone or a camera, therefore, text information in a scene is in any direction and can be influenced by natural environment, and the characteristics are beneficial to estimation and verification of the text detection algorithm.
The invention introduces the Attention-based algorithm and the EAST algorithm to process the detection result pair of the long text in the natural scene, as shown in FIGS. 4-5, and can see that the text detection visual field is improved and the detection effect of the long text is effectively improved by adding the Attention-based algorithm to enhance the extraction of the feature information of the text and the direction. Meanwhile, the method uses three indexes of Recall (Recall), accuracy (Precision) and weighted harmonic mean value F-measured to evaluate the training effect of the detection method on the ICDAR data set. The experimental results are shown in table 1, and it can be shown through the experimental results that the performance indexes of the text detection by the method for introducing the attention mechanism proposed herein are improved compared with the original EAST algorithm.
Table 1 comparison of experimental results with data for each text detection algorithm
Algorithm Usage recall rate Rate of accuracy Weighted harmonic mean
Attention-EAST 0.7902 0.8401 0.8144
EAST 0.7831 0.8224 0.8022
In order to analyze the influence of the attention module on the detection efficiency on the original EAST algorithm, the Frame Per Second (FPS) index is adopted in the experimental environment to evaluate the detection efficiency of the original EAST algorithm and the Frame Per Second (FPS) index, which represents the number of pictures processed Per Second, and 500 detection pictures in the test set are randomly divided into 5 parts for testing respectively. The experimental results are shown in table 2, and it can be seen that the detection efficiency of the original algorithm is not lost after the attention module is filled.
TABLE 2 text detection efficiency comparison data (FPS) of two algorithms
Figure BDA0002166919960000091
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present invention, and therefore, the scope of the present invention should be determined by the scope of the claims.

Claims (1)

1. A natural scene text detection method introducing attention mechanism is characterized in that: the method comprises the following steps: in the process of using a PVANet network to perform down-sampling on images, a space attention module is generated by using the spatial relationship of the intermediate text feature information, the space attention module is used for capturing importance information judged for a target area in a two-dimensional space, and the feature information generated by each convolution belongs to the range of I ∈ R 1×H×W And activated through an sgmod function, and the expression is as follows:
W S (I)=σf 7×7 Pool(I) (4)
wherein f is 7×7 In order to perform convolution operation, a convolution kernel is a convolution layer of 7 multiplied by 7, characteristics are extracted in an unprooling mode in the image sampling process and used for approximating target bit characteristics to generate a channel attention module, and then adjustment is performed through a shared network MLP, wherein the expression is as follows:
W C (I′)=σMLP(unpool(I))=σW 1 W 0 I′ (5)
where σ is the singmod activation function, W 0 ∈R C/r×C And W 1 ∈R C×C/r Respectively MLP weight, and finally, in the process of feature fusion, forming a whole branch attention model by using the channel attention weight and the spatial attention weight, wherein the process is represented as follows:
I′=(W S (I)+1)⊙I (6)
I″=(W C (I′)+1)⊙I′ (7)
wherein |, is the multiplication of the corresponding matrix elements, resulting in an attention channel with each element value between [0,1 ].
CN201910750169.1A 2019-08-14 2019-08-14 Natural scene text detection method introducing attention mechanism Active CN110458165B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910750169.1A CN110458165B (en) 2019-08-14 2019-08-14 Natural scene text detection method introducing attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910750169.1A CN110458165B (en) 2019-08-14 2019-08-14 Natural scene text detection method introducing attention mechanism

Publications (2)

Publication Number Publication Date
CN110458165A CN110458165A (en) 2019-11-15
CN110458165B true CN110458165B (en) 2022-11-08

Family

ID=68486514

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910750169.1A Active CN110458165B (en) 2019-08-14 2019-08-14 Natural scene text detection method introducing attention mechanism

Country Status (1)

Country Link
CN (1) CN110458165B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111126243B (en) * 2019-12-19 2023-04-07 北京科技大学 Image data detection method and device and computer readable storage medium
CN113311700B (en) * 2020-02-27 2022-10-04 陕西师范大学 UUV cluster cooperative control method guided by non-average mechanism
CN111414875B (en) * 2020-03-26 2023-06-02 电子科技大学 Three-dimensional point cloud head posture estimation system based on depth regression forest
CN111753825A (en) * 2020-03-27 2020-10-09 北京京东尚科信息技术有限公司 Image description generation method, device, system, medium and electronic equipment
CN112749621B (en) * 2020-11-25 2023-06-13 厦门理工学院 Remote sensing image cloud layer detection method based on deep convolutional neural network
CN112446372B (en) * 2020-12-08 2022-11-08 电子科技大学 Text detection method based on channel grouping attention mechanism
CN113052159A (en) * 2021-04-14 2021-06-29 中国移动通信集团陕西有限公司 Image identification method, device, equipment and computer storage medium
CN113255646B (en) * 2021-06-02 2022-10-18 北京理工大学 Real-time scene text detection method
CN113554026A (en) * 2021-07-28 2021-10-26 广东电网有限责任公司 Power equipment nameplate identification method and device and electronic equipment
CN114863437B (en) * 2022-04-21 2023-04-07 北京百度网讯科技有限公司 Text recognition method and device, electronic equipment and storage medium
CN116636423B (en) * 2023-07-26 2023-09-26 云南农业大学 Efficient cultivation method of poria cocos strain

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108288088A (en) * 2018-01-17 2018-07-17 浙江大学 A kind of scene text detection method based on end-to-end full convolutional neural networks
CN109165697A (en) * 2018-10-12 2019-01-08 福州大学 A kind of natural scene character detecting method based on attention mechanism convolutional neural networks
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A kind of text implication relation recognition methods for merging more granular informations
CN109389091A (en) * 2018-10-22 2019-02-26 重庆邮电大学 The character identification system and method combined based on neural network and attention mechanism

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376611B (en) * 2018-09-27 2022-05-20 方玉明 Video significance detection method based on 3D convolutional neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108288088A (en) * 2018-01-17 2018-07-17 浙江大学 A kind of scene text detection method based on end-to-end full convolutional neural networks
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A kind of text implication relation recognition methods for merging more granular informations
CN109165697A (en) * 2018-10-12 2019-01-08 福州大学 A kind of natural scene character detecting method based on attention mechanism convolutional neural networks
CN109389091A (en) * 2018-10-22 2019-02-26 重庆邮电大学 The character identification system and method combined based on neural network and attention mechanism

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
EAST: An Efficient and Accurate Scene Text Detector;Xinyu Zhou 等;《 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)》;20171109;2642-2651 *
基于Attention机制的链接预测算法;程华 等;《华中科技大学学报(自然科学版)》;20190228;第47卷(第02期);109-114 *
结合注意力机制的深度学习图像目标检测;孙萍 等;《计算机工程与应用》;20190429;第55卷(第17期);180-184 *

Also Published As

Publication number Publication date
CN110458165A (en) 2019-11-15

Similar Documents

Publication Publication Date Title
CN110458165B (en) Natural scene text detection method introducing attention mechanism
CN110276316B (en) Human body key point detection method based on deep learning
TWI773189B (en) Method of detecting object based on artificial intelligence, device, equipment and computer-readable storage medium
CN108647585B (en) Traffic identifier detection method based on multi-scale circulation attention network
CN111126472A (en) Improved target detection method based on SSD
CN111950453B (en) Random shape text recognition method based on selective attention mechanism
CN108304820B (en) Face detection method and device and terminal equipment
CN111126258A (en) Image recognition method and related device
CN111079739B (en) Multi-scale attention feature detection method
CN113780296A (en) Remote sensing image semantic segmentation method and system based on multi-scale information fusion
KR20200091331A (en) Learning method and learning device for object detector based on cnn, adaptable to customers&#39; requirements such as key performance index, using target object merging network and target region estimating network, and testing method and testing device using the same to be used for multi-camera or surround view monitoring
CN113807361B (en) Neural network, target detection method, neural network training method and related products
CN112464912B (en) Robot end face detection method based on YOLO-RGGNet
CN113781164B (en) Virtual fitting model training method, virtual fitting method and related devices
CN110889421A (en) Target detection method and device
CN114463759A (en) Lightweight character detection method and device based on anchor-frame-free algorithm
CN114241277A (en) Attention-guided multi-feature fusion disguised target detection method, device, equipment and medium
CN111179272B (en) Rapid semantic segmentation method for road scene
CN116645592A (en) Crack detection method based on image processing and storage medium
CN115187786A (en) Rotation-based CenterNet2 target detection method
CN113239866B (en) Face recognition method and system based on space-time feature fusion and sample attention enhancement
CN115546468A (en) Method for detecting elongated object target based on transform
CN112052865A (en) Method and apparatus for generating neural network model
CN113393434A (en) RGB-D significance detection method based on asymmetric double-current network architecture
CN112070040A (en) Text line detection method for video subtitles

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant