CN106650725A - Full convolutional neural network-based candidate text box generation and text detection method - Google Patents

Full convolutional neural network-based candidate text box generation and text detection method Download PDF

Info

Publication number
CN106650725A
CN106650725A CN201611070587.9A CN201611070587A CN106650725A CN 106650725 A CN106650725 A CN 106650725A CN 201611070587 A CN201611070587 A CN 201611070587A CN 106650725 A CN106650725 A CN 106650725A
Authority
CN
China
Prior art keywords
text
candidate
detection
network
inception
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611070587.9A
Other languages
Chinese (zh)
Other versions
CN106650725B (en
Inventor
马景法
金连文
钟卓耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201611070587.9A priority Critical patent/CN106650725B/en
Publication of CN106650725A publication Critical patent/CN106650725A/en
Application granted granted Critical
Publication of CN106650725B publication Critical patent/CN106650725B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention discloses a full convolutional neural network-based candidate text box generation and text detection method. The method comprises the steps of generating text region candidate boxes, taking a natural scene picture and a set of real bounding boxes for marking a text region as inputs by an inception-RPN, generating a controllable number of word region candidate boxes, sliding an inception network on a convolutional feature response graph of a VGG16 model, and providing assistance in each sliding position through a set of text feature priori boxes; incorporating text type monitoring information easily causing ambiguity, fusing multilevel regional down-sampling information, and performing text detection; training an inception candidate box generation network and a text detection network in an end-to-end way through back propagation and stochastic gradient descent; and performing iterative voting by the candidate boxes, obtaining a higher text recall rate in a supplementary way, and removing excessive detection boxes by using a candidate box filtering algorithm. According to the method, the accuracy rates of 0.83 and 0.85 are obtained in ICDAR 2011 and 2013 robust text detection standard databases and are superior to the previous best result.

Description

Candidate's text box based on full convolutional neural networks is generated and Method for text detection
Technical field
The present invention relates to natural scene picture Chinese version candidate frame generates the technology with text detection, more particularly to based on complete Candidate's text box of convolutional neural networks is generated and Method for text detection.
Background technology
Text in image provides abundant and accurate high-caliber semantic information, and these information understand for scene, Image and food are retrieved, and content-based recommendation system etc. is potentially large number of using most important.The text inspection of natural scene picture Survey has attracted substantial amounts of concern in computer vision and image understanding community.However, the text detection of natural scene remains one It is individual full of challenge and an open question.First, the background of textual image is very complicated, and symbol, mark, fragment of brick and grass The regions such as ground composition is very difficult to and text differentiation.Additionally, uneven illumination condition, heavy exposure, low contrast, fuzzy Huge challenge is added to text detection task with the super confounding factor such as low resolution
The content of the invention
To overcome the deficiencies in the prior art, the present invention to propose that the candidate's text box based on full convolutional neural networks is generated and text This detection method.
The technical scheme is that what is be achieved in that:
Candidate's text box based on full convolutional neural networks is generated and Method for text detection, including step
S1:Generate text filed candidate frame, inception-RPN is with natural scene picture and a set of retrtieval region Real border frame produces the word region candidate frame of controlled quantity, on the convolution characteristic response figure of VGG16 models as input Slide an inception network, and aids in a set of text feature priori frame in each sliding position;
S2:The text categories supervision message for easily causing ambiguity is incorporated to, multi-level region down-sampling information is incorporated, is carried out Text detection;
S3:By backpropagation and stochastic gradient descent, inception candidate frames are trained to give birth in a kind of mode end to end Into network and text detection network;
S4:The ballot of candidate frame iteration obtains higher text recall rate in the way of a kind of supplement, is filtered using candidate frame Algorithm, removes the detection block of surplus.
Further, step S1 includes step
S11:Text feature priori frame is designed;
S12:Build Inception candidate frames and generate network.
Further, totally 24 kinds of step S11 Chinese eigen priori frame, the width of wherein each sliding position sliding window sets For 32,48,64 and 80, Aspect Ratio is 0.2,0.5,0.8,1.0,1.2 and 1.5.
Further, inception candidate frames generate convolutional layer of the network by a 3*3, the volume of 5*5 in step S12 The maximum pond layer of lamination and 3*3 is connected to the corresponding space of the characteristic response figure of a Conv5_3 as input and receives On domain.
Further, step S2 Chinese version classification supervision message is:Candidate frame IoU overlaps being appointed as more than or equal to 0.5 There is text, candidate frame IoU is overlapped and is appointed as " fuzzy text " less than 0.5 more than or equal to 0.2, other are appointed as not including Text message.
Further, multi-level in step S2 region down-sampling information is:VGG16 networks Conv4_3 and The convolution characteristic response figure of Conv5_3 is carried out multi-level region down-sampling, and obtains the sampling feature of two 512*H*W, Then the feature for being linked together with the convolution layer decoder of a 512*1*1.
The beneficial effects of the present invention is, compared with prior art, the present invention proposes inception candidate frames and generates net Network, this network applies different size of sliding window on convolution characteristic pattern, and aids in a set of text in each sliding position Feature priori frame, generates word region candidate frame.This different size of sliding window retains local information on relevant position While also take into account contextual information, help filters out the candidate frame without text, and the inception candidate frames of the present invention are generated Network has obtained very high recall rate in the case of only with hundreds of word candidates frame;The present invention also draws in text detection network Enter the extra easily text categories supervision message of an ambiguity and incorporate multi-level region down-sampling information, these information The more distinction information of help text detection e-learning distinguish text from complicated background;Additionally, the present invention is in order to more Well using the model in training process, it is proposed that a kind of scheme of candidate frame iteration ballot, obtained in the way of a kind of supplement Higher word recall rate, the filter algorithm that the present invention is used retains optimal candidate frame, removes the candidate frame of surplus.
Description of the drawings
Fig. 1 is the flow chart of candidate text box generation and Method for text detection of the present invention based on full convolutional neural networks.
Fig. 2 is the exemplary plot that the IoU of the word region candidate frame of one embodiment of the invention list overlaps specific interval.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than the embodiment of whole.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made Embodiment, belongs to the scope of protection of the invention.
Fig. 1 is referred to, candidate text box of the present invention based on full convolutional neural networks is generated and Method for text detection, comprising Four steps:S1, text filed candidate frame are generated;S2, text detection;S3, end to end study optimization;S4, heuristic process.
The part S1's act as:Inception-RPN is with natural scene picture and a set of retrtieval region Real border frame as input, produce the word region candidate frame of controlled quantity;For searching words region candidate frame, we Slide an inception network on the convolution characteristic response figure of VGG16 models, and aids in a set of text in each sliding position Eigen priori frame.Particularly may be divided into two steps:(1) text feature priori frame (2) Inception candidate frames are designed and generates network. Each sliding position arrange four kinds of different scales (32,48,64 with 80) different with six kinds ratio (0.2,0.5,0.8,1.0, 1.2 and 1.5), common k=24 kinds priori sliding window.In the study stage, be more than 0.5 divided by union occuring simultaneously with real text frame Be appointed as text label, otherwise overlapping region is appointed as background label divided by union refion less than 0.3.Design Inception candidate frames generate convolutional layer of the network by a 3*3, and the convolutional layer of 5*5 and the maximum pond layer of 3*3 are connected to one In the corresponding space acceptance region of the characteristic response figure of the individual Conv5_3 as input.In addition, in order to reduce dimension, the volume of 1*1 Product operation is used on the maximum pond layer of 3*3.Then, we couple together the feature of various pieces on passage coordinate, The connection features vector of one 640 dimension is sent to two output layers:Classification layer predicts score of the region with the presence or absence of text, returns Layer is returned to improve the text filed position of the various priori windows of each sliding position.
Step S2 includes:(1) the comprehensive text categories supervision message for easily causing ambiguity is to increase more rational prisons Superintend and direct information, help grader to learn more area's another characteristics, identify from complicated and diversified background text filed, and filter Fall the candidate frame not comprising text.(2) multi-level region down-sampling information is incorporated.It act as preferably utilizing multi-level volume The distinction information of product feature and abundant each sliding window.
Being much operated in detection network in the past is appointed as the presence of text the candidate frame that IoU is overlapped more than 0.5, otherwise It is appointed as no presence of text.But this judgement candidate frame is irrational with the presence or absence of the method for text, because IoU is overlapped Interval 0.2 to 0.5 may include space or autgmentability text message, as shown in Figure 2.The label information that these mix can be upset The classification learning of text and non-textual candidate frame.For this purpose, it is proposed that candidate frame IoU is overlapped being appointed as more than or equal to 0.5 There is text, candidate frame IoU is overlapped and is appointed as " fuzzy text " less than 0.5 more than or equal to 0.2, other are appointed as not including Text message.This strategy provides more rational supervision messages and helps grader to learn more distinction features, with Text is identified from complicated and diversified background and the candidate frame without text is filtered out.
In order to better profit from multi-level convolution feature and enrich the discriminant information of each candidate frame, the present invention is in VGG16 The convolution characteristic response figure of the Conv4_3 and Conv5_3 of network is carried out multi-level region down-sampling, and obtains two 512* The sampling feature of H*W.Then the feature for being linked together with the convolution layer decoder of a 512*1*1.The convolutional layer of this 1*1 Together and in the training process Weight merges by multi-level sampling combinations of features to act as (1).(2) reduce dimension with First full articulamentum of matching VGG16.
The part S3 is different from having pointed out the four step Training strategies for combining RPN and Fast-RCNN, and the present invention is logical The method for crossing backpropagation and stochastic gradient descent generates network and text detection network with end-to-end inception candidate frames Mode be trained.Shared convolutional network is by the good imageNet sorter networks initialization of training in advance.The weight of new layer The Gaussian Profile initialization that by average be 0 and deviation is 0.01.Benchmark learning rate is 0.001, and original is reduced into 40000 times per iteration / 10th for coming.Momentum and weights attenuation are set to 0.9 and 0.0005.
Inception candidate frames generate network and text detection network two fraternal input layers:One classification layer, one Return layer.Inception candidate frames generate network and the difference of text detection network output layer is as follows:(1) inception candidates Frame generates network, and each priori frame should be by independent parameter, so we need to predict k=24 priori candidate simultaneously Frame.Classification layer exports 2k and judges whether candidate frame has the score of text, while returning the candidate frame after layer output 4k improves Deviate the numerical value of former candidate frame.(2) text detection network has three output scores to each candidate frame, and background, mould are corresponded to respectively Paste text and the candidate frame that there is text.Return layer and export 4 deviation from regression values of each text candidates frame.In our training process The loss function minimum of this multitask is made, formula is as follows:
L(p,p*,t,t*)=Lcls(p,p*)+λLreg(t,t*), (0.1)
The loss function L of classification layerclsIt is softmax loss functions, p and p*It is respectively the label and real mark of prediction Sign.Return loss function LregUsing smooth-L1 loss functions.In addition, t={ tx,ty,tw,thAndPoint The deviation from regression value vector of prediction and true candidate frame, t are not represented not accordingly*By equation below gained:
Here, P={ Px,Py,Pw,PhAnd G={ Gx,Gy,Gw,GhCorresponding candidate frame P and real text frame G is represented respectively Centre coordinate, height and width.λ represents loss balance parameters, and we allow λ=3 in inception candidate frames generate network So that he is partial to more preferable candidate frame position, in text detection network by λ=1.
The part S4 includes candidate frame iteration voting mechanism and filter algorithm.Candidate frame iteration voting mechanism makes this Invention obtains higher text recall rate in the way of a kind of supplement, and improve text detection system is energy.Filter algorithm makes this Invention removes the detection block of surplus, to improve accuracy.
Natural scene picture and a set of real text frame data are input to inception candidate frames and are generated by the present invention first Network, produces a number of word region candidate frame.Then will obtain word region candidate frame send into one be used for text and Non-textual classification and the text detection network of String localization, the network increased in the training process the text for easily causing ambiguity Classification supervision message and multi-level region down-sampling information is incorporated.Whole system declines mechanism by backpropagation and gradient It is trained in a kind of mode end to end.The mid-module present invention to make full use of training process is thrown using candidate frame iteration Ticket mechanism obtains the high recall rate of text example in the way of a kind of supplement, improves the performance of whole text detection system.Finally The present invention applies a kind of filter algorithm, this algorithm that the inside and outside candidate frame of each text example is found for coordinate position, protects High score candidate frame is stayed, the candidate frame of low score is removed.
The above is the preferred embodiment of the present invention, it is noted that for those skilled in the art For, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications are also considered as Protection scope of the present invention.

Claims (6)

1. the candidate's text box based on full convolutional neural networks is generated and Method for text detection, it is characterised in that including step
S1:Generate text filed candidate frame, inception-RPN is true with natural scene picture and a set of retrtieval region Bounding box produces the word region candidate frame of controlled quantity as input, slides on the convolution characteristic response figure of VGG16 models One inception network, and aid in a set of text feature priori frame in each sliding position;
S2:The text categories supervision message for easily causing ambiguity is incorporated to, multi-level region down-sampling information is incorporated, text is carried out Detection;
S3:By backpropagation and stochastic gradient descent, inception candidate frames are trained to generate net in a kind of mode end to end Network and text detection network;
S4:The ballot of candidate frame iteration obtains higher text recall rate in the way of a kind of supplement, using candidate frame filter algorithm, Remove the detection block of surplus.
2. candidate's text box as claimed in claim 1 based on full convolutional neural networks is generated and Method for text detection, and it is special Levy and be, step S1 includes step
S11:Text feature priori frame is designed;
S12:Build Inception candidate frames and generate network.
3. candidate's text box as claimed in claim 2 based on full convolutional neural networks is generated and Method for text detection, and it is special Levy and be, totally 24 kinds of step S11 Chinese eigen priori frame, wherein each sliding position sliding window width is set to 32,48,64 With 80, Aspect Ratio is 0.2,0.5,0.8,1.0,1.2 and 1.5.
4. candidate's text box as claimed in claim 2 based on full convolutional neural networks is generated and Method for text detection, and it is special Levy and be, inception candidate frames generate convolutional layer of the network by a 3*3 in step S12, and the convolutional layer and 3*3 of 5*5 are most Great Chiization layer is connected in the corresponding space acceptance region of the characteristic response figure of a Conv5_3 as input.
5. candidate's text box as claimed in claim 1 based on full convolutional neural networks is generated and Method for text detection, and it is special Levy and be, step S2 Chinese version classification supervision message is:Candidate frame IoU overlaps being appointed as more than or equal to 0.5 and there is text, Candidate frame IoU is overlapped and is appointed as " fuzzy text " less than 0.5 more than or equal to 0.2, and other are appointed as not comprising text message.
6. candidate's text box as claimed in claim 1 based on full convolutional neural networks is generated and Method for text detection, and it is special Levy and be, region down-sampling information multi-level in step S2 is:It is special in the convolution of the Conv4_3 and Conv5_3 of VGG16 networks Levy response diagram and be carried out multi-level region down-sampling, and obtain the sampling feature of two 512*H*W, then with a 512*1* The feature that 1 convolution layer decoder links together.
CN201611070587.9A 2016-11-29 2016-11-29 Candidate text box generation and text detection method based on full convolution neural network Active CN106650725B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611070587.9A CN106650725B (en) 2016-11-29 2016-11-29 Candidate text box generation and text detection method based on full convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611070587.9A CN106650725B (en) 2016-11-29 2016-11-29 Candidate text box generation and text detection method based on full convolution neural network

Publications (2)

Publication Number Publication Date
CN106650725A true CN106650725A (en) 2017-05-10
CN106650725B CN106650725B (en) 2020-06-26

Family

ID=58813359

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611070587.9A Active CN106650725B (en) 2016-11-29 2016-11-29 Candidate text box generation and text detection method based on full convolution neural network

Country Status (1)

Country Link
CN (1) CN106650725B (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107316058A (en) * 2017-06-15 2017-11-03 国家新闻出版广电总局广播科学研究院 Improve the method for target detection performance by improving target classification and positional accuracy
CN107397658A (en) * 2017-07-26 2017-11-28 成都快眼科技有限公司 A kind of multiple dimensioned full convolutional network and vision blind-guiding method and device
CN107480649A (en) * 2017-08-24 2017-12-15 浙江工业大学 A kind of fingerprint pore extracting method based on full convolutional neural networks
CN108090443A (en) * 2017-12-15 2018-05-29 华南理工大学 Scene text detection method and system based on deeply study
CN108154145A (en) * 2018-01-24 2018-06-12 北京地平线机器人技术研发有限公司 The method and apparatus for detecting the position of the text in natural scene image
CN108288088A (en) * 2018-01-17 2018-07-17 浙江大学 A kind of scene text detection method based on end-to-end full convolutional neural networks
CN108647681A (en) * 2018-05-08 2018-10-12 重庆邮电大学 A kind of English text detection method with text orientation correction
CN108764228A (en) * 2018-05-28 2018-11-06 嘉兴善索智能科技有限公司 Word object detection method in a kind of image
CN109165697A (en) * 2018-10-12 2019-01-08 福州大学 A kind of natural scene character detecting method based on attention mechanism convolutional neural networks
CN109190458A (en) * 2018-07-20 2019-01-11 华南理工大学 A kind of person of low position's head inspecting method based on deep learning
CN109299274A (en) * 2018-11-07 2019-02-01 南京大学 A kind of natural scene Method for text detection based on full convolutional neural networks
CN109376658A (en) * 2018-10-26 2019-02-22 信雅达系统工程股份有限公司 A kind of OCR method based on deep learning
CN109389114A (en) * 2017-08-08 2019-02-26 富士通株式会社 Line of text acquisition device and method
CN109492630A (en) * 2018-10-26 2019-03-19 信雅达系统工程股份有限公司 A method of the word area detection positioning in the financial industry image based on deep learning
CN109598290A (en) * 2018-11-22 2019-04-09 上海交通大学 A kind of image small target detecting method combined based on hierarchical detection
CN109800756A (en) * 2018-12-14 2019-05-24 华南理工大学 A kind of text detection recognition methods for the intensive text of Chinese historical document
CN109918987A (en) * 2018-12-29 2019-06-21 中国电子科技集团公司信息科学研究院 A kind of video caption keyword recognition method and device
CN110135408A (en) * 2019-03-26 2019-08-16 北京捷通华声科技股份有限公司 Text image detection method, network and equipment
CN110135424A (en) * 2019-05-23 2019-08-16 阳光保险集团股份有限公司 Tilt text detection model training method and ticket image Method for text detection
CN110135248A (en) * 2019-04-03 2019-08-16 华南理工大学 A kind of natural scene Method for text detection based on deep learning
CN110619325A (en) * 2018-06-20 2019-12-27 北京搜狗科技发展有限公司 Text recognition method and device
CN112418207A (en) * 2020-11-23 2021-02-26 南京审计大学 Weak supervision character detection method based on self-attention distillation
CN112765353A (en) * 2021-01-22 2021-05-07 重庆邮电大学 Scientific research text-based biomedical subject classification method and device
CN113454638A (en) * 2018-12-19 2021-09-28 艾奎菲股份有限公司 System and method for joint learning of complex visual inspection tasks using computer vision

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015132665A2 (en) * 2014-03-07 2015-09-11 Wolf, Lior System and method for the detection and counting of repetitions of repetitive activity via a trained network
CN104915386A (en) * 2015-05-25 2015-09-16 中国科学院自动化研究所 Short text clustering method based on deep semantic feature learning
CN105740892A (en) * 2016-01-27 2016-07-06 北京工业大学 High-accuracy human body multi-position identification method based on convolutional neural network
CN105912611A (en) * 2016-04-05 2016-08-31 中国科学技术大学 CNN based quick image search method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015132665A2 (en) * 2014-03-07 2015-09-11 Wolf, Lior System and method for the detection and counting of repetitions of repetitive activity via a trained network
CN104915386A (en) * 2015-05-25 2015-09-16 中国科学院自动化研究所 Short text clustering method based on deep semantic feature learning
CN105740892A (en) * 2016-01-27 2016-07-06 北京工业大学 High-accuracy human body multi-position identification method based on convolutional neural network
CN105912611A (en) * 2016-04-05 2016-08-31 中国科学技术大学 CNN based quick image search method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KEZE WANG 等: "Dictionary Pair Classifier Driven Convolutional Neural Networks for Object Detection", 《2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 *
金连文 等: "深度学习在手写汉字识别中的应用综述", 《自动化学报》 *

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107316058A (en) * 2017-06-15 2017-11-03 国家新闻出版广电总局广播科学研究院 Improve the method for target detection performance by improving target classification and positional accuracy
CN107397658A (en) * 2017-07-26 2017-11-28 成都快眼科技有限公司 A kind of multiple dimensioned full convolutional network and vision blind-guiding method and device
CN107397658B (en) * 2017-07-26 2020-06-19 成都快眼科技有限公司 Multi-scale full-convolution network and visual blind guiding method and device
CN109389114B (en) * 2017-08-08 2021-12-03 富士通株式会社 Text line acquisition device and method
CN109389114A (en) * 2017-08-08 2019-02-26 富士通株式会社 Line of text acquisition device and method
CN107480649A (en) * 2017-08-24 2017-12-15 浙江工业大学 A kind of fingerprint pore extracting method based on full convolutional neural networks
CN108090443B (en) * 2017-12-15 2020-09-22 华南理工大学 Scene text detection method and system based on deep reinforcement learning
CN108090443A (en) * 2017-12-15 2018-05-29 华南理工大学 Scene text detection method and system based on deeply study
CN108288088A (en) * 2018-01-17 2018-07-17 浙江大学 A kind of scene text detection method based on end-to-end full convolutional neural networks
CN108288088B (en) * 2018-01-17 2020-02-28 浙江大学 Scene text detection method based on end-to-end full convolution neural network
CN108154145B (en) * 2018-01-24 2020-05-19 北京地平线机器人技术研发有限公司 Method and device for detecting position of text in natural scene image
CN108154145A (en) * 2018-01-24 2018-06-12 北京地平线机器人技术研发有限公司 The method and apparatus for detecting the position of the text in natural scene image
CN108647681A (en) * 2018-05-08 2018-10-12 重庆邮电大学 A kind of English text detection method with text orientation correction
CN108647681B (en) * 2018-05-08 2019-06-14 重庆邮电大学 A kind of English text detection method with text orientation correction
CN108764228A (en) * 2018-05-28 2018-11-06 嘉兴善索智能科技有限公司 Word object detection method in a kind of image
CN110619325A (en) * 2018-06-20 2019-12-27 北京搜狗科技发展有限公司 Text recognition method and device
CN110619325B (en) * 2018-06-20 2024-03-08 北京搜狗科技发展有限公司 Text recognition method and device
CN109190458A (en) * 2018-07-20 2019-01-11 华南理工大学 A kind of person of low position's head inspecting method based on deep learning
CN109165697A (en) * 2018-10-12 2019-01-08 福州大学 A kind of natural scene character detecting method based on attention mechanism convolutional neural networks
CN109165697B (en) * 2018-10-12 2021-11-30 福州大学 Natural scene character detection method based on attention mechanism convolutional neural network
CN109492630A (en) * 2018-10-26 2019-03-19 信雅达系统工程股份有限公司 A method of the word area detection positioning in the financial industry image based on deep learning
CN109376658A (en) * 2018-10-26 2019-02-22 信雅达系统工程股份有限公司 A kind of OCR method based on deep learning
CN109299274B (en) * 2018-11-07 2021-12-17 南京大学 Natural scene text detection method based on full convolution neural network
CN109299274A (en) * 2018-11-07 2019-02-01 南京大学 A kind of natural scene Method for text detection based on full convolutional neural networks
CN109598290A (en) * 2018-11-22 2019-04-09 上海交通大学 A kind of image small target detecting method combined based on hierarchical detection
CN109800756A (en) * 2018-12-14 2019-05-24 华南理工大学 A kind of text detection recognition methods for the intensive text of Chinese historical document
CN109800756B (en) * 2018-12-14 2021-02-12 华南理工大学 Character detection and identification method for dense text of Chinese historical literature
CN113454638A (en) * 2018-12-19 2021-09-28 艾奎菲股份有限公司 System and method for joint learning of complex visual inspection tasks using computer vision
CN109918987A (en) * 2018-12-29 2019-06-21 中国电子科技集团公司信息科学研究院 A kind of video caption keyword recognition method and device
CN109918987B (en) * 2018-12-29 2021-05-14 中国电子科技集团公司信息科学研究院 Video subtitle keyword identification method and device
CN110135408A (en) * 2019-03-26 2019-08-16 北京捷通华声科技股份有限公司 Text image detection method, network and equipment
CN110135408B (en) * 2019-03-26 2021-02-19 北京捷通华声科技股份有限公司 Text image detection method, network and equipment
CN110135248A (en) * 2019-04-03 2019-08-16 华南理工大学 A kind of natural scene Method for text detection based on deep learning
CN110135424B (en) * 2019-05-23 2021-06-11 阳光保险集团股份有限公司 Inclined text detection model training method and ticket image text detection method
CN110135424A (en) * 2019-05-23 2019-08-16 阳光保险集团股份有限公司 Tilt text detection model training method and ticket image Method for text detection
CN112418207A (en) * 2020-11-23 2021-02-26 南京审计大学 Weak supervision character detection method based on self-attention distillation
CN112418207B (en) * 2020-11-23 2024-03-19 南京审计大学 Weak supervision character detection method based on self-attention distillation
CN112765353A (en) * 2021-01-22 2021-05-07 重庆邮电大学 Scientific research text-based biomedical subject classification method and device

Also Published As

Publication number Publication date
CN106650725B (en) 2020-06-26

Similar Documents

Publication Publication Date Title
CN106650725A (en) Full convolutional neural network-based candidate text box generation and text detection method
CN107066445B (en) The deep learning method of one attribute emotion word vector
CN104217214B (en) RGB D personage's Activity recognition methods based on configurable convolutional neural networks
US11687728B2 (en) Text sentiment analysis method based on multi-level graph pooling
CN103631859B (en) Intelligent review expert recommending method for science and technology projects
CN107943967A (en) Algorithm of documents categorization based on multi-angle convolutional neural networks and Recognition with Recurrent Neural Network
CN109461157A (en) Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field
CN106354710A (en) Neural network relation extracting method
CN110083700A (en) A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks
CN109961132A (en) System and method for learning the structure of depth convolutional neural networks
CN106845499A (en) A kind of image object detection method semantic based on natural language
CN109543502A (en) A kind of semantic segmentation method based on the multiple dimensioned neural network of depth
CN110516539A (en) Remote sensing image building extracting method, system, storage medium and equipment based on confrontation network
CN109492666A (en) Image recognition model training method, device and storage medium
CN108197294A (en) A kind of text automatic generation method based on deep learning
CN108038205A (en) For the viewpoint analysis prototype system of Chinese microblogging
CN110222634A (en) A kind of human posture recognition method based on convolutional neural networks
CN107657056A (en) Method and apparatus based on artificial intelligence displaying comment information
CN109063719A (en) A kind of image classification method of co-ordinative construction similitude and category information
CN107451230A (en) A kind of answering method and question answering system
CN113254652B (en) Social media posting authenticity detection method based on hypergraph attention network
CN112925908A (en) Attention-based text classification method and system for graph Attention network
CN109558904A (en) Classification method, device and the storage medium of image local feature
CN106203510A (en) A kind of based on morphological feature with the hyperspectral image classification method of dictionary learning
CN113255895A (en) Graph neural network representation learning-based structure graph alignment method and multi-graph joint data mining method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant