CN110533041A - Multiple dimensioned scene text detection method based on recurrence - Google Patents

Multiple dimensioned scene text detection method based on recurrence Download PDF

Info

Publication number
CN110533041A
CN110533041A CN201910838235.0A CN201910838235A CN110533041A CN 110533041 A CN110533041 A CN 110533041A CN 201910838235 A CN201910838235 A CN 201910838235A CN 110533041 A CN110533041 A CN 110533041A
Authority
CN
China
Prior art keywords
convolution
module
filled
length
convolution kernel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910838235.0A
Other languages
Chinese (zh)
Other versions
CN110533041B (en
Inventor
景小荣
朱莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jinming Information Technology Co.,Ltd.
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201910838235.0A priority Critical patent/CN110533041B/en
Publication of CN110533041A publication Critical patent/CN110533041A/en
Application granted granted Critical
Publication of CN110533041B publication Critical patent/CN110533041B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to the multiple dimensioned scene text detection methods based on recurrence, belong to digital image processing field.This method specifically includes: S1: the sufficient training data with text position calibration is arranged;S2: construction feature extracts network, including feedforward network process and top-down Fusion Features process bottom-up, for extracting the low feature on the middle and senior level of each training data;S3: cascade module is used to each layer feature for being sent into detection layers;S4: using the detection framework based on recurrence, suitable default frame is arranged according to text feature, the text position in detection image.The cascade module that the present invention uses improves the receptive field of network, so that the default frame of the text feature of setting is very suitable, text position of the final accurate detection into image.

Description

Multiple dimensioned scene text detection method based on recurrence
Technical field
The invention belongs to digital image processing field, it is related to the multi-direction scene text detection method based on recurrence.
Background technique
With popularizing for smart machine, people can obtain image information anywhere or anytime.Text in image is as one Kind high-layer semantic information, provides important clue to understand and analyzing picture material.Text is the direct reflection of picture material, It is easier to be extracted and understand compared to other elements, and the description of many texts can be used directly, can easily be answered For in the various images based on keyword, Video content retrieval and analysis.Thus text detection becomes computer vision neck The hot research topic in domain.
The method of text detection has very much.Traditional scene text detection method needs artificial design features, often different Image need to design different feature extraction modes, workload is huge.Designer is wanted in the work of characteristic Design simultaneously It asks very high, needs professional knowledge abundant.These all cause the development bottleneck of artificial design features.And deep learning go out It is existing, solve the problems, such as this.
It is some to be calculated based on general target detection as deep learning achieves outstanding detection effect in object detection field The improved Method for text detection of method comes into being.Two major classes can be divided into based on general target detection method: based on candidate region Method and method based on recurrence.Unlike general target detection, how the variation of text length-width ratio acutely, makes network pair It is problem in need of consideration that text dimensional variation, which has stronger robustness,.The text that method based on candidate region is improved This detection algorithm, such as: connecting natural scene text detection algorithm (the Detecting Text in Natural of text box Image with Connectionist Text Proposal Network, CTPN), which proposes text sequence Acutely, horizontal position is higher than the prediction difficulty of upright position for length variation, in order to more accurately generate text candidates frame, the party Method is fixed as width 16 for frame is defaulted, and only predicts the position of vertical direction.Although this method realizes convolutional neural networks for the first time With the end-to-end training of Recognition with Recurrent Neural Network, it is extracted the space characteristics and sequence signature of text;And to multiple dimensioned and multilingual The precision of text detection is higher, but just for the detection of horizontal text, and speed is slower.Method based on recurrence is carried out Improved text detection algorithm, such as: fast text detection algorithm (the A Fast Text Detector of single step deep neural network With a Single Deep Neural Network, Textboxes), which is predicted in different layers, low layer prediction Small object, high level predict big target.Devise the default frame of suitable text scale.Although the speed of this method has not with precision Wrong effect, but since the feature extraction of middle low layer is insufficient, it is undesirable to the detection effect of Small object.
Therefore, the Method for text detection that a kind of pair of text dimensional variation has higher robustness is needed.
Summary of the invention
In view of this, the purpose of the present invention is to provide a kind of multi-direction scene text detection method based on recurrence, solution The problem of being certainly currently based on existing for the text detection network of recurrence to text dimensional variation inadequate robust, sets for text feature Suitable default frame is set, finally detects the text position in image.
In order to achieve the above objectives, the invention provides the following technical scheme:
A kind of multiple dimensioned scene text detection method based on recurrence, specifically includes the following steps:
S1: the sufficient training data with text position calibration is set;
S2: construction feature extracts network, including feedforward network process and top-down Fusion Features mistake bottom-up Journey, for extracting the low feature on the middle and senior level of each training data;
S3: cascade (inception) module is used to each layer feature for being sent into detection layers;
S4: using the detection framework based on recurrence, suitable default frame is arranged according to text feature, the text in detection image This position.
Further, in the step S2, feedforward network bottom-up includes: an input module, volume first to the 5th Volume module, the first to the 5th pond module, a Recognition with Recurrent Neural Network module, the 6th to the tenth convolution module and the 6th Chi Huamo Block;Wherein input of the input module as the first convolution module, the first to the 5th convolution module separately include the first to the 5th pond Change module;One Recognition with Recurrent Neural Network module, the 6th to the tenth convolution module and the 6th pond module successively cascade.
Further, in the step S2, top-down Fusion Features, which refer to, melts high-level characteristic and low-level feature Close, specifically: it is high-rise first by deconvolution obtain with low layer characteristic pattern of the same size, then connect and batch normalize (Batch Normalization, BatchNorm) module;Low layer connects a convolution module first, and convolution kernel size is 1*1, step-length 1, It is filled with 0;BatchNorm module is connect again;Element dot product operations (Eltwise) finally is used, two characteristic layers are merged; Output of the fused output as entire feature extraction network.
Further, the convolution kernel size of the first to the 6th convolution module is 3*3, and step-length 1 is filled with 1;5th pond Module convolution kernel size is 3*3, and step-length 1 is filled with 1;Remaining pond module convolution kernel size is 2*2, step-length 2, filling It is 0;One Recognition with Recurrent Neural Network module is two-way long short-term memory Recognition with Recurrent Neural Network (Bi-directional Long Short-Term Memory Recurrent Neural Network, BLSTM-RNN), hidden layer unit number is 256;The Seven convolution kernel sizes are 1*1, and step-length 1 is filled with 0;8th to the tenth convolution module includes two convolution kernels, one of them Convolution kernel size is 1*1, and step-length 1 is filled with 0, another convolution kernel size is 3*3, and step-length 2 is filled with 1.
Further, in the step S3, described cascade (inception) module includes input spectrum end and characteristic spectrum cascade End, by four convolution branches connection in parallel between the input spectrum end and characteristic spectrum cascade end, every branch road includes 1,2 Or 3 convolution modules.
Further, the cascade module includes four convolution branches in parallel,
It include a convolution kernel on first convolution branch, convolution kernel size is 3*3, and step-length 1 is filled with 1;
Article 2 convolution branch road includes three convolution kernels, and one of convolution kernel size is 1*1, and step-length 1 is filled with 0;One convolution kernel size is 1*5, and step-length 1 is filled with 1;One convolution kernel size is 5*1, and step-length 1 is filled with 1;
Article 3 convolution branch road includes three convolution kernels, and one of convolution kernel size is 1*1, and step-length 1 is filled with 0;One convolution kernel size is 5*1, and step-length 1 is filled with 1;One convolution kernel size is 1*5, and step-length 1 is filled with 1;
Article 4 convolution branch road includes a pond layer and a convolution kernel, and wherein the convolution kernel size of pond layer is 3* 3, step-length 1 is filled with 1, and convolution kernel size is 1*1, and step-length 1 is filled with 0;
BatchNorm module and Relu module are connect after all of above convolution kernel.
The beneficial effects of the present invention are: Method for text detection of the present invention has stronger Shandong to text dimensional variation Stick.The present invention extracts the space characteristics of text with sequence signature simultaneously using convolution loop neural network.Use feature gold word The prediction output of tower structure multilayer, low-level feature figure predict that Small object, high-level characteristic figure predict big target.It is high using Fusion Features For the semantic information of layer for classifying, the structured message of low layer is used for auxiliary regression, alleviates low-level feature abstract to a certain extent It is insufficient, the lower problem of Small object predictablity rate.Inception module finally is used to each layer feature for being sent into detection layers The receptive field of network is further increased, then uses the detection framework based on recurrence, suitable default frame is set for text feature, Finally detect the text position in image.
Other advantages, target and feature of the invention will be illustrated in the following description to a certain extent, and And to a certain extent, based on will be apparent to those skilled in the art to investigating hereafter, Huo Zheke To be instructed from the practice of the present invention.Target of the invention and other advantages can be realized by following specification and It obtains.
Detailed description of the invention
To make the objectives, technical solutions, and advantages of the present invention clearer, the present invention is made below in conjunction with attached drawing excellent The detailed description of choosing, in which:
Fig. 1 is inventive network structural schematic diagram;
Fig. 2 is characterized fusion schematic diagram;
Fig. 3 is cascade inception modular structure schematic diagram.
Specific embodiment
Illustrate embodiments of the present invention below by way of specific specific example, those skilled in the art can be by this specification Other advantages and efficacy of the present invention can be easily understood for disclosed content.The present invention can also pass through in addition different specific realities The mode of applying is embodied or practiced, the various details in this specification can also based on different viewpoints and application, without departing from Various modifications or alterations are carried out under spirit of the invention.
FIG. 1 to FIG. 3 is please referred to, for a kind of present invention preferably implementation of the multiple dimensioned scene text detection method based on recurrence Example, this method includes the following steps:
Step 1: prepare data;
Gather several public data collection --- SynthText, ICDAR2011, ICDAR2013, SVT.Wherein SynthText The 8*10 for including5Synthesising picture is used for network pre-training, ICDAR2011, ICDAR2013, SVT totally 749 trained pictures pair Network is finely adjusted.Totally 585 trained pictures are used to test tri- data sets of ICDAR2011, ICDAR2013, SVT.
Step 2: network pre-training, specifically includes the following steps:
1) network structure as shown in Figure 1 is constructed;
2) pre-training is carried out to network on SynthText generated data collection: the image for being normalized to 300*300 is inputted In network model, network output is the positioning result of text and the marking of text classification, loses letter using shown in formula (1) Number.
Loss function includes two parts: two Classification Loss of line of text and the default frame position of line of text return loss;Its Middle N indicates the quantity of matched default frame, and α=1, x are to default frame with the matching matrix of true frame, and c represents each default frame Whether include text confidence level, l represents the positioning result of the neural network forecast of each default frame, and g represents the position of true frame.Text Two Classification Loss L of current rowconfUsing entropy loss is intersected, the default frame position of line of text returns loss LlocUsing smooth L1 Loss;
3) to the loss 2) obtained using random optimization device (A Method for Stochastic Optimization, Adam it) optimizes: loss function being minimized by Adam optimizer, constantly updates the parameter in network.Network trains 4* altogether 106Secondary, learning rate is initialized as 10-3, every iteration 4*105Secondary learning rate loses 0.3 parameter multiplied by 0.1 at random.
Step 3: network fine tuning specifically includes following steps:
1) 749 true pictures on ICDAR2011, ICDAR2013, the SVT proposed using step 1 obtain step 2 To network model be finely adjusted, data enhancings are carried out to 749 true pictures, including overturn at random, plus noise, the behaviour such as fuzzy Make;
2) the default frame of 6 kinds of different length-width ratios is set in different output layers, is respectively as follows: 1,2,3,5,7 and 10;
3) detection layers are wide by increasing network by different size of convolution nuclear cascade using cascade (inception) module Degree improves the receptive field of network, solves the text detection problem of ultimate attainment length-width ratio;
4) setting learning rate is 10-5, total iteration 20000 times.It is optimized, is obtained using stochastic gradient descent in this process To final deep neural network model;
Step 4: the network succeeded in school is tested on test set: in the step, normalized test image is defeated Enter in network model, network output is the positioning result of text and the marking of text classification.
Finally, it is stated that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting, although referring to compared with Good embodiment describes the invention in detail, those skilled in the art should understand that, it can be to skill of the invention Art scheme is modified or replaced equivalently, and without departing from the objective and range of the technical program, should all be covered in the present invention Scope of the claims in.

Claims (6)

1. a kind of multiple dimensioned scene text detection method based on recurrence, which is characterized in that this method specifically includes the following steps:
S1: the sufficient training data with text position calibration is set;
S2: construction feature extracts network, including feedforward network process and top-down Fusion Features process bottom-up, uses In the low feature on the middle and senior level for extracting each training data;
S3: cascade module is used to each layer feature for being sent into detection layers;
S4: using the detection framework based on recurrence, suitable default frame is arranged according to text feature, the text position in detection image It sets.
2. the multiple dimensioned scene text detection method according to claim 1 based on recurrence, which is characterized in that the step In S2, feedforward network bottom-up includes: an input module, the first to the 5th convolution module, the first to the 5th Chi Huamo Block, a Recognition with Recurrent Neural Network module, the 6th to the tenth convolution module and the 6th pond module;Wherein input module is as first The input of convolution module, the first to the 5th convolution module separately include the first to the 5th pond module;One Recognition with Recurrent Neural Network Module, the 6th to the tenth convolution module and the 6th pond module successively cascade.
3. the multiple dimensioned scene text detection method according to claim 1 based on recurrence, which is characterized in that the step In S2, top-down Fusion Features, which refer to, merges high-level characteristic with low-level feature, specifically: it is high-rise first by anti- Convolution obtain with low layer characteristic pattern of the same size, then connect batch normalization (Batch Normalization, BatchNorm) mould Block;Low layer connects a convolution module first, and convolution kernel size is 1*1, and step-length 1 is filled with 0;BatchNorm module is connect again; Element dot product operations are finally used, two characteristic layers are merged;Fused output is as entire feature extraction network Output.
4. the multiple dimensioned scene text detection method according to claim 2 based on recurrence, which is characterized in that first to The convolution kernel size of six convolution modules is 3*3, and step-length 1 is filled with 1;5th pond module convolution kernel size is 3*3, step A length of 1, it is filled with 1;Remaining pond module convolution kernel size is 2*2, and step-length 2 is filled with 0;One Recognition with Recurrent Neural Network mould Block is two-way long short-term memory Recognition with Recurrent Neural Network (Bi-directional Long Short-Term Memory Recurrent Neural Network, BLSTM-RNN), hidden layer unit number is 256;7th convolution kernel size is 1*1, Step-length is 1, is filled with 0;8th to the tenth convolution module includes two convolution kernels, and one of convolution kernel size is 1*1, step A length of 1, it is filled with 0, another convolution kernel size is 3*3, and step-length 2 is filled with 1.
5. the multiple dimensioned scene text detection method according to claim 1 based on recurrence, which is characterized in that the step In S3, the cascade module includes input spectrum end and characteristic spectrum cascade end, is led between the input spectrum end and characteristic spectrum cascade end Four convolution branches connection in parallel is crossed, every branch road includes 1,2 or 3 convolution module.
6. the multiple dimensioned scene text detection method according to claim 5 based on recurrence, which is characterized in that the cascade Module includes four convolution branches in parallel,
It include a convolution kernel on first convolution branch, convolution kernel size is 3*3, and step-length 1 is filled with 1;
Article 2 convolution branch road includes three convolution kernels, and one of convolution kernel size is 1*1, and step-length 1 is filled with 0;One A convolution kernel size is 1*5, and step-length 1 is filled with 1;One convolution kernel size is 5*1, and step-length 1 is filled with 1;
Article 3 convolution branch road includes three convolution kernels, and one of convolution kernel size is 1*1, and step-length 1 is filled with 0;One A convolution kernel size is 5*1, and step-length 1 is filled with 1;One convolution kernel size is 1*5, and step-length 1 is filled with 1;
Article 4 convolution branch road includes a pond layer and a convolution kernel, and wherein the convolution kernel size of pond layer is 3*3, step A length of 1, it is filled with 1, convolution kernel size is 1*1, and step-length 1 is filled with 0;
BatchNorm module and line rectification unit module (Rectified Linear are met after all of above convolution kernel Unit,Relu)。
CN201910838235.0A 2019-09-05 2019-09-05 Regression-based multi-scale scene text detection method Active CN110533041B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910838235.0A CN110533041B (en) 2019-09-05 2019-09-05 Regression-based multi-scale scene text detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910838235.0A CN110533041B (en) 2019-09-05 2019-09-05 Regression-based multi-scale scene text detection method

Publications (2)

Publication Number Publication Date
CN110533041A true CN110533041A (en) 2019-12-03
CN110533041B CN110533041B (en) 2022-07-01

Family

ID=68667081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910838235.0A Active CN110533041B (en) 2019-09-05 2019-09-05 Regression-based multi-scale scene text detection method

Country Status (1)

Country Link
CN (1) CN110533041B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200005141A1 (en) * 2018-06-29 2020-01-02 Utechzone Co., Ltd. Automated optical inspection and classification apparatus based on a deep learning system and training apparatus thereof
CN111259764A (en) * 2020-01-10 2020-06-09 中国科学技术大学 Text detection method and device, electronic equipment and storage device
CN111881943A (en) * 2020-07-08 2020-11-03 泰康保险集团股份有限公司 Method, device, equipment and computer readable medium for image classification
CN112287962A (en) * 2020-08-10 2021-01-29 南京行者易智能交通科技有限公司 Training method, detection method and device of multi-scale target detection model, and terminal equipment
CN113159079A (en) * 2020-01-07 2021-07-23 顺丰科技有限公司 Target detection method, target detection device, computer equipment and storage medium
CN113408525A (en) * 2021-06-17 2021-09-17 成都崇瑚信息技术有限公司 Multilayer ternary pivot and bidirectional long-short term memory fused text recognition method
CN115393868A (en) * 2022-08-18 2022-11-25 中化现代农业有限公司 Text detection method and device, electronic equipment and storage medium
CN116704248A (en) * 2023-06-07 2023-09-05 南京大学 Serum sample image classification method based on multi-semantic unbalanced learning

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631426A (en) * 2015-12-29 2016-06-01 中国科学院深圳先进技术研究院 Image text detection method and device
CN107578060A (en) * 2017-08-14 2018-01-12 电子科技大学 A kind of deep neural network based on discriminant region is used for the method for vegetable image classification
CN107688808A (en) * 2017-08-07 2018-02-13 电子科技大学 A kind of quickly natural scene Method for text detection
CN108549893A (en) * 2018-04-04 2018-09-18 华中科技大学 A kind of end-to-end recognition methods of the scene text of arbitrary shape
CN108734169A (en) * 2018-05-21 2018-11-02 南京邮电大学 One kind being based on the improved scene text extracting method of full convolutional network
CN109086663A (en) * 2018-06-27 2018-12-25 大连理工大学 The natural scene Method for text detection of dimension self-adaption based on convolutional neural networks
CN109271967A (en) * 2018-10-16 2019-01-25 腾讯科技(深圳)有限公司 The recognition methods of text and device, electronic equipment, storage medium in image
CN109299274A (en) * 2018-11-07 2019-02-01 南京大学 A kind of natural scene Method for text detection based on full convolutional neural networks
US20190180154A1 (en) * 2017-12-13 2019-06-13 Abbyy Development Llc Text recognition using artificial intelligence
EP3534298A1 (en) * 2018-02-26 2019-09-04 Capital One Services, LLC Dual stage neural network pipeline systems and methods

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631426A (en) * 2015-12-29 2016-06-01 中国科学院深圳先进技术研究院 Image text detection method and device
CN107688808A (en) * 2017-08-07 2018-02-13 电子科技大学 A kind of quickly natural scene Method for text detection
CN107578060A (en) * 2017-08-14 2018-01-12 电子科技大学 A kind of deep neural network based on discriminant region is used for the method for vegetable image classification
US20190180154A1 (en) * 2017-12-13 2019-06-13 Abbyy Development Llc Text recognition using artificial intelligence
EP3534298A1 (en) * 2018-02-26 2019-09-04 Capital One Services, LLC Dual stage neural network pipeline systems and methods
CN108549893A (en) * 2018-04-04 2018-09-18 华中科技大学 A kind of end-to-end recognition methods of the scene text of arbitrary shape
CN108734169A (en) * 2018-05-21 2018-11-02 南京邮电大学 One kind being based on the improved scene text extracting method of full convolutional network
CN109086663A (en) * 2018-06-27 2018-12-25 大连理工大学 The natural scene Method for text detection of dimension self-adaption based on convolutional neural networks
CN109271967A (en) * 2018-10-16 2019-01-25 腾讯科技(深圳)有限公司 The recognition methods of text and device, electronic equipment, storage medium in image
CN109299274A (en) * 2018-11-07 2019-02-01 南京大学 A kind of natural scene Method for text detection based on full convolutional neural networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
WENHAO HE等: "Deep Direct Regression for Multi-oriented Scene Text Detection", 《2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION》 *
方清: "基于深度学习的自然场景文本检测与识别", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
杨小栋: "基于深度特征的多方向场景文字检测", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
雷绮仑: "多方向自然场景文本提取方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200005141A1 (en) * 2018-06-29 2020-01-02 Utechzone Co., Ltd. Automated optical inspection and classification apparatus based on a deep learning system and training apparatus thereof
US11455528B2 (en) * 2018-06-29 2022-09-27 Utechzone Co., Ltd. Automated optical inspection and classification apparatus based on a deep learning system and training apparatus thereof
CN113159079A (en) * 2020-01-07 2021-07-23 顺丰科技有限公司 Target detection method, target detection device, computer equipment and storage medium
CN111259764A (en) * 2020-01-10 2020-06-09 中国科学技术大学 Text detection method and device, electronic equipment and storage device
CN111881943A (en) * 2020-07-08 2020-11-03 泰康保险集团股份有限公司 Method, device, equipment and computer readable medium for image classification
CN112287962A (en) * 2020-08-10 2021-01-29 南京行者易智能交通科技有限公司 Training method, detection method and device of multi-scale target detection model, and terminal equipment
CN112287962B (en) * 2020-08-10 2023-06-09 南京行者易智能交通科技有限公司 Training method, detection method and device for multi-scale target detection model, and terminal equipment
CN113408525A (en) * 2021-06-17 2021-09-17 成都崇瑚信息技术有限公司 Multilayer ternary pivot and bidirectional long-short term memory fused text recognition method
CN115393868A (en) * 2022-08-18 2022-11-25 中化现代农业有限公司 Text detection method and device, electronic equipment and storage medium
CN116704248A (en) * 2023-06-07 2023-09-05 南京大学 Serum sample image classification method based on multi-semantic unbalanced learning

Also Published As

Publication number Publication date
CN110533041B (en) 2022-07-01

Similar Documents

Publication Publication Date Title
CN110533041A (en) Multiple dimensioned scene text detection method based on recurrence
CN110334705B (en) Language identification method of scene text image combining global and local information
Yuan et al. Gated CNN: Integrating multi-scale feature layers for object detection
CN111639544B (en) Expression recognition method based on multi-branch cross-connection convolutional neural network
CN110083700A (en) A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks
CN108537269B (en) Weak interactive object detection deep learning method and system thereof
CN110287960A (en) The detection recognition method of curve text in natural scene image
CN109858488A (en) A kind of handwriting samples recognition methods and system based on sample enhancing
CN108830334A (en) A kind of fine granularity target-recognition method based on confrontation type transfer learning
CN110866542B (en) Depth representation learning method based on feature controllable fusion
CN109886141A (en) A kind of pedestrian based on uncertainty optimization discrimination method again
CN106919920A (en) Scene recognition method based on convolution feature and spatial vision bag of words
CN110414344A (en) A kind of human classification method, intelligent terminal and storage medium based on video
CN108427740B (en) Image emotion classification and retrieval algorithm based on depth metric learning
CN111598183A (en) Multi-feature fusion image description method
CN106919710A (en) A kind of dialect sorting technique based on convolutional neural networks
CN112507904B (en) Real-time classroom human body posture detection method based on multi-scale features
CN109344898A (en) Convolutional neural networks image classification method based on sparse coding pre-training
CN110070106A (en) Smog detection method, device and electronic equipment
CN106874929A (en) A kind of pearl sorting technique based on deep learning
CN111723667A (en) Human body joint point coordinate-based intelligent lamp pole crowd behavior identification method and device
CN109726671A (en) The action identification method and system of expression study from the overall situation to category feature
Agrawal et al. Image caption generator using attention mechanism
CN116150747A (en) Intrusion detection method and device based on CNN and SLTM
CN114398485B (en) Expert portrait construction method and device based on multi-view fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240623

Address after: 518000 1104, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province

Patentee after: Shenzhen Hongyue Enterprise Management Consulting Co.,Ltd.

Country or region after: China

Address before: 400065 Chongqing Nan'an District huangjuezhen pass Chongwen Road No. 2

Patentee before: CHONGQING University OF POSTS AND TELECOMMUNICATIONS

Country or region before: China

TR01 Transfer of patent right

Effective date of registration: 20240625

Address after: 200030, Room 901-1606, Building 4, No. 2377 Shenkun Road, Minhang District, Shanghai

Patentee after: Shanghai Jinming Information Technology Co.,Ltd.

Country or region after: China

Address before: 518000 1104, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province

Patentee before: Shenzhen Hongyue Enterprise Management Consulting Co.,Ltd.

Country or region before: China