CN110533088A - A kind of scene text Language Identification based on differentiated convolutional neural networks - Google Patents

A kind of scene text Language Identification based on differentiated convolutional neural networks Download PDF

Info

Publication number
CN110533088A
CN110533088A CN201910759386.7A CN201910759386A CN110533088A CN 110533088 A CN110533088 A CN 110533088A CN 201910759386 A CN201910759386 A CN 201910759386A CN 110533088 A CN110533088 A CN 110533088A
Authority
CN
China
Prior art keywords
neural networks
convolutional neural
differentiated
feature
cnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910759386.7A
Other languages
Chinese (zh)
Inventor
王春枝
袁野
叶志伟
严灵毓
李敏
夏慧玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei University of Technology
Original Assignee
Hubei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei University of Technology filed Critical Hubei University of Technology
Priority to CN201910759386.7A priority Critical patent/CN110533088A/en
Publication of CN110533088A publication Critical patent/CN110533088A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of scene text Language Identification based on differentiated convolutional neural networks, traditional image classification method often carry out integral analysis to image, lack the clearly capture to details, cannot handle problems well.In order to solve the above-mentioned technical problem, propose the model and correlation method of a kind of entitled " differentiated convolutional neural networks ", pass through " differentiated cluster " (discriminative clustering) algorithm, study has the local feature of distinction to one group " distinction pattern " (discriminative patterns) in the depth convolution feature of image.Finally, being indicated generally at for image is classified with two layers of fully connected network network layers.Traditional image classification method before the present invention can significantly improve is identified under the condition elements such as font, noise, illumination and is come with some shortcomings.

Description

A kind of scene text Language Identification based on differentiated convolutional neural networks
Technical field
The invention belongs to deep learning technical field of character recognition, and in particular to one kind is based on differentiated convolutional neural networks Scene text Language Identification.
Background technique
Multi-language environment is very universal in modern society.It can usually be encountered not in airport, railway station, hotels and other places The case where occurring simultaneously with spoken and written languages.Languages itself are important information.In addition, different spoken and written languages have it is completely different Characteristic, processing often needs to according to languages classification using targeted model and processing method, therefore, in multi-language environment In, languages identification has great significance.
Languages identification is the important component of traditional optical character recognition (OCR) system.Since it is in multi-language environment In importance, the problem in the past few decades in obtained extensive research.In recent years, with multi-medium data, especially It is the continuous growth of the picture of mobile device acquisition, the importance of scene Text region more highlights, and leads in computer vision Domain causes the upsurge of research.Just because of this, the languages identification of scene text also becomes indispensable.
Previous languages recognizer is designed mainly for document picture or video caption.Document picture and video caption Background and prospect all relative cleans, noise jamming it is small.Therefore, the method based on binaryzation, region segmentation, morphological analysis etc. Often it is used in such work.However, such method is often because back can not be coped with when being applied on scene text Complexity and the variability of the factors such as scape, font, noise, illumination condition and be difficult to be competent at.
The task of scene text languages identification be in given picture languages classification belonging to text in prediction (English, Chinese, Greek etc.).The problem can naturally enough be considered as a kind of image classification problem.In recent years, convolutional neural networks (CNN) good solution is provided using its powerful learning ability and generalization ability as image classification.Nonetheless, languages Identification still has unique challenge because of its own characteristic.And traditional image classification method often carries out entirety to image Property analysis, lack the clearly capture to details, problems cannot be handled well.
Summary of the invention
In order to solve the above-mentioned technical problem the present invention, proposes a kind of scene text based on differentiated convolutional neural networks Recognition methods, traditional image classification method before capable of significantly improving identifies under the condition elements such as font, noise, illumination to be deposited In some shortcomings.
The technical scheme adopted by the invention is that: a kind of scene text languages identification based on differentiated convolutional neural networks Method, which comprises the following steps:
Step 1: building languages identify convolutional neural networks model Disc CNN;
Step 1.1: obtaining data set, the data set is made of several pictures, has a kind of language inside every picture Text constitutes several spoken and written languages together;The data set is divided into training set and test set;
Step 1.2: the picture in training set being cut according to default size, based on the scene character image after cutting I extracts convolution feature level based on convolutional neural networks from scene character image IWherein hlIndicate first of feature The characteristic pattern extracted on layer, L are characterized the number of layer;
Step 1.3: extracting intensive local feature from convolution feature level using convolutional neural networks;
Step 1.4: the intensive local feature extracted to convolutional neural networks distinguishes formula cluster, and study obtains one Differentiated code book;
Step 1.5: each intensive local feature being encoded using the differentiated code book, then by intensive local feature Coding result fusion, obtain full figure fixed dimension vector indicate, be denoted as full figure description son;
Step 1.6: building languages identify convolutional neural networks model Disc CNN;
Disc CNN includes a hidden layer, and activation primitive ReLU, output layer includes C node;Output layer activation value The probability value in C languages classification is obtained after SoftMax is operated;Wherein Disc CNN uses differentiated middle layer coding staff Method learns to obtain, and is finely tuned in the end-to-end training of overall model for once linear operation plus a ReLU class;It is complete Whole Disc CNN is the end-to-end model that can be trained, and carries out global optimization using gradient descent method;
Step 1.7: parameter migration, end-to-end optimization;
By the back transfer algorithm of neural network, the error gradient of rear class is fed back to prime, then uses gradient descent method Parameter adjustment is carried out simultaneously to front stage;
Step 2: the identification of text languages being carried out to picture using languages identification convolutional neural networks model Disc CNN;
It includes four convolutional layers: convl, conv2, conv, 3, conv4 that Disc CNN network, which has altogether,;The characteristic pattern of output Height is respectively 15,7,3,1, and width is determined by input picture width;Distinction segment is found and middle layer is indicated in conv2, It is carried out on the characteristic pattern of conv3, conv4 output.
The beneficial effects of the present invention are: the present invention is special by identifying that problem proposes that one kind combines depth convolution to languages Sign, differentiated middle layer indicate, the image representing method of spatial pyramid method, and image representing method is modeled as end-to-end to instruct Experienced neural network model advanced optimizes model parameter by end-to-end training.
Detailed description of the invention
Fig. 1 is the general flow chart of the embodiment of the present invention;
Fig. 2 is the detail differences schematic diagram of the scene Text region of the embodiment of the present invention;
Fig. 3 is the Disc CNN neural network structure schematic diagram of the embodiment of the present invention;
Fig. 4 is the test Text region picture schematic diagram of the embodiment of the present invention.
Specific embodiment
Understand for the ease of those of ordinary skill in the art and implement the present invention, with reference to the accompanying drawings and embodiments to this hair It is bright to be described in further detail, it should be understood that implementation example described herein is merely to illustrate and explain the present invention, not For limiting the present invention.
The task of scene text languages identification be in given picture languages classification belonging to text in prediction (English, Chinese, Greek etc.).The problem can naturally enough be considered as a kind of image classification problem.In recent years, convolutional neural networks (CNN) good solution is provided using its powerful learning ability and generalization ability as image classification.Nonetheless, languages Identification still has unique challenge because of its own characteristic.And traditional image classification method often carries out entirety to image Property analysis, lack the clearly capture to details, problems cannot be handled well.In order to solve the above-mentioned technical problem, it proposes A kind of model and correlation method of entitled " differentiated convolutional neural networks ", traditional image point before capable of significantly improving Class method identifies under the condition elements such as font, noise, illumination to come with some shortcomings.This method passes through " differentiated cluster " (discriminative clustering) algorithm, study is to one group " distinction pattern " in the depth convolution feature of image (discriminative patterns), that is, have the local feature of distinction.Finally, being indicated generally at for image is connected entirely with two layers Network layer is connect to classify.
Referring to Fig.1, a kind of scene text Language Identification based on differentiated convolutional neural networks provided by the invention, The following steps are included:
Step 1: building languages identify convolutional neural networks model Disc CNN;
Step 1.1: data set is obtained, data set is made of several pictures, there are a kind of spoken and written languages inside every picture, Several spoken and written languages are constituted together;Data set is divided into training set and test set;
It is the detail differences schematic diagram of the scene Text region of the present embodiment see Fig. 2.
Step 1.2: the picture in training set being cut according to default size, based on the scene character image after cutting I extracts convolution feature level based on convolutional neural networks from scene character image IWherein hlIndicate first of feature The characteristic pattern extracted on layer, L are characterized the number of layer;
In the present embodiment, feature level is extracted by a convolutional neural networks CNN Jing Guo pre-training;
The picture of input is scaled to fixed height first, is then input in CNN;CNN passes through to input Picture carries out a series of convolution sum maximum pondization operation to extract feature;
WithConvolution kernel and amount of bias are respectively indicated, then the process of feature extraction are as follows:
hl=poolmax(σ(hl-1*kl+bl));
Wherein, poolmaxIndicate maximum pondization operation, σ is any nonlinear activation function, and * indicates convolution operation, h0Table Show input initial picture, hlThe l in the middle upper right corner indicates the coding of input picture;The step-length of all convolution operations is disposed as 1, volume For product core having a size of 3x3, Boundary filling size is set as 1x1.
Step 1.3: extracting intensive local feature from convolution feature level using convolutional neural networks;
It is done with given image to convolution for a convolution kernel can extract a kind of feature of text in image, no Same convolution kernel can extract different characteristics of image.Overview says that the calculation method of convolutional layer is exactly according to formula:
Wherein " σ " indicates activation primitive;" imgMat " expression gray level image matrix;" W " expression convolution kernel;" ο " expression volume Product operation;" b " expression bias.
Step 1.4: the intensive local feature extracted to convolutional neural networks distinguishes formula cluster, and study obtains one Differentiated code book (discriminative codebook);
In the present embodiment, the specific implementation of step 1.4 includes following sub-step:
Step 1.4.1: the intensive local feature extracted to convolutional neural networks distinguishes formula cluster;
Differentiated cluster carries out respectively in each languages classification and feature level;It was found that collection is the picture of classification c in layer The local feature set extracted on grade l, is usedIt indicates;Naturally collection is then the local feature that every other classification picture extracts Set;
Step 1.4.2: study obtains a differentiated code book;
Differentiated code book is made of one group of linear classifier;Each classifier is equivalent to the detection of a distinction segment Device, its output illustrate the response for distinguishing sexual norm;Wherein, languages are distinguished to need to capture the area, topography for having distinction Domain, this region is also referred to as distinction segment, as a kind of differentiation sexual norm.
Step 1.5: each intensive local feature being encoded using differentiated code book, then by the volume of intensive local feature Code result fusion, the fixed dimension vector for obtaining full figure indicate, are denoted as full figure description;
In the present embodiment, it is assumed that intensive local feature diagram shape is w1×h1×n1, w1、h1、n1Indicate characteristic pattern shape Three dimensions;Each position of characteristic pattern has corresponded to a local description, therefore w × h can be extracted from characteristic pattern Local description, each is the vector of m dimension;Use hl[i, j] come indicate the description on the i-th row jth column position, the description son It is encoded by the code book comprising k classification, obtains vector z [i, j]: z of k dimensionl[i, j]=max (0, Wlxl[i, j]+bl); Wherein, Wlxl[i, j]+blIt is l layers of description in code book Wl, blResponse after coding, by max (0) by the negative sound in response Zero setting is answered, therefore coding result is the non-Negative Acknowledgment of code book;
The coding result of local description is full figure description by horizontal space pyramid pond HSPP operating polymerization;It will Local code value is divided into several sub-regions according to spatial position;And do horizontal maximum pondization operation in that region respectively; Assuming that local code value, by indicating, horizontal maximum pondization operation chooses maximum encoded radio as pond by dimension in each row As a result, i.e. hspp (zl)=maxzl[i, j];Wherein, local code value is by zl[i, j] is indicated;
Spatial sub-area divides on the longitudinal direction of characteristic pattern, i.e., characteristic pattern is divided into contour several pieces, width and Former characteristic pattern is consistent;Each subgraph is by equation hspp (zl)=maxzl[i, j] carries out the operation of horizontal space pondization respectively, most The pond result obtained afterwards obtains description of full figure by splicing.
Step 1.6: building languages identify convolutional neural networks model Disc CNN;
See Fig. 3, the Disc CNN of the present embodiment includes a hidden layer, and activation primitive ReLU, output layer includes C Node;Output layer activation value obtains the probability value in C languages classification after SoftMax is operated;Wherein Disc CNN is used The coding method of differentiated middle layer learns to obtain for once linear operation plus a ReLU class, and in the end-to-end of overall model It is finely tuned in training;Complete Disc CNN is the end-to-end model that can be trained, and is carried out using gradient descent method whole excellent Change;
Step 1.7: parameter migration, end-to-end optimization;
By the back transfer algorithm of neural network, the error gradient of rear class is fed back to prime, then uses gradient descent method Parameter adjustment is carried out simultaneously to front stage;
In the present embodiment, the fine tuning optimization algorithm of Disc CNN is stochastic gradient descent (SGD).Initial learning rate is set as 10-3, momentum 0.9, batch size is 128.In fcl, network uses dropout mechanism to avoid trained over-fitting.
Feature extraction network carries out pre-training in another individual CNN model.
Disc CNN realizes by C++ and Python programming language
Step 2: the identification of text languages being carried out to picture using languages identification convolutional neural networks model Disc CNN;
It includes four convolutional layers: convl, conv2, conv, 3, conv4 that Disc CNN network, which has altogether,;The characteristic pattern of output Height is respectively 15,7,3,1, and width is determined by input picture width;Distinction segment is found and middle layer is indicated in conv2, It is carried out on the characteristic pattern of conv3, conv4 output.
The present invention passes through the application using a kind of entitled differentiated convolutional neural networks on scene Text region, passes through area Fraction clustering algorithm, study is to one group " distinction pattern " in the depth convolution feature of image, finally, image is indicated generally at Classified with two layers of fully connected network network layers.
The present invention can significantly improve before traditional image classification method under the condition elements such as font, noise, illumination Identification comes with some shortcomings.See Fig. 4, it is the test Text region picture schematic diagram of the embodiment of the present invention, is to be acquired in figure Some data sets, the substantially picture picture that is all the natural scene for the use looked on the net, all texts font, color, Strong variation is suffered from typesetting, writing style, in addition, background is also to be frequently subjected to illumination, the influence of camera angle etc., It is relatively mixed and disorderly.
For the present invention see the following table 1 compared with conventional method accuracy, accuracy of the invention is apparently higher than traditional method.
Table 1
It should be understood that the part that this specification does not elaborate belongs to the prior art;It is above-mentioned to implement for preferable The description of example is more detailed, therefore can not be considered the limitation to the invention patent protection scope, the common skill of this field Art personnel under the inspiration of the present invention, in the case where not departing from the ambit that the claims in the present invention are protected, can also make and replace It changes or deforms, fall within the scope of protection of the present invention, it is of the invention range is claimed to be determined by the appended claims.

Claims (5)

1. a kind of scene text Language Identification based on differentiated convolutional neural networks, which is characterized in that including following step It is rapid:
Step 1: building languages identify convolutional neural networks model Disc CNN;
Step 1.1: data set is obtained, the data set is made of several pictures, there are a kind of spoken and written languages inside every picture, Several spoken and written languages are constituted together;The data set is divided into training set and test set;
Step 1.2: the picture in training set being cut according to default size, based on the scene character image I after cutting, base Convolution feature level is extracted from scene character image I in convolutional neural networksWherein hlIt indicates on first of characteristic layer The characteristic pattern of extraction, L are characterized the number of layer;
Step 1.3: extracting intensive local feature from convolution feature level using convolutional neural networks;
Step 1.4: the intensive local feature extracted to convolutional neural networks distinguishes formula cluster, and study obtains a differentiation Formula code book;
Step 1.5: each intensive local feature being encoded using the differentiated code book, then by the volume of intensive local feature Code result fusion, the fixed dimension vector for obtaining full figure indicate, are denoted as full figure description;
Step 1.6: building languages identify convolutional neural networks model Disc CNN;
Disc CNN includes a hidden layer, and activation primitive ReLU, output layer includes C node;Output layer activation value passes through The probability value in C languages classification is obtained after SoftMax operation;Wherein Disc CNN uses the coding method of differentiated middle layer, is Once linear operation learns to obtain plus a ReLU class, and is finely tuned in the end-to-end training of overall model;Completely Disc CNN is the end-to-end model that can be trained, and carries out global optimization using gradient descent method;
Step 1.7: parameter migration, end-to-end optimization;
By the back transfer algorithm of neural network, the error gradient of rear class is fed back to prime, then with gradient descent method to preceding Rear class carries out parameter adjustment simultaneously;
Step 2: the identification of text languages being carried out to picture using languages identification convolutional neural networks model Disc CNN;
It includes four convolutional layers: convl, conv2, conv, 3, conv4 that Disc CNN network, which has altogether,;The characteristic pattern height of output Respectively 15,7,3,1, width is determined by input picture width;Distinction segment is found and middle layer is indicated in conv2, conv3, It is carried out on the characteristic pattern of conv4 output.
2. the scene text Language Identification according to claim 1 based on differentiated convolutional neural networks, feature Be: in step 1.2, feature level is extracted by a convolutional neural networks CNN Jing Guo pre-training;
The picture of input is scaled to fixed height first, is then input in CNN;CNN passes through to input picture A series of convolution sum maximum pondization operation is carried out to extract feature;
WithWithConvolution kernel and amount of bias are respectively indicated, then the process of feature extraction are as follows:
hl=poolmax(σ(hl-1*kl+bl));
Wherein, poolmaxIndicate maximum pondization operation, σ is any nonlinear activation function, and * indicates convolution operation, h0Indicate input Initial picture, hlThe l in the middle upper right corner indicates the coding of input picture;The step-length of all convolution operations is disposed as 1, convolution kernel ruler Very little is 3x3, and Boundary filling size is set as 1x1.
3. the scene text Language Identification according to claim 1 based on differentiated convolutional neural networks, feature It is, the specific implementation of step 1.4 includes following sub-step:
Step 1.4.1: the intensive local feature extracted to convolutional neural networks distinguishes formula cluster;
Differentiated cluster carries out respectively in each languages classification and feature level;It was found that collection is the picture of classification c on level l Obtained local feature set is extracted, is usedIt indicates;Naturally collection is then the local feature set that every other classification picture extracts;
Step 1.4.2: study obtains a differentiated code book;
The differentiated code book is made of one group of linear classifier;Each classifier is equivalent to the detection of a distinction segment Device, its output illustrate the response for distinguishing sexual norm;Wherein, languages are distinguished to need to capture the area, topography for having distinction Domain, this region is also referred to as distinction segment, as a kind of differentiation sexual norm.
4. the scene text Language Identification according to claim 1 based on differentiated convolutional neural networks, feature It is: in step 1.5, it is assumed that intensive local feature diagram shape is w1×h1×n1, w1、h1、n1Indicate three of characteristic pattern shape Dimension;Each position of characteristic pattern has corresponded to a local description, therefore w × h part can be extracted from characteristic pattern Description, each is the vector of m dimension;Use hl[i, j] come indicate the description on the i-th row jth column position, description is by wrapping Code book containing k classification is encoded, and vector z [i, j]: z of k dimension is obtainedl[i, j]=max (0, Wlxl[i, j]+bl);Wherein, Wlxl[i, j]+blIt is l layers of description in code book Wl, blResponse after coding is set the Negative Acknowledgment in response by max (0) Zero, therefore coding result is the non-Negative Acknowledgment of code book.
5. the scene text Language Identification according to claim 4 based on differentiated convolutional neural networks, feature Be: in step 1.5, the coding result of local description is retouched by horizontal space pyramid pond HSPP operating polymerization for full figure State son;Local code value is divided into several sub-regions according to spatial position;And do horizontal maximum pond in that region respectively Change operation;Assuming that local code value, by indicating, horizontal maximum pondization operation chooses maximum encoded radio by dimension in each row As pond as a result, i.e. hspp (zl)=maxzl[i, j];Wherein, local code value is by zl[i, j] is indicated;
Spatial sub-area divides on the longitudinal direction of characteristic pattern, i.e., characteristic pattern is divided into contour several pieces, width and Yuan Te Sign figure is consistent;Each subgraph is by equation hspp (zl)=maxzl[i, j] carries out the operation of horizontal space pondization respectively, finally To pond result by splicing obtain full figure description son.
CN201910759386.7A 2019-08-16 2019-08-16 A kind of scene text Language Identification based on differentiated convolutional neural networks Pending CN110533088A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910759386.7A CN110533088A (en) 2019-08-16 2019-08-16 A kind of scene text Language Identification based on differentiated convolutional neural networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910759386.7A CN110533088A (en) 2019-08-16 2019-08-16 A kind of scene text Language Identification based on differentiated convolutional neural networks

Publications (1)

Publication Number Publication Date
CN110533088A true CN110533088A (en) 2019-12-03

Family

ID=68663497

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910759386.7A Pending CN110533088A (en) 2019-08-16 2019-08-16 A kind of scene text Language Identification based on differentiated convolutional neural networks

Country Status (1)

Country Link
CN (1) CN110533088A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036296A (en) * 2014-06-20 2014-09-10 深圳先进技术研究院 Method and device for representing and processing image
CN105917354A (en) * 2014-10-09 2016-08-31 微软技术许可有限责任公司 Spatial pyramid pooling networks for image processing
CN105956517A (en) * 2016-04-20 2016-09-21 广东顺德中山大学卡内基梅隆大学国际联合研究院 Motion identification method based on dense trajectory
US20170011281A1 (en) * 2015-07-09 2017-01-12 Qualcomm Incorporated Context-based priors for object detection in images
CN109685152A (en) * 2018-12-29 2019-04-26 北京化工大学 A kind of image object detection method based on DC-SPP-YOLO

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036296A (en) * 2014-06-20 2014-09-10 深圳先进技术研究院 Method and device for representing and processing image
CN105917354A (en) * 2014-10-09 2016-08-31 微软技术许可有限责任公司 Spatial pyramid pooling networks for image processing
US20170011281A1 (en) * 2015-07-09 2017-01-12 Qualcomm Incorporated Context-based priors for object detection in images
CN105956517A (en) * 2016-04-20 2016-09-21 广东顺德中山大学卡内基梅隆大学国际联合研究院 Motion identification method based on dense trajectory
CN109685152A (en) * 2018-12-29 2019-04-26 北京化工大学 A kind of image object detection method based on DC-SPP-YOLO

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
石葆光: "基于深度学习的自然场景文字检测与识别方法研究", 《中国博士学位论文全文数据库信息科技辑》 *

Similar Documents

Publication Publication Date Title
CN105469047B (en) Chinese detection method and system based on unsupervised learning deep learning network
CN108549893A (en) A kind of end-to-end recognition methods of the scene text of arbitrary shape
CN105608454B (en) Character detecting method and system based on text structure component detection neural network
CN107368831A (en) English words and digit recognition method in a kind of natural scene image
CN107871101A (en) A kind of method for detecting human face and device
CN110807422A (en) Natural scene text detection method based on deep learning
Radwan et al. Neural networks pipeline for offline machine printed Arabic OCR
CN108804397A (en) A method of the Chinese character style conversion based on a small amount of target font generates
CN112686345B (en) Offline English handwriting recognition method based on attention mechanism
CN105913053B (en) A kind of facial expression recognizing method for singly drilling multiple features based on sparse fusion
CN111126404B (en) Ancient character and font recognition method based on improved YOLO v3
CN113762269B (en) Chinese character OCR recognition method, system and medium based on neural network
Talukder et al. Real-time bangla sign language detection with sentence and speech generation
Ahmed et al. Bangladeshi sign language recognition using fingertip position
Malakar et al. A holistic approach for handwritten Hindi word recognition
CN108664975A (en) A kind of hand-written Letter Identification Method of Uighur, system and electronic equipment
CN108537109B (en) OpenPose-based monocular camera sign language identification method
CN112069900A (en) Bill character recognition method and system based on convolutional neural network
CN110517270A (en) A kind of indoor scene semantic segmentation method based on super-pixel depth network
CN110348280A (en) Water book character recognition method based on CNN artificial neural
Alghazo et al. An online numeral recognition system using improved structural features–a unified method for handwritten Arabic and Persian numerals
CN109002771A (en) A kind of Classifying Method in Remote Sensing Image based on recurrent neural network
Garg et al. Optical character recognition using artificial intelligence
Rajnoha et al. Handwriting comenia script recognition with convolutional neural network
Bankar et al. Real time sign language recognition using deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191203

RJ01 Rejection of invention patent application after publication