CN109241974B - Text image identification method and system - Google Patents

Text image identification method and system Download PDF

Info

Publication number
CN109241974B
CN109241974B CN201810965342.5A CN201810965342A CN109241974B CN 109241974 B CN109241974 B CN 109241974B CN 201810965342 A CN201810965342 A CN 201810965342A CN 109241974 B CN109241974 B CN 109241974B
Authority
CN
China
Prior art keywords
image
text
network
training
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810965342.5A
Other languages
Chinese (zh)
Other versions
CN109241974A (en
Inventor
康立
齐伟
刘燕清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Yantu Education Technology Co ltd
Original Assignee
Suzhou Yantu Education Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Yantu Education Technology Co ltd filed Critical Suzhou Yantu Education Technology Co ltd
Priority to CN201810965342.5A priority Critical patent/CN109241974B/en
Publication of CN109241974A publication Critical patent/CN109241974A/en
Application granted granted Critical
Publication of CN109241974B publication Critical patent/CN109241974B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/60Rotation of whole images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Character Discrimination (AREA)

Abstract

The invention relates to a text image identification method, which comprises the following steps: inputting an image to be recognized into an image compression righting network to rotate so as to enable a text in the image to be recognized to be in a horizontal position, wherein the image compression righting network is obtained by training through a machine learning method and has an image rotation function; the image output from the image compression righting network identifies text. The invention has the beneficial effects that: the image to be recognized is automatically compressed and corrected by the convolutional self-encoder, and is recognized by the text recognition neural network, so that the accuracy of character recognition is ensured, the process of manual preprocessing is omitted, the manual labor is saved, and convenience is provided for users.

Description

Text image identification method and system
Technical Field
The invention belongs to the technical field of character recognition, and particularly relates to a text image recognition method and system.
Background
The OCR Character Recognition software is software that directly converts text contents on a picture or a photograph into an editable text by using an OCR (Optical Character Recognition) technique.
The existing character recognition process includes: converting the paper document into an electronic document by an electronic device, for example, acquiring an image file of the paper document by a scanner or a digital camera; and the OCR character recognition software analyzes the image file to acquire characters and layout information.
In the actual operation process, because the image files acquired by the electronic equipment are difficult to ensure horizontal arrangement, an operator needs to manually rotate the image text and adjust the character arrangement direction to be horizontal, when the number of paper documents to be identified is large, the workload of the operator is large, the identification efficiency is low, the manual operation is easy to make mistakes, and the identification accuracy is difficult to ensure.
Therefore, it is an urgent problem to be solved by those skilled in the art to provide a more convenient text image recognition method.
Disclosure of Invention
In order to solve the problems of low text recognition efficiency and low accuracy rate in the prior art, the invention provides a text image recognition method and system, which have the characteristics of high recognition efficiency, high accuracy rate and the like.
The invention aims to provide a text image recognition method and a text image recognition system which are convenient for people to use, save physical labor and have higher recognition efficiency.
The method for recognizing the text image according to the embodiment of the invention comprises the following steps: inputting an image to be recognized into an image compression normal position network, and rotating the image to be recognized so as to enable a text in the image to be recognized to be in a horizontal position, wherein the image compression normal position network is obtained by training through a machine learning method and has an image rotation function;
and identifying texts for the images output by the image compression positioning network.
Preferably, the image compression righting network rotates the image to be identified, and adds a mark site at the edge of the text image, wherein the mark site is used for distinguishing characters and blank areas in the text image;
the process of recognizing the text of the image output by the image compression normal position network comprises the following steps: and identifying texts from the images output by the image compression righting network according to the marked sites.
Preferably, the image compression righting network compresses the image to be recognized while rotating the image to be recognized.
Preferably, the compressed and rotated image to be recognized is cut line by line and word by word according to the mark point;
and inputting the cut image to be recognized into a text recognition neural network for text recognition, wherein the text recognition neural network is obtained by training through a machine learning method and has a text recognition function.
Preferably, the acquiring process of the text recognition neural network includes:
establishing a word stock;
building a multi-classification convolutional neural network;
selecting characters in a character library to be spliced into a complete image, and inputting the image compression righting network for compression
Training a convolutional neural network by using the compressed word stock of the image compression normal position network;
and obtaining the text recognition neural network.
Preferably, the text recognition neural network is composed of convolutional layers, pooling layers, full-link layers and corresponding network weights of the convolutional neural network.
Preferably, the image compression righting network consists of convolutional layers and pooling layers of the convolutional neural network.
Preferably, the method for acquiring the image compression righting network comprises:
acquiring a training image text;
rotating and righting the training images to be used as training targets, and marking original images to be used as training sets;
cutting the text line by line word by word according to a sample of a training target, and adding cutting sites at word intervals;
inputting the training sample and the training target into a convolution self-encoder for training, deleting a full connection layer in a decoder by the trained convolution self-encoder, and obtaining the image compression normal position network with automatic correction and compression capability.
Preferably, a distributed processing mode is adopted in the text recognition process, and a plurality of groups of text recognition neural networks work simultaneously; and integrating the results of the distributed text recognition in sequence to obtain a final text recognition result.
The system for recognizing a text image according to an embodiment of the present invention includes:
the text image acquisition module is used for acquiring an image to be identified by a user;
the image compression righting network is used for rotating and compressing the acquired image to be identified of the user;
the text cutting module cuts the image subjected to the rotary compression line by line and word by word; and
and the text recognition module recognizes the cut image and outputs corresponding characters.
The method and the system for recognizing the text image have the advantages that: by combining the self-encoder with the convolutional neural network, a user does not need to preprocess an original image, convenience is provided for the user, and high character recognition precision is maintained; the complicated steps of the existing character recognition are simplified, and the character recognition can be completed in the same network system.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a first flowchart of a method for recognizing text images according to an exemplary embodiment;
FIG. 2 is a block flow diagram II of a method for recognizing text images according to an exemplary embodiment;
FIG. 3 is a block flow diagram of a text recognition neural network composition provided in accordance with an exemplary embodiment;
FIG. 4 is a first flowchart diagram of the composition of an image compression righting network provided in accordance with an exemplary embodiment;
FIG. 5 is a block flow diagram II of a configuration of an image compression righting network provided in accordance with an exemplary embodiment;
FIG. 6 is a block diagram of an overall recognition network provided in accordance with an exemplary embodiment;
FIG. 7 is a block diagram of a text recognition system provided in accordance with an exemplary embodiment;
FIG. 8 is a schematic illustration of a text image after rotation and compression processing provided in accordance with an exemplary embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.
Referring to fig. 1, an embodiment of the present invention provides a text image recognition method, including:
101. and acquiring an image to be identified.
102. Inputting an image to be recognized into an image compression righting network to rotate so as to enable a text in the image to be recognized to be in a horizontal position; the image compression normal position network is obtained by training through a machine learning method and has an image rotation function;
103. the image output from the image compression righting network device identifies text.
According to the text image identification method provided by the embodiment, the original image does not need to be preprocessed by a user in a mode of combining the self-encoder and the convolutional neural network, convenience is provided for the user, and meanwhile higher character identification precision is maintained; the complicated steps of the existing character recognition are simplified, and the character recognition can be completed in the same network system.
As a possible implementation of the above embodiment, the convolutional self-encoder includes an encoder composed of a plurality of convolutional layers, a pooling layer, and a decoder composed of an anti-pooling layer and an anti-convolutional layer. The convolutional layer comprisesThe method comprises the steps that a plurality of convolution kernels are used for carrying out feature extraction on an input image to obtain a feature map; the activation function of the convolutional layer may be: h isk=σ(x*Wkk). The pooling layer performs a de-noising sampling operation on the feature map to reduce the computational complexity of the convolution operation. Deconvolution operation convolution operation and summation are performed on each feature map with the transpose of its corresponding convolution kernel, and the activation function may be: y-sigma (∑ h)k*(WT)k+c)。
The rotation righting operation is carried out because the arrangement direction of most characters of the shot image and the upper edge of the page are not in the horizontal position, so that the difficulty of character segmentation and recognition is increased, the accuracy is reduced, and the convolution self-encoder is used for carrying out rotation transformation on the input text image, so that the arrangement direction of the output image text is horizontal.
Referring to fig. 2, in an embodiment of the present invention, while the image compression righting network rotates the image to be recognized, mark points are added at the edge of the text image, and the mark points are used for distinguishing characters and blank areas in the text image; the image recognition text output from the image compression righting network is: identifying texts from the images output by the image compression normal position network according to the marked sites;
the image compression normal position network compresses the image to be identified while rotating the image to be identified;
cutting the compressed and rotated image to be identified line by line and word by word according to the mark points;
and inputting the cut image to be recognized into a text recognition neural network for text recognition, wherein the text recognition neural network is obtained by training through a machine learning method and has a text recognition function.
The reason for compressing images while using an image compression righting network is that self-encoding itself has good image compression capability, which is too resource-wasting if used only for image rotation. Compressing the image after rotating it will avoid wasting resources. An autoencoder is a neural network with three layers: an input layer, a hidden layer (coding layer) and a decoding layer. The purpose of the network is to reconstruct its input so that its hidden layer learns a good representation of the input. An autoencoder neural network is an unsupervised machine learning algorithm that applies back propagation and can set a target value equal to an input value. The training goal of an autoencoder is to copy the input to the output. Internally it has a hidden layer that describes the code used to characterize its input. The convolution self-coding used by the invention is compatible with the denoising automatic encoder, and the damaged input is partially adopted at random to solve the risk of the identity function, so that the automatic encoder needs to recover or denoise. This technique can be used to get a good representation of the input. A good characterization refers to a characterization that can be robustly obtained from the corrupted input, which can be used to recover its corresponding noise-free input.
After the image is subjected to rotation compression, the processed image contains cutting sites, so that the image can be easily cut line by line word by word. Referring to fig. 8, as an embodiment of the present invention, the cutting point information after the text image is compressed by rotation includes three kinds of data, where the first data represents a line number from left to right, the second data represents an x-axis coordinate, and the third data represents a y-axis coordinate, so that each line of the text is marked with a label and a position, and the text can be easily cut.
As a possible implementation of the above embodiment, as shown in fig. 3, the text recognition neural network may be obtained by the following process:
301. establishing a word stock;
302. building a multi-classification convolutional neural network;
303. characters in a word stock are selected and spliced into a complete image, an image compression righting network is input for compression, and the word stock compressed by the image compression righting network is used for training a convolutional neural network;
304. and intercepting the trained convolutional neural network to obtain a text recognition neural network.
In an embodiment of the present invention, a training process of a convolutional word recognition network comprises:
firstly, collecting scanned pictures of different characters, establishing a complete character library, and compressing by using an image compression normal position network to serve as a sample set;
initializing the convolutional neural network, and assigning network weight value by using random parameter to make the network in the state of waiting for training
State-hardened, the network convolution layer uses the Relu activation function, i.e., f (x) max (0, x);
disordering, sequencing and grouping the collected word banks, inputting the word banks into the initialized convolutional neural network batch by batch, and training the network;
and observing the training progress, and performing cross validation on the training result until the network performance tends to converge, thereby finishing the training.
The text image processed by the image compression network contains high-level information of the original image, and unnecessary information is filtered and cut.
As a possible implementation manner of the foregoing embodiment, as shown in fig. 4, the acquiring process of the image compression righting network may include:
401. acquiring a training image text;
402. rotating and righting the training images to be used as training targets, and marking original images to be used as training sets;
403. cutting the text line by line word by word according to a sample of a training target, and adding cutting sites at word intervals;
404. inputting the training sample and the training target into a convolution self-encoder for training, deleting a full connection layer in a decoder by the trained convolution self-encoder, and obtaining the image compression normal position network with automatic correction and compression capability.
Referring to FIG. 5, in an embodiment of the present invention, training a convolutional auto-encoder comprises the following processes:
501. collecting training samples;
502. carrying out small-amplitude rotation on the training sample, adding a training set, and marking an original picture;
503. performing word-by-word cutting on the rotated and positioned pictures in the training sample;
504. adding cutting sites at the cutting positions, and splicing into a complete picture as a training target;
505. initializing a convolution self-encoder, and assigning a network by using a random value;
506. and (4) disordering and arranging the training samples, and inputting the training samples into the convolution self-encoder in batches for training until convergence.
The training mode is to minimize the reconstruction error between the reconstructed image of the convolution self-encoder and the training target. The training loss function uses a minimum mean square error function, i.e.
Figure BDA0001774723800000071
Wherein y isiIn order to be a value for the training target,
Figure BDA0001774723800000072
are the values of the reconstructed image. The update formula of the convolution network parameters is as follows:
Figure BDA0001774723800000073
referring to fig. 6, in an embodiment of the present invention, the overall identification network includes: the image compression positioning network, a program for cutting the image to be identified word by word after rotation compression and a text identification neural network. Because the output result of the image compression righting network has obvious cutting sites, the character cutting does not need to be carried out according to the traditional character cutting mode. The input image can be dynamically scanned, the cutting sites are taken as boundaries, and the images among the cutting sites are connected with a character recognition neural network. If the processing capacity of the server side is enough, the whole recognition network does not need to be divided into two subsystems, and the character recognition neural network can be directly connected to the tail end of the convolutional self-encoder to form a complete neural network. The design avoids a large amount of communication congestion between the GPU and the CPU caused by character-by-character segmentation, and greatly improves the utilization efficiency and the calculation speed of the GPU.
In some embodiments of the present invention, the process of character recognition may adopt a distributed processing manner, and a plurality of overall recognition networks work simultaneously, which may greatly increase recognition speed.
The embodiment of the invention also provides a text image recognition system, which comprises:
the text image acquisition module is used for acquiring an image to be identified by a user;
the image compression normal position network is used for rotating and compressing the acquired image to be identified of the user;
the text cutting module is used for cutting the image subjected to the rotary compression line by line and word by word; and
and the text recognition module is used for recognizing the cut image and outputting corresponding characters.
In some embodiments of the present invention, the environment for recognizing the whole text image includes a plurality of terminals and a server, and the server is provided with the system for recognizing the text image. The terminal can be, but is not limited to, various personal computers, laptops, personal digital assistants, smart phones, tablet computers, portable wearable devices and the like capable of operating the mathematical formula detection method in the image. The server may be a server implementing a single function, or may be a server implementing multiple functions, specifically, an independent physical server, or a physical server cluster. The client terminal shoots a text to be identified, such as an examination paper, and sends the text to the server end through the network; and the server side carries out automatic preprocessing on the picture to be detected by using a convolution self-coding machine, and then identifies the picture by using a character identification network to obtain a final result. And the identification text content is communicated and returned to the client through the network, and the user obtains an identification result.
In an embodiment of the present invention, the computer device on the server side includes a processor, a memory, a network interface, a display screen, and an input device, which are connected through a system bus. Wherein, the processor is used for providing calculation and control capability and supporting the operation of the whole terminal. The memory of the computer device includes a non-volatile storage medium and an internal memory, the non-volatile storage medium storing an operating system and a computer program that, when executed by the processor, causes the processor to implement a method of detecting a mathematical formula in an image. The internal memory of the computer device may also store a computer program that, when executed by the processor, causes the processor to perform a method of recognizing an overall text image. The network interface of the computer device is used for communicating with the terminal. The input device of the computer device may be a touch layer covered on a display screen, or an external keyboard, a touch pad, or a mouse, and the input device may obtain an instruction generated by a user using a finger to an operation interface displayed on the display screen, for example, obtain an instruction that the user inputs an image to be detected by clicking a specific option on a terminal. The display screen may be used to display text regions for input interfaces or outputs.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (7)

1. A method for recognizing a text image, comprising:
inputting an image to be recognized into an image compression normal position network, and rotating the image to be recognized so as to enable a text in the image to be recognized to be in a horizontal position, wherein the image compression normal position network is obtained by training through a machine learning method and has an image rotation function;
recognizing texts of the images output by the image compression normal position network;
the image compression normal position network rotates the image to be identified, and simultaneously adds mark sites on the edge of the text image, wherein the mark sites are used for distinguishing characters and blank areas in the text image;
the process of recognizing the text of the image output by the image compression normal position network comprises the following steps: recognizing texts from the images output by the image compression normal position network according to the marked sites;
the image compression normal position network consists of a convolutional layer and a pooling layer of a convolutional neural network;
the method for acquiring the image compression normal position network comprises the following steps:
acquiring a training image text;
rotating and righting the training images to be used as training targets, and marking original images to be used as training sets;
cutting the text line by line word by word according to a sample of a training target, and adding cutting sites at word intervals;
inputting the training sample and the training target into a convolution self-encoder for training, deleting a full connection layer in a decoder by the trained convolution self-encoder, and obtaining the image compression normal position network with automatic correction and compression capability.
2. The method of claim 1, wherein the image compression righting network compresses the image to be recognized while rotating the image to be recognized.
3. The method of claim 2,
cutting the compressed and rotated image to be identified line by line and word by word according to the mark point;
and inputting the cut image to be recognized into a text recognition neural network for text recognition, wherein the text recognition neural network is obtained by training through a machine learning method and has a text recognition function.
4. The method of claim 3, wherein the obtaining of the text recognition neural network comprises:
establishing a word stock;
building a multi-classification convolutional neural network;
characters in a character library are selected and spliced into a complete image, and the image compression righting network is input for compression;
training a convolutional neural network by using the compressed word stock of the image compression normal position network;
and obtaining the text recognition neural network.
5. The method of claim 4, wherein the text recognition neural network is comprised of convolutional layers, pooling layers, fully-connected layers, and corresponding network weights of the convolutional neural network.
6. The method according to any one of claims 1 to 5, characterized in that a distributed processing mode is adopted for the text recognition process, and a plurality of groups of text recognition neural networks work simultaneously; and integrating the results of the distributed text recognition in sequence to obtain a final text recognition result.
7. A system for recognizing a text image, comprising:
the text image acquisition module is used for acquiring an image to be identified by a user;
the image compression righting network is used for rotating and compressing the acquired image to be identified of the user;
the text cutting module cuts the image subjected to the rotary compression line by line and word by word; and the text recognition module is used for recognizing the cut image and outputting corresponding characters.
CN201810965342.5A 2018-08-23 2018-08-23 Text image identification method and system Active CN109241974B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810965342.5A CN109241974B (en) 2018-08-23 2018-08-23 Text image identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810965342.5A CN109241974B (en) 2018-08-23 2018-08-23 Text image identification method and system

Publications (2)

Publication Number Publication Date
CN109241974A CN109241974A (en) 2019-01-18
CN109241974B true CN109241974B (en) 2020-12-01

Family

ID=65069329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810965342.5A Active CN109241974B (en) 2018-08-23 2018-08-23 Text image identification method and system

Country Status (1)

Country Link
CN (1) CN109241974B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695385B (en) * 2019-03-15 2023-09-26 杭州海康威视数字技术股份有限公司 Text recognition method, device and equipment
CN110674811B (en) * 2019-09-04 2022-04-29 广东浪潮大数据研究有限公司 Image recognition method and device
CN111242024A (en) * 2020-01-11 2020-06-05 北京中科辅龙科技股份有限公司 Method and system for recognizing legends and characters in drawings based on machine learning
CN111444908B (en) * 2020-03-25 2024-02-02 腾讯科技(深圳)有限公司 Image recognition method, device, terminal and storage medium
US11216960B1 (en) 2020-07-01 2022-01-04 Alipay Labs (singapore) Pte. Ltd. Image processing method and system
CN117496531B (en) * 2023-11-02 2024-05-24 四川轻化工大学 Construction method of convolution self-encoder capable of reducing Chinese character recognition resource overhead

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104915386A (en) * 2015-05-25 2015-09-16 中国科学院自动化研究所 Short text clustering method based on deep semantic feature learning
CN105469047A (en) * 2015-11-23 2016-04-06 上海交通大学 Chinese detection method based on unsupervised learning and deep learning network and system thereof
CN107247950A (en) * 2017-06-06 2017-10-13 电子科技大学 A kind of ID Card Image text recognition method based on machine learning
CN107403130A (en) * 2017-04-19 2017-11-28 北京粉笔未来科技有限公司 A kind of character identifying method and character recognition device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156766B (en) * 2015-03-25 2020-02-18 阿里巴巴集团控股有限公司 Method and device for generating text line classifier

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104915386A (en) * 2015-05-25 2015-09-16 中国科学院自动化研究所 Short text clustering method based on deep semantic feature learning
CN105469047A (en) * 2015-11-23 2016-04-06 上海交通大学 Chinese detection method based on unsupervised learning and deep learning network and system thereof
CN107403130A (en) * 2017-04-19 2017-11-28 北京粉笔未来科技有限公司 A kind of character identifying method and character recognition device
CN107247950A (en) * 2017-06-06 2017-10-13 电子科技大学 A kind of ID Card Image text recognition method based on machine learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Robust Scene Text Recognition with Automatic Rectification;Baoguang Shi,Xinggang Wang,Pengyuan Lyu,Cong Yao,Xiang Bai;《 2016 IEEE Conference on Computer Vision and Pattern Recognition》;20161212;全文 *
基于深度学习的场景文字检测与识别;白翔,杨明锟,石葆光,廖明辉;《中国科学:信息科学》;20180511;第48卷(第5期);正文第3章3.2.1-3.2.2节 *

Also Published As

Publication number Publication date
CN109241974A (en) 2019-01-18

Similar Documents

Publication Publication Date Title
CN109241974B (en) Text image identification method and system
WO2018010657A1 (en) Structured text detection method and system, and computing device
CN111325271B (en) Image classification method and device
CN109919077B (en) Gesture recognition method, device, medium and computing equipment
CN111597884A (en) Facial action unit identification method and device, electronic equipment and storage medium
CN108734653B (en) Image style conversion method and device
US11599727B2 (en) Intelligent text cleaning method and apparatus, and computer-readable storage medium
CN109408058B (en) Front-end auxiliary development method and device based on machine learning
CN110689658A (en) Taxi bill identification method and system based on deep learning
CN111428557A (en) Method and device for automatically checking handwritten signature based on neural network model
CN113657404B (en) Image processing method of Dongba pictograph
JP7389824B2 (en) Object identification method and device, electronic equipment and storage medium
CN111832449A (en) Engineering drawing display method and related device
CN110610180A (en) Method, device and equipment for generating recognition set of wrongly-recognized words and storage medium
CN112380566A (en) Method, apparatus, electronic device, and medium for desensitizing document image
CN112966685A (en) Attack network training method and device for scene text recognition and related equipment
CN115937546A (en) Image matching method, three-dimensional image reconstruction method, image matching device, three-dimensional image reconstruction device, electronic apparatus, and medium
US11106908B2 (en) Techniques to determine document recognition errors
CN111325190A (en) Expression recognition method and device, computer equipment and readable storage medium
CN111833413B (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
WO2021051562A1 (en) Facial feature point positioning method and apparatus, computing device, and storage medium
CN113486171B (en) Image processing method and device and electronic equipment
WO2023130613A1 (en) Facial recognition model construction method, facial recognition method, and related device
CN114708420A (en) Visual positioning method and device based on local variance and posterior probability classifier
CN113610856A (en) Method and device for training image segmentation model and image segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant