CN110427852B - Character recognition method and device, computer equipment and storage medium - Google Patents

Character recognition method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN110427852B
CN110427852B CN201910672543.0A CN201910672543A CN110427852B CN 110427852 B CN110427852 B CN 110427852B CN 201910672543 A CN201910672543 A CN 201910672543A CN 110427852 B CN110427852 B CN 110427852B
Authority
CN
China
Prior art keywords
character
neural network
sequence
center
ordered list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910672543.0A
Other languages
Chinese (zh)
Other versions
CN110427852A (en
Inventor
龙上邦
关玉烁
姚聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kuangshi Technology Co Ltd
Original Assignee
Beijing Kuangshi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kuangshi Technology Co Ltd filed Critical Beijing Kuangshi Technology Co Ltd
Priority to CN201910672543.0A priority Critical patent/CN110427852B/en
Publication of CN110427852A publication Critical patent/CN110427852A/en
Application granted granted Critical
Publication of CN110427852B publication Critical patent/CN110427852B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The application relates to a character recognition method, a character recognition device, a computer device and a storage medium. The method can effectively filter the characteristics of irrelevant positions by carrying out characteristic sampling based on the character central point, meanwhile, the method carries out characteristic acquisition based on the character central point sequence list and keeps the integral vision of the characteristics, the method successfully combines the thought of two-dimensional space distribution with the thought of sequence learning, and the method has excellent performance on the recognition of irregular characters on images.

Description

Character recognition method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer vision technologies, and in particular, to a text method, a text device, a computer apparatus, and a storage medium.
Background
Scene text recognition has been an increasingly popular research topic in computer vision over the past few decades. Instant translation, robot navigation, industrial automation, automatic driving and other scenes need to apply scene text recognition technology.
Conventional text recognition follows an encoder-decoder framework under which an image is decomposed into a sequence of pixel frames starting from the left side to the right side of the image. The framework can be generalized as a Convolutional Recurrent Neural Network (CRNN), whose convolutional layer encodes images into deep features, whose pooling layer processes features into C-dimensional sequential features with w time steps, whose RNN network or LSTM layer is further encoded and decoded. The framework has good recognition results when processing images containing horizontal text.
However, this text recognition framework is not suitable for the case of curved text recognition.
Disclosure of Invention
In view of the above, it is desirable to provide a character recognition method, device, computer device and storage medium capable of better recognizing irregular characters.
A method of word recognition, the method comprising:
inputting the image into a feature extraction neural network for feature extraction to obtain a shared feature map;
acquiring a character central point ordered list according to the shared feature map and the character center detection neural network, wherein the character central point ordered list comprises at least one predicted coordinate of the character central point;
extracting sequence features from the shared feature map based on the character central point ordered list, and serially connecting the sequence features in sequence to obtain a feature sequence;
and inputting the characteristic sequence into a character recognition neural network for character recognition to obtain a recognition result of characters in the image.
In one embodiment, obtaining the ordered list of character center points according to the shared feature map and the character center detection neural network includes:
inputting the shared characteristic diagram into a character center detection neural network to predict a character center area, and acquiring a character center prediction diagram, wherein the character center area is a character frame with a reduced preset proportion, and the character center prediction diagram indicates the probability that each pixel belongs to the character center area;
and obtaining a character central point ordered list according to the center prediction graph, wherein the character central point ordered list comprises at least one predicted coordinate of the character central point.
In one embodiment, the obtaining the ordered list of the character center points according to the center prediction graph includes:
taking the pixel with the probability greater than or equal to a preset threshold value in the central prediction image as a positive sample;
aggregating positive samples adjacent to the spatial position into a group to obtain a plurality of character central areas;
and acquiring coordinates of the character center point of each character center area, and arranging the coordinates of the character center points according to a preset rule to obtain an ordered list of the character center points.
In one embodiment, the feature extraction neural network comprises a residual error network ResNet and a feature pyramid network;
the residual error network ResNet is used for extracting features; the characteristic pyramid network is used for fusing the characteristics extracted by each network layer of the residual error network ResNet.
In one embodiment, the residual network ResNet comprises 50 layers of residual networks, and the upsampling layer of the feature pyramid network is sampled by using a bi-sex difference method.
In one embodiment, the character center detection neural network comprises a double-layer convolutional neural network, a first network layer of the double-layer convolutional neural network is a filter, and a second network layer of the double-layer convolutional neural network is a pixel classification layer.
In one embodiment, extracting sequence features from the shared feature map based on the ordered list of character centroids, and concatenating the sequence features in sequence to obtain a feature sequence includes:
inputting the shared characteristic graph into a shared characteristic coding neural network to carry out shared characteristic coding to obtain a characteristic sampling graph;
sampling the feature sampling graph based on the character central point ordered list to obtain sequence features;
and sequentially connecting the sequence features in series to obtain a characteristic sequence.
In one embodiment, sampling the feature sampling graph based on the ordered list of character center points to obtain a sequence feature includes:
selecting a sampling point according to the character central points in the character central point ordered list;
and if the coordinates of the sampling points are not integer values, acquiring the characteristics of the sampling points by using a bilinear difference method, and taking the acquired characteristics of the sampling points as sequence characteristics.
In one embodiment, the encoder of the character recognition neural network is a single-layer bidirectional long and short memory network LSTM; the decoder of the character recognition neural network is a one-way long and short memory network LSTM.
A word recognition device, the device comprising:
the shared feature extraction module is used for extracting the features of the image input feature extraction neural network to obtain a shared feature map;
the character pooling module is used for detecting a neural network according to the shared characteristic diagram and the character center to obtain an ordered list of character center points; extracting sequence features from the shared feature map based on the character central point ordered list, and serially connecting the sequence features in sequence to obtain a feature sequence; wherein the ordered list of character center points comprises coordinates of at least one predicted character center point;
the character recognition module is used for inputting the characteristic sequence into a character recognition neural network for character recognition to obtain a computer device of a recognition result of characters in an image, and comprises a memory, a processor and a computer program which is stored on the memory and can be operated on the processor, wherein the processor realizes the following steps when executing the computer program:
inputting the image into a feature extraction neural network for feature extraction to obtain a shared feature map;
inputting the shared characteristic diagram into a character center detection neural network to perform character center prediction to obtain a character center prediction diagram, wherein the character center is a character frame with a reduced preset proportion, and the center prediction diagram indicates the probability of each pixel as a character center;
obtaining a character central point ordered list according to the central prediction graph, wherein the character central point ordered list comprises at least one predicted character central point;
collecting features from the shared feature graph based on the character center points in the character center point ordered list, and serially connecting the features in sequence to obtain a feature sequence;
and inputting the characteristic sequence into a character recognition neural network for character recognition to obtain a recognition result of characters in the image.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
inputting the image into a feature extraction neural network for feature extraction to obtain a shared feature map;
inputting the shared characteristic diagram into a character center detection neural network to perform character center prediction to obtain a character center prediction diagram, wherein the character center is a character frame with a reduced preset proportion, and the center prediction diagram indicates the probability of each pixel as a character center;
obtaining a character central point ordered list according to the central prediction graph, wherein the character central point ordered list comprises at least one predicted character central point;
collecting features from the shared feature graph based on the character center points in the character center point ordered list, and serially connecting the features in sequence to obtain a feature sequence;
and inputting the characteristic sequence into a character recognition neural network for character recognition to obtain a recognition result of characters in the image.
According to the character recognition method, the character recognition device, the computer equipment and the storage medium, firstly, the characteristic extraction is carried out on the characteristic extraction neural network, the shared characteristic diagram is obtained, then, the character center sequence table is obtained by detecting the neural network according to the shared characteristic diagram and the character center, then, the characteristic sequence is obtained by extracting the characteristic based on the character center sequence table, and finally, character recognition is carried out based on the obtained characteristic sequence. The method can effectively filter the characteristics of irrelevant positions by carrying out characteristic sampling based on the character central point, meanwhile, the method carries out characteristic acquisition based on the character central point sequence list and keeps the integral vision of the characteristics, the method successfully combines the thought of two-dimensional space distribution with the thought of sequence learning, and the method has excellent performance on the recognition of irregular characters on images.
Drawings
FIG. 1 is a diagram of an exemplary implementation of a text recognition method;
FIG. 2 is a flow diagram illustrating a method for text recognition in one embodiment;
FIG. 3 is a schematic diagram of a feature extraction neural network and a character center detection neural network in one embodiment;
FIG. 4 is a diagram illustrating pooling of the obtained shared signature graph in one embodiment;
FIG. 5 is a schematic flow chart illustrating the step of refining step S220 in one embodiment;
FIG. 6 is a schematic flow chart of the refinement step of step S222 in one embodiment;
FIG. 7 is a schematic flow chart showing a refinement step of step S230 in another embodiment;
FIG. 8 is a block diagram of a text recognition device in one embodiment;
FIG. 9 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The character recognition method provided by the application can be applied to the application environment shown in fig. 1. The processor 100 and the image capturing device 200 are connected to each other. The image capturing device 200 is used for capturing an image to be processed, and the processor 100 is used for processing the image captured by the image capturing device 200.
Optionally, the processor 100 and the image capturing device 200 may be a notebook computer, a smart phone, a tablet computer, or other intelligent terminals that integrate image capturing and image processing.
Alternatively, the processor 100 and the image capturing apparatus 200 may be independent devices, for example, the image capturing apparatus 200 is a digital camera, and the processor 100 is a personal computer. At this time, when the processor 100 is required to perform image segmentation processing on the image acquired by the image acquisition device 200, a communication link between the processor 100 and the image acquisition device 200 needs to be established first, and then the image acquired by the image acquisition device 200 is transmitted to the processor 100 through the communication link.
Further, the processor 100 may execute a correlation algorithm of the neural network model to perform a correlation operation of each network layer of the neural network model.
In one embodiment, as shown in fig. 2, a text recognition method is provided, which is exemplified by the application environment in fig. 1, and includes the following steps:
and step 210, inputting the image into a feature extraction neural network for feature extraction, and acquiring a shared feature map.
Specifically, the image is input into a feature extraction neural network, and the processor 100 executes a correlation algorithm of the feature extraction neural network to perform feature extraction on the input image, so as to obtain a shared feature map. Further, the feature extraction neural network may include a residual network ResNet and a feature pyramid network FPN. The residual error network ResNet performs feature extraction, and the feature pyramid network is used for fusing features extracted by each network layer of the residual error network ResNet. Alternatively, as shown in fig. 3, the residual network ResNet may select a 50-layer residual network. The upper sampling layer of the characteristic pyramid network performs sampling by using a dual difference method, and other settings of the characteristic pyramid network are consistent with those of a standard characteristic pyramid network. Further, all convolution layers in the feature pyramid have 256 convolution kernels. The size of convolution kernels connected horizontally in the feature pyramid is 1, and the size of convolution kernels connected from top to bottom is 3. Optionally, the size of the shared feature map is one quarter of the size of the image (input image).
Step 220, according to the shared characteristic diagram and the character center detection neural network, obtaining an ordered list of character center points.
And the character center point ordered list comprises at least one coordinate of the predicted character center point, and the coordinate is used for describing the position of the character center point. Specifically, the processor 100 executes the operation processing shared feature map of the character center detection neural network to obtain the ordered list of the character center points. More specifically, the processor 100 may input the shared feature map into a character center detection neural network to perform character center prediction, so as to obtain a plurality of character center areas, then obtain coordinates of character center points of the character center areas, and obtain an ordered list of the character center points according to the obtained coordinates of the character center points.
And step 230, extracting sequence features from the shared feature graph based on the character center point ordered list, and serially connecting the sequence features in sequence to obtain a feature sequence.
Specifically, the processor 100 extracts sequence features from the shared feature map based on the ordered list of character center points, and concatenates the sequence features in sequence to obtain a feature sequence. Alternatively, referring to fig. 4, the processor 100 obtains a sequence feature by uniformly sampling the shared feature map based on the coordinates of the center point of each character in the ordered list of the center points of the characters, and then serially connects the sequence feature in sequence to obtain a feature sequence. More specifically, the hyper-parameter of the preset number of samples is set to M, and the processor 100 acquires a C × 1 × 1 feature vector from a position corresponding to the shared feature map based on the coordinates of the center point of each character, and based on this, a feature sequence of size M × C is obtained, where the feature sequence has M time steps.
And 240, inputting the characteristic sequence into a character recognition neural network for character recognition to obtain a recognition result of characters in the image.
Specifically, the feature sequence is input into a character recognition neural network, and the processor 100 runs a correlation algorithm of the character recognition neural network to perform character recognition, so as to obtain a recognition result of characters in the image. Alternatively, the framework of the word recognition neural network may be an encoder-decoder structure that incorporates an attention mechanism. Optionally, the encoder of the character recognition neural network is a single-layer bidirectional long-short memory network LSTM, which can capture long-term dependencies and maintain an overall perception of serialized pooling characteristics. This can make use of the understanding of the image globally to make up for missing information when individual characters are obscured or missing. Optionally, the hidden state size of each direction of the single-layer bidirectional long and short memory network LSTM is set to 256, and the hidden states of the two directions are connected together. The decoder of the word recognition neural network is a unidirectional LSTM, whose hidden state may be 512 in size. The initial concealment state is set equal to the last concealment state of the encoder. Optionally, the decoder is equipped with an attention mechanism, i.e. the input at each time step is a concatenation of the low-dimensional real-valued character embedding of the last output symbol and the average of the hidden states of the encoder weighted by the attention score.
In the character recognition method, firstly, the neural network is extracted for feature extraction to obtain a shared feature map, then the neural network is detected according to the shared feature map and the character center to obtain a character center sequence table, then, features are extracted based on the character center sequence table to obtain a feature sequence, and finally, character recognition is carried out based on the obtained feature sequence. The method can effectively filter the characteristics of irrelevant positions by carrying out characteristic sampling based on the character central point, meanwhile, the method carries out characteristic acquisition based on the character central point sequence list and keeps the integral vision of the characteristics, the method successfully combines the thought of two-dimensional space distribution with the thought of sequence learning, and the method has excellent performance on the recognition of irregular characters on images.
In one embodiment, as shown in fig. 5, step S220 includes:
step S221, inputting the shared characteristic diagram into a character center detection neural network to perform character center area prediction, and acquiring a character center prediction diagram.
The character central area is a character frame reduced by a preset proportion, and the central prediction graph indicates the probability that each pixel belongs to the character central area. Specifically, the shared feature map is input to the character center detection neural network, and the processor 100 executes a relevant algorithm of the character center detection neural network to perform character center area prediction, thereby obtaining a character center prediction map. With continued reference to fig. 3, the character center detection neural network includes a double-layer convolutional neural network, a first network layer of the double-layer convolutional neural network is a filter, and a second network layer of the double-layer convolutional neural network is a pixel classification layer. The second network layer is essentially a pixel level classification layer. Alternatively, the kernel size of the first network layer may be 3, which may contain 256 filters in total.
Alternatively, the size of the character center prediction map obtained in step S221 is the same as the size of the shared feature map obtained in step S210 for the convenience of subsequent data processing modeling. Optionally, the size of the character center is 1/4 of the character box.
Step 222, obtaining an ordered list of character center points according to the character center prediction graph.
Specifically, the processor 100 obtains an ordered list of character center points according to the character center prediction graph. The character center prediction map indicates the probability that each pixel belongs to the center region of the character. More specifically, referring to fig. 4, first, the center area of each character is determined according to the center prediction graph, then, the center point of each character is determined according to the center area of each character, and finally, the center points of the characters are connected in series according to the coordinates of the center points of each character, so as to obtain the ordered list of the center points of the characters.
According to the method, firstly, the character center detection neural network is used for predicting the character center area to obtain the character center prediction graph, then the character center point ordered list is obtained according to the character center prediction graph, the data processing process is finer, and the character recognition result obtained based on the character center point ordered list is more accurate.
In one embodiment, as shown in fig. 6, step S222 includes:
step S2221, using the pixel in the character center prediction graph whose probability is greater than or equal to a preset threshold as a positive sample. And meanwhile, taking the pixel with the probability smaller than the preset threshold value in the central prediction image as a negative sample. Optionally, the preset threshold is a value set empirically or obtained through a plurality of experimental summaries. Optionally, the preset threshold may be 0.5.
Step S2222, the positive samples adjacent to each other in the spatial position are grouped together to obtain a character central area.
Step S2223, obtaining the coordinates of the character center point of each character center area, and arranging the coordinates of the character center points according to a preset rule to obtain an ordered list of the character center points.
Specifically, step S2223 may include: firstly, constructing a coordinate system based on the edge line of the character center prediction graph; then obtaining the coordinates of each pixel in the center of each character in the coordinate system; then obtaining the coordinates of the character center point corresponding to each character center according to the coordinates of each pixel of each character center in the coordinate system; and finally, arranging the coordinates of the character central points according to a preset rule to obtain an ordered list of the character central points.
Since the size of the central area of the character is smaller than the corresponding character frame (generally 0.25 times of the character frame), the method of the embodiment separates the features of each character, and then concatenates the separated features according to the coordinates (spatial information) of the center point of each character. Therefore, the method not only keeps the overall visual characteristics of the characters in the image, but also effectively separates the characteristics of the single character, thereby being beneficial to the subsequent identification of each character in the image.
In one embodiment, as shown in fig. 7, step S230 includes:
and S231, inputting the shared characteristic graph into a shared characteristic coding neural network for shared characteristic coding to obtain a characteristic sampling graph.
Specifically, the shared signature is input into the shared signature encoding neural network, and the processor 100 executes the algorithm associated with the shared signature encoding neural network to further encode the shared signature in the shared signature.
And step S232, sampling the feature sampling graph based on the character center point ordered list to obtain sequence features.
Specifically, the processor 100 samples the feature sampling graph based on the ordered list of character center points to obtain a sequence feature. More specifically, the processor 100 first selects a sampling point according to the character center points in the ordered list of character center points; and then uniformly sampling from the feature sampling graph based on the coordinates of the center point of each character in the character center point ordered list to obtain sequence features. Optionally, if the coordinates of the sampling points are not integer values, the characteristics of the sampling points are obtained by using a bilinear difference method, and the obtained characteristics of the sampling points are used as sequence characteristics.
And step S233, serially connecting the sequence characteristics in sequence to obtain a characteristic sequence.
Specifically, the processor 100 concatenates the sequence features in sequence to obtain a feature sequence.
The feature sequence obtained by the sequence features acquired by the embodiment can obtain a more excellent character recognition result in the subsequent character recognition process. It should be understood that although the steps in the flowcharts of fig. 2, 5-7 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2, 5-7 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least some of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 8, there is provided a character recognition apparatus including:
and the shared feature extraction module 810 is configured to input the image into a feature extraction neural network to perform feature extraction, so as to obtain a shared feature map.
A character pooling module 820, configured to obtain an ordered list of character center points according to the shared feature map and the character center detection neural network; extracting sequence features from the shared feature map based on the character central point ordered list, and serially connecting the sequence features in sequence to obtain a feature sequence; wherein the ordered list of character center points includes coordinates of at least one predicted character center point.
And the character recognition module 830 is configured to input the feature sequence into a character recognition neural network for character recognition, so as to obtain a recognition result of characters in the image.
In one embodiment, the character pooling module 820 is a character center detecting module 720, and is specifically configured to input the shared feature map into a character center detecting neural network to perform character center area prediction, so as to obtain a character center prediction map, where the character center area is a character frame with a reduced preset ratio, and the character center prediction map indicates a probability that each pixel belongs to the character center area; and obtaining a character central point ordered list according to the center prediction graph, wherein the character central point ordered list comprises at least one predicted coordinate of the character central point.
In one embodiment, the character pooling module 820 is specifically configured to use pixels in the central prediction graph with a probability greater than or equal to a preset threshold as positive samples; aggregating positive samples adjacent to the spatial position into a group to obtain a plurality of character central areas; and acquiring coordinates of the character center point of each character center area, and arranging the coordinates of the character center points according to a preset rule to obtain an ordered list of the character center points.
In one embodiment, the character pooling module 820 is specifically configured to input the shared feature map into a shared feature encoding neural network for shared feature encoding, so as to obtain a feature sampling map; sampling the feature sampling graph based on the character central point ordered list to obtain sequence features; and sequentially connecting the sequence features in series to obtain a characteristic sequence.
In one embodiment, the character pooling module 820 is specifically configured to select a sampling point according to the character center point in the ordered list of character center points; and if the coordinates of the sampling points are not integer values, acquiring the characteristics of the sampling points by using a bilinear difference method, and taking the acquired characteristics of the sampling points as sequence characteristics.
For the specific limitation of the character recognition device, reference may be made to the above limitation of the character recognition method, which is not described herein again. The modules in the character recognition device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 9. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of word recognition. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program: inputting the image into a feature extraction neural network for feature extraction to obtain a shared feature map; acquiring a character central point ordered list according to the shared feature map and the character center detection neural network, wherein the character central point ordered list comprises at least one predicted coordinate of the character central point; extracting sequence features from the shared feature map based on the character central point ordered list, and serially connecting the sequence features in sequence to obtain a feature sequence; and inputting the characteristic sequence into a character recognition neural network for character recognition to obtain a recognition result of characters in the image.
In one embodiment, the processor when executing the computer program embodies the following steps: inputting the shared characteristic diagram into a character center detection neural network to predict a character center area, and acquiring a character center prediction diagram, wherein the character center area is a character frame with a reduced preset proportion, and the character center prediction diagram indicates the probability that each pixel belongs to the character center area; and obtaining a character central point ordered list according to the center prediction graph, wherein the character central point ordered list comprises at least one predicted coordinate of the character central point.
In one embodiment, the processor when executing the computer program embodies the following steps: taking the pixel with the probability greater than or equal to a preset threshold value in the central prediction image as a positive sample; aggregating positive samples adjacent to the spatial position into a group to obtain a plurality of character central areas; and acquiring coordinates of the character center point of each character center area, and arranging the coordinates of the character center points according to a preset rule to obtain an ordered list of the character center points.
In one embodiment, the processor when executing the computer program embodies the following steps: inputting the shared characteristic graph into a shared characteristic coding neural network to carry out shared characteristic coding to obtain a characteristic sampling graph; sampling the feature sampling graph based on the character central point ordered list to obtain sequence features; and sequentially connecting the sequence features in series to obtain a characteristic sequence.
In one embodiment, the processor when executing the computer program embodies the following steps: selecting a sampling point according to the character central points in the character central point ordered list; and if the coordinates of the sampling points are not integer values, acquiring the characteristics of the sampling points by using a bilinear difference method, and taking the acquired characteristics of the sampling points as sequence characteristics.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of: inputting the image into a feature extraction neural network for feature extraction to obtain a shared feature map; acquiring a character central point ordered list according to the shared feature map and the character center detection neural network, wherein the character central point ordered list comprises at least one predicted coordinate of the character central point; extracting sequence features from the shared feature map based on the character central point ordered list, and serially connecting the sequence features in sequence to obtain a feature sequence; and inputting the characteristic sequence into a character recognition neural network for character recognition to obtain a recognition result of characters in the image.
In one embodiment, the computer program when executed by the processor embodies the steps of: inputting the shared characteristic diagram into a character center detection neural network to predict a character center area, and acquiring a character center prediction diagram, wherein the character center area is a character frame with a reduced preset proportion, and the character center prediction diagram indicates the probability that each pixel belongs to the character center area; and obtaining a character central point ordered list according to the center prediction graph, wherein the character central point ordered list comprises at least one predicted coordinate of the character central point.
In one embodiment, the computer program when executed by the processor embodies the steps of: taking the pixel with the probability greater than or equal to a preset threshold value in the central prediction image as a positive sample; aggregating positive samples adjacent to the spatial position into a group to obtain a plurality of character central areas; and acquiring coordinates of the character center point of each character center area, and arranging the coordinates of the character center points according to a preset rule to obtain an ordered list of the character center points.
In one embodiment, a computer program, when executed by a processor, specifically implements the steps of inputting a shared feature pattern into a shared feature encoding neural network for shared feature encoding to obtain a feature sampling pattern; sampling the feature sampling graph based on the character central point ordered list to obtain sequence features; and sequentially connecting the sequence features in series to obtain a characteristic sequence.
In one embodiment, the computer program when executed by the processor embodies the steps of: selecting a sampling point according to the character central points in the character central point ordered list; and if the coordinates of the sampling points are not integer values, acquiring the characteristics of the sampling points by using a bilinear difference method, and taking the acquired characteristics of the sampling points as sequence characteristics.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (11)

1. A method for recognizing a character, the method comprising:
inputting the image into a feature extraction neural network for feature extraction to obtain a shared feature map;
acquiring a character central point ordered list according to the shared feature map and the character center detection neural network, wherein the character central point ordered list comprises at least one predicted coordinate of the character central point;
extracting sequence features from the shared feature map based on the character central point ordered list, and serially connecting the sequence features in sequence to obtain a feature sequence;
inputting the characteristic sequence into a character recognition neural network for character recognition to obtain a recognition result of characters in the image;
the character center detection neural network comprises a double-layer convolutional neural network, wherein a first network layer of the double-layer convolutional neural network is a filter, and a second network layer of the double-layer convolutional neural network is a pixel classification layer;
the obtaining of the ordered list of the character center points according to the shared feature graph and the character center detection neural network comprises:
inputting the shared characteristic diagram into a character center detection neural network to perform character center prediction to obtain a plurality of character center areas;
and acquiring the coordinates of the character center point of each character center area, and acquiring an ordered list of the character center points according to the coordinates of the character center points.
2. The method of claim 1, wherein obtaining an ordered list of character center points from the shared signature graph and the character center detection neural network comprises:
inputting the shared characteristic diagram into a character center detection neural network to predict a character center area, and acquiring a character center prediction diagram, wherein the character center area is a character frame with a reduced preset proportion, and the character center prediction diagram indicates the probability that each pixel belongs to the character center area;
and obtaining a character central point ordered list according to the center prediction graph, wherein the character central point ordered list comprises at least one predicted coordinate of the character central point.
3. The method of claim 2, wherein obtaining an ordered list of character center points from the center prediction graph comprises:
taking the pixel with the probability greater than or equal to a preset threshold value in the central prediction image as a positive sample;
aggregating positive samples adjacent to the spatial position into a group to obtain a plurality of character central areas;
and acquiring the coordinates of the character center point of each character center area, and arranging the coordinates of the character center points according to a preset rule to obtain a character center point ordered list.
4. The method of claim 1, wherein the feature extraction neural network comprises a residual network ResNet and a feature pyramid network;
the residual error network ResNet is used for extracting features; the characteristic pyramid network is used for fusing the characteristics extracted by each network layer of the residual error network ResNet.
5. The method of claim 4, wherein the residual network ResNet comprises 50 layers of residual networks, and wherein the upsampling layer of the feature pyramid network is sampled using a bi-sexual difference method.
6. The method of claim 1, wherein extracting sequence features from the shared feature map based on the ordered list of character centroids, and concatenating the sequence features in sequence to obtain a feature sequence comprises:
inputting the shared characteristic graph into a shared characteristic coding neural network to carry out shared characteristic coding to obtain a characteristic sampling graph;
sampling the feature sampling graph based on the character central point ordered list to obtain sequence features;
and sequentially connecting the sequence features in series to obtain a characteristic sequence.
7. The method of claim 6, wherein sampling the feature sample graph based on the ordered list of character center points to obtain a sequence feature comprises:
selecting a sampling point according to the character central points in the character central point ordered list;
and if the coordinates of the sampling points are not integer values, acquiring the characteristics of the sampling points by using a bilinear difference method, and taking the acquired characteristics of the sampling points as sequence characteristics.
8. The method according to any one of claims 1 to 7, wherein the encoder of the character recognition neural network is a single-layer bidirectional long-short memory network (LSTM); the decoder of the character recognition neural network is a one-way long and short memory network LSTM.
9. A character recognition apparatus, comprising:
the shared feature extraction module is used for extracting the features of the image input feature extraction neural network to obtain a shared feature map;
the character pooling module is used for detecting a neural network according to the shared characteristic diagram and the character center to obtain an ordered list of character center points; extracting sequence features from the shared feature map based on the character central point ordered list, and serially connecting the sequence features in sequence to obtain a feature sequence; wherein the ordered list of character center points comprises coordinates of at least one predicted character center point;
the character recognition module is used for inputting the characteristic sequence into a character recognition neural network for character recognition to obtain a recognition result of characters in the image;
the character center detection neural network comprises a double-layer convolutional neural network, wherein a first network layer of the double-layer convolutional neural network is a filter, and a second network layer of the double-layer convolutional neural network is a pixel classification layer;
the character pooling module is also used for inputting the shared characteristic diagram into a character center detection neural network to predict character centers to obtain a plurality of character center areas; and acquiring the coordinates of the character center point of each character center area, and acquiring an ordered list of the character center points according to the coordinates of the character center points.
10. A computer device comprising a memory and a processor, the memory having stored thereon a computer program operable on the processor, wherein the processor, when executing the computer program, performs the steps of the method of any of claims 1 to 8.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 8.
CN201910672543.0A 2019-07-24 2019-07-24 Character recognition method and device, computer equipment and storage medium Active CN110427852B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910672543.0A CN110427852B (en) 2019-07-24 2019-07-24 Character recognition method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910672543.0A CN110427852B (en) 2019-07-24 2019-07-24 Character recognition method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110427852A CN110427852A (en) 2019-11-08
CN110427852B true CN110427852B (en) 2022-04-15

Family

ID=68412213

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910672543.0A Active CN110427852B (en) 2019-07-24 2019-07-24 Character recognition method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110427852B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027613A (en) * 2019-12-04 2020-04-17 浙江省北大信息技术高等研究院 Scene character recognition method and device, storage medium and terminal
CN111027553A (en) * 2019-12-23 2020-04-17 武汉唯理科技有限公司 Character recognition method for circular seal
CN111178358A (en) * 2019-12-31 2020-05-19 上海眼控科技股份有限公司 Text recognition method and device, computer equipment and storage medium
CN111275046B (en) * 2020-01-10 2024-04-16 鼎富智能科技有限公司 Character image recognition method and device, electronic equipment and storage medium
CN111898374B (en) * 2020-07-30 2023-11-07 腾讯科技(深圳)有限公司 Text recognition method, device, storage medium and electronic equipment
CN113255668B (en) * 2021-06-22 2021-10-08 北京世纪好未来教育科技有限公司 Text recognition method and device, electronic equipment and storage medium
CN113361522B (en) * 2021-06-23 2022-05-17 北京百度网讯科技有限公司 Method and device for determining character sequence and electronic equipment
CN114842487B (en) * 2021-12-09 2023-11-03 上海鹑火信息技术有限公司 Identification method and system for salomile characters
CN114418001B (en) * 2022-01-20 2023-05-12 北方工业大学 Character recognition method and system based on parameter reconstruction network
CN114708580B (en) * 2022-04-08 2024-04-16 北京百度网讯科技有限公司 Text recognition method, text recognition model training method, text recognition device, model training device, text recognition program, model training program, and computer-readable storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9245191B2 (en) * 2013-09-05 2016-01-26 Ebay, Inc. System and method for scene text recognition
CN105740909A (en) * 2016-02-02 2016-07-06 华中科技大学 Text recognition method under natural scene on the basis of spatial transformation
CN106557768A (en) * 2016-11-25 2017-04-05 北京小米移动软件有限公司 The method and device is identified by word in picture
CN107977620A (en) * 2017-11-29 2018-05-01 华中科技大学 A kind of multi-direction scene text single detection method based on full convolutional network
CN108304835A (en) * 2018-01-30 2018-07-20 百度在线网络技术(北京)有限公司 character detecting method and device
CN108399419A (en) * 2018-01-25 2018-08-14 华南理工大学 Chinese text recognition methods in natural scene image based on two-dimentional Recursive Networks
CN108549893A (en) * 2018-04-04 2018-09-18 华中科技大学 A kind of end-to-end recognition methods of the scene text of arbitrary shape
CN109389091A (en) * 2018-10-22 2019-02-26 重庆邮电大学 The character identification system and method combined based on neural network and attention mechanism
CN109492638A (en) * 2018-11-07 2019-03-19 北京旷视科技有限公司 Method for text detection, device and electronic equipment
CN109829437A (en) * 2019-02-01 2019-05-31 北京旷视科技有限公司 Image processing method, text recognition method, device and electronic system
CN110008950A (en) * 2019-03-13 2019-07-12 南京大学 The method of text detection in the natural scene of a kind of pair of shape robust

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9245191B2 (en) * 2013-09-05 2016-01-26 Ebay, Inc. System and method for scene text recognition
CN105740909A (en) * 2016-02-02 2016-07-06 华中科技大学 Text recognition method under natural scene on the basis of spatial transformation
CN106557768A (en) * 2016-11-25 2017-04-05 北京小米移动软件有限公司 The method and device is identified by word in picture
CN107977620A (en) * 2017-11-29 2018-05-01 华中科技大学 A kind of multi-direction scene text single detection method based on full convolutional network
CN108399419A (en) * 2018-01-25 2018-08-14 华南理工大学 Chinese text recognition methods in natural scene image based on two-dimentional Recursive Networks
CN108304835A (en) * 2018-01-30 2018-07-20 百度在线网络技术(北京)有限公司 character detecting method and device
CN108549893A (en) * 2018-04-04 2018-09-18 华中科技大学 A kind of end-to-end recognition methods of the scene text of arbitrary shape
CN109389091A (en) * 2018-10-22 2019-02-26 重庆邮电大学 The character identification system and method combined based on neural network and attention mechanism
CN109492638A (en) * 2018-11-07 2019-03-19 北京旷视科技有限公司 Method for text detection, device and electronic equipment
CN109829437A (en) * 2019-02-01 2019-05-31 北京旷视科技有限公司 Image processing method, text recognition method, device and electronic system
CN110008950A (en) * 2019-03-13 2019-07-12 南京大学 The method of text detection in the natural scene of a kind of pair of shape robust

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ASTER: An Attentional Scene Text Recognizer with Flexible Rectification;Baoguang Shi et al;《IEEE Transactions on Pattern Analysis and Machine Intelligence》;20180625;第2035-2048页 *
TextField: Learning a Deep Direction Field for Irregular Scene Text Detection;Yongchao Xu et al;《 IEEE Transactions on Image Processing》;20190221;第5566-5579页 *
一种非直线型的文字检测与识别方法;吴昊;《中国优秀硕士学位论文全文数据库信息科技辑》;20190215;第I138-2214页 *

Also Published As

Publication number Publication date
CN110427852A (en) 2019-11-08

Similar Documents

Publication Publication Date Title
CN110427852B (en) Character recognition method and device, computer equipment and storage medium
CN109886077B (en) Image recognition method and device, computer equipment and storage medium
CN108399052B (en) Picture compression method and device, computer equipment and storage medium
CN109472209B (en) Image recognition method, device and storage medium
CN110020582B (en) Face emotion recognition method, device, equipment and medium based on deep learning
CN110287836B (en) Image classification method and device, computer equipment and storage medium
CN112862828B (en) Semantic segmentation method, model training method and device
CN110046577B (en) Pedestrian attribute prediction method, device, computer equipment and storage medium
CN110390254B (en) Character analysis method and device based on human face, computer equipment and storage medium
CN111797834B (en) Text recognition method and device, computer equipment and storage medium
CN113496208B (en) Video scene classification method and device, storage medium and terminal
WO2023174098A1 (en) Real-time gesture detection method and apparatus
CN109543685A (en) Image, semantic dividing method, device and computer equipment
CN112633159A (en) Human-object interaction relation recognition method, model training method and corresponding device
CN111401322A (en) Station entering and exiting identification method and device, terminal and storage medium
CN112232140A (en) Crowd counting method and device, electronic equipment and computer storage medium
CN114519877A (en) Face recognition method, face recognition device, computer equipment and storage medium
CN110807463B (en) Image segmentation method and device, computer equipment and storage medium
CN116152226A (en) Method for detecting defects of image on inner side of commutator based on fusible feature pyramid
CN110222752B (en) Image processing method, system, computer device, storage medium and chip
CN111353442A (en) Image processing method, device, equipment and storage medium
CN111353429A (en) Interest degree method and system based on eyeball turning
CN107886093B (en) Character detection method, system, equipment and computer storage medium
CN108875611B (en) Video motion recognition method and device
CN111382638A (en) Image detection method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant