CN112419159A - Character image super-resolution reconstruction system and method - Google Patents

Character image super-resolution reconstruction system and method Download PDF

Info

Publication number
CN112419159A
CN112419159A CN202011417305.4A CN202011417305A CN112419159A CN 112419159 A CN112419159 A CN 112419159A CN 202011417305 A CN202011417305 A CN 202011417305A CN 112419159 A CN112419159 A CN 112419159A
Authority
CN
China
Prior art keywords
character
resolution
image
super
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011417305.4A
Other languages
Chinese (zh)
Inventor
张晓东
张月
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Internet Software Group Co ltd
Original Assignee
Shanghai Internet Software Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Internet Software Group Co ltd filed Critical Shanghai Internet Software Group Co ltd
Priority to CN202011417305.4A priority Critical patent/CN112419159A/en
Publication of CN112419159A publication Critical patent/CN112419159A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Abstract

The invention discloses a system and a method for reconstructing super-resolution of character images, wherein the method comprises the following steps: the feature extraction module extracts a set feature layer corresponding to the image to be processed; inputting the characteristic layer into a super-resolution image reconstruction module, carrying out up-sampling on the characteristic layer, and carrying out feature extraction on the up-sampled characteristic layer to obtain a reconstructed super-resolution character image; inputting the characteristic layer into a character recognition module, performing down-sampling on the characteristic layer, performing time sequence feature extraction on the feature layer after down-sampling, and performing character recognition on the extracted time sequence feature to obtain character contents in a character image to be processed; and inputting the characteristic layer into a super-resolution gradient map reconstruction module, up-sampling the characteristic layer, and extracting the characteristics of the up-sampled characteristic layer to obtain a reconstructed super-resolution gradient map. The multitask character image super-resolution reconstruction system and the multitask character image super-resolution reconstruction method can improve the definition and the reliability of the reconstructed character image.

Description

Character image super-resolution reconstruction system and method
Technical Field
The invention belongs to the technical field of image processing, relates to an image processing system, and particularly relates to a system and a method for reconstructing super-resolution of a character image.
Background
The deep neural network is a complex mathematical model, input data obtain corresponding output data through the deep neural network, a loss function is constructed through the difference between the output data and the marking data, the loss function calculates the gradient of parameters in the deep neural network, the parameters in the deep neural network are updated through gradient back propagation, and the difference between the output data and the marking data is continuously reduced through continuously updating the parameters. Wherein the input data and the marking data form training data required by deep neural network training, and the performance of the deep neural network is related to the structure of the neural network and the training data. The deep neural network has acquired performance superior to that of the traditional method in the fields of image, voice, natural language processing and the like, and is widely applied.
The image super-resolution reconstruction means reconstructing a corresponding high-resolution image from an observed low-resolution image. With the rapid development of the deep learning technology, the image super-resolution reconstruction method based on the deep neural network is the image super-resolution reconstruction method with the optimal performance at present.
The image super-resolution reconstruction method based on the deep neural network generally comprises two modules: the feature extraction module 21 and the super-resolution image reconstruction module 31 obtain the reconstructed super-resolution character image 41, during training, an image loss function 51 between the reconstructed super-resolution character image 41 and a high-resolution image corresponding to the character image 11 to be processed is calculated, image training gradient backward propagation is performed based on the image loss function 51, and parameters of the feature extraction module 21 and the super-resolution image reconstruction module 31 are updated, so that the feature extraction module 21 can extract image information of the image 11 to be processed, and the whole is as shown in fig. 1. The existing image super-resolution reconstruction method based on the deep neural network obtains good performance in natural image reconstruction. When the existing image super-resolution reconstruction method is directly used for character image super-resolution reconstruction, the reconstructed super-resolution character image has the problems of fuzzy character edge and low reliability:
compared with a natural image, the character image contains a large amount of gradient information, and when the existing image super-resolution reconstruction method is directly used for character image super-resolution reconstruction, the gradient information in the character image cannot be fully utilized, so that the character edge of the reconstructed super-resolution character image is fuzzy;
the super-resolution reconstruction is an ill-posed problem in nature, that is, for a low-resolution image, there are usually many high-resolution images corresponding to the low-resolution image, and the ill-posed problem may cause the change of the text content in the reconstructed super-resolution text image, so that the reconstructed super-resolution text image has low reliability.
In view of the above, there is an urgent need to design a new text image reconstruction method to overcome at least some of the above-mentioned defects of the existing text image reconstruction methods.
Disclosure of Invention
The invention provides a character image super-resolution reconstruction system and method, which can reconstruct a character image with reduced resolution into a character image with super-resolution and provide a clear and credible image for high-level tasks such as character detection and identification.
In order to solve the technical problem, according to one aspect of the present invention, the following technical solutions are adopted:
a text image super-resolution reconstruction system, the system comprising:
the characteristic extraction module is used for extracting a set characteristic layer corresponding to the image to be processed;
the super-resolution image reconstruction module is connected with the feature extraction module and is used for up-sampling the feature layer and extracting features of the up-sampled feature layer to obtain a reconstructed super-resolution character image;
the character recognition module is connected with the feature extraction module and used for down-sampling the feature layer, extracting time sequence features of the down-sampled feature layer and recognizing characters of the extracted time sequence features to obtain character contents in the character image to be processed;
and the super-resolution gradient map reconstruction module is connected with the feature extraction module and is used for up-sampling the feature layer and extracting features of the up-sampled feature layer to obtain a reconstructed super-resolution gradient map.
As an embodiment of the present invention, the system further includes:
the image loss function acquisition module is used for calculating an image loss function according to the super-resolution character image acquired by the super-resolution image reconstruction module;
the character loss function acquisition module is used for calculating a character loss function according to the character content acquired by the character recognition module;
the gradient loss function acquisition module is used for calculating a gradient loss function according to the super-resolution gradient map acquired by the super-resolution gradient map reconstruction module;
a loss function fusion module, configured to fuse the image loss function obtained by the image loss function obtaining module, the character loss function obtained by the character loss function obtaining module, and the gradient loss function obtained by the gradient loss function obtaining module to obtain a fusion loss function; and (4) training the multitask character image super-resolution reconstruction network by using the fusion loss function.
As an embodiment of the present invention, the feature extraction module is configured to obtain an advanced feature layer of a text image to be processed, where the advanced feature layer includes deep feature information of the text image to be processed;
the super-resolution image reconstruction module is used for carrying out up-sampling on the advanced feature layer by a deep neural network, carrying out feature extraction on the feature layer after up-sampling and obtaining features output by each layer of the deep neural network; determining the characteristics output by the last layer of deep neural network as a reconstructed super-resolution character image;
the character recognition module is used for carrying out down-sampling on the advanced feature layer by a deep neural network comprising a pooling layer, so that the height of the down-sampled feature layer is a set value; sending the down-sampled feature layer into a bidirectional LSTM network to extract time sequence features, and outputting the time sequence features of the character and image to be processed; further extracting the characteristics of the time sequence characteristics through a full connection layer and a softmax function, and determining the characteristics of the last layer as the character content of the character image to be processed;
the super-resolution gradient map reconstruction module is used for performing up-sampling on the advanced feature layer by a deep neural network, performing feature extraction on the up-sampled feature layer and obtaining features output by each layer of the deep neural network; and determining the characteristics output by the last layer of deep neural network as a reconstructed super-resolution gradient map.
As an embodiment of the present invention, the image loss function obtaining module is configured to reversely propagate the calculated image loss function to the feature extraction module through an image training gradient; the high-level feature layer extracted by the feature extraction module contains rich image information, so that the super-resolution character image reconstructed by the super-resolution reconstruction module is more vivid;
the character loss function acquisition module is used for reversely transmitting the calculated character loss function to the feature extraction module through a character training gradient, so that a feature layer extracted by the feature extraction module contains rich character information, the character content of the super-resolution character image reconstructed by the super-resolution image reconstruction module is more prepared, and the reliability of the reconstructed super-resolution character image is improved;
the gradient loss function acquisition module is used for reversely transmitting the calculated gradient loss function to the feature extraction module through a gradient training gradient, so that the high-level feature layer extracted by the feature extraction module contains rich gradient information, the super-resolution character image and the character edge reconstructed by the super-resolution image reconstruction module are clearer, and the definition of the reconstructed super-resolution character image is improved.
As an embodiment of the present invention, the image loss function obtaining module is configured to calculate an image loss function, and specifically includes: calculating L the reconstructed super-resolution character image and the high-resolution character image corresponding to the character image to be processed1Loss, so that the reconstructed super-resolution character image has the pixel value of the corresponding high-resolution character image;
the text loss function obtaining module is used for calculating a text loss function, and specifically includes: calculating the CTC loss by using the character content of the character image to be processed acquired by the character recognition module and the corresponding marked character content, so that the character content recognized by the character recognition module is more correct;
the gradient loss function obtaining module is configured to calculate a gradient loss function, and specifically includes: calculating a gradient map of the high-resolution character image corresponding to the character image to be processed through a Sobel operator to obtain a target gradient map; calculating L by the target gradient map and the reconstructed super-resolution gradient map1Loss, so that the reconstructed super-resolution gradient map has the pixel value of the target gradient map;
and the loss function fusion module performs weighted summation on the image loss function, the character loss function and the gradient loss function to obtain a fusion loss function.
According to one aspect of the invention, the following technical scheme is adopted: a text image super-resolution reconstruction method comprises the following steps:
the feature extraction module extracts a set feature layer corresponding to the image to be processed;
inputting the characteristic layer into a super-resolution image reconstruction module, carrying out up-sampling on the characteristic layer, and carrying out feature extraction on the up-sampled characteristic layer to obtain a reconstructed super-resolution character image;
inputting the characteristic layer into a character recognition module, performing down-sampling on the characteristic layer, performing time sequence feature extraction on the feature layer after down-sampling, and performing character recognition on the extracted time sequence feature to obtain character contents in a character image to be processed;
and inputting the characteristic layer into a super-resolution gradient map reconstruction module, up-sampling the characteristic layer, and extracting the characteristics of the up-sampled characteristic layer to obtain a reconstructed super-resolution gradient map.
As an embodiment of the present invention, the method further comprises:
the image loss function acquisition module calculates an image loss function according to the super-resolution character image acquired by the super-resolution image reconstruction module;
the character loss function acquisition module calculates a character loss function according to the character content acquired by the character recognition module;
the gradient loss function acquisition module calculates a gradient loss function according to the super-resolution gradient map acquired by the super-resolution gradient map reconstruction module;
the loss function fusion module fuses the image loss function acquired by the image loss function acquisition module, the character loss function acquired by the character loss function acquisition module and the gradient loss function acquired by the gradient loss function acquisition module to acquire a fusion loss function; and (4) training the multitask character image super-resolution reconstruction network by using the fusion loss function.
As an embodiment of the present invention, the feature extraction module obtains an advanced feature layer of the character image to be processed, where the advanced feature layer includes deep feature information of the character image to be processed;
the super-resolution image reconstruction module performs up-sampling on the advanced feature layer by a deep neural network, performs feature extraction on the up-sampled feature layer, and obtains features output by each layer of the deep neural network; determining the characteristics output by the last layer of deep neural network as a reconstructed super-resolution character image;
the character recognition module carries out down-sampling of the deep neural network comprising the pooling layer on the high-level feature layer, so that the height of the down-sampled feature layer is a set value; sending the down-sampled feature layer into a bidirectional LSTM network to extract time sequence features, and outputting the time sequence features of the character and image to be processed; further extracting the characteristics of the time sequence characteristics through a full connection layer and a softmax function, and determining the characteristics of the last layer as the character content of the character image to be processed;
the super-resolution gradient map reconstruction module performs up-sampling on the advanced feature layer by a deep neural network, performs feature extraction on the up-sampled feature layer, and obtains features output by each layer of the deep neural network; and determining the characteristics output by the last layer of deep neural network as a reconstructed super-resolution gradient map.
As an embodiment of the present invention, the image loss function obtaining module reversely propagates the calculated image loss function to the feature extraction module through an image training gradient; the high-level feature layer extracted by the feature extraction module contains rich image information, so that the super-resolution character image reconstructed by the super-resolution reconstruction module is more vivid;
the character loss function acquisition module reversely transmits the calculated character loss function to the feature extraction module through a character training gradient, so that a feature layer extracted by the feature extraction module contains rich character information, the character content of the super-resolution character image reconstructed by the super-resolution image reconstruction module is more prepared, and the reliability of the reconstructed super-resolution character image is improved;
the gradient loss function acquisition module reversely transmits the calculated gradient loss function to the feature extraction module through a gradient training gradient, so that the high-level feature layer extracted by the feature extraction module contains rich gradient information, the super-resolution character image and the character edge reconstructed by the super-resolution image reconstruction module are clearer, and the definition of the reconstructed super-resolution character image is improved.
As an embodiment of the present invention, the image loss function obtaining module calculates an image loss function, and specifically includes: calculating L the reconstructed super-resolution character image and the high-resolution character image corresponding to the character image to be processed1Loss, so that the reconstructed super-resolution character image has the pixel value of the corresponding high-resolution character image;
the character loss function obtaining module calculates a character loss function, and specifically includes: calculating the CTC loss by using the character content of the character image to be processed acquired by the character recognition module and the corresponding marked character content, so that the character content recognized by the character recognition module is more correct;
the gradient loss function obtaining module calculates a gradient loss function, and specifically includes: treating the part to be treatedCalculating a gradient map of the high-resolution character image corresponding to the character image through a Sobel operator to obtain a target gradient map; calculating L by the target gradient map and the reconstructed super-resolution gradient map1Loss, so that the reconstructed super-resolution gradient map has the pixel value of the target gradient map;
and the loss function fusion module performs weighted summation on the image loss function, the character loss function and the gradient loss function to obtain a fusion loss function.
The invention has the beneficial effects that: the multitask character image super-resolution reconstruction system and the multitask character image super-resolution reconstruction method provided by the invention have the advantages that the character image with reduced resolution ratio is reconstructed into the super-resolution character image, the problems that when the existing image super-resolution reconstruction method based on the deep neural network is applied to character image reconstruction, the reconstructed super-resolution character image is fuzzy in character edge and low in character content reliability are solved, and a clear and credible image is provided for high-level tasks such as semantic analysis of the character image.
Compared with the existing image super-resolution reconstruction method based on the deep neural network, the method has the following two advantages:
(1) the reconstructed super-resolution character image has clear character edges:
according to the multitask character image super-resolution reconstruction method, the super-resolution gradient map reconstruction module is added on the basis of the super-resolution image reconstruction module in parallel, the gradient loss function is calculated, and when network parameters are updated, the gradient training gradient is propagated reversely, so that the high-level feature layer extracted by the feature extraction module contains rich gradient information, and the character edge of the super-resolution character image reconstructed by the super-resolution image reconstruction module is clearer.
(2) The reconstructed super-resolution character image has high character content reliability:
according to the multitask character image super-resolution reconstruction method, the character recognition module is added on the basis of the super-resolution image reconstruction module in parallel, the character loss function is calculated, and when network parameters are updated, the high-level feature layer extracted by the feature extraction module contains abundant character information through reverse propagation of the character training gradient, so that the character content of the super-resolution character image reconstructed by the super-resolution image reconstruction module is correct, and the reliability is high.
Drawings
Fig. 1 is a schematic composition diagram of a conventional text image super-resolution reconstruction system.
Fig. 2 is a schematic composition diagram of a super-resolution reconstruction system for text images according to an embodiment of the present invention.
Fig. 3 is a schematic composition diagram of a super-resolution reconstruction system for text images according to an embodiment of the present invention.
Fig. 4 is a schematic composition diagram of a super-resolution reconstruction system for text images according to an embodiment of the present invention.
Detailed Description
Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
For a further understanding of the invention, reference will now be made to the preferred embodiments of the invention by way of example, and it is to be understood that the description is intended to further illustrate features and advantages of the invention, and not to limit the scope of the claims.
The description in this section is for several exemplary embodiments only, and the present invention is not limited only to the scope of the embodiments described. It is within the scope of the present disclosure and protection that the same or similar prior art means and some features of the embodiments may be interchanged.
The steps in the embodiments in the specification are only expressed for convenience of description, and the implementation manner of the present application is not limited by the order of implementation of the steps. The term "connected" in the specification includes both direct connection and indirect connection.
The invention discloses a character image super-resolution reconstruction system, and fig. 2 and 3 are schematic composition diagrams of the character image super-resolution reconstruction system in an embodiment of the invention; referring to fig. 2 and 3, the system includes: the system comprises a feature extraction module 1, a super-resolution image reconstruction module 2, a character recognition module 3 and a super-resolution gradient map reconstruction module 4; the feature extraction module 1 is respectively connected with the super-resolution image reconstruction module 2, the character recognition module 3 and the super-resolution gradient map reconstruction module 4.
The feature extraction module 1 is used for extracting a set feature layer corresponding to an image to be processed; the super-resolution image reconstruction module 2 is used for up-sampling the feature layer and extracting features of the up-sampled feature layer to obtain a reconstructed super-resolution character image; the character recognition module 3 is used for down-sampling the characteristic layer, extracting time sequence characteristics of the characteristic layer after down-sampling, and performing character recognition on the extracted time sequence characteristics to obtain character contents in the character image to be processed; the super-resolution gradient map reconstruction module 4 is used for up-sampling the feature layer, and extracting features of the up-sampled feature layer to obtain a reconstructed super-resolution gradient map.
In an embodiment of the invention, the feature extraction module is configured to obtain an advanced feature layer of the text image to be processed, where the advanced feature layer includes deep feature information of the text image to be processed. In one embodiment, the text image to be processed may be input to a feature extraction module in the ESRGAN generation network, thereby obtaining an advanced feature layer.
FIG. 4 is a schematic diagram illustrating a super-resolution reconstruction system for text images according to an embodiment of the present invention; referring to fig. 4, in an embodiment of the present invention, the super-resolution image reconstruction module 2 is configured to perform up-sampling on the feature layer by using a deep neural network, perform feature extraction on the up-sampled feature layer, and obtain features output by each layer of the deep neural network; and determining the characteristics output by the last layer of deep neural network as the reconstructed super-resolution character image.
The character recognition module 3 is used for performing down-sampling on the advanced feature layer by using a deep neural network comprising a pooling layer, so that the height of the down-sampled feature layer is a set value 1; sending the down-sampled feature layer into a bidirectional LSTM network to extract time sequence features, and outputting the time sequence features of the character and image to be processed; and further providing the characteristics of the time sequence characteristics through a full connection layer and a softmax function, and determining the characteristics of the last layer as the character content of the character image to be processed.
The super-resolution gradient map reconstruction module 4 is used for performing up-sampling of the deep neural network on the high-level feature layer, performing feature extraction on the up-sampled feature layer, and obtaining features output by each layer of the deep neural network; and determining the characteristics output by the last layer of deep neural network as a reconstructed super-resolution gradient map.
As shown in fig. 4, in an embodiment of the present invention, the system further includes an image loss function obtaining module 5, a text loss function obtaining module 6, a gradient loss function obtaining module 7, and a loss function fusing module 8.
The image loss function acquisition module 5 is used for calculating an image loss function according to the super-resolution character image acquired by the super-resolution image reconstruction module; the text loss function acquisition module 6 is used for calculating a text loss function according to the text content acquired by the text recognition module; the gradient loss function acquisition module 7 is used for calculating a gradient loss function according to the super-resolution gradient map acquired by the super-resolution gradient map reconstruction module. The loss function fusion module 8 is configured to fuse the image loss function acquired by the image loss function acquisition module 5, the text loss function acquired by the text loss function acquisition module 6, and the gradient loss function acquired by the gradient loss function acquisition module 7 to acquire a fusion loss function; and (4) training the multitask character image super-resolution reconstruction network by using the fusion loss function.
In an embodiment of the present invention, the image loss function obtaining module 5 is configured to calculate an image loss function, and specifically includes: calculating L the reconstructed super-resolution character image and the high-resolution character image corresponding to the character image to be processed1And loss, so that the reconstructed super-resolution character image has the pixel value of the corresponding high-resolution character image.
The text loss function obtaining module 6 is configured to calculate a text loss function, and specifically includes: the character content of the character image to be processed acquired by the character recognition module and the corresponding marked character content are used for calculating the CTC loss, so that the character content recognized by the character recognition module is more correct.
The ladderThe degree loss function obtaining module 7 is configured to calculate a gradient loss function, and specifically includes: calculating a gradient map of the high-resolution character image corresponding to the character image to be processed through a Sobel operator to obtain a target gradient map; calculating L by the target gradient map and the reconstructed super-resolution gradient map1And (4) losing, so that the reconstructed super-resolution gradient map has the pixel value of the target gradient map.
And the loss function fusion module 8 performs weighted summation on the image loss function, the character loss function and the gradient loss function to obtain a fusion loss function.
With reference to fig. 3 and 4, in an embodiment of the present invention, the image loss function obtaining module 5 is configured to reversely propagate the calculated image loss function to the feature extraction module through an image training gradient; the high-level feature layer extracted by the feature extraction module contains abundant image information, so that the super-resolution character image reconstructed by the super-resolution reconstruction module is more vivid.
The character loss function acquisition module 6 is used for reversely transmitting the calculated character loss function to the feature extraction module through a character training gradient, so that the feature layer extracted by the feature extraction module contains rich character information, thereby helping the super-resolution character image content reconstructed by the super-resolution image reconstruction module to be more ready and improving the reliability of the reconstructed super-resolution character image.
The gradient loss function acquisition module 7 is used for reversely transmitting the calculated gradient loss function to the feature extraction module through a gradient training gradient, so that the high-level feature layer extracted by the feature extraction module contains rich gradient information, thereby helping the super-resolution character image and the character edge reconstructed by the super-resolution image reconstruction module to be clearer and improving the definition of the reconstructed super-resolution character image.
The loss function fusion module 8 is further configured to reversely propagate the fusion loss function to the image loss function obtaining module 5, the text loss function obtaining module 6, and the gradient loss function obtaining module 7.
The invention also discloses a multitask character image super-resolution reconstruction method, which can refer to fig. 4, and the method comprises the following steps:
the feature extraction module extracts a set feature layer corresponding to the image to be processed;
inputting the characteristic layer into a super-resolution image reconstruction module, carrying out up-sampling on the characteristic layer, and carrying out feature extraction on the up-sampled characteristic layer to obtain a reconstructed super-resolution character image;
inputting the characteristic layer into a character recognition module, performing down-sampling on the characteristic layer, performing time sequence feature extraction on the feature layer after down-sampling, and performing character recognition on the extracted time sequence feature to obtain character contents in a character image to be processed;
and inputting the characteristic layer into a super-resolution gradient map reconstruction module, up-sampling the characteristic layer, and extracting the characteristics of the up-sampled characteristic layer to obtain a reconstructed super-resolution gradient map.
In an embodiment of the present invention, the feature extraction module obtains an advanced feature layer of the text image to be processed, where the advanced feature layer includes deep feature information of the text image to be processed.
In an embodiment of the present invention, the super-resolution image reconstruction module performs up-sampling on the feature layer by using a deep neural network, and performs feature extraction on the up-sampled feature layer to obtain features output by each layer of the deep neural network; and determining the characteristics output by the last layer of deep neural network as the reconstructed super-resolution character image.
The character recognition module carries out down-sampling of the deep neural network comprising the pooling layer on the high-level feature layer, so that the height of the down-sampled feature layer is a set value; sending the down-sampled feature layer into a bidirectional LSTM network to extract time sequence features, and outputting the time sequence features of the character and image to be processed; and further providing the characteristics of the time sequence characteristics through a full connection layer and a softmax function, and determining the characteristics of the last layer as the character content of the character image to be processed.
The super-resolution gradient map reconstruction module performs up-sampling on the advanced feature layer by a deep neural network, performs feature extraction on the up-sampled feature layer, and obtains features output by each layer of the deep neural network; and determining the characteristics output by the last layer of deep neural network as a reconstructed super-resolution gradient map.
In an embodiment of the invention, the method further comprises a training process.
Referring to fig. 4, in an embodiment of the invention, the method further includes:
the image loss function acquisition module calculates an image loss function according to the super-resolution character image acquired by the super-resolution image reconstruction module;
the character loss function acquisition module calculates a character loss function according to the character content acquired by the character recognition module;
the gradient loss function acquisition module calculates a gradient loss function according to the super-resolution gradient map acquired by the super-resolution gradient map reconstruction module;
the loss function fusion module fuses the image loss function acquired by the image loss function acquisition module, the character loss function acquired by the character loss function acquisition module and the gradient loss function acquired by the gradient loss function acquisition module to acquire a fusion loss function; and (4) training the multitask character image super-resolution reconstruction network by using the fusion loss function.
In an embodiment of the present invention, the image loss function obtaining module calculates an image loss function, which specifically includes: calculating L the reconstructed super-resolution character image and the high-resolution character image corresponding to the character image to be processed1And loss, so that the reconstructed super-resolution character image has the pixel value of the corresponding high-resolution character image.
The character loss function obtaining module calculates a character loss function, and specifically includes: the character content of the character image to be processed acquired by the character recognition module and the corresponding marked character content are used for calculating the CTC loss, so that the character content recognized by the character recognition module is more correct.
The gradient loss function obtaining module calculates a gradient loss function, and specifically includes: calculating a gradient map of the high-resolution character image corresponding to the character image to be processed through a Sobel operator to obtain a target gradient map; calculating L by the target gradient map and the reconstructed super-resolution gradient map1And (4) losing, so that the reconstructed super-resolution gradient map has the pixel value of the target gradient map.
And the loss function fusion module performs weighted summation on the image loss function, the character loss function and the gradient loss function to obtain a fusion loss function.
Referring to fig. 4, in an embodiment of the invention, the method further includes:
the image loss function acquisition module reversely propagates the calculated image loss function to the feature extraction module through an image training gradient; the high-level feature layer extracted by the feature extraction module contains rich image information, so that the super-resolution character image reconstructed by the super-resolution reconstruction module is more vivid;
the character loss function acquisition module reversely transmits the calculated character loss function to the feature extraction module through a character training gradient, so that a feature layer extracted by the feature extraction module contains rich character information, the character content of the super-resolution character image reconstructed by the super-resolution image reconstruction module is more prepared, and the reliability of the reconstructed super-resolution character image is improved;
the gradient loss function acquisition module reversely transmits the calculated gradient loss function to the feature extraction module through a gradient training gradient, so that the high-level feature layer extracted by the feature extraction module contains rich gradient information, thereby helping the super-resolution character image and the character edge reconstructed by the super-resolution image reconstruction module to be clearer and improving the definition of the reconstructed super-resolution character image;
the loss function fusion module reversely propagates the fusion loss function to the image loss function acquisition module, the character loss function acquisition module and the gradient loss function acquisition module.
In summary, the multitask character image super-resolution reconstruction system and the multitask character image super-resolution reconstruction method provided by the invention reconstruct the character image with reduced resolution into the super-resolution character image, solve the problems that when the existing image super-resolution reconstruction method based on the deep neural network is applied to character image reconstruction, the reconstructed super-resolution character image has fuzzy character edges and low character content reliability, and provide clear and credible images for high-level tasks such as semantic analysis of the character image.
It should be noted that the present application may be implemented in software and/or a combination of software and hardware; for example, it may be implemented using Application Specific Integrated Circuits (ASICs), general purpose computers, or any other similar hardware devices. In some embodiments, the software programs of the present application may be executed by a processor to implement the above steps or functions. As such, the software programs (including associated data structures) of the present application can be stored in a computer-readable recording medium; such as RAM memory, magnetic or optical drives or diskettes, and the like. In addition, some steps or functions of the present application may be implemented using hardware; for example, as circuitry that cooperates with the processor to perform various steps or functions.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The description and applications of the invention herein are illustrative and are not intended to limit the scope of the invention to the embodiments described above. Effects or advantages referred to in the embodiments may not be reflected in the embodiments due to interference of various factors, and the description of the effects or advantages is not intended to limit the embodiments. Variations and modifications of the embodiments disclosed herein are possible, and alternative and equivalent various components of the embodiments will be apparent to those skilled in the art. It will be clear to those skilled in the art that the present invention may be embodied in other forms, structures, arrangements, proportions, and with other components, materials, and parts, without departing from the spirit or essential characteristics thereof. Other variations and modifications of the embodiments disclosed herein may be made without departing from the scope and spirit of the invention.

Claims (10)

1. A text image super-resolution reconstruction system, the system comprising:
the characteristic extraction module is used for extracting a set characteristic layer corresponding to the image to be processed;
the super-resolution image reconstruction module is connected with the feature extraction module and is used for up-sampling the feature layer and extracting features of the up-sampled feature layer to obtain a reconstructed super-resolution character image;
the character recognition module is connected with the feature extraction module and used for down-sampling the feature layer, extracting time sequence features of the down-sampled feature layer and recognizing characters of the extracted time sequence features to obtain character contents in the character image to be processed;
and the super-resolution gradient map reconstruction module is connected with the feature extraction module and is used for up-sampling the feature layer and extracting features of the up-sampled feature layer to obtain a reconstructed super-resolution gradient map.
2. The text image super-resolution reconstruction system according to claim 1, wherein:
the system further comprises:
the image loss function acquisition module is used for calculating an image loss function according to the super-resolution character image acquired by the super-resolution image reconstruction module;
the character loss function acquisition module is used for calculating a character loss function according to the character content acquired by the character recognition module;
the gradient loss function acquisition module is used for calculating a gradient loss function according to the super-resolution gradient map acquired by the super-resolution gradient map reconstruction module;
a loss function fusion module, configured to fuse the image loss function obtained by the image loss function obtaining module, the character loss function obtained by the character loss function obtaining module, and the gradient loss function obtained by the gradient loss function obtaining module to obtain a fusion loss function; and (4) training the multitask character image super-resolution reconstruction network by using the fusion loss function.
3. The text image super-resolution reconstruction system according to claim 1, wherein:
the character extraction module is used for acquiring an advanced feature layer of the character image to be processed, and the advanced feature layer comprises deep feature information of the character image to be processed;
the super-resolution image reconstruction module is used for carrying out up-sampling on the advanced feature layer by a deep neural network, carrying out feature extraction on the feature layer after up-sampling and obtaining features output by each layer of the deep neural network; determining the characteristics output by the last layer of deep neural network as a reconstructed super-resolution character image;
the character recognition module is used for carrying out down-sampling on the advanced feature layer by a deep neural network comprising a pooling layer, so that the height of the down-sampled feature layer is a set value; sending the down-sampled feature layer into a bidirectional LSTM network to extract time sequence features, and outputting the time sequence features of the character and image to be processed; further extracting the characteristics of the time sequence characteristics through a full connection layer and a softmax function, and determining the characteristics of the last layer as the character content of the character image to be processed;
the super-resolution gradient map reconstruction module is used for performing up-sampling on the advanced feature layer by a deep neural network, performing feature extraction on the up-sampled feature layer and obtaining features output by each layer of the deep neural network; and determining the characteristics output by the last layer of deep neural network as a reconstructed super-resolution gradient map.
4. The text image super-resolution reconstruction system according to claim 2, wherein:
the image loss function acquisition module is used for reversely transmitting the calculated image loss function to the feature extraction module through an image training gradient; the high-level feature layer extracted by the feature extraction module contains rich image information, so that the super-resolution character image reconstructed by the super-resolution reconstruction module is more vivid;
the character loss function acquisition module is used for reversely transmitting the calculated character loss function to the feature extraction module through a character training gradient, so that a feature layer extracted by the feature extraction module contains rich character information, the character content of the super-resolution character image reconstructed by the super-resolution image reconstruction module is more prepared, and the reliability of the reconstructed super-resolution character image is improved;
the gradient loss function acquisition module is used for reversely transmitting the calculated gradient loss function to the feature extraction module through a gradient training gradient, so that the high-level feature layer extracted by the feature extraction module contains rich gradient information, the super-resolution character image and the character edge reconstructed by the super-resolution image reconstruction module are clearer, and the definition of the reconstructed super-resolution character image is improved.
5. The text image super-resolution reconstruction system according to claim 2 or 4, wherein:
the image loss function obtaining module is used for calculating an image loss function, and specifically includes: calculating L the reconstructed super-resolution character image and the high-resolution character image corresponding to the character image to be processed1Loss, so that the reconstructed super-resolution character image has the pixel value of the corresponding high-resolution character image;
the text loss function obtaining module is used for calculating a text loss function, and specifically includes: calculating the CTC loss by using the character content of the character image to be processed acquired by the character recognition module and the corresponding marked character content, so that the character content recognized by the character recognition module is more correct;
the gradient loss functionThe number obtaining module is used for calculating a gradient loss function, and specifically comprises: calculating a gradient map of the high-resolution character image corresponding to the character image to be processed through a Sobel operator to obtain a target gradient map; calculating L by the target gradient map and the reconstructed super-resolution gradient map1Loss, so that the reconstructed super-resolution gradient map has the pixel value of the target gradient map;
and the loss function fusion module performs weighted summation on the image loss function, the character loss function and the gradient loss function to obtain a fusion loss function.
6. A character image super-resolution reconstruction method is characterized by comprising the following steps:
the feature extraction module extracts a set feature layer corresponding to the image to be processed;
inputting the characteristic layer into a super-resolution image reconstruction module, carrying out up-sampling on the characteristic layer, and carrying out feature extraction on the up-sampled characteristic layer to obtain a reconstructed super-resolution character image;
inputting the characteristic layer into a character recognition module, performing down-sampling on the characteristic layer, performing time sequence feature extraction on the feature layer after down-sampling, and performing character recognition on the extracted time sequence feature to obtain character contents in a character image to be processed;
and inputting the characteristic layer into a super-resolution gradient map reconstruction module, up-sampling the characteristic layer, and extracting the characteristics of the up-sampled characteristic layer to obtain a reconstructed super-resolution gradient map.
7. The super-resolution reconstruction method for text images according to claim 6, wherein:
the method further comprises:
the image loss function acquisition module calculates an image loss function according to the super-resolution character image acquired by the super-resolution image reconstruction module;
the character loss function acquisition module calculates a character loss function according to the character content acquired by the character recognition module;
the gradient loss function acquisition module calculates a gradient loss function according to the super-resolution gradient map acquired by the super-resolution gradient map reconstruction module;
the loss function fusion module fuses the image loss function acquired by the image loss function acquisition module, the character loss function acquired by the character loss function acquisition module and the gradient loss function acquired by the gradient loss function acquisition module to acquire a fusion loss function; and (4) training the multitask character image super-resolution reconstruction network by using the fusion loss function.
8. The super-resolution reconstruction method for text images according to claim 6, wherein:
the character extraction module acquires an advanced feature layer of the character image to be processed, wherein the advanced feature layer comprises deep feature information of the character image to be processed;
the super-resolution image reconstruction module performs up-sampling on the advanced feature layer by a deep neural network, performs feature extraction on the up-sampled feature layer, and obtains features output by each layer of the deep neural network; determining the characteristics output by the last layer of deep neural network as a reconstructed super-resolution character image;
the character recognition module carries out down-sampling of the deep neural network comprising the pooling layer on the high-level feature layer, so that the height of the down-sampled feature layer is a set value; sending the down-sampled feature layer into a bidirectional LSTM network to extract time sequence features, and outputting the time sequence features of the character and image to be processed; further extracting the characteristics of the time sequence characteristics through a full connection layer and a softmax function, and determining the characteristics of the last layer as the character content of the character image to be processed;
the super-resolution gradient map reconstruction module performs up-sampling on the advanced feature layer by a deep neural network, performs feature extraction on the up-sampled feature layer, and obtains features output by each layer of the deep neural network; and determining the characteristics output by the last layer of deep neural network as a reconstructed super-resolution gradient map.
9. The super-resolution reconstruction method for text images according to claim 6, wherein:
the image loss function acquisition module reversely propagates the calculated image loss function to the feature extraction module through an image training gradient; the high-level feature layer extracted by the feature extraction module contains rich image information, so that the super-resolution character image reconstructed by the super-resolution reconstruction module is more vivid;
the character loss function acquisition module reversely transmits the calculated character loss function to the feature extraction module through a character training gradient, so that a feature layer extracted by the feature extraction module contains rich character information, the character content of the super-resolution character image reconstructed by the super-resolution image reconstruction module is more prepared, and the reliability of the reconstructed super-resolution character image is improved;
the gradient loss function acquisition module reversely transmits the calculated gradient loss function to the feature extraction module through a gradient training gradient, so that the high-level feature layer extracted by the feature extraction module contains rich gradient information, the super-resolution character image and the character edge reconstructed by the super-resolution image reconstruction module are clearer, and the definition of the reconstructed super-resolution character image is improved.
10. The super-resolution reconstruction method for text images according to claim 9, wherein:
the image loss function obtaining module calculates an image loss function, and specifically includes: calculating L the reconstructed super-resolution character image and the high-resolution character image corresponding to the character image to be processed1Loss, so that the reconstructed super-resolution character image has the pixel value of the corresponding high-resolution character image;
the character loss function obtaining module calculates a character loss function, and specifically includes: calculating the CTC loss by using the character content of the character image to be processed acquired by the character recognition module and the corresponding marked character content, so that the character content recognized by the character recognition module is more correct;
the gradient loss function obtaining module calculates a gradient loss function, and specifically includes: calculating a gradient map of the high-resolution character image corresponding to the character image to be processed through a Sobel operator to obtain a target gradient map; calculating L by the target gradient map and the reconstructed super-resolution gradient map1Loss, so that the reconstructed super-resolution gradient map has the pixel value of the target gradient map;
and the loss function fusion module performs weighted summation on the image loss function, the character loss function and the gradient loss function to obtain a fusion loss function.
CN202011417305.4A 2020-12-07 2020-12-07 Character image super-resolution reconstruction system and method Pending CN112419159A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011417305.4A CN112419159A (en) 2020-12-07 2020-12-07 Character image super-resolution reconstruction system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011417305.4A CN112419159A (en) 2020-12-07 2020-12-07 Character image super-resolution reconstruction system and method

Publications (1)

Publication Number Publication Date
CN112419159A true CN112419159A (en) 2021-02-26

Family

ID=74775215

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011417305.4A Pending CN112419159A (en) 2020-12-07 2020-12-07 Character image super-resolution reconstruction system and method

Country Status (1)

Country Link
CN (1) CN112419159A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712468A (en) * 2021-03-26 2021-04-27 北京万里红科技股份有限公司 Iris image super-resolution reconstruction method and computing device
CN113591798A (en) * 2021-08-23 2021-11-02 京东科技控股股份有限公司 Document character reconstruction method and device, electronic equipment and computer storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108986029A (en) * 2018-07-03 2018-12-11 南京览笛信息科技有限公司 Character image super resolution ratio reconstruction method, system, terminal device and storage medium
CN109902678A (en) * 2019-02-12 2019-06-18 北京奇艺世纪科技有限公司 Model training method, character recognition method, device, electronic equipment and computer-readable medium
CN110136063A (en) * 2019-05-13 2019-08-16 南京信息工程大学 A kind of single image super resolution ratio reconstruction method generating confrontation network based on condition
US20190370608A1 (en) * 2018-05-31 2019-12-05 Seoul National University R&Db Foundation Apparatus and method for training facial locality super resolution deep neural network
CN110633755A (en) * 2019-09-19 2019-12-31 北京市商汤科技开发有限公司 Network training method, image processing method and device and electronic equipment
CN110929726A (en) * 2020-02-11 2020-03-27 南京智莲森信息技术有限公司 Railway contact network support number plate identification method and system
US10671878B1 (en) * 2019-01-11 2020-06-02 Capital One Services, Llc Systems and methods for text localization and recognition in an image of a document
CN111402138A (en) * 2020-03-24 2020-07-10 天津城建大学 Image super-resolution reconstruction method of supervised convolutional neural network based on multi-scale feature extraction fusion
CN111553290A (en) * 2020-04-30 2020-08-18 北京市商汤科技开发有限公司 Text recognition method, device, equipment and storage medium
CN111754399A (en) * 2020-05-29 2020-10-09 清华大学 Image super-resolution method for keeping geometric structure based on gradient
CN112037131A (en) * 2020-08-31 2020-12-04 上海电力大学 Single-image super-resolution reconstruction method based on generation countermeasure network

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190370608A1 (en) * 2018-05-31 2019-12-05 Seoul National University R&Db Foundation Apparatus and method for training facial locality super resolution deep neural network
CN108986029A (en) * 2018-07-03 2018-12-11 南京览笛信息科技有限公司 Character image super resolution ratio reconstruction method, system, terminal device and storage medium
US10671878B1 (en) * 2019-01-11 2020-06-02 Capital One Services, Llc Systems and methods for text localization and recognition in an image of a document
CN109902678A (en) * 2019-02-12 2019-06-18 北京奇艺世纪科技有限公司 Model training method, character recognition method, device, electronic equipment and computer-readable medium
CN110136063A (en) * 2019-05-13 2019-08-16 南京信息工程大学 A kind of single image super resolution ratio reconstruction method generating confrontation network based on condition
CN110633755A (en) * 2019-09-19 2019-12-31 北京市商汤科技开发有限公司 Network training method, image processing method and device and electronic equipment
CN110929726A (en) * 2020-02-11 2020-03-27 南京智莲森信息技术有限公司 Railway contact network support number plate identification method and system
CN111402138A (en) * 2020-03-24 2020-07-10 天津城建大学 Image super-resolution reconstruction method of supervised convolutional neural network based on multi-scale feature extraction fusion
CN111553290A (en) * 2020-04-30 2020-08-18 北京市商汤科技开发有限公司 Text recognition method, device, equipment and storage medium
CN111754399A (en) * 2020-05-29 2020-10-09 清华大学 Image super-resolution method for keeping geometric structure based on gradient
CN112037131A (en) * 2020-08-31 2020-12-04 上海电力大学 Single-image super-resolution reconstruction method based on generation countermeasure network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
XIANGDONG SU 等: "Improving Text Image Resolution using a Deep Generative Adversarial Network for Optical Character Recognition", 《2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR)》, 29 February 2020 (2020-02-29), pages 1193 - 1199 *
刘祥龙 等: "《人脸识别与美颜算法实战 基于Python、机器学习与深度学习》", 西安:西安电子科技大学出版社, pages: 152 - 145 *
占文枢 等: "基于像素及梯度域双层深度卷积神经网络的页岩图像超分辨率重建", 《科学技术与工程》, vol. 18, no. 3, pages 85 - 90 *
陈赛健: "基于深度学习的文本图像重建方法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 8, pages 138 - 539 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712468A (en) * 2021-03-26 2021-04-27 北京万里红科技股份有限公司 Iris image super-resolution reconstruction method and computing device
CN112712468B (en) * 2021-03-26 2021-07-09 北京万里红科技股份有限公司 Iris image super-resolution reconstruction method and computing device
CN113591798A (en) * 2021-08-23 2021-11-02 京东科技控股股份有限公司 Document character reconstruction method and device, electronic equipment and computer storage medium
CN113591798B (en) * 2021-08-23 2023-11-03 京东科技控股股份有限公司 Method and device for reconstructing text of document, electronic equipment and computer storage medium

Similar Documents

Publication Publication Date Title
CN111461114B (en) Multi-scale feature pyramid text detection method based on segmentation
US10223586B1 (en) Multi-modal electronic document classification
CN111784762B (en) Method and device for extracting blood vessel center line of X-ray radiography image
CN105678292A (en) Complex optical text sequence identification system based on convolution and recurrent neural network
CN112419159A (en) Character image super-resolution reconstruction system and method
CN110853039B (en) Sketch image segmentation method, system and device for multi-data fusion and storage medium
CN113554032B (en) Remote sensing image segmentation method based on multi-path parallel network of high perception
CN117237197B (en) Image super-resolution method and device based on cross attention mechanism
CN111881768A (en) Document layout analysis method
CN113609892A (en) Handwritten poetry recognition method integrating deep learning with scenic spot knowledge map
CN114596566A (en) Text recognition method and related device
CN109766918A (en) Conspicuousness object detecting method based on the fusion of multi-level contextual information
US20210383520A1 (en) Method and apparatus for generating image, device, storage medium and program product
CN113140023A (en) Text-to-image generation method and system based on space attention
CN112037239B (en) Text guidance image segmentation method based on multi-level explicit relation selection
CN115511705A (en) Image super-resolution reconstruction method based on deformable residual convolution neural network
CN113537187A (en) Text recognition method and device, electronic equipment and readable storage medium
CN112419158A (en) Image video super-resolution and super-definition reconstruction system and method
CN114693926A (en) Image semantic segmentation method based on deep learning
CN113947102A (en) Backbone two-path image semantic segmentation method for scene understanding of mobile robot in complex environment
CN114298909A (en) Super-resolution network model and application thereof
CN116258652B (en) Text image restoration model and method based on structure attention and text perception
CN112464733A (en) High-resolution optical remote sensing image ground feature classification method based on bidirectional feature fusion
Mu et al. Integration of gradient guidance and edge enhancement into super‐resolution for small object detection in aerial images
CN111524090A (en) Depth prediction image-based RGB-D significance detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination