CN117275006A - Image processing method and device, electronic equipment and storage medium - Google Patents

Image processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117275006A
CN117275006A CN202311374367.5A CN202311374367A CN117275006A CN 117275006 A CN117275006 A CN 117275006A CN 202311374367 A CN202311374367 A CN 202311374367A CN 117275006 A CN117275006 A CN 117275006A
Authority
CN
China
Prior art keywords
image
text
blurred
processing
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311374367.5A
Other languages
Chinese (zh)
Inventor
黎安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lianren Healthcare Big Data Technology Co Ltd
Original Assignee
Lianren Healthcare Big Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lianren Healthcare Big Data Technology Co Ltd filed Critical Lianren Healthcare Big Data Technology Co Ltd
Priority to CN202311374367.5A priority Critical patent/CN117275006A/en
Publication of CN117275006A publication Critical patent/CN117275006A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/1801Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image processing method, an image processing device, electronic equipment and a storage medium. The method comprises the following steps: acquiring a text blurred image, and preprocessing the text blurred image to obtain a priori image of the text blurred image; performing prediction processing on the prior image and the text blurred image based on the first processing model to obtain a rough predicted image of the text blurred image; residual prediction is carried out on the coarse predicted image based on the second processing model, and a residual image corresponding to the coarse predicted image is obtained; and determining a super-resolution image corresponding to the text blurred image based on the coarse predicted image and the residual image. The image processing method provided by the scheme is used for processing the text blurred image to obtain the super-resolution image corresponding to the text blurred image, so that the deblurring effect of the text blurred image is improved, the resolution of the text blurred image is improved, and the obtained super-resolution image has high quality and high accuracy.

Description

Image processing method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image processing method, an image processing device, an electronic device, and a storage medium.
Background
In recent years, OCR (Optical Character Recognition ) text recognition task is one of the core problems in the field of computer vision, aiming at finding out all text information in images and further determining their location and content. Because characters have different appearances, shapes, dimensions and attitudes, and are interfered by factors such as illumination, shielding and the like during imaging, OCR text recognition is always one of the very challenging problems in the field of computer vision. Low resolution images due to limited sample image capture area, insufficient imaging conditions, etc. can lead to reduced performance for downstream tasks.
Disclosure of Invention
The invention provides an image processing method, an image processing device, electronic equipment and a storage medium, which are used for solving the problem of low resolution caused by factors such as limited sample image shooting area, insufficient imaging conditions and the like.
According to an aspect of the present invention, there is provided an image processing method including:
acquiring a text blurred image, and preprocessing the text blurred image to obtain a priori image of the text blurred image;
performing prediction processing on the prior image and the text blurred image based on the first processing model to obtain a rough predicted image of the text blurred image;
Residual prediction is carried out on the coarse predicted image based on the second processing model, and a residual image corresponding to the coarse predicted image is obtained;
and determining a super-resolution image corresponding to the text blurred image based on the coarse predicted image and the residual image.
Optionally, preprocessing the text blurred image to obtain a priori image of the text blurred image, including:
binarizing the text blurred image to obtain a mask image of the text blurred image, and taking the mask image as a priori image of the text blurred image; or,
and carrying out image segmentation on the text blurred image to obtain a segmented image, and taking the segmented image as a priori image of the text blurred image.
Optionally, after obtaining the prior image of the text blurred image, the method further comprises:
extracting at least one layer of features of the prior image to obtain a feature image of the prior image;
correspondingly, the prior image and the text blurred image are predicted based on the first processing model to obtain a rough predicted image of the text blurred image, which comprises the following steps:
and carrying out prediction processing on the characteristic image and the text blurred image of the prior image based on the first processing model to obtain a rough predicted image of the text blurred image.
Optionally, performing residual prediction on the coarse prediction image based on the second processing model to obtain a residual image corresponding to the coarse prediction image, including:
the following processing procedure is iteratively executed until the iteration ending condition is met, and a residual image corresponding to the rough predicted image is obtained:
acquiring a current iteration image, a current iteration number and a rough prediction image; the current iteration image is an output image of the second processing model in the last iteration or an initial image input for the first time;
and inputting the iteration image of the current time, the current iteration times and the rough prediction image into a second processing model to obtain an image output by the current iteration.
Optionally, the method further comprises:
determining loss data of the output images of different iterations, and determining whether an iteration end condition is met based on the loss data, wherein the iteration end condition comprises: the loss data of the images output by different iterations reach a convergence state, or the loss data of the images output by the current iteration is smaller than a preset value.
Optionally, determining the super-resolution image corresponding to the text blurred image based on the coarse prediction image and the residual image includes:
and adding pixel values of corresponding pixel points of the coarse prediction image and the residual image to obtain a super-resolution image.
Optionally, the first processing model is a U-NET network model, and the second processing model is a diffusion model;
the first processing model is obtained based on the training of the fuzzy sample image and the clear label image, wherein the fuzzy sample image is obtained by carrying out fuzzy processing on the clear label image;
the second processing model is obtained by training the rough prediction image of the blurred sample image and the difference label image of the blurred sample image and the clear label image based on the first processing model.
According to another aspect of the present invention, there is provided an image processing apparatus characterized by comprising:
the prior image acquisition module is used for acquiring a text blurred image, preprocessing the text blurred image and obtaining a prior image of the text blurred image;
the rough prediction image determining module is used for carrying out prediction processing on the prior image and the text blurred image based on the first processing model to obtain a rough prediction image of the text blurred image;
the residual image determining module is used for carrying out residual prediction on the coarse predicted image based on the second processing model to obtain a residual image corresponding to the coarse predicted image;
and the super-resolution image determining module is used for determining a super-resolution image corresponding to the text blurred image based on the coarse prediction image and the residual image.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the image processing method of any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute an image processing method of any one of the embodiments of the present invention.
According to the technical scheme, the text blurred image is obtained, and is preprocessed, so that a priori image of the text blurred image is obtained; performing prediction processing on the prior image and the text blurred image based on the first processing model to obtain a rough predicted image of the text blurred image; residual prediction is carried out on the coarse predicted image based on the second processing model, and a residual image corresponding to the coarse predicted image is obtained; the super-resolution image corresponding to the text blurred image is determined based on the rough predicted image and the residual image, the super-resolution image corresponding to the text blurred image can be obtained through the scheme, the problem of low resolution caused by a plurality of factors such as limited sample image shooting area, insufficient imaging conditions and the like is solved, the deblurring effect of the text blurred image is improved, the resolution of the text blurred image is improved, and the obtained super-resolution image has higher quality and higher accuracy.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of an image processing method according to a first embodiment of the present invention;
fig. 2 is a flowchart of an image processing method according to a second embodiment of the present invention;
FIG. 3 is a schematic diagram of a process flow of an image processing model to which embodiments of the present invention are applied;
fig. 4 is a schematic structural view of an image processing apparatus according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device implementing an image processing method according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of an image processing method according to an embodiment of the present invention, where the method may be performed by an image processing apparatus, and the image processing apparatus may be implemented in hardware and/or software, and the image processing apparatus may be configured in an electronic device such as a computer. As shown in fig. 1, the method includes:
S110, acquiring a text blurred image, and preprocessing the text blurred image to obtain a priori image of the text blurred image.
A text blurred image is understood to mean, in particular, a blurred image containing text. Optionally, for an image containing text information, the image containing the text information can be subjected to quality evaluation through a pre-trained image quality analysis model to determine whether the image is a blurred image, if so, the image is taken as the text blurred image, and the image processing method of the scheme is started, otherwise, no processing is required. In this embodiment, the image containing text information may include, but is not limited to, an image containing medical record information, an image including a signature, etc., and text information in the text blurred image is not limited herein.
The prior image is specifically understood to be an image containing prior information corresponding to text information in the text blurred image, and may be obtained by performing image preprocessing on the text blurred image, where the image preprocessing method includes, but is not limited to, binarization processing, image segmentation processing, and the like.
Specifically, the existing image data is analyzed through a preset graph quality analysis model, a text blurred image is selected from the image data, and the text blurred image is preprocessed through an image preprocessing method, so that a priori image of the text blurred image is obtained.
Optionally, preprocessing the text blurred image to obtain a priori image of the text blurred image, including: binarizing the text blurred image to obtain a mask image of the text blurred image, and taking the mask image as a priori image of the text blurred image; or, performing image segmentation on the text blurred image to obtain a segmented image, and taking the segmented image as a priori image of the text blurred image.
Specifically, binarization processing is performed on the obtained text blurred image, pixel values of all pixel points in the text blurred image are compared with a preset pixel threshold value, pixel points larger than the preset pixel threshold value are set to be first pixel values (for example, 0), pixel points smaller than or equal to the preset pixel threshold value are set to be second pixel points (for example, 255), a mask image of the text blurred image is obtained, and the mask image is used as a priori image of the text blurred image. Or, performing image segmentation processing on the text image, segmenting text information in the image to obtain a segmented image, and taking the segmented image as a priori image of the text blurred image, wherein the image segmentation processing comprises but is not limited to edge segmentation, threshold segmentation, specific theoretical segmentation and other methods.
In this embodiment, an image with a blurred text is obtained first, then the image is preprocessed to obtain a corresponding mask image or a segmentation image, and the obtained image processing result is used as a priori image of the text blurred image, so that priori information is fused in the image to obtain the priori image, thereby being beneficial to avoiding the influence of inaccurate priori information on obtaining an accurate text priori image and ensuring the high quality and accuracy of the text image.
S120, carrying out prediction processing on the prior image and the text blurred image based on the first processing model to obtain a rough predicted image of the text blurred image.
The first processing model can be specifically understood as a model for performing super-resolution coarse prediction on the text blurred image, and the first processing model can be a U-NET network model. Optionally, the first processing model is trained based on a blurred sample image and a clear label image, wherein the blurred sample image is obtained by blurring the clear label image.
And acquiring a clear image in advance, taking the clear image as a clear label image, and carrying out blurring processing on the clear label image to obtain a corresponding blurred sample image, wherein the blurring processing comprises but is not limited to noise adding processing, gaussian blurring processing and the like. Alternatively, a pre-trained image quality analysis model may be used to perform sharpness classification on a large number of initial images in the initial image set, and an image with a classification result of sharpness type is determined as a sharp image.
And taking the blurred sample image as input of the first processing model, taking the clear label image as a training label, training the first processing model in sequence, and adjusting model parameters of the first processing model to obtain the first processing model containing optimal model parameters, so that a prediction result of the first processing model is more accurate.
Specifically, the prior image and the text blurred image are predicted through the trained first processing model, namely the prior image and the text blurred image are used as the input of the first processing model, and the rough predicted image corresponding to the text blurred image is obtained through the prediction processing of the first processing model. The specific formula is as follows:
X t =P t (I m )
wherein I is m For the fusion result of the priori image and the text blurred image, the fusion mode can adopt a CONCAT function to process the priori image and the text blurred image to obtain the fusion result; p (P) t The first process model may be a U-NET network model.
For training of a first processing model, selecting a clear label image, carrying out fuzzy processing on the clear label image to obtain a text fuzzy sample image, taking the sample image as an input parameter of the first processing model, processing by the first processing model, outputting a corresponding rough prediction result, and measuring the difference between the rough prediction image and the clear label image through the mean square error Lpixcel of a pixel point, wherein the mean square error Lpixcel has the following calculation formula:
Lpixcel=E||X t -X gt || 2
Wherein X is t X is a coarse predicted image gt For the purpose of clear label images, judging whether the mean square error Lpixcel of the pixel points meets a preset error threshold value or not as a condition for executing iterative training of the first processing model, if the judgment result is not met, adjusting model parameters of the first processing model, continuing iterative training until the mean square error Lpixcel of the pixel points meets the preset error threshold value, obtaining optimal parameters of the first processing model, and finally obtaining the trained first processing model.
In this embodiment, the prior image and the text blurred image are predicted by the trained first processing model, so that the sharpness of the text blurred image can be primarily improved.
And S130, carrying out residual prediction on the coarse predicted image based on the second processing model to obtain a residual image corresponding to the coarse predicted image.
The second processing model may be understood in particular as a model for determining the residual of the coarse-predicted image of the first processing model, and the second processing model may be a diffusion model. Optionally, the second processing model is obtained based on the first processing model training the rough prediction image of the blurred sample image and the difference label image of the blurred sample image and the clear label image.
And training the second processing model by taking the coarse prediction image output by the first processing model as the input parameter of the second processing model, determining loss data between the corresponding prediction residual image output by the second processing model and the difference label image of the fuzzy sample image and the clear label image, and continuously adjusting the model parameter of the second processing model according to the loss data until the loss data is in a stable state or the loss data reaches a preset threshold value, so that training of the second processing model can be finished, and the trained second processing model is obtained.
Specifically, residual prediction is performed on the coarse predicted image output by the first processing model through the trained second processing model, and a residual image corresponding to the coarse predicted image is output by the second processing model.
Optionally, performing residual prediction on the coarse prediction image based on the second processing model to obtain a residual image corresponding to the coarse prediction image, including: the following processing procedure is iteratively executed until the iteration ending condition is met, and a residual image corresponding to the rough predicted image is obtained: acquiring a current iteration image, a current iteration number and a rough prediction image; the current iteration image is an output image of the second processing model in the last iteration or an initial image input for the first time; and inputting the iteration image of the current time, the current iteration times and the rough prediction image into a second processing model to obtain an image output by the current iteration.
Specifically, the coarse predicted image output by the first processing model is input into the second processing model as an input parameter, and when residual prediction is performed, the current iteration image and the current iteration number are also required to be input into the second processing model, and it is to be noted that if residual prediction processing is performed on the coarse predicted image by the second processing model for the first time, an image may be randomly selected as an initial image input for the first time, where the current iteration number t=1, where the initial image may also be a blank image or a noise image, where information of the initial image is not limited, and if residual prediction processing is not performed on the coarse predicted image by the second processing model for the first time, the current iteration image, the current iteration number and the coarse predicted image are required to be input into the second processing model, so as to obtain an image output by the current iteration, where the current iteration number t=n and n values are 2,3,4, … …. Further, determining loss data of the output images of different iterations, determining whether an iteration end condition is satisfied based on the loss data, the iteration end condition including: the loss data of the images output by different iterations reach a convergence state, or the loss data of the images output by the current iteration is smaller than a preset value. Specifically, determining the loss data of the output images of different iterations is to obtain the loss data of the output image of the current iteration by making a difference between the pixel values of the corresponding pixels of the output image of the current iteration and the standard residual image until the loss data reaches a convergence state, then outputting the prediction result of the current iteration as the residual image corresponding to the rough prediction image, or outputting the prediction result of the current iteration as the residual image corresponding to the rough prediction image if the loss data of the output image of the current iteration is smaller than a preset value.
In this embodiment, residual prediction is performed on the coarse predicted image through the second processing model, and iterative processing is performed in the prediction process, so that an image output by the current iteration meeting the iteration end condition is obtained as a residual image corresponding to the coarse predicted image, so that the prediction precision and stability of determining the residual image are improved, and the accuracy of determining the super-resolution image is improved.
And S140, determining a super-resolution image corresponding to the text blurred image based on the coarse prediction image and the residual image.
Specifically, the determined coarse prediction image and residual image are subjected to preset processing, including but not limited to addition processing, weighting processing and the like, to obtain a super-resolution image corresponding to the text blurred image.
Optionally, determining the super-resolution image corresponding to the text blurred image based on the coarse prediction image and the residual image includes: and adding pixel values of corresponding pixel points of the coarse prediction image and the residual image to obtain a super-resolution image.
Specifically, the pixel values of the corresponding pixels of the coarse prediction image and the residual image can be added to obtain a super-resolution image.
According to the technical scheme, the text blurred image is obtained, and is preprocessed, so that a priori image of the text blurred image is obtained; performing prediction processing on the prior image and the text blurred image based on the first processing model to obtain a rough predicted image of the text blurred image; residual prediction is carried out on the coarse predicted image based on the second processing model, and a residual image corresponding to the coarse predicted image is obtained; the super-resolution image corresponding to the text blurred image is determined based on the rough predicted image and the residual image, the super-resolution image corresponding to the text blurred image can be obtained through the scheme, the problem of low resolution caused by a plurality of factors such as limited sample image shooting area, insufficient imaging conditions and the like is solved, the deblurring effect of the text blurred image is improved, the resolution of the text blurred image is improved, and the obtained super-resolution image has higher quality and higher accuracy.
Example two
Fig. 2 is a flowchart of an image processing method according to a second embodiment of the present invention, where the image processing method according to the foregoing embodiment is further optimized, and optionally, at least one layer of feature extraction is performed on a prior image to obtain a feature image of the prior image; and carrying out prediction processing on the characteristic image and the text blurred image of the prior image based on the first processing model to obtain a rough predicted image of the text blurred image. As shown in fig. 2, the method includes:
s210, acquiring a text blurred image, and preprocessing the text blurred image to obtain a priori image of the text blurred image.
S220, carrying out at least one layer of feature extraction on the prior image to obtain a feature image of the prior image.
Specifically, a priori image is processed through a DW-VIT model, the priori image and an original image are fused, then an image fusion result is processed through a Conv-Block model, the image fusion result is used for improving the locality and the space invariance capacity of the image, the image is up-sampled through a Pixel-Shuffle model, the details and the text structure of the learning image can be better captured, the priori image is processed through the three models, the output result of each model is used as the input of the next model, and at least one layer of feature extraction is carried out on the priori image, so that the feature image of the priori image is obtained.
In the embodiment, a method for extracting at least one layer of features of the prior image is introduced, so that image details and text structures in the prior image can be captured and learned better, and the accuracy of text image processing is further improved.
S230, carrying out prediction processing on the characteristic image and the text blurred image of the prior image based on the first processing model to obtain a rough predicted image of the text blurred image.
Specifically, the characteristic image and the text blurred image of the prior image are predicted through the trained first processing model, namely the characteristic image and the text blurred image of the prior image are used as input parameters of the first processing model, text pixel generation is carried out through the first processing model, the blurred scene text image is predicted, and the rough predicted image of the text blurred image is obtained.
S240, residual prediction is carried out on the coarse predicted image based on the second processing model, and a residual image corresponding to the coarse predicted image is obtained.
S250, determining a super-resolution image corresponding to the text blurred image based on the coarse prediction image and the residual image.
In a specific embodiment, as shown in a schematic process flow of an image processing model in fig. 3, the schematic process includes three parts, namely, a prior image determining part, a rough prediction image determining part and a residual prediction part, wherein the prior image determining part firstly carries out mask processing on a text blurred image, namely, binarization processing is carried out, a text part is displayed in white, a non-text part is displayed in black, a prior image is obtained, and after the prior image is processed through a DW-VIT model, a Conv-Block model and a Pixel-Shuffle model, the output result of each model is used as an input parameter of a next model according to sequence, at least one layer of feature extraction is carried out on the prior image, and a feature image of the prior image is obtained. And predicting the characteristic image and the text blurred image of the prior image by using the first processing model as input parameters of the first processing model to obtain a rough predicted image of the text blurred image, further, performing residual prediction by using the second processing model, taking the rough predicted image as input parameters of the second processing model, selecting an initial image as input parameters, wherein the second processing model is required to be subjected to prediction processing for the first time, the initial image can be a randomly selected image and can be a blank image, the current times t=n, namely, the current times, the initial image and the rough predicted image are taken as input parameters of the second model, performing prediction processing by using the second model, if the current times, the current iteration times and the rough predicted image are not firstly subjected to prediction processing, then, the current iteration images, the current iteration times and the rough predicted image are taken as input parameters of the second processing model, performing prediction processing by using the second model, the prediction processing of the second model is an iteration processing, each iteration output result is taken as input of next iteration processing, the second processing model is processed by taking the output image as the next iteration processing result, the output image is satisfied until the corresponding iteration condition is satisfied, the corresponding to obtain a residual error value, and finally, the rough predicted image is obtained, and the residual error is obtained.
According to the technical scheme, the text blurred image is obtained, and is preprocessed, so that a priori image of the text blurred image is obtained; extracting at least one layer of features of the prior image to obtain a feature image of the prior image; performing prediction processing on the characteristic image and the text blurred image of the prior image based on the first processing model to obtain a rough predicted image of the text blurred image; residual prediction is carried out on the coarse predicted image based on the second processing model, and a residual image corresponding to the coarse predicted image is obtained; and determining a super-resolution image corresponding to the text blurred image based on the coarse predicted image and the residual image. The method for extracting the characteristic image of the prior image is introduced to perform rough prediction image of the text blurred image, so that image details and text structures in the prior image can be captured and learned better, and then the super-resolution image corresponding to the text blurred image is determined through the obtained rough prediction image and the residual image, so that the super-resolution image corresponding to the text blurred image with higher resolution can be obtained, and the high quality and accuracy of the super-resolution image are improved.
Example III
Fig. 4 is a schematic structural diagram of an image processing apparatus according to a third embodiment of the present invention. As shown in fig. 4, the apparatus includes:
The prior image obtaining module 410 is configured to obtain a text blurred image, and pre-process the text blurred image to obtain a prior image of the text blurred image;
the rough predicted image determining module 420 is configured to perform prediction processing on the prior image and the text blurred image based on the first processing model, so as to obtain a rough predicted image of the text blurred image;
the residual image determining module 430 is configured to perform residual prediction on the coarse predicted image based on the second processing model, so as to obtain a residual image corresponding to the coarse predicted image;
the super-resolution image determining module 440 is configured to determine a super-resolution image corresponding to the text blur image based on the coarse prediction image and the residual image.
According to the technical scheme, a priori image acquisition module is used for acquiring a text blurred image, and preprocessing is carried out on the text blurred image to obtain a priori image of the text blurred image; the rough prediction image determining module predicts the prior image and the text blurred image based on the first processing model to obtain a rough prediction image of the text blurred image; the residual image determining module carries out residual prediction on the coarse predicted image based on the second processing model to obtain a residual image corresponding to the coarse predicted image; the super-resolution image determining module determines a super-resolution image corresponding to the text blurred image based on the coarse predicted image and the residual image, so that the prior image acquiring module acquires the text blurred image, and further the super-resolution image corresponding to the text blurred image is obtained, the problem of low resolution caused by various factors such as limited sample image shooting area, insufficient imaging conditions and the like is solved, the deblurring effect of the text blurred image is improved, the resolution of the text blurred image is improved, and the obtained super-resolution image has higher quality and higher accuracy.
Based on the above embodiments, the optional a priori image acquisition module 410 is specifically configured to:
binarizing the text blurred image to obtain a mask image of the text blurred image, and taking the mask image as a priori image of the text blurred image; or,
and carrying out image segmentation on the text blurred image to obtain a segmented image, and taking the segmented image as a priori image of the text blurred image.
After obtaining the prior image of the text blurred image, the method further comprises the following steps:
extracting at least one layer of features of the prior image to obtain a feature image of the prior image;
optionally, the coarse prediction image determining module 420 is specifically configured to:
and carrying out prediction processing on the characteristic image and the text blurred image of the prior image based on the first processing model to obtain a rough predicted image of the text blurred image.
Optionally, the residual image determining module 430 is specifically configured to:
the following processing procedure is iteratively executed until the iteration ending condition is met, and a residual image corresponding to the rough predicted image is obtained:
acquiring a current iteration image, a current iteration number and a rough prediction image; the current iteration image is an output image of the second processing model in the last iteration or an initial image input for the first time;
And inputting the iteration image of the current time, the current iteration times and the rough prediction image into a second processing model to obtain an image output by the current iteration.
Determining loss data of the output images of different iterations, and determining whether an iteration end condition is met based on the loss data, wherein the iteration end condition comprises: the loss data of the images output by different iterations reach a convergence state, or the loss data of the images output by the current iteration is smaller than a preset value.
Optionally, the super-resolution image determining module 440 is specifically configured to:
and adding pixel values of corresponding pixel points of the coarse prediction image and the residual image to obtain a super-resolution image.
The first processing model is a U-NET network model, and the second processing model is a diffusion model;
the first processing model is obtained based on the training of the fuzzy sample image and the clear label image, wherein the fuzzy sample image is obtained by carrying out fuzzy processing on the clear label image;
the second processing model is obtained by training the rough prediction image of the blurred sample image and the difference label image of the blurred sample image and the clear label image based on the first processing model.
The image processing device provided by the embodiment of the invention can execute the image processing method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 5 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention. The electronic device 10 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the respective methods and processes described above, for example, an image processing method.
In some embodiments, the image processing method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the image processing method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the image processing method in any other suitable way (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
The computer program for implementing the image processing method of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
Example five
The fifth embodiment of the present invention also provides a computer-readable storage medium storing computer instructions for causing a processor to execute an image processing method, the method comprising:
acquiring a text blurred image, and preprocessing the text blurred image to obtain a priori image of the text blurred image;
performing prediction processing on the prior image and the text blurred image based on the first processing model to obtain a rough predicted image of the text blurred image;
residual prediction is carried out on the coarse predicted image based on the second processing model, and a residual image corresponding to the coarse predicted image is obtained;
and determining a super-resolution image corresponding to the text blurred image based on the coarse predicted image and the residual image.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. An image processing method, comprising:
acquiring a text blurred image, and preprocessing the text blurred image to obtain a priori image of the text blurred image;
performing prediction processing on the prior image and the text blurred image based on a first processing model to obtain a rough predicted image of the text blurred image;
residual prediction is carried out on the coarse predicted image based on a second processing model, and a residual image corresponding to the coarse predicted image is obtained;
and determining a super-resolution image corresponding to the text blurred image based on the coarse prediction image and the residual image.
2. The method of claim 1, wherein the preprocessing the text blur image to obtain a priori image of the text blur image comprises:
performing binarization processing on the text blurred image to obtain a mask image of the text blurred image, and taking the mask image as a priori image of the text blurred image; or,
and carrying out image segmentation on the text blurred image to obtain a segmented image, and taking the segmented image as a priori image of the text blurred image.
3. The method of claim 1, further comprising, after obtaining the prior image of the text blur image:
extracting at least one layer of features of the prior image to obtain a feature image of the prior image;
correspondingly, the predicting the prior image and the text blurred image based on the first processing model to obtain a rough predicted image of the text blurred image comprises the following steps:
and carrying out prediction processing on the characteristic image of the prior image and the text blurred image based on a first processing model to obtain a rough predicted image of the text blurred image.
4. The method according to claim 1, wherein performing residual prediction on the coarse predicted image based on the second processing model to obtain a residual image corresponding to the coarse predicted image includes:
the following processing procedure is iteratively executed until the iteration ending condition is met, and a residual image corresponding to the rough predicted image is obtained:
acquiring a current iteration image, a current iteration number and the rough prediction image; the current iteration image is an output image of the last iteration of the second processing model or an initial image input for the first time;
And inputting the iteration image of the current time, the current iteration times and the rough prediction image into the second processing model to obtain an image output by the current iteration.
5. The method according to claim 4, wherein the method further comprises:
determining loss data of output images of different iterations, and determining whether the iteration ending condition is met or not based on the loss data, wherein the iteration ending condition comprises: the loss data of the images output by different iterations reach a convergence state, or the loss data of the images output by the current iteration is smaller than a preset value.
6. The method of claim 1, wherein the determining a super-resolution image corresponding to the text blur image based on the coarse prediction image and the residual image comprises:
and adding pixel values of corresponding pixel points of the coarse prediction image and the residual image to obtain a super-resolution image.
7. The method of any one of claims 1-6, wherein the first process model is a U-NET network model and the second process model is a diffusion model;
the first processing model is obtained based on training a fuzzy sample image and a clear label image, wherein the fuzzy sample image is obtained by carrying out fuzzy processing on the clear label image;
The second processing model is obtained by training the rough prediction image of the fuzzy sample image and the difference label image of the fuzzy sample image and the clear label image based on the first processing model.
8. An image processing apparatus, comprising:
the prior image acquisition module is used for acquiring a text blurred image, preprocessing the text blurred image and obtaining a prior image of the text blurred image;
the rough prediction image determining module is used for carrying out prediction processing on the prior image and the text blurred image based on a first processing model to obtain a rough prediction image of the text blurred image;
the residual image determining module is used for carrying out residual prediction on the coarse predicted image based on a second processing model to obtain a residual image corresponding to the coarse predicted image;
and the super-resolution image determining module is used for determining a super-resolution image corresponding to the text blurred image based on the coarse prediction image and the residual image.
9. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the image processing method of any one of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores computer instructions for causing a processor to implement the image processing method of any one of claims 1-7 when executed.
CN202311374367.5A 2023-10-23 2023-10-23 Image processing method and device, electronic equipment and storage medium Pending CN117275006A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311374367.5A CN117275006A (en) 2023-10-23 2023-10-23 Image processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311374367.5A CN117275006A (en) 2023-10-23 2023-10-23 Image processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117275006A true CN117275006A (en) 2023-12-22

Family

ID=89219342

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311374367.5A Pending CN117275006A (en) 2023-10-23 2023-10-23 Image processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117275006A (en)

Similar Documents

Publication Publication Date Title
CN112949767B (en) Sample image increment, image detection model training and image detection method
CN112561879B (en) Ambiguity evaluation model training method, image ambiguity evaluation method and image ambiguity evaluation device
CN112989995B (en) Text detection method and device and electronic equipment
CN115294332B (en) Image processing method, device, equipment and storage medium
CN114360074A (en) Training method of detection model, living body detection method, apparatus, device and medium
CN116952958B (en) Defect detection method, device, electronic equipment and storage medium
CN113643260A (en) Method, apparatus, device, medium and product for detecting image quality
CN115471476A (en) Method, device, equipment and medium for detecting component defects
CN116703925B (en) Bearing defect detection method and device, electronic equipment and storage medium
CN117333443A (en) Defect detection method and device, electronic equipment and storage medium
CN114724144B (en) Text recognition method, training device, training equipment and training medium for model
CN115482248B (en) Image segmentation method, device, electronic equipment and storage medium
CN115700758A (en) Sperm activity detection method, device, equipment and storage medium
CN113628192B (en) Image blur detection method, apparatus, device, storage medium, and program product
CN116363444A (en) Fuzzy classification model training method, fuzzy image recognition method and device
CN114821596A (en) Text recognition method and device, electronic equipment and medium
CN117275006A (en) Image processing method and device, electronic equipment and storage medium
CN118411382B (en) Boundary point detection method, boundary point detection device, electronic equipment and storage medium
CN118411381B (en) Boundary coordinate detection method, device, electronic equipment and storage medium
CN118135381B (en) Image blurring detection method, device, equipment and medium
CN114092739B (en) Image processing method, apparatus, device, storage medium, and program product
CN114037865B (en) Image processing method, apparatus, device, storage medium, and program product
CN117746069B (en) Graph searching model training method and graph searching method
CN118411382A (en) Boundary point detection method, boundary point detection device, electronic equipment and storage medium
CN117274588A (en) Image processing method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination