CN115147280B - Training method, image processing method, device and equipment for deep learning model - Google Patents

Training method, image processing method, device and equipment for deep learning model Download PDF

Info

Publication number
CN115147280B
CN115147280B CN202210855822.2A CN202210855822A CN115147280B CN 115147280 B CN115147280 B CN 115147280B CN 202210855822 A CN202210855822 A CN 202210855822A CN 115147280 B CN115147280 B CN 115147280B
Authority
CN
China
Prior art keywords
image
transformation
deep learning
learning model
vertex
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210855822.2A
Other languages
Chinese (zh)
Other versions
CN115147280A (en
Inventor
李鑫
张霖
何栋梁
李甫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202210855822.2A priority Critical patent/CN115147280B/en
Publication of CN115147280A publication Critical patent/CN115147280A/en
Application granted granted Critical
Publication of CN115147280B publication Critical patent/CN115147280B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • G06T3/4076Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution using the original low-resolution images to iteratively correct the high-resolution images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure provides a training method of a deep learning model, relates to the technical field of artificial intelligence, and particularly relates to the technical fields of image processing, computer vision and deep learning. The specific implementation scheme is as follows: inputting the first sample image and the first reference image into a deep learning model to obtain a first superresolution image; inputting the first superscore image and the second sample image into a deep learning model to obtain a second superscore image; determining a first loss from the first superdivision image and a second loss from the second superdivision image; and adjusting parameters of the deep learning model according to the first loss and the second loss. The disclosure also provides an image processing method, an image processing device, an electronic device and a storage medium.

Description

Training method, image processing method, device and equipment for deep learning model
Technical Field
The present disclosure relates to the field of artificial intelligence, and more particularly to image processing, computer vision, and deep learning techniques. More specifically, the present disclosure provides a training method, an image processing method, an apparatus, an electronic device, and a storage medium for a deep learning model.
Background
The image super-resolution processing takes a low-resolution image as an input and takes an effective high-quality high-resolution image as a desired output. The image super-resolution processing may employ a deep learning implementation.
Disclosure of Invention
The disclosure provides a training method, an image processing device, equipment and a storage medium for a deep learning model.
According to a first aspect, there is provided a training method of a deep learning model, the method comprising: inputting the first sample image and the first reference image into a deep learning model to obtain a first superresolution image; inputting the first superscore image and the second sample image into a deep learning model to obtain a second superscore image; determining a first loss from the first superdivision image and a second loss from the second superdivision image; and adjusting parameters of the deep learning model according to the first loss and the second loss.
According to a second aspect, there is provided an image processing method comprising: acquiring an image to be processed and a second reference image; inputting the image to be processed and the second reference image into a deep learning model to obtain a super-resolution image of the image to be processed; the deep learning model is trained according to the training method of the deep learning model.
According to a third aspect, there is provided a training apparatus of a deep learning model, the apparatus comprising: the first superdivision module is used for inputting the first sample image and the first reference image into the deep learning model to obtain a first superdivision image; the second superdivision module is used for inputting the first superdivision image and the second sample image into the deep learning model to obtain a second superdivision image; the determining module is used for determining a first loss according to the first superdivision image and determining a second loss according to the second superdivision image; and the adjusting module is used for adjusting parameters of the deep learning model according to the first loss and the second loss.
According to a fourth aspect, there is provided an image processing apparatus comprising: the acquisition module is used for acquiring the image to be processed and the second reference image; the second processing module is used for inputting the image to be processed and the second reference image into the deep learning model to obtain a super-resolution image of the image to be processed; the deep learning model is obtained through training according to the training device of the deep learning model.
According to a fifth aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method provided in accordance with the present disclosure.
According to a sixth aspect, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a method provided according to the present disclosure.
According to a seventh aspect, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method provided according to the present disclosure.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram of a method of implementing image super-resolution processing using a deep learning model according to one embodiment of the related art;
FIG. 2A is a flow chart of a training method of a deep learning model according to one embodiment of the present disclosure;
FIG. 2B is a flow chart of a training method of a deep learning model according to another embodiment of the present disclosure;
3A-3C are schematic diagrams of a method of perspective transforming an image according to one embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a training method of a deep learning model according to one embodiment of the present disclosure;
FIG. 5 is an effect contrast graph of image super resolution processing according to one embodiment of the present disclosure;
FIG. 6 is a flow chart of an image processing method according to one embodiment of the present disclosure;
FIG. 7 is a block diagram of a training apparatus of a deep learning model according to one embodiment of the present disclosure;
fig. 8 is a block diagram of an image processing apparatus according to an embodiment of the present disclosure; and
fig. 9 is a block diagram of an electronic device of a training method and/or an image processing method of a deep learning model according to one embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The image super-resolution processing may employ a deep learning implementation. Deep learning-based implementations may include reference image-based image super-resolution processing, such as for low resolution image X LR Given one and X LR High resolution reference image Y with similar content HR X is taken as LR And Y HR Inputting the deep learning model to obtain X LR Super-resolution image X of (a) SR . The deep learning model may be, for example, a Reference-based image Super-Resolution (RefSR) model.
Fig. 1 is a schematic diagram of a method of implementing image super-resolution processing using a deep learning model according to an embodiment of the related art.
As shown in fig. 1, the image 101 may be a low resolution image X to be subjected to image super resolution processing LR Image 102 may be a high resolution reference image Y HR . Image 101 has similar content as image 102, e.g., has a similar scene, a similar person, etc. Inputting image 101 and image 102 into RefSR model 110, refSR model 110 may output image 103, image 103 being super-resolution image X of image 101 SR
The super-resolution processing of the image based on the reference image has the high-definition image with similar content as the reference, so that the super-resolution processing result of the low-resolution image is greatly improved compared with the traditional mode (such as an interpolation mode). However, in practical application, the super-resolution effect of the image super-resolution processing based on the reference frame is still limited, and it is difficult to satisfy the scene with higher definition requirement.
Therefore, the embodiment of the disclosure provides a training method of a deep learning model and a method for realizing image super-resolution processing by using the deep learning model, wherein each training process of the deep learning model comprises two times of super-resolution processing, the two times of super-resolution processing are mutually promoted, and the effect of the super-resolution processing is further improved.
In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.
In the technical scheme of the disclosure, the authorization or consent of the user is obtained before the personal information of the user is obtained or acquired.
Fig. 2A is a flow chart of a training method of a deep learning model according to one embodiment of the present disclosure.
As shown in fig. 2A, the training method 200 of the deep learning model may include operations S210 to S240.
In operation S210, a first sample image and a first reference image are input into a deep learning model, resulting in a first super-resolution image.
For example, the first sample image may be a low resolution image X LR The first reference image may be a high resolution image Y HR ,X LR And Y HR With similar content, for example with similar scenes or similar people. X is to be LR And Y HR A deep learning model (e.g., refSR model) is input, which may be utilized from Y HR Is to supplement X with rich texture LR Missing details, thereby outputting X LR Super-resolution image X of (a) SR (first supersplit image).
Wherein X is SR Higher resolution than X LR Resolution of X SR Is also higher than X LR Is a definition of (2).
In operation S220, the first super-resolution image and the second sample image are input into a deep learning model, resulting in a second super-resolution image.
It will be appreciated that if X SR Is sufficiently high (e.g., resolution greater than 1024X 1000), then X SR Can also be used as a new reference image for super-dividing a sum X SR New low resolution images with similar content.
For example, the new low resolution image is a second sample image, which may beIs Y HR Is a transform image of (a) is provided. This is due to X SR And Y is equal to HR With similar content, Y can be directly utilized HR Of (e.g. Y) HR Is included) to perform the second super-resolution processing without introducing a new data set.
For example, the second sample image may be Y HR First, a low resolution process is carried out to obtain a low resolution compressed image Y LR Then compress the image Y LR First transformed image obtained by perspective transformation
Figure BDA0003748075170000051
For example, the second sample image
Figure BDA0003748075170000052
And a first superdivision image X SR (reference image as second sample image) is input into RefSR model for secondary super-division processing, and RefSR model can be used from X SR Is supplemented with rich texture->
Figure BDA0003748075170000053
The details of the deletion in (a) to output +.>
Figure BDA0003748075170000054
Super-resolution image +.>
Figure BDA0003748075170000055
(second supersplit image).
Perspective transformation is understood to mean the projection of an image onto a new viewing plane, in particular the transformation of the coordinates of the image in two dimensions into three dimensions, and the mapping of the coordinates of the image in three dimensions into another two dimensions (new viewing plane). The perspective transformation of the image can change the viewing angle of the image.
In operation S230, a first loss is determined from the first super-resolution image, and a second loss is determined from the second super-resolution image.
For example, the above-described two-time super-process procedureMay be synchronously optimized, the second superresolution image
Figure BDA0003748075170000056
The image quality (e.g. sharpness or resolution) of (a) depends on the first super-resolution image X SR Thus the second super-processing will promote the first super-resolution image X SR Image quality improvement, first super-resolution image X SR Can promote the second super-resolution image +.>
Figure BDA0003748075170000057
The image quality is improved, so that the above two-time over-processing procedure can form reciprocity.
The penalty of the deep learning model (e.g., refSR model) may include two parts, the first part penalty being determined by the first super-processing result, e.g., the first part penalty L1 may be the first super-division image X SR With the first sample image X LR Is a supervisory image (e.g. X HR ) And errors such as mean square error or cross entropy. The second partial loss is determined by the second super-processing result, for example, the second partial loss L2 may be the second super-resolution image
Figure BDA0003748075170000058
And second sample image->
Figure BDA0003748075170000059
Is a supervision image of (e.g.)>
Figure BDA00037480751700000510
) Error of mean square error or cross entropy, wherein the second sample image +.>
Figure BDA00037480751700000511
Is->
Figure BDA00037480751700000512
For example from a first reference image Y HR And obtaining the image through perspective transformation.
In operation S240, parameters of the deep learning model are adjusted according to the first loss and the second loss.
For example, the sum of the two losses (L1+L2) can be used as the overall loss of the deep learning model to adjust the parameters of the deep learning model. The training process described above involving two superscore processes may be referred to as a reciprocal learning process. It will be appreciated that the overall loss of the deep learning model may also be obtained by weighting L1 and L2 according to the actual application, which is not limited by the present disclosure.
In the embodiment of the disclosure, each training process comprises two times of superdivision processing, the result of the first time of superdivision processing is used as a new reference image, the second time of superdivision processing is performed on the new low-resolution image, the two times of superdivision processing are mutually promoted, and the superdivision processing effect of the deep learning model is further improved.
Fig. 2B is a flow chart of a training method of a deep learning model according to another embodiment of the present disclosure.
As shown in fig. 2B, the training method of the deep learning model may include operations S210 to S240 and operations S201 to S202.
The specific implementation of operations S210 to S240 refers to the description of operations S210 to S240 as in fig. 2A.
Operations S201 to S202 are steps of generating a second sample image.
In operation S201, a resolution reduction process is performed on a first reference image to obtain a compressed image.
In operation S202, a perspective transformation is performed on the compressed image, resulting in a first transformed image as a second sample image.
For example, for a first reference image Y HR Performing low resolution processing (such as compression operation) to obtain compressed image Y LR Then compress the image Y LR A first transformed image obtained by performing perspective transformation as a second sample image
Figure BDA0003748075170000061
It will be appreciated that the compressed image Y is not used directly LR As the second sample image, Y is used L R is subjected to perspective transformation to obtain a first transformed image
Figure BDA0003748075170000062
As the second sample image, this is because the reference image of the second super-division processing is X SR (result of first super-processing) if Y is directly used LR As a second sample image, the second super-resolution process is aimed at using X SR To restore Y LR (it is understood that X is used SR To generate Y as much as possible HR ) While the first superdivision process uses Y HR Generating X SR This results in a two-fold oversubscription process comprising the following two steps: step one, utilize Y HR Generating X SR Step two, utilizing X SR To generate Y HR Such a twice super-division process would be equivalent to a self-encoder for Y, X SR As the reference image of the second step, in order to facilitate the generation of Y in the second step HR Will reserve Y as much as possible in step one HR This loses the first time the hyper-segmentation process is to recover X LR To generate X SR Meaning of (2).
According to an embodiment of the disclosure, the training method of the deep learning model further includes performing perspective transformation on the first reference image to obtain a second transformed image. Operation S230 includes determining a second loss according to a difference between the second transformed image and the second super-resolution image.
For example, the second-time super-processing results
Figure BDA0003748075170000071
The surveillance image of (2) may be +.>
Figure BDA0003748075170000072
May be made of a first reference image Y HR And obtaining the image through perspective transformation. The second loss may be +.>
Figure BDA0003748075170000073
And->
Figure BDA0003748075170000074
Mean square error or cross entropy between them, etc.
The perspective transformation operation provided by the present disclosure is described in detail below with reference to fig. 3A to 3C.
Fig. 3A-3C are schematic diagrams of a method of perspective transforming an image according to one embodiment of the present disclosure.
As shown in fig. 3A, for image 310, a sub-image may be truncated from image 310 using a randomly selected solid rectangular box 301. The truncated sub-image 320 is shown in fig. 3B. A specific implementation of perspective transformation of the sub-image 320 is as follows.
For each vertex of the solid-line rectangular frame 301, two areas having different ranges from each other are determined centering on the vertex, and an area between the two areas can be regarded as the target area 303 (for example, a gray transparent area in fig. 3A), that is, four vertices of the solid-line rectangular frame 301, that is, four vertices of the sub-image 320. For example, two regions having different ranges from each other may be rectangular regions, and for each vertex of the solid rectangular frame 301, the vertex is taken as the origin, the coordinate ranges of the transverse axis and the longitudinal axis of the larger region of the two rectangular regions are [ -20,20], and the coordinate ranges of the transverse axis and the longitudinal axis of the smaller region are [ -5,5], and then the coordinate ranges of the transverse axis and the longitudinal axis of the target region 303 may be [ -20, -5] [5, 20 ]). The two regions that are different from each other in extent may also be regions of other shapes, such as annular regions. Similarly, the target areas 303 for the other three vertices may be determined. As shown in fig. 3A, each vertex of the solid rectangular frame 301 determines one target area 303, and four target areas 303 may be determined.
For each vertex of the solid line rectangular frame 301, a new vertex can be determined by randomly moving the vertex within the target area 303 determined by itself. For example, referring to fig. 3A, the vertices of the solid rectangular frame 301 move randomly within the target region 303, including the outer frame where the vertices must not exceed the gray transparent region, nor must they enter the white region. As shown in fig. 3A, each vertex of the solid rectangular box 301 may define a new vertex, and a total of four new vertices may be defined. Four new vertices may define a quadrilateral box, such as dashed box 302 shown in fig. 3A.
For example, a perspective transformation matrix may be determined based on the mapping relationship between the vertices of the solid rectangular box 301 and the vertices of the dashed box 302. From this perspective transformation matrix, the sub-image 320 determined by the solid rectangular box 301 may be perspective transformed, resulting in a perspective transformed image 330 of the sub-image 320, as shown in fig. 3C. For example, perspective transformation of the sub-image 320 may be achieved by multiplying the coordinates of each pixel point in the sub-image 320 with a perspective transformation matrix.
The above describes an example implementation of perspective transforming the sub-image 320, in embodiments of the present disclosure, the compressed image Y is required to be LR Performing perspective transformation to obtain a first transformed image
Figure BDA0003748075170000081
As a second sample image. Thus, the image Y is compressed LR I.e., four vertices that can correspond to the four vertices of the solid rectangular box 301 shown in fig. 3A, reference may be made to the specific implementation of perspective transformation of the sub-image 320 shown in fig. 3A.
Similarly, in embodiments of the present disclosure, it is desirable to store the first reference image Y HR Performing perspective transformation to obtain a second transformed image
Figure BDA0003748075170000082
So as to utilize the second transformed image +.>
Figure BDA0003748075170000083
And a second superminute image->
Figure BDA0003748075170000084
The difference between them determines a second loss. First reference image Y HR Performing perspective transformation to obtain a second transformed image +.>
Figure BDA0003748075170000085
Reference may also be made to the specific implementation of perspective transformation of the sub-image 320 as shown in fig. 3A, which is not described herein.
Fig. 4 is a schematic diagram of a training method of a deep learning model according to one embodiment of the present disclosure.
As shown in fig. 4, the image 401 may be a low resolution image X LR The image 402 may be a high resolution first reference image Y HR . Image 401 has similar content as image 402, e.g., has a similar scene, a similar person, etc. Inputting image 401 and image 402 into RefSR model 410, refSR model 410 may output image 403, image 403 being super-resolution image X of image 401 SR
The image 404 may be a first reference image Y HR First, a low resolution process is carried out to obtain a low resolution compressed image Y LR Then compress the image Y LR First transformed image obtained by perspective transformation
Figure BDA0003748075170000086
Image 403 may be used as a new reference image for image 404, image 403 and image 404 may be input to RefSR model 410, refSR model 410 may output image 405, image 405 being a super-resolution image of image 404
Figure BDA0003748075170000087
It will be appreciated that the two RefSR models 410 shown in fig. 4 are the same RefSR model, which is used to perform the above-described two super-processing steps.
The penalty of the RefSR model may include a first penalty 411 and a second penalty 412, the first penalty 411 may be determined by the first super-processing result image 403, for example the first penalty 411 may be a supervision image of the image 403 and the image 401 (e.g. X HR ) And errors such as mean square error or cross entropy. The second loss 412 may be determined from the second-time-super-processing result image 405, for example, the second loss 412 may be a supervisory image (for example, a supervisory image of the image 405 and the image 404Such as
Figure BDA0003748075170000091
) Error of mean square error or cross entropy, wherein the supervision image of image 404 +.>
Figure BDA0003748075170000092
Such as from the perspective transformation of the image 402.
The first loss 411 and the second loss 412 may together be used as a loss of the deep learning model to adjust parameters of the deep learning model. Parameters of RefSR model 410 are adjusted, for example, according to the sum of first penalty 411 and second penalty 412.
According to the embodiment of the disclosure, each training process comprises two times of superdivision processing, the two times of superdivision processing mutually promote to form reciprocal learning, and the superdivision processing effect of the deep learning model can be improved. And in the above two-time super-processing, the second-time super-processing is directed to the perspective-transformed image
Figure BDA0003748075170000093
The second superdivision processing is not directly carried out on the compressed image YHR, so that the problem that the first superdivision processing effect is poor due to the fact that a self-encoder related to Y is formed by a deep learning model can be avoided, and the effect of the first superdivision processing is guaranteed.
Fig. 5 is an effect contrast diagram of image super-resolution processing according to an embodiment of the present disclosure.
As shown in fig. 5, the image 501 is a low resolution image (e.g., X LR ) Image 502 is a high resolution reference image (e.g., Y HR ). Image 503 is a partial area image of image 501, for example, a sub-image in a rectangular box in image 501. Image 504 is a partial region image in image 502, for example, a sub-image in a rectangular box in image 502. The position of the rectangular box in image 501 is the same as the position of the rectangular box in image 502, and the content in image 503 is blurred compared to the content in image 504.
Image 505 is a surveillance image of image 503. The super resolution processing of the image 501 can make the image quality of the image 503 (partial area image of the image 501) as high as possible to the image quality of the image 505. The super resolution processing of the image 501 in different ways can achieve different effects on the image quality of the image 503.
For example, in one super-resolution processing scheme, a deep learning model (e.g., refSR model) is trained, each training comprising two super-division processes that mutually promote forming reciprocal learning, but the second super-division process is directed to a compressed image (without performing perspective transformation operations) of the reference image of the first super-division process. The trained deep learning model performs super-resolution processing on the image 501, and the image quality of the image 503 (partial region image of the image 501) is improved, for example, in the super-resolution processing mode, an effect diagram of the image quality improvement of the image 503 is an image 506. As shown in fig. 5, image 506 has improved sharpness compared to image 503, but still has more pronounced shadows.
For example, in one super-resolution processing mode, the image 501 is super-resolved by using a conventional reference-based image super-resolution model (the training process of the model does not include reciprocal learning), and the image quality of the image 503 (the partial region image of the image 501) is improved, for example, in the super-resolution processing mode, an effect diagram of the image quality improvement of the image 503 is an image 507. As shown in fig. 5, the image 507 has improved sharpness compared to the image 503, but has limited effect, and it is still difficult to recognize the content in the image 507.
For example, in the super-resolution processing method provided in the present disclosure, a deep learning model (for example, refSR model) is trained, each training includes two super-division processes, the two super-division processes mutually promote to form reciprocal learning, and the second super-division process is directed to perspective transformed images of reference images of the first super-division process, the trained deep learning model performs super-resolution processing on the image 501, and image quality of the image 503 (partial region image of the image 501) is improved, for example, in the super-resolution processing method provided in the present disclosure, an effect map of image quality improvement of the image 503 is an image 508. As shown in fig. 5, image 508 has further improved sharpness compared to images 503, 506, and 507.
Fig. 6 is a flowchart of an image processing method according to one embodiment of the present disclosure.
As shown in fig. 6, the image processing method 600 may include operations S610 to S620.
In operation S610, a to-be-processed image and a second reference image are acquired.
In operation S620, the image to be processed and the second reference image are input into the deep learning model, and a super-resolution image of the image to be processed is obtained.
The deep learning model is obtained by training the training method of the deep learning model.
For example, the image to be processed is a low-resolution image, the second reference image is a high-resolution image having similar content to the image to be processed, and the super-resolution image of the high-quality image to be processed can be obtained by inputting the image to be processed and the second reference image into the deep learning model.
Fig. 7 is a block diagram of a training apparatus of a deep learning model according to one embodiment of the present disclosure.
As shown in fig. 7, the training apparatus 700 of the deep learning model includes a first superdivision module 701, a second superdivision module 702, a determination module 703, and an adjustment module 704.
The first superdivision module 701 is configured to input the first sample image and the first reference image into a deep learning model, so as to obtain a first superdivision image.
The second superscore module 702 is configured to input the first superscore image and the second sample image into a deep learning model, so as to obtain a second superscore image.
The determining module 703 is configured to determine a first loss according to the first super-division image and determine a second loss according to the second super-division image.
The adjustment module 704 is configured to adjust parameters of the deep learning model according to the first loss and the second loss.
According to an embodiment of the present disclosure, the training apparatus 700 of the deep learning model further includes a first processing module and a first transformation module.
The first processing module is used for carrying out resolution reduction processing on the first reference image to obtain a compressed image.
The first transformation module is used for performing perspective transformation on the compressed image to obtain a first transformation image serving as a second sample image.
The first transformation module includes a first determination unit, a second determination unit, a first movement unit, a third determination unit, and a first transformation unit.
The first determination unit is configured to determine two first regions having different ranges from each other centering on the vertex of the compressed image.
The second determination unit is configured to determine, as a first target area, an area between two first areas different in range from each other.
The first moving unit is used for moving the vertex of the compressed image to a first target area to obtain a first vertex.
The third determination unit is configured to determine a first transformation matrix from the vertices of the compressed image and the first vertices.
The first transformation unit is used for performing perspective transformation on the compressed image according to the first transformation matrix to obtain a first transformation image.
The training apparatus 700 of the deep learning model further comprises a second transformation module.
The second transformation module is used for performing perspective transformation on the first reference image to obtain a second transformation image;
the determining module 703 is configured to determine a second loss according to a difference between the second transformed image and the second super-resolution image.
According to an embodiment of the present disclosure, the second transformation module includes a fourth determination unit, a fifth determination unit, a second movement unit, a sixth determination unit, and a second transformation unit.
A fourth determining unit configured to determine two second regions having different ranges from each other centering on the vertex of the first reference image;
a fifth determining unit configured to determine, as a second target area, an area between two second areas different in range from each other;
the second moving unit is used for moving the vertex of the first reference image to a second target area to obtain a second vertex;
a sixth determining unit configured to determine a second transformation matrix according to the vertex of the first reference image and the second vertex; and
and the second transformation unit is used for performing perspective transformation on the first reference image according to the second transformation matrix to obtain a second transformation image.
Fig. 8 is a block diagram of an image processing apparatus according to one embodiment of the present disclosure.
As shown in fig. 8, the image processing 800 may include an acquisition module 801 and a second processing module 802.
The acquiring module 801 is configured to acquire an image to be processed and a second reference image.
The second processing module 802 is configured to input the image to be processed and the second reference image into a deep learning model, so as to obtain a super-resolution image of the image to be processed.
The deep learning model is obtained through training according to the training device of the deep learning model.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 9 shows a schematic block diagram of an example electronic device 900 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 9, the apparatus 900 includes a computing unit 901 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 902 or a computer program loaded from a storage unit 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the device 900 can also be stored. The computing unit 901, the ROM 902, and the RAM 903 are connected to each other by a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.
Various components in device 900 are connected to I/O interface 905, including: an input unit 906 such as a keyboard, a mouse, or the like; an output unit 907 such as various types of displays, speakers, and the like; a storage unit 908 such as a magnetic disk, an optical disk, or the like; and a communication unit 909 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunications networks.
The computing unit 901 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 901 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 901 performs the respective methods and processes described above, for example, a training method of a deep learning model and/or an image processing method. For example, in some embodiments, the training method and/or image processing method of the deep learning model may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 900 via the ROM 902 and/or the communication unit 909. When the computer program is loaded into the RAM 903 and executed by the computing unit 901, one or more steps of the training method and/or the image processing method of the deep learning model described above may be performed. Alternatively, in other embodiments, the computing unit 901 may be configured to perform the training method and/or the image processing method of the deep learning model by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (12)

1. A training method of a deep learning model, comprising:
inputting a first sample image and a first reference image into a deep learning model to obtain a first super-resolution image, wherein the first sample image and the first reference image have similar contents, and the resolution of the first sample image is lower than that of the first reference image;
inputting the first superscore image and the second sample image into the deep learning model to obtain a second superscore image;
determining a first loss according to the first superdivision image and determining a second loss according to the second superdivision image; and
adjusting parameters of the deep learning model according to the first loss and the second loss;
the method further comprises the steps of:
performing resolution reduction processing on the first reference image to obtain a compressed image; and
and performing perspective transformation on the compressed image to obtain a first transformation image serving as the second sample image.
2. The method of claim 1, wherein the perspective transforming the compressed image to obtain a first transformed image comprises:
determining two first areas with different ranges from each other by taking the vertex of the compressed image as the center;
determining a region between two first regions different in range from each other as a first target region;
moving the vertex of the compressed image to the first target area to obtain a first vertex;
determining a first transformation matrix according to the vertexes of the compressed image and the first vertexes; and
and performing perspective transformation on the compressed image according to the first transformation matrix to obtain the first transformation image.
3. The method of claim 1, further comprising:
performing perspective transformation on the first reference image to obtain a second transformed image;
the determining a second loss from the second superdivision image includes:
determining the second loss based on a difference between the second transformed image and the second hyper-resolution image.
4. A method according to claim 3, wherein said performing a perspective transformation on said first reference image to obtain a second transformed image comprises:
determining two second areas with different ranges from each other by taking the vertex of the first reference image as the center;
determining a region between two second regions different from each other in range as a second target region;
moving the vertex of the first reference image to the second target area to obtain a second vertex;
determining a second transformation matrix according to the vertexes of the first reference image and the second vertexes; and
and performing perspective transformation on the first reference image according to the second transformation matrix to obtain the second transformation image.
5. An image processing method, comprising:
acquiring an image to be processed and a second reference image; and
inputting the image to be processed and the second reference image into a deep learning model to obtain a super-resolution image of the image to be processed;
wherein the deep learning model is trained in accordance with the method of any one of claims 1 to 4.
6. A training device for a deep learning model, comprising:
the first super-division module is used for inputting a first sample image and a first reference image into the deep learning model to obtain a first super-division image, wherein the first sample image and the first reference image have similar contents, and the resolution of the first sample image is lower than that of the first reference image;
the second superdivision module is used for inputting the first superdivision image and the second sample image into the deep learning model to obtain a second superdivision image;
the determining module is used for determining a first loss according to the first superdivision image and determining a second loss according to the second superdivision image; and
the adjusting module is used for adjusting parameters of the deep learning model according to the first loss and the second loss;
the apparatus further comprises:
the first processing module is used for carrying out resolution reduction processing on the first reference image to obtain a compressed image;
and the first transformation module is used for performing perspective transformation on the compressed image to obtain a first transformation image serving as the second sample image.
7. The apparatus of claim 6, wherein the first transformation module comprises:
a first determination unit configured to determine two first regions having different ranges from each other with a vertex of the compressed image as a center;
a second determination unit configured to determine, as a first target area, an area between two first areas different in range from each other;
the first moving unit is used for moving the vertex of the compressed image to the first target area to obtain a first vertex;
a third determining unit configured to determine a first transformation matrix according to the vertex of the compressed image and the first vertex; and
and the first transformation unit is used for performing perspective transformation on the compressed image according to the first transformation matrix to obtain the first transformation image.
8. The apparatus of claim 6, further comprising:
the second transformation module is used for performing perspective transformation on the first reference image to obtain a second transformation image;
the determining module is configured to determine the second loss according to a difference between the second transformed image and the second super-resolution image.
9. The apparatus of claim 8, wherein the second transformation module comprises:
a fourth determining unit configured to determine two second regions having different ranges from each other with a vertex of the first reference image as a center;
a fifth determining unit configured to determine, as a second target area, an area between two second areas different in range from each other;
a second moving unit, configured to move the vertex of the first reference image to the second target area, to obtain a second vertex;
a sixth determining unit configured to determine a second transformation matrix according to the vertex of the first reference image and the second vertex; and
and the second transformation unit is used for performing perspective transformation on the first reference image according to the second transformation matrix to obtain the second transformation image.
10. An image processing apparatus comprising:
the acquisition module is used for acquiring the image to be processed and the second reference image; and
the second processing module is used for inputting the image to be processed and the second reference image into a deep learning model to obtain a super-resolution image of the image to be processed;
wherein the deep learning model is trained in accordance with the apparatus of any one of claims 6 to 9.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 5.
12. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1 to 5.
CN202210855822.2A 2022-07-15 2022-07-15 Training method, image processing method, device and equipment for deep learning model Active CN115147280B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210855822.2A CN115147280B (en) 2022-07-15 2022-07-15 Training method, image processing method, device and equipment for deep learning model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210855822.2A CN115147280B (en) 2022-07-15 2022-07-15 Training method, image processing method, device and equipment for deep learning model

Publications (2)

Publication Number Publication Date
CN115147280A CN115147280A (en) 2022-10-04
CN115147280B true CN115147280B (en) 2023-06-02

Family

ID=83411641

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210855822.2A Active CN115147280B (en) 2022-07-15 2022-07-15 Training method, image processing method, device and equipment for deep learning model

Country Status (1)

Country Link
CN (1) CN115147280B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113313633A (en) * 2021-06-25 2021-08-27 西安紫光展锐科技有限公司 Training method and device of hyper-division network model and electronic equipment
CN114627343A (en) * 2022-03-14 2022-06-14 北京百度网讯科技有限公司 Deep learning model training method, image processing method, device and equipment

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11869170B2 (en) * 2018-11-16 2024-01-09 Google Llc Generating super-resolution images using neural networks
US11521131B2 (en) * 2019-01-24 2022-12-06 Jumio Corporation Systems and methods for deep-learning based super-resolution using multiple degradations on-demand learning
CN111182254B (en) * 2020-01-03 2022-06-24 北京百度网讯科技有限公司 Video processing method, device, equipment and storage medium
CN113554047B (en) * 2020-04-24 2024-08-23 京东方科技集团股份有限公司 Training method of image processing model, image processing method and corresponding device
US20220138500A1 (en) * 2020-10-30 2022-05-05 Samsung Electronics Co., Ltd. Unsupervised super-resolution training data construction
CN113191495A (en) * 2021-03-26 2021-07-30 网易(杭州)网络有限公司 Training method and device for hyper-resolution model and face recognition method and device, medium and electronic equipment
CN113362229B (en) * 2021-07-06 2022-07-22 北京百度网讯科技有限公司 Training method of image processing model, image processing method, device and equipment
CN113888410A (en) * 2021-09-30 2022-01-04 北京百度网讯科技有限公司 Image super-resolution method, apparatus, device, storage medium, and program product
CN113628116B (en) * 2021-10-12 2022-02-11 腾讯科技(深圳)有限公司 Training method and device for image processing network, computer equipment and storage medium
CN114022359A (en) * 2021-11-03 2022-02-08 深圳大学 Image super-resolution model training method and device, storage medium and equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113313633A (en) * 2021-06-25 2021-08-27 西安紫光展锐科技有限公司 Training method and device of hyper-division network model and electronic equipment
CN114627343A (en) * 2022-03-14 2022-06-14 北京百度网讯科技有限公司 Deep learning model training method, image processing method, device and equipment

Also Published As

Publication number Publication date
CN115147280A (en) 2022-10-04

Similar Documents

Publication Publication Date Title
CN112785674B (en) Texture map generation method, rendering device, equipment and storage medium
US11436702B2 (en) Systems and methods for super-resolusion image reconstruction
JP2020507850A (en) Method, apparatus, equipment, and storage medium for determining the shape of an object in an image
US10402941B2 (en) Guided image upsampling using bitmap tracing
CN113688907B (en) A model training and video processing method, which comprises the following steps, apparatus, device, and storage medium
CN112967381B (en) Three-dimensional reconstruction method, apparatus and medium
CN110889809A (en) Image processing method and device, electronic device and storage medium
CN113393468A (en) Image processing method, model training device and electronic equipment
CN113313631A (en) Image rendering method and device
CN115147280B (en) Training method, image processing method, device and equipment for deep learning model
CN111784733A (en) Image processing method, device, terminal and computer readable storage medium
CN115174774B (en) Depth image compression method, device, equipment and storage medium
EP4227904A2 (en) Method and apparatus for determining image depth information, electronic device, and media
CN116309158A (en) Training method, three-dimensional reconstruction method, device, equipment and medium of network model
CN114549303B (en) Image display method, image processing method, image display device, image processing apparatus, image display device, image processing program, and storage medium
CN115375539A (en) Image resolution enhancement, multi-frame image super-resolution system and method
CN113920027A (en) Method for rapidly enhancing sequence image based on bidirectional projection
CN113470131B (en) Sea surface simulation image generation method and device, electronic equipment and storage medium
CN114119419B (en) Image processing method, device, electronic equipment and storage medium
CN116363331B (en) Image generation method, device, equipment and storage medium
CN118521720B (en) Virtual person three-dimensional model determining method and device based on sparse view angle image
CN114820908B (en) Virtual image generation method and device, electronic equipment and storage medium
CN115439331B (en) Corner correction method and generation method and device of three-dimensional model in meta universe
CN118864326A (en) Image correction method, device, electronic equipment and storage medium
CN117336619A (en) Color balance method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant