CN115147280B

CN115147280B - Training method, image processing method, device and equipment for deep learning model

Info

Publication number: CN115147280B
Application number: CN202210855822.2A
Authority: CN
Inventors: 李鑫; 张霖; 何栋梁; 李甫
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-07-15
Filing date: 2022-07-15
Publication date: 2023-06-02
Anticipated expiration: 2042-07-15
Also published as: CN115147280A

Abstract

The disclosure provides a training method of a deep learning model, relates to the technical field of artificial intelligence, and particularly relates to the technical fields of image processing, computer vision and deep learning. The specific implementation scheme is as follows: inputting the first sample image and the first reference image into a deep learning model to obtain a first superresolution image; inputting the first superscore image and the second sample image into a deep learning model to obtain a second superscore image; determining a first loss from the first superdivision image and a second loss from the second superdivision image; and adjusting parameters of the deep learning model according to the first loss and the second loss. The disclosure also provides an image processing method, an image processing device, an electronic device and a storage medium.

Description

Training method, image processing method, device and equipment for deep learning model

Technical Field

The present disclosure relates to the field of artificial intelligence, and more particularly to image processing, computer vision, and deep learning techniques. More specifically, the present disclosure provides a training method, an image processing method, an apparatus, an electronic device, and a storage medium for a deep learning model.

Background

The image super-resolution processing takes a low-resolution image as an input and takes an effective high-quality high-resolution image as a desired output. The image super-resolution processing may employ a deep learning implementation.

Disclosure of Invention

The disclosure provides a training method, an image processing device, equipment and a storage medium for a deep learning model.

According to a first aspect, there is provided a training method of a deep learning model, the method comprising: inputting the first sample image and the first reference image into a deep learning model to obtain a first superresolution image; inputting the first superscore image and the second sample image into a deep learning model to obtain a second superscore image; determining a first loss from the first superdivision image and a second loss from the second superdivision image; and adjusting parameters of the deep learning model according to the first loss and the second loss.

According to a second aspect, there is provided an image processing method comprising: acquiring an image to be processed and a second reference image; inputting the image to be processed and the second reference image into a deep learning model to obtain a super-resolution image of the image to be processed; the deep learning model is trained according to the training method of the deep learning model.

According to a third aspect, there is provided a training apparatus of a deep learning model, the apparatus comprising: the first superdivision module is used for inputting the first sample image and the first reference image into the deep learning model to obtain a first superdivision image; the second superdivision module is used for inputting the first superdivision image and the second sample image into the deep learning model to obtain a second superdivision image; the determining module is used for determining a first loss according to the first superdivision image and determining a second loss according to the second superdivision image; and the adjusting module is used for adjusting parameters of the deep learning model according to the first loss and the second loss.

According to a fourth aspect, there is provided an image processing apparatus comprising: the acquisition module is used for acquiring the image to be processed and the second reference image; the second processing module is used for inputting the image to be processed and the second reference image into the deep learning model to obtain a super-resolution image of the image to be processed; the deep learning model is obtained through training according to the training device of the deep learning model.

According to a fifth aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method provided in accordance with the present disclosure.

According to a sixth aspect, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a method provided according to the present disclosure.

According to a seventh aspect, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method provided according to the present disclosure.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a schematic diagram of a method of implementing image super-resolution processing using a deep learning model according to one embodiment of the related art;

FIG. 2A is a flow chart of a training method of a deep learning model according to one embodiment of the present disclosure;

FIG. 2B is a flow chart of a training method of a deep learning model according to another embodiment of the present disclosure;

3A-3C are schematic diagrams of a method of perspective transforming an image according to one embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a training method of a deep learning model according to one embodiment of the present disclosure;

FIG. 5 is an effect contrast graph of image super resolution processing according to one embodiment of the present disclosure;

FIG. 6 is a flow chart of an image processing method according to one embodiment of the present disclosure;

FIG. 7 is a block diagram of a training apparatus of a deep learning model according to one embodiment of the present disclosure;

fig. 8 is a block diagram of an image processing apparatus according to an embodiment of the present disclosure; and

fig. 9 is a block diagram of an electronic device of a training method and/or an image processing method of a deep learning model according to one embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The image super-resolution processing may employ a deep learning implementation. Deep learning-based implementations may include reference image-based image super-resolution processing, such as for low resolution image X _LR Given one and X _LR High resolution reference image Y with similar content _HR X is taken as _LR And Y _HR Inputting the deep learning model to obtain X _LR Super-resolution image X of (a) _SR . The deep learning model may be, for example, a Reference-based image Super-Resolution (RefSR) model.

Fig. 1 is a schematic diagram of a method of implementing image super-resolution processing using a deep learning model according to an embodiment of the related art.

As shown in fig. 1, the image 101 may be a low resolution image X to be subjected to image super resolution processing _LR Image 102 may be a high resolution reference image Y _HR . Image 101 has similar content as image 102, e.g., has a similar scene, a similar person, etc. Inputting image 101 and image 102 into RefSR model 110, refSR model 110 may output image 103, image 103 being super-resolution image X of image 101 _SR 。

The super-resolution processing of the image based on the reference image has the high-definition image with similar content as the reference, so that the super-resolution processing result of the low-resolution image is greatly improved compared with the traditional mode (such as an interpolation mode). However, in practical application, the super-resolution effect of the image super-resolution processing based on the reference frame is still limited, and it is difficult to satisfy the scene with higher definition requirement.

Therefore, the embodiment of the disclosure provides a training method of a deep learning model and a method for realizing image super-resolution processing by using the deep learning model, wherein each training process of the deep learning model comprises two times of super-resolution processing, the two times of super-resolution processing are mutually promoted, and the effect of the super-resolution processing is further improved.

In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.

In the technical scheme of the disclosure, the authorization or consent of the user is obtained before the personal information of the user is obtained or acquired.

Fig. 2A is a flow chart of a training method of a deep learning model according to one embodiment of the present disclosure.

As shown in fig. 2A, the training method 200 of the deep learning model may include operations S210 to S240.

In operation S210, a first sample image and a first reference image are input into a deep learning model, resulting in a first super-resolution image.

For example, the first sample image may be a low resolution image X _LR The first reference image may be a high resolution image Y _HR ，X _LR And Y _HR With similar content, for example with similar scenes or similar people. X is to be _LR And Y _HR A deep learning model (e.g., refSR model) is input, which may be utilized from Y _HR Is to supplement X with rich texture _LR Missing details, thereby outputting X _LR Super-resolution image X of (a) _SR (first supersplit image).

Wherein X is _SR Higher resolution than X _LR Resolution of X _SR Is also higher than X _LR Is a definition of (2).

In operation S220, the first super-resolution image and the second sample image are input into a deep learning model, resulting in a second super-resolution image.

It will be appreciated that if X _SR Is sufficiently high (e.g., resolution greater than 1024X 1000), then X _SR Can also be used as a new reference image for super-dividing a sum X _SR New low resolution images with similar content.

For example, the new low resolution image is a second sample image, which may beIs Y _HR Is a transform image of (a) is provided. This is due to X _SR And Y is equal to _HR With similar content, Y can be directly utilized _HR Of (e.g. Y) _HR Is included) to perform the second super-resolution processing without introducing a new data set.

For example, the second sample image may be Y _HR First, a low resolution process is carried out to obtain a low resolution compressed image Y _LR Then compress the image Y _LR First transformed image obtained by perspective transformation

For example, the second sample image

And a first superdivision image X _SR (reference image as second sample image) is input into RefSR model for secondary super-division processing, and RefSR model can be used from X _SR Is supplemented with rich texture->

The details of the deletion in (a) to output +.>

Super-resolution image +.>

(second supersplit image).

Perspective transformation is understood to mean the projection of an image onto a new viewing plane, in particular the transformation of the coordinates of the image in two dimensions into three dimensions, and the mapping of the coordinates of the image in three dimensions into another two dimensions (new viewing plane). The perspective transformation of the image can change the viewing angle of the image.

In operation S230, a first loss is determined from the first super-resolution image, and a second loss is determined from the second super-resolution image.

For example, the above-described two-time super-process procedureMay be synchronously optimized, the second superresolution image

The image quality (e.g. sharpness or resolution) of (a) depends on the first super-resolution image X _SR Thus the second super-processing will promote the first super-resolution image X _SR Image quality improvement, first super-resolution image X _SR Can promote the second super-resolution image +.>

The image quality is improved, so that the above two-time over-processing procedure can form reciprocity.

The penalty of the deep learning model (e.g., refSR model) may include two parts, the first part penalty being determined by the first super-processing result, e.g., the first part penalty L1 may be the first super-division image X _SR With the first sample image X _LR Is a supervisory image (e.g. X _HR ) And errors such as mean square error or cross entropy. The second partial loss is determined by the second super-processing result, for example, the second partial loss L2 may be the second super-resolution image

And second sample image->

Is a supervision image of (e.g.)>

) Error of mean square error or cross entropy, wherein the second sample image +.>

Is->

For example from a first reference image Y _HR And obtaining the image through perspective transformation.

In operation S240, parameters of the deep learning model are adjusted according to the first loss and the second loss.

For example, the sum of the two losses (L1+L2) can be used as the overall loss of the deep learning model to adjust the parameters of the deep learning model. The training process described above involving two superscore processes may be referred to as a reciprocal learning process. It will be appreciated that the overall loss of the deep learning model may also be obtained by weighting L1 and L2 according to the actual application, which is not limited by the present disclosure.

In the embodiment of the disclosure, each training process comprises two times of superdivision processing, the result of the first time of superdivision processing is used as a new reference image, the second time of superdivision processing is performed on the new low-resolution image, the two times of superdivision processing are mutually promoted, and the superdivision processing effect of the deep learning model is further improved.

Fig. 2B is a flow chart of a training method of a deep learning model according to another embodiment of the present disclosure.

As shown in fig. 2B, the training method of the deep learning model may include operations S210 to S240 and operations S201 to S202.

The specific implementation of operations S210 to S240 refers to the description of operations S210 to S240 as in fig. 2A.

Operations S201 to S202 are steps of generating a second sample image.

In operation S201, a resolution reduction process is performed on a first reference image to obtain a compressed image.

In operation S202, a perspective transformation is performed on the compressed image, resulting in a first transformed image as a second sample image.

For example, for a first reference image Y _HR Performing low resolution processing (such as compression operation) to obtain compressed image Y _LR Then compress the image Y _LR A first transformed image obtained by performing perspective transformation as a second sample image

It will be appreciated that the compressed image Y is not used directly _LR As the second sample image, Y is used _L R is subjected to perspective transformation to obtain a first transformed image

As the second sample image, this is because the reference image of the second super-division processing is X _SR (result of first super-processing) if Y is directly used _LR As a second sample image, the second super-resolution process is aimed at using X _SR To restore Y _LR (it is understood that X is used _SR To generate Y as much as possible _HR ) While the first superdivision process uses Y _HR Generating X _SR This results in a two-fold oversubscription process comprising the following two steps: step one, utilize Y _HR Generating X _SR Step two, utilizing X _SR To generate Y _HR Such a twice super-division process would be equivalent to a self-encoder for Y, X _SR As the reference image of the second step, in order to facilitate the generation of Y in the second step _HR Will reserve Y as much as possible in step one _HR This loses the first time the hyper-segmentation process is to recover X _LR To generate X _SR Meaning of (2).

According to an embodiment of the disclosure, the training method of the deep learning model further includes performing perspective transformation on the first reference image to obtain a second transformed image. Operation S230 includes determining a second loss according to a difference between the second transformed image and the second super-resolution image.

For example, the second-time super-processing results

The surveillance image of (2) may be +.>

May be made of a first reference image Y _HR And obtaining the image through perspective transformation. The second loss may be +.>

And->

Mean square error or cross entropy between them, etc.

The perspective transformation operation provided by the present disclosure is described in detail below with reference to fig. 3A to 3C.

Fig. 3A-3C are schematic diagrams of a method of perspective transforming an image according to one embodiment of the present disclosure.

As shown in fig. 3A, for image 310, a sub-image may be truncated from image 310 using a randomly selected solid rectangular box 301. The truncated sub-image 320 is shown in fig. 3B. A specific implementation of perspective transformation of the sub-image 320 is as follows.

For each vertex of the solid-line rectangular frame 301, two areas having different ranges from each other are determined centering on the vertex, and an area between the two areas can be regarded as the target area 303 (for example, a gray transparent area in fig. 3A), that is, four vertices of the solid-line rectangular frame 301, that is, four vertices of the sub-image 320. For example, two regions having different ranges from each other may be rectangular regions, and for each vertex of the solid rectangular frame 301, the vertex is taken as the origin, the coordinate ranges of the transverse axis and the longitudinal axis of the larger region of the two rectangular regions are [ -20,20], and the coordinate ranges of the transverse axis and the longitudinal axis of the smaller region are [ -5,5], and then the coordinate ranges of the transverse axis and the longitudinal axis of the target region 303 may be [ -20, -5] [5, 20 ]). The two regions that are different from each other in extent may also be regions of other shapes, such as annular regions. Similarly, the target areas 303 for the other three vertices may be determined. As shown in fig. 3A, each vertex of the solid rectangular frame 301 determines one target area 303, and four target areas 303 may be determined.

For each vertex of the solid line rectangular frame 301, a new vertex can be determined by randomly moving the vertex within the target area 303 determined by itself. For example, referring to fig. 3A, the vertices of the solid rectangular frame 301 move randomly within the target region 303, including the outer frame where the vertices must not exceed the gray transparent region, nor must they enter the white region. As shown in fig. 3A, each vertex of the solid rectangular box 301 may define a new vertex, and a total of four new vertices may be defined. Four new vertices may define a quadrilateral box, such as dashed box 302 shown in fig. 3A.

For example, a perspective transformation matrix may be determined based on the mapping relationship between the vertices of the solid rectangular box 301 and the vertices of the dashed box 302. From this perspective transformation matrix, the sub-image 320 determined by the solid rectangular box 301 may be perspective transformed, resulting in a perspective transformed image 330 of the sub-image 320, as shown in fig. 3C. For example, perspective transformation of the sub-image 320 may be achieved by multiplying the coordinates of each pixel point in the sub-image 320 with a perspective transformation matrix.

The above describes an example implementation of perspective transforming the sub-image 320, in embodiments of the present disclosure, the compressed image Y is required to be _LR Performing perspective transformation to obtain a first transformed image

As a second sample image. Thus, the image Y is compressed _LR I.e., four vertices that can correspond to the four vertices of the solid rectangular box 301 shown in fig. 3A, reference may be made to the specific implementation of perspective transformation of the sub-image 320 shown in fig. 3A.

Similarly, in embodiments of the present disclosure, it is desirable to store the first reference image Y _HR Performing perspective transformation to obtain a second transformed image

So as to utilize the second transformed image +.>

And a second superminute image->

The difference between them determines a second loss. First reference image Y _HR Performing perspective transformation to obtain a second transformed image +.>

Reference may also be made to the specific implementation of perspective transformation of the sub-image 320 as shown in fig. 3A, which is not described herein.

Fig. 4 is a schematic diagram of a training method of a deep learning model according to one embodiment of the present disclosure.

As shown in fig. 4, the image 401 may be a low resolution image X _LR The image 402 may be a high resolution first reference image Y _HR . Image 401 has similar content as image 402, e.g., has a similar scene, a similar person, etc. Inputting image 401 and image 402 into RefSR model 410, refSR model 410 may output image 403, image 403 being super-resolution image X of image 401 _SR 。

The image 404 may be a first reference image Y _HR First, a low resolution process is carried out to obtain a low resolution compressed image Y _LR Then compress the image Y _LR First transformed image obtained by perspective transformation

Image 403 may be used as a new reference image for image 404, image 403 and image 404 may be input to RefSR model 410, refSR model 410 may output image 405, image 405 being a super-resolution image of image 404

It will be appreciated that the two RefSR models 410 shown in fig. 4 are the same RefSR model, which is used to perform the above-described two super-processing steps.

The penalty of the RefSR model may include a first penalty 411 and a second penalty 412, the first penalty 411 may be determined by the first super-processing result image 403, for example the first penalty 411 may be a supervision image of the image 403 and the image 401 (e.g. X _HR ) And errors such as mean square error or cross entropy. The second loss 412 may be determined from the second-time-super-processing result image 405, for example, the second loss 412 may be a supervisory image (for example, a supervisory image of the image 405 and the image 404Such as

) Error of mean square error or cross entropy, wherein the supervision image of image 404 +.>

Such as from the perspective transformation of the image 402.

The first loss 411 and the second loss 412 may together be used as a loss of the deep learning model to adjust parameters of the deep learning model. Parameters of RefSR model 410 are adjusted, for example, according to the sum of first penalty 411 and second penalty 412.

According to the embodiment of the disclosure, each training process comprises two times of superdivision processing, the two times of superdivision processing mutually promote to form reciprocal learning, and the superdivision processing effect of the deep learning model can be improved. And in the above two-time super-processing, the second-time super-processing is directed to the perspective-transformed image

The second superdivision processing is not directly carried out on the compressed image YHR, so that the problem that the first superdivision processing effect is poor due to the fact that a self-encoder related to Y is formed by a deep learning model can be avoided, and the effect of the first superdivision processing is guaranteed.

Fig. 5 is an effect contrast diagram of image super-resolution processing according to an embodiment of the present disclosure.

As shown in fig. 5, the image 501 is a low resolution image (e.g., X _LR ) Image 502 is a high resolution reference image (e.g., Y _HR ). Image 503 is a partial area image of image 501, for example, a sub-image in a rectangular box in image 501. Image 504 is a partial region image in image 502, for example, a sub-image in a rectangular box in image 502. The position of the rectangular box in image 501 is the same as the position of the rectangular box in image 502, and the content in image 503 is blurred compared to the content in image 504.

Image 505 is a surveillance image of image 503. The super resolution processing of the image 501 can make the image quality of the image 503 (partial area image of the image 501) as high as possible to the image quality of the image 505. The super resolution processing of the image 501 in different ways can achieve different effects on the image quality of the image 503.

For example, in one super-resolution processing scheme, a deep learning model (e.g., refSR model) is trained, each training comprising two super-division processes that mutually promote forming reciprocal learning, but the second super-division process is directed to a compressed image (without performing perspective transformation operations) of the reference image of the first super-division process. The trained deep learning model performs super-resolution processing on the image 501, and the image quality of the image 503 (partial region image of the image 501) is improved, for example, in the super-resolution processing mode, an effect diagram of the image quality improvement of the image 503 is an image 506. As shown in fig. 5, image 506 has improved sharpness compared to image 503, but still has more pronounced shadows.

For example, in one super-resolution processing mode, the image 501 is super-resolved by using a conventional reference-based image super-resolution model (the training process of the model does not include reciprocal learning), and the image quality of the image 503 (the partial region image of the image 501) is improved, for example, in the super-resolution processing mode, an effect diagram of the image quality improvement of the image 503 is an image 507. As shown in fig. 5, the image 507 has improved sharpness compared to the image 503, but has limited effect, and it is still difficult to recognize the content in the image 507.

For example, in the super-resolution processing method provided in the present disclosure, a deep learning model (for example, refSR model) is trained, each training includes two super-division processes, the two super-division processes mutually promote to form reciprocal learning, and the second super-division process is directed to perspective transformed images of reference images of the first super-division process, the trained deep learning model performs super-resolution processing on the image 501, and image quality of the image 503 (partial region image of the image 501) is improved, for example, in the super-resolution processing method provided in the present disclosure, an effect map of image quality improvement of the image 503 is an image 508. As shown in fig. 5, image 508 has further improved sharpness compared to

images

503, 506, and 507.

Fig. 6 is a flowchart of an image processing method according to one embodiment of the present disclosure.

As shown in fig. 6, the image processing method 600 may include operations S610 to S620.

In operation S610, a to-be-processed image and a second reference image are acquired.

In operation S620, the image to be processed and the second reference image are input into the deep learning model, and a super-resolution image of the image to be processed is obtained.

The deep learning model is obtained by training the training method of the deep learning model.

For example, the image to be processed is a low-resolution image, the second reference image is a high-resolution image having similar content to the image to be processed, and the super-resolution image of the high-quality image to be processed can be obtained by inputting the image to be processed and the second reference image into the deep learning model.

Fig. 7 is a block diagram of a training apparatus of a deep learning model according to one embodiment of the present disclosure.

As shown in fig. 7, the training apparatus 700 of the deep learning model includes a first superdivision module 701, a second superdivision module 702, a determination module 703, and an adjustment module 704.

The first superdivision module 701 is configured to input the first sample image and the first reference image into a deep learning model, so as to obtain a first superdivision image.

The second superscore module 702 is configured to input the first superscore image and the second sample image into a deep learning model, so as to obtain a second superscore image.

The determining module 703 is configured to determine a first loss according to the first super-division image and determine a second loss according to the second super-division image.

The adjustment module 704 is configured to adjust parameters of the deep learning model according to the first loss and the second loss.

According to an embodiment of the present disclosure, the training apparatus 700 of the deep learning model further includes a first processing module and a first transformation module.

The first processing module is used for carrying out resolution reduction processing on the first reference image to obtain a compressed image.

The first transformation module is used for performing perspective transformation on the compressed image to obtain a first transformation image serving as a second sample image.

The first transformation module includes a first determination unit, a second determination unit, a first movement unit, a third determination unit, and a first transformation unit.

The first determination unit is configured to determine two first regions having different ranges from each other centering on the vertex of the compressed image.

The second determination unit is configured to determine, as a first target area, an area between two first areas different in range from each other.

The first moving unit is used for moving the vertex of the compressed image to a first target area to obtain a first vertex.

The third determination unit is configured to determine a first transformation matrix from the vertices of the compressed image and the first vertices.

The first transformation unit is used for performing perspective transformation on the compressed image according to the first transformation matrix to obtain a first transformation image.

The training apparatus 700 of the deep learning model further comprises a second transformation module.

The second transformation module is used for performing perspective transformation on the first reference image to obtain a second transformation image;

the determining module 703 is configured to determine a second loss according to a difference between the second transformed image and the second super-resolution image.

According to an embodiment of the present disclosure, the second transformation module includes a fourth determination unit, a fifth determination unit, a second movement unit, a sixth determination unit, and a second transformation unit.

A fourth determining unit configured to determine two second regions having different ranges from each other centering on the vertex of the first reference image;

a fifth determining unit configured to determine, as a second target area, an area between two second areas different in range from each other;

the second moving unit is used for moving the vertex of the first reference image to a second target area to obtain a second vertex;

a sixth determining unit configured to determine a second transformation matrix according to the vertex of the first reference image and the second vertex; and

and the second transformation unit is used for performing perspective transformation on the first reference image according to the second transformation matrix to obtain a second transformation image.

Fig. 8 is a block diagram of an image processing apparatus according to one embodiment of the present disclosure.

As shown in fig. 8, the image processing 800 may include an acquisition module 801 and a second processing module 802.

The acquiring module 801 is configured to acquire an image to be processed and a second reference image.

The second processing module 802 is configured to input the image to be processed and the second reference image into a deep learning model, so as to obtain a super-resolution image of the image to be processed.

The deep learning model is obtained through training according to the training device of the deep learning model.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.

Fig. 9 shows a schematic block diagram of an example electronic device 900 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 9, the apparatus 900 includes a computing unit 901 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 902 or a computer program loaded from a storage unit 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the device 900 can also be stored. The computing unit 901, the ROM 902, and the RAM 903 are connected to each other by a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.

Various components in device 900 are connected to I/O interface 905, including: an input unit 906 such as a keyboard, a mouse, or the like; an output unit 907 such as various types of displays, speakers, and the like; a storage unit 908 such as a magnetic disk, an optical disk, or the like; and a communication unit 909 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunications networks.

The computing unit 901 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 901 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 901 performs the respective methods and processes described above, for example, a training method of a deep learning model and/or an image processing method. For example, in some embodiments, the training method and/or image processing method of the deep learning model may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 900 via the ROM 902 and/or the communication unit 909. When the computer program is loaded into the RAM 903 and executed by the computing unit 901, one or more steps of the training method and/or the image processing method of the deep learning model described above may be performed. Alternatively, in other embodiments, the computing unit 901 may be configured to perform the training method and/or the image processing method of the deep learning model by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A training method of a deep learning model, comprising:

inputting a first sample image and a first reference image into a deep learning model to obtain a first super-resolution image, wherein the first sample image and the first reference image have similar contents, and the resolution of the first sample image is lower than that of the first reference image;

inputting the first superscore image and the second sample image into the deep learning model to obtain a second superscore image;

determining a first loss according to the first superdivision image and determining a second loss according to the second superdivision image; and

adjusting parameters of the deep learning model according to the first loss and the second loss;

the method further comprises the steps of:

performing resolution reduction processing on the first reference image to obtain a compressed image; and

and performing perspective transformation on the compressed image to obtain a first transformation image serving as the second sample image.

2. The method of claim 1, wherein the perspective transforming the compressed image to obtain a first transformed image comprises:

determining two first areas with different ranges from each other by taking the vertex of the compressed image as the center;

determining a region between two first regions different in range from each other as a first target region;

moving the vertex of the compressed image to the first target area to obtain a first vertex;

determining a first transformation matrix according to the vertexes of the compressed image and the first vertexes; and

and performing perspective transformation on the compressed image according to the first transformation matrix to obtain the first transformation image.

3. The method of claim 1, further comprising:

performing perspective transformation on the first reference image to obtain a second transformed image;

the determining a second loss from the second superdivision image includes:

determining the second loss based on a difference between the second transformed image and the second hyper-resolution image.

4. A method according to claim 3, wherein said performing a perspective transformation on said first reference image to obtain a second transformed image comprises:

determining two second areas with different ranges from each other by taking the vertex of the first reference image as the center;

determining a region between two second regions different from each other in range as a second target region;

moving the vertex of the first reference image to the second target area to obtain a second vertex;

determining a second transformation matrix according to the vertexes of the first reference image and the second vertexes; and

and performing perspective transformation on the first reference image according to the second transformation matrix to obtain the second transformation image.

5. An image processing method, comprising:

acquiring an image to be processed and a second reference image; and

inputting the image to be processed and the second reference image into a deep learning model to obtain a super-resolution image of the image to be processed;

wherein the deep learning model is trained in accordance with the method of any one of claims 1 to 4.

6. A training device for a deep learning model, comprising:

the first super-division module is used for inputting a first sample image and a first reference image into the deep learning model to obtain a first super-division image, wherein the first sample image and the first reference image have similar contents, and the resolution of the first sample image is lower than that of the first reference image;

the second superdivision module is used for inputting the first superdivision image and the second sample image into the deep learning model to obtain a second superdivision image;

the determining module is used for determining a first loss according to the first superdivision image and determining a second loss according to the second superdivision image; and

the adjusting module is used for adjusting parameters of the deep learning model according to the first loss and the second loss;

the apparatus further comprises:

the first processing module is used for carrying out resolution reduction processing on the first reference image to obtain a compressed image;

and the first transformation module is used for performing perspective transformation on the compressed image to obtain a first transformation image serving as the second sample image.

7. The apparatus of claim 6, wherein the first transformation module comprises:

a first determination unit configured to determine two first regions having different ranges from each other with a vertex of the compressed image as a center;

a second determination unit configured to determine, as a first target area, an area between two first areas different in range from each other;

the first moving unit is used for moving the vertex of the compressed image to the first target area to obtain a first vertex;

a third determining unit configured to determine a first transformation matrix according to the vertex of the compressed image and the first vertex; and

and the first transformation unit is used for performing perspective transformation on the compressed image according to the first transformation matrix to obtain the first transformation image.

8. The apparatus of claim 6, further comprising:

the determining module is configured to determine the second loss according to a difference between the second transformed image and the second super-resolution image.

9. The apparatus of claim 8, wherein the second transformation module comprises:

a fourth determining unit configured to determine two second regions having different ranges from each other with a vertex of the first reference image as a center;

a second moving unit, configured to move the vertex of the first reference image to the second target area, to obtain a second vertex;

and the second transformation unit is used for performing perspective transformation on the first reference image according to the second transformation matrix to obtain the second transformation image.

10. An image processing apparatus comprising:

the acquisition module is used for acquiring the image to be processed and the second reference image; and

the second processing module is used for inputting the image to be processed and the second reference image into a deep learning model to obtain a super-resolution image of the image to be processed;

wherein the deep learning model is trained in accordance with the apparatus of any one of claims 6 to 9.

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 5.

12. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1 to 5.