CN111429371A

CN111429371A - Image processing method and device and terminal equipment

Info

Publication number: CN111429371A
Application number: CN202010207474.9A
Authority: CN
Inventors: 贾玉虎
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2020-03-23
Filing date: 2020-03-23
Publication date: 2020-07-17
Anticipated expiration: 2040-03-23
Also published as: CN111429371B

Abstract

The present application is applicable to the technical field of image processing, and provides an image processing method, device and terminal device, including: acquiring an image to be reconstructed; extracting high-frequency components and low-frequency components of the to-be-reconstructed image to obtain a The first high-frequency image composed of the first high-frequency image and the first low-frequency image composed of the low-frequency components; the first high-frequency image is input into the trained high-frequency image generation network, and the second output of the high-frequency image generation network is obtained. high-frequency image; inputting the first low-frequency image into the trained low-frequency image generation network, to obtain a second low-frequency image output by the low-frequency image generation network; generating according to the second high-frequency image and the second low-frequency image The reconstructed image, the resolution of the reconstructed image is higher than the resolution of the image to be reconstructed. By the above method, a better image can be reconstructed.

Description

Image processing method, device and terminal device

技术领域technical field

本申请属于图像处理技术领域，尤其涉及图像处理方法、装置、终端设备及计算机可读存储介质。The present application belongs to the technical field of image processing, and in particular, relates to an image processing method, apparatus, terminal device, and computer-readable storage medium.

背景技术Background technique

图像分辨率是指传感器观察或测量最小物体的能力，这取决于像素大小。而作为二维信号记录的数字图像，在大多数应用中总是需要具有较高的分辨率。在过去的几十年中，成像技术得到了迅速发展，并且图像分辨率达到了一个新的水平，但许多应用中仍希望能够更好地提升图像分辨率。例如，数字监控产品往往在某种程度上选择牺牲图像分辨率，以保证记录设备的长期稳定运行和动态场景的适当帧速率；在遥感领域也存在类似的情况：在空间，光谱和图像分辨率之间总是存在权衡；而对于医学成像，如何在降低辐射的同时用第一图像提取人体结构的三维模型仍然是一个挑战。Image resolution refers to the ability of a sensor to see or measure the smallest objects, and it depends on pixel size. And digital images, recorded as two-dimensional signals, always require higher resolution in most applications. Imaging technology has advanced rapidly over the past few decades, and image resolution has reached a new level, but there are still many applications where better image resolution is desired. For example, digital surveillance products often choose to sacrifice image resolution to some extent to ensure long-term stable operation of recording equipment and appropriate frame rates for dynamic scenes; a similar situation exists in remote sensing: spatial, spectral and image resolution There is always a trade-off; and for medical imaging, how to extract a 3D model of the human body structure with the first image while reducing radiation remains a challenge.

目前，可通过超分辨率方法提高图像的分辨率。Currently, the resolution of images can be increased by super-resolution methods.

现有的超分辨率方法大致可以分为两类:传统的插值方法和基于深度学习的方法。传统的插值算法已经发展有几十年了,但在单张图像超分辨率(Single Image Super-Resolution,SISR)领域,其效果远不如深度学习的方法。因此,许多新算法都是用数据驱动的深度学习模型来重建图像所需的细节,以获得精确的超分辨率。Existing super-resolution methods can be roughly divided into two categories: traditional interpolation methods and deep learning-based methods. Traditional interpolation algorithms have been developed for decades, but in the field of Single Image Super-Resolution (SISR), their effects are far inferior to deep learning methods. Therefore, many new algorithms use data-driven deep learning models to reconstruct the required details of images for accurate super-resolution.

目前基于深度学习方法的SISR的改进大多都是针对模型结构或者训练方法，这些方法针对仿真得到的低分辨率(Low Resolution,LR)图像进行超分辨率时效果很好，但当处理真实获取的第二图像时,效果就会显著降低，比如目前的SOTA(state of the art,最先进水平)的超分辨率模型ESRGAN(X.Wang et.al.,"ESRGAN:Enhanced Super-ResolutionGenerative Adversarial Networks"in ECCV 2018workshop)在处理智能手机获得的图像时,重建后有明显的细节过拟合和模糊,效果甚至不如低分辨率图本身(如图1(a)为原图，图1(b)为超分后的图)。这是因为，当处理类似智能手机获取的第二图像时,由于本身含有大量的噪声和图像处理过程中产生的伪影,而基于深度学习的模型结构会把很多的噪声和伪影当成是细节来增强,因此导致超分辨率之后的图像质量反而会降低,所以我们需要在超分辨处理的源头就先尽量区分需要增强的细节和图像噪声及后处理过程中的伪影,以此提高超分辨率重建后的图像质量。At present, most of the SISR improvements based on deep learning methods are aimed at the model structure or training methods. These methods work well for super-resolution of low-resolution (LR) images obtained by simulation, but when dealing with real acquired images When the second image is used, the effect will be significantly reduced, such as the current SOTA (state of the art, the most advanced level) super-resolution model ESRGAN (X.Wang et.al., "ESRGAN: Enhanced Super-ResolutionGenerative Adversarial Networks" in ECCV 2018workshop) when processing the image obtained by the smartphone, there is obvious over-fitting and blurring of details after reconstruction, and the effect is not even as good as the low-resolution image itself (Fig. 1(a) is the original image, and Fig. 1(b) is the original image. image after over-score). This is because, when processing the second image obtained by a similar smartphone, it contains a lot of noise and artifacts generated in the process of image processing, and the model structure based on deep learning will treat a lot of noise and artifacts as details. Therefore, the image quality after super-resolution will be reduced, so we need to distinguish the details that need to be enhanced from image noise and artifacts in the post-processing process at the source of super-resolution processing, so as to improve super-resolution image quality after reconstruction.

故，需要提出一种新的技术方案以解决上述技术问题。Therefore, it is necessary to propose a new technical solution to solve the above-mentioned technical problems.

发明内容SUMMARY OF THE INVENTION

本申请实施例提供了图像处理方法，可以解决现有方法在重建图像后所产生的伪细节过多的问题。The embodiments of the present application provide an image processing method, which can solve the problem of excessive pseudo details generated by the existing method after reconstructing an image.

第一方面，本申请实施例提供了一种图像处理方法，包括：In a first aspect, an embodiment of the present application provides an image processing method, including:

获取待重建的图像；Get the image to be reconstructed;

提取所述待重建的图像的高频分量和低频分量，得到由所述高频分量组成的第一高频图像和由所述低频分量组成的第一低频图像，其中，所述高频分量是指大于或等于预设频率阈值的频率分量，所述低频分量是指小于所述预设频率阈值的频率分量；Extracting high-frequency components and low-frequency components of the image to be reconstructed, to obtain a first high-frequency image composed of the high-frequency components and a first low-frequency image composed of the low-frequency components, wherein the high-frequency components are Refers to a frequency component greater than or equal to a preset frequency threshold, and the low-frequency component refers to a frequency component less than the preset frequency threshold;

将所述第一高频图像输入训练后的高频图像生成网络，得到所述高频图像生成网络输出的第二高频图像；Inputting the first high-frequency image into the trained high-frequency image generation network to obtain a second high-frequency image output by the high-frequency image generation network;

将所述第一低频图像输入训练后的低频图像生成网络，得到所述低频图像生成网络输出的第二低频图像；Inputting the first low-frequency image into the trained low-frequency image generation network to obtain a second low-frequency image output by the low-frequency image generation network;

根据所述第二高频图像和所述第二低频图像生成重建后的图像，所述重建后的图像的分辨率高于所述待重建的图像的分辨率。A reconstructed image is generated according to the second high frequency image and the second low frequency image, and the resolution of the reconstructed image is higher than the resolution of the image to be reconstructed.

第二方面，本申请实施例提供了一种图像处理装置，包括：In a second aspect, an embodiment of the present application provides an image processing apparatus, including:

待重建的图像获取单元，用于获取待重建的图像；an image acquisition unit to be reconstructed, used for acquiring the image to be reconstructed;

高低频图像提取单元，用于提取所述待重建的图像的高频分量和低频分量，得到由所述高频分量组成的第一高频图像和由所述低频分量组成的第一低频图像，其中，所述高频分量是指大于或等于预设频率阈值的频率分量，所述低频分量是指小于所述预设频率阈值的频率分量；a high- and low-frequency image extraction unit, configured to extract high-frequency components and low-frequency components of the image to be reconstructed, to obtain a first high-frequency image composed of the high-frequency components and a first low-frequency image composed of the low-frequency components, Wherein, the high-frequency component refers to a frequency component greater than or equal to a preset frequency threshold, and the low-frequency component refers to a frequency component less than the preset frequency threshold;

第二高频图像生成单元，用将所述第一高频图像输入训练后的高频图像生成网络，得到所述高频图像生成网络输出的第二高频图像；The second high-frequency image generation unit is used to input the first high-frequency image into a trained high-frequency image generation network to obtain a second high-frequency image output by the high-frequency image generation network;

第二低频图像生成单元，用于将所述第一低频图像输入训练后的低频图像生成网络，得到所述低频图像生成网络输出的第二低频图像；A second low-frequency image generation unit, configured to input the first low-frequency image into a trained low-frequency image generation network to obtain a second low-frequency image output by the low-frequency image generation network;

图像重建单元，用于根据所述第二高频图像和所述第二低频图像生成重建后的图像，所述重建后的图像的分辨率高于所述待重建的图像的分辨率。An image reconstruction unit, configured to generate a reconstructed image according to the second high-frequency image and the second low-frequency image, where the resolution of the reconstructed image is higher than the resolution of the image to be reconstructed.

第三方面，本申请实施例提供了一种终端设备，包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现如第一方面所述的方法。In a third aspect, an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, when the processor executes the computer program A method as described in the first aspect is implemented.

第四方面，本申请实施例提供了一种计算机可读存储介质，所述计算机可读存储介质存储有计算机程序，所述计算机程序被处理器执行时实现如第一方面所述的方法。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the method according to the first aspect is implemented.

第五方面，本申请实施例提供了一种计算机程序产品，当计算机程序产品在终端设备上运行时，使得终端设备执行上述第一方面所述的方法。In a fifth aspect, an embodiment of the present application provides a computer program product that, when the computer program product runs on a terminal device, enables the terminal device to execute the method described in the first aspect.

本申请实施例与现有技术相比存在的有益效果是：The beneficial effects that the embodiments of the present application have compared with the prior art are:

由于将图像的高频分量和低频分量分开处理，因此能够在细节被增强的同时压制噪声和伪影，从而能够重建出噪声和伪影更少，细节更清晰的超分图像。Since the high-frequency components and low-frequency components of the image are processed separately, noise and artifacts can be suppressed while details are enhanced, so that a super-resolution image with less noise and artifacts and clearer details can be reconstructed.

附图说明Description of drawings

为了更清楚地说明本申请实施例中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍。In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings that are required to be used in the description of the embodiments or the prior art.

图1(b)和图1(a)是现有技术提供的重建后的图像与原图的比较示意图；Figure 1(b) and Figure 1(a) are schematic diagrams comparing the reconstructed image provided by the prior art with the original image;

图2是本申请实施例一提供的一种图像处理方法的流程图；2 is a flowchart of an image processing method provided in Embodiment 1 of the present application;

图3是本申请实施例一提供的另一种图像处理方法的流程图；3 is a flowchart of another image processing method provided in Embodiment 1 of the present application;

图4是本申请实施例一提供的一种生成网络训练方法的流程示意图；4 is a schematic flowchart of a method for generating network training provided in Embodiment 1 of the present application;

图5是本申请实施例一提供的一种高频图像生成网络的结构示意图；5 is a schematic structural diagram of a high-frequency image generation network provided in Embodiment 1 of the present application;

图6是本申请实施例一提供的一种待训练的判别网络的结构示意图；6 is a schematic structural diagram of a discriminant network to be trained provided by Embodiment 1 of the present application;

图7是本申请实施例一提供的另一种生成网络训练方法的流程示意图；7 is a schematic flowchart of another method for generating network training provided in Embodiment 1 of the present application;

图8是本申请实施例二提供的一种图像处理装置的结构框图；8 is a structural block diagram of an image processing apparatus provided in Embodiment 2 of the present application;

图9是本申请实施例三提供的一种终端设备的结构示意图。FIG. 9 is a schematic structural diagram of a terminal device according to Embodiment 3 of the present application.

具体实施方式Detailed ways

以下描述中，为了说明而不是为了限定，提出了诸如特定系统结构、技术之类的具体细节，以便透彻理解本申请实施例。然而，本领域的技术人员应当清楚，在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中，省略对众所周知的系统、装置、电路以及方法的详细说明，以免不必要的细节妨碍本申请的描述。In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are set forth in order to provide a thorough understanding of the embodiments of the present application. However, it will be apparent to those skilled in the art that the present application may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

应当理解，当在本申请说明书和所附权利要求书中使用时，术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在，但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It is to be understood that, when used in this specification and the appended claims, the term "comprising" indicates the presence of the described feature, integer, step, operation, element and/or component, but does not exclude one or more other The presence or addition of features, integers, steps, operations, elements, components and/or sets thereof.

还应当理解，在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合，并且包括这些组合。It will also be understood that, as used in this specification and the appended claims, the term "and/or" refers to and including any and all possible combinations of one or more of the associated listed items.

如在本申请说明书和所附权利要求书中所使用的那样，术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地，短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。As used in the specification of this application and the appended claims, the term "if" may be contextually interpreted as "when" or "once" or "in response to determining" or "in response to detecting ". Similarly, the phrases "if it is determined" or "if the [described condition or event] is detected" may be interpreted, depending on the context, to mean "once it is determined" or "in response to the determination" or "once the [described condition or event] is detected. ]" or "in response to detection of the [described condition or event]".

另外，在本申请说明书和所附权利要求书的描述中，术语“第一”、“第二”、“第三”、“第四”等仅用于区分描述，而不能理解为指示或暗示相对重要性。In addition, in the description of the specification of the present application and the appended claims, the terms "first", "second", "third", "fourth", etc. are only used to distinguish the description, and should not be construed as indicating or implying relative importance.

在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此，在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例，而是意味着“一个或多个但不是所有的实施例”，除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”，除非是以其他方式另外特别强调。References in this specification to "one embodiment" or "some embodiments" and the like mean that a particular feature, structure or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in other embodiments," etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless specifically emphasized otherwise. The terms "including", "including", "having" and their variants mean "including but not limited to" unless specifically emphasized otherwise.

实施例一：Example 1:

目前，基于深度学习方法的图像超分辨改进通常是针对模型结构或者训练方法，这些方法针对仿真得到的第二图像进行超分辨率处理时，效果很好，但当处理真实获取的第二图像时，效果就会显著降低。为了解决该技术问题，本申请在训练高频图像生成网络和低频图像生成网络时，所采用的训练样本都是直接选择终端设备采集的真实的第二图像和真实的第一图像，以保证训练后的高频图像生成网络和训练后的低频图像生成网络对真实的第二图像执行超分处理时的稳健性。此外，本申请在训练高频图像生成网络和低频图像生成网络时，是通过分别提取第二图像和第一图像的高频分量和低频分量，再对由高频分量组成的高频图像和由低频分量组成的低频图像进行处理来实现的，由于终端设备(如智能手机)采集的真实的第二图像，后处理所产生的伪影以及经过图像压缩留下的噪声大部分集中在图像低频区，真正需要被增强的细节集中在图像的高频区，因此，通过将图像的高频分量和低频分量分开处理，能够在细节被增强的同时压制噪声和伪影。即，根据训练后的高频图像生成网络生成的高频图像和根据训练后的低频图像生成网络生成低频图像，能够重建出噪声和伪影更少，细节更清晰的超分图像。At present, the improvement of image super-resolution based on deep learning methods is usually aimed at the model structure or training method. When these methods perform super-resolution processing on the second image obtained by simulation, the effect is very good, but when processing the second image obtained in real , the effect will be significantly reduced. In order to solve this technical problem, when training the high-frequency image generation network and the low-frequency image generation network, the training samples used in this application are all the real second images and real first images collected by the terminal device directly to ensure the training. The robustness of the post-high-frequency image generation network and the trained low-frequency image generation network when performing superdivision processing on the real second image. In addition, when training the high-frequency image generation network and the low-frequency image generation network, the present application extracts the high-frequency components and low-frequency components of the second image and the first image respectively, and then compares the high-frequency image composed of high-frequency components and the It is realized by processing the low-frequency image composed of low-frequency components. Due to the real second image collected by the terminal device (such as a smart phone), most of the artifacts generated by post-processing and the noise left by image compression are concentrated in the low-frequency area of the image. , the details that really need to be enhanced are concentrated in the high-frequency region of the image. Therefore, by processing the high-frequency and low-frequency components of the image separately, noise and artifacts can be suppressed while the details are enhanced. That is, the high-frequency image generated by the trained high-frequency image generation network and the low-frequency image generated by the trained low-frequency image generation network can reconstruct a super-segmented image with less noise and artifacts and clearer details.

下面详细介绍如何采用训练后的生成网络对待重建的图像进行重建，其中，该生成网络包括高频图像生成网络和低频图像生成网络。The following describes in detail how to use the trained generation network to reconstruct the image to be reconstructed, wherein the generation network includes a high-frequency image generation network and a low-frequency image generation network.

图2示出了本申请实施例提供的一种图像处理方法的流程图，详述如下：FIG. 2 shows a flowchart of an image processing method provided by an embodiment of the present application, and details are as follows:

步骤S21，获取待重建的图像；Step S21, acquiring the image to be reconstructed;

本实施例中，获取终端设备采集的待重建的图像，该待重建的图像的分辨率较低。具体地，可将终端设备(如手机)拍摄或截屏得到的图像作为待重建的图像，或者，将其他终端设备拍摄并传输过来的图像作为待重建的图像。In this embodiment, the image to be reconstructed collected by the terminal device is acquired, and the resolution of the image to be reconstructed is relatively low. Specifically, the image to be reconstructed may be taken as the image to be reconstructed by taking or screenshot of the terminal device (such as a mobile phone), or the image to be reconstructed may be taken and transmitted by other terminal devices.

步骤S22，提取所述待重建的图像的高频分量和低频分量，得到由所述高频分量组成的第一高频图像和由所述低频分量组成的第一低频图像，其中，所述高频分量是指大于或等于预设频率阈值的频率分量，所述低频分量是指小于所述预设频率阈值的频率分量；Step S22, extracting high-frequency components and low-frequency components of the image to be reconstructed, and obtaining a first high-frequency image composed of the high-frequency components and a first low-frequency image composed of the low-frequency components, wherein the high-frequency components are The frequency component refers to a frequency component greater than or equal to a preset frequency threshold, and the low-frequency component refers to a frequency component less than the preset frequency threshold;

具体地，通过滤波器提取待重建的图像的高频分量和低频分量。例如，采用低通滤波器(如高斯低通滤波器)提取待重建的图像的低频分量，再将待重建的图像与低频分量相减，得到该待重建的图像的高频分量。在一些实施例中，也可以采用平均滤波器提取该待重建的图像的高频分量和低频分量。Specifically, high-frequency components and low-frequency components of the image to be reconstructed are extracted through filters. For example, a low-pass filter (such as a Gaussian low-pass filter) is used to extract the low-frequency component of the image to be reconstructed, and then the low-frequency component of the image to be reconstructed is subtracted to obtain the high-frequency component of the image to be reconstructed. In some embodiments, an averaging filter can also be used to extract high frequency components and low frequency components of the image to be reconstructed.

若采用高斯低通滤波器提取低频分量，则可通过以下方式提取：If a Gaussian low-pass filter is used to extract the low-frequency components, it can be extracted in the following ways:

X_D＝w_L*X (1)X _D = w _L *X (1)

其中，“*”表示卷积操作，w_L表示高斯低通核，X表示待重建的图像，X_D表示X对应的低频分量。该w_L的核可采用3x3，也可以采用5x5，作为一个示例，当w_L为5x5时，Among them, "*" represents the convolution operation, w _L represents the Gaussian low-pass kernel, X represents the image to be reconstructed, and X _D represents the low-frequency component corresponding to X. The core of w _L can be 3x3 or 5x5. As an example, when w _L is 5x5,

在提取了低频分量之后，通过原图与低频分量相减即得到对应的高频分量：After the low-frequency components are extracted, the corresponding high-frequency components are obtained by subtracting the original image from the low-frequency components:

X_H＝X-X_D (3)X _H =XX _D (3)

其中，X_H表示X的高频分量。Among them, X _H represents the high frequency component of X.

步骤S23，将所述第一高频图像输入训练后的高频图像生成网络，得到所述高频图像生成网络输出的第二高频图像；Step S23, inputting the first high-frequency image into a trained high-frequency image generation network to obtain a second high-frequency image output by the high-frequency image generation network;

步骤S24，将所述第一低频图像输入训练后的低频图像生成网络，得到所述低频图像生成网络输出的第二低频图像；Step S24, inputting the first low-frequency image into a trained low-frequency image generation network to obtain a second low-frequency image output by the low-frequency image generation network;

步骤S25，根据所述第二高频图像和所述第二低频图像生成重建后的图像，所述重建后的图像的分辨率高于所述待重建的图像的分辨率。Step S25, generating a reconstructed image according to the second high-frequency image and the second low-frequency image, where the resolution of the reconstructed image is higher than the resolution of the image to be reconstructed.

具体地，将第二高频图像对应地数据作为重建后的图像的高频数据，将第二低频图像对应的数据作为该重建后的图像的低频数据，从而得到重建后的图像。Specifically, the data corresponding to the second high-frequency image is used as the high-frequency data of the reconstructed image, and the data corresponding to the second low-frequency image is used as the low-frequency data of the reconstructed image, so as to obtain the reconstructed image.

本申请实施例中，由于将图像的高频分量和低频分量分开处理，因此，能够在细节被增强的同时压制噪声和伪影，从而能够重建出噪声和伪影更少，细节更清晰的超分图像。In the embodiment of the present application, since the high-frequency components and low-frequency components of the image are processed separately, noise and artifacts can be suppressed while details are enhanced, so that an ultra-high-frequency image with less noise and artifacts and clearer details can be reconstructed. split image.

图3示出了本申请实施例提供的另一种图像处理方法的流程图，在本实施例中，为了降低网络的运行消耗，提高处理速度，只对待重建的图像的Y通道的数据进行处理，详述如下：FIG. 3 shows a flowchart of another image processing method provided by an embodiment of the present application. In this embodiment, in order to reduce the running consumption of the network and improve the processing speed, only the data of the Y channel of the image to be reconstructed is processed. , detailed as follows:

步骤S31，获取待重建的图像；Step S31, acquiring the image to be reconstructed;

步骤S32，将所述待重建的图像的颜色空间转换为YUV颜色空间，提取所述待重建的图像的Y通道的高频分量和低频分量，得到由Y通道的高频分量组成的第一高频图像，以及由Y通道的低频分量组成的第一低频图像，其中，所述高频分量是指大于或等于预设频率阈值的频率分量，所述低频分量是指小于所述预设频率阈值的频率分量。Step S32, convert the color space of the image to be reconstructed into the YUV color space, extract the high-frequency component and the low-frequency component of the Y channel of the image to be reconstructed, and obtain the first high-frequency component composed of the high-frequency components of the Y channel. frequency image, and a first low-frequency image composed of low-frequency components of the Y channel, wherein the high-frequency components refer to frequency components greater than or equal to a preset frequency threshold, and the low-frequency components refer to frequencies less than the preset frequency threshold frequency components.

本实施例中，待重建的图像的颜色空间通常为红绿蓝(RGB)颜色空间，为了降低处理过程对图像色彩的影响，将该RGB颜色空间转换为亮度、色度(YUV)颜色空间。在转换为YUV颜色空间后，提取该YUV颜色空间的Y通道，若采用NV21标准参数，则根据下面的公式(4)提取出Y通道：In this embodiment, the color space of the image to be reconstructed is usually a red, green and blue (RGB) color space. In order to reduce the influence of the processing process on the image color, the RGB color space is converted into a luminance and chrominance (YUV) color space. After converting to YUV color space, extract the Y channel of the YUV color space. If the NV21 standard parameters are used, the Y channel is extracted according to the following formula (4):

该步骤中，根据公式(4)提取待重建的图像的Y通道，得到该Y通道的数据所对应的灰度图像，再根据上述公式(1)～(3)提取该灰度图像对应的低频分量和高频分量，进而得到由提取的低频分量组成的第一低频图像和由对应的高频分量组成的第一高频图像。In this step, the Y channel of the image to be reconstructed is extracted according to formula (4) to obtain a grayscale image corresponding to the data of the Y channel, and then the low frequency corresponding to the grayscale image is extracted according to the above formulas (1) to (3). components and high-frequency components, and then obtain a first low-frequency image composed of the extracted low-frequency components and a first high-frequency image composed of the corresponding high-frequency components.

步骤S33，将所述第一高频图像输入训练后的高频图像生成网络，得到所述高频图像生成网络输出的第二高频图像；Step S33, inputting the first high-frequency image into a trained high-frequency image generation network to obtain a second high-frequency image output by the high-frequency image generation network;

步骤S34，将所述第一低频图像输入训练后的低频图像生成网络，得到所述低频图像生成网络输出的第二低频图像；Step S34, inputting the first low-frequency image into a trained low-frequency image generation network to obtain a second low-frequency image output by the low-frequency image generation network;

步骤S35，根据所述第二高频图像和所述第二低频图像生成重建后的图像，所述重建后的图像的分辨率高于所述待重建的图像的分辨率。Step S35, generating a reconstructed image according to the second high-frequency image and the second low-frequency image, where the resolution of the reconstructed image is higher than the resolution of the image to be reconstructed.

本申请实施例中，由于只提取待重建的图像的Y通道的数据，因此，减少了高频图像生成网络和低频图像生成网络所需处理的数据量，因此，能够有效降低网络的运行消耗，提高处理速度。并且，由于将RGB颜色空间转换为YUV颜色空间，因此，能够避免直接对RGB三通道的数据进行处理所产生的偏色的问题。In the embodiment of the present application, since only the data of the Y channel of the image to be reconstructed is extracted, the amount of data that needs to be processed by the high-frequency image generation network and the low-frequency image generation network is reduced, so the operation consumption of the network can be effectively reduced, Improve processing speed. In addition, since the RGB color space is converted into the YUV color space, the problem of color cast caused by directly processing the data of three RGB channels can be avoided.

下面将介绍本申请实施例提供的生成网络训练方法，其中，这里的生成网络包括高频图像生成网络和低频图像生成网络，详见图4：The generation network training method provided by the embodiments of the present application will be introduced below, wherein the generation network here includes a high-frequency image generation network and a low-frequency image generation network, as shown in Figure 4:

步骤S41，获取训练样本集合，所述训练样本集合存储的是：终端设备采集的第一图像和与所述第一图像一一对应的第二图像，其中，所述第一图像的分辨率高于所述第二图像的分辨率；Step S41, acquiring a training sample set, where the training sample set stores: a first image collected by a terminal device and a second image corresponding to the first image one-to-one, wherein the first image has a high resolution at the resolution of the second image;

其中，该终端设备包括：手机、相机、平板电脑等设备。Among them, the terminal equipment includes: mobile phone, camera, tablet computer and other equipment.

本实施例中，训练样本集合所存储的第一图像和第二图像都是终端设备采集的真实的图像，即保证了用于训练的样本具有真实的噪声。其中，第一图像和第二图像的对应关系可在确定后存入映射表，之后通过查找映射表就能确定与第一图像对应的第二图像，或者，能够确定与第二图像对应的第一图像。为了使得训练后的高频图像生成网络和低频图像生成网络更具有普适性，需要尽量丰富样本的多样性，比如，使得样本既包括背景丰富的图像又包括背景单一的图像；既包括光线充足的图像又包括光线不足的图像；既包括曝光正常的图像，又包括曝光不正常的图像等等。In this embodiment, both the first image and the second image stored in the training sample set are real images collected by the terminal device, that is, it is ensured that the samples used for training have real noise. The corresponding relationship between the first image and the second image can be stored in the mapping table after being determined, and then the second image corresponding to the first image can be determined by looking up the mapping table, or the first image corresponding to the second image can be determined. an image. In order to make the trained high-frequency image generation network and low-frequency image generation network more universal, it is necessary to enrich the diversity of samples as much as possible. For example, the samples include both images with rich backgrounds and images with single backgrounds; The images that are exposed include images with insufficient light; both images with normal exposure, images with abnormal exposure, and so on.

另外，为了使得第一图像包含的图像内容和与其对应的第二图像包含的内容差异更小，则需要保证终端设备在获取该第一图像的时间和获取对应的第二图像的时间的时间差足够小。In addition, in order to make the difference between the image content contained in the first image and the content contained in the corresponding second image smaller, it is necessary to ensure that the time difference between the time when the terminal device acquires the first image and the time when the corresponding second image is acquired is sufficient. Small.

步骤S42，从所述训练样本集合获取第一图像，提取所述第一图像的高频分量组成的第三高频图像和所述第一图像的低频分量组成的第三低频图像；Step S42, obtaining a first image from the training sample set, and extracting a third high-frequency image composed of high-frequency components of the first image and a third low-frequency image composed of low-frequency components of the first image;

其中，所述高频分量是指大于或等于预设频率阈值的频率分量，所述低频分量是指小于所述预设频率阈值的频率分量。The high frequency components refer to frequency components greater than or equal to a preset frequency threshold, and the low frequency components refer to frequency components less than the preset frequency threshold.

具体地，通过滤波器提取第一图像的高频分量和低频分量。例如，采用低通滤波器(如高斯低通滤波器)提取第一图像的低频分量，再将第一图像与低频分量相减，得到该第一图像的高频分量。Specifically, high frequency components and low frequency components of the first image are extracted through a filter. For example, a low-pass filter (such as a Gaussian low-pass filter) is used to extract the low-frequency component of the first image, and then the first image and the low-frequency component are subtracted to obtain the high-frequency component of the first image.

步骤S43，从所述训练样本集合获取与所述第一图像对应的第二图像，提取所述第二图像的高频分量组成的第四高频图像和所述第二图像的低频分量组成的第四低频图像；Step S43: Obtain a second image corresponding to the first image from the training sample set, and extract a fourth high-frequency image composed of high-frequency components of the second image and a low-frequency component of the second image. the fourth low frequency image;

其中，第二图像的低频分量和高频分量的提取与上述步骤中第一图像的低频分量和高频分量的提取类似，此处不再赘述。The extraction of the low-frequency component and the high-frequency component of the second image is similar to the extraction of the low-frequency component and the high-frequency component of the first image in the above steps, and details are not described herein again.

步骤S44，将所述第四高频图像输入待训练的高频图像生成网络，得到所述待训练的高频图像生成网络输出的第五高频图像；Step S44, inputting the fourth high-frequency image into the high-frequency image generation network to be trained, to obtain the fifth high-frequency image output by the high-frequency image generation network to be trained;

步骤S45，若所述待训练的高频图像生成网络的损失函数的输出值的变化在预设范围内，则停止对所述待训练的高频图像生成网络的训练，得到训练后的高频图像生成网络，否则，调整所述待训练的高频图像生成网络的参数，并返回执行将所述第四高频图像输入待训练的高频图像生成网络的步骤以及后续步骤，其中，所述待训练的高频图像生成网络的损失函数包括待训练的判别网络，所述待训练的判别网络通过所述待训练的判别网络的损失函数进行训练，且所述待训练的判别网络的损失函数的输入包括所述第三高频图像和所述第五高频图像；Step S45, if the change of the output value of the loss function of the high-frequency image generation network to be trained is within a preset range, stop the training of the high-frequency image generation network to be trained, and obtain the high-frequency image after training. image generation network, otherwise, adjust the parameters of the high-frequency image generation network to be trained, and return to perform the step of inputting the fourth high-frequency image into the high-frequency image generation network to be trained and subsequent steps, wherein the The loss function of the high-frequency image generation network to be trained includes the discriminant network to be trained, the discriminant network to be trained is trained by the loss function of the discriminant network to be trained, and the loss function of the discriminant network to be trained The input includes the third high-frequency image and the fifth high-frequency image;

其中，高频图像生成网络与判别网络构成生成对抗网络系统，即高频图像生成网络与判别网络相互训练。Among them, the high-frequency image generation network and the discriminant network constitute a generative adversarial network system, that is, the high-frequency image generation network and the discriminant network are mutually trained.

本实施例中，若待训练的高频图像生成网络的损失函数的相邻输出值的绝对值差值在预设范围内(该预设范围为较小的范围，比如，为[0，5])，则表明该损失函数达到收敛状态，此时，停止对待训练的高频图像生成网络的训练，即将最后一次训练过的高频图像生成网络作为训练后的高频图像生成网络。当然，若损失函数的相邻输出值的绝对值差值不在预设范围内，则返回执行步骤S42、步骤S43以及步骤S43之后的步骤。In this embodiment, if the absolute value difference between the adjacent output values of the loss function of the high-frequency image generation network to be trained is within a preset range (the preset range is a small range, for example, it is [0, 5 ]), it indicates that the loss function has reached a state of convergence. At this time, the training of the high-frequency image generation network to be trained is stopped, that is, the last trained high-frequency image generation network is used as the trained high-frequency image generation network. Of course, if the absolute value difference between the adjacent output values of the loss function is not within the preset range, return to step S42, step S43 and steps after step S43.

在一些实施例中，待训练的高频图像生成网络的损失函数包括3部分：表示内容损失的L_content，表示视觉差异的L_perceptual和语义损失的L_texture，如公式(5)～(7)所示。当然，若待训练的高频图像生成网络的损失函数包括上述列举的3部分，那么损失函数的输出值的变化在预设范围内可以指：3部分的变化的和在预设范围内；也可以指任一部分的变化都在预设范围内，比如L_content的输出值的变化在预设范围内、L_perceptual的输出值的变化在预设范围内、以及L_texture的变化在预设范围内。In some embodiments, the loss function of the high-frequency image generation network to be trained includes three parts: L _content , which represents the content loss, L _perceptual , which represents the visual difference, and L _texture , which represents the semantic loss, as shown in formulas (5)-(7) shown. Of course, if the loss function of the high-frequency image generation network to be trained includes the three parts listed above, then the change of the output value of the loss function within the preset range may refer to: the sum of the changes of the three parts is within the preset range; or It can mean that the change of any part is within the preset range, such as the change of the output value of L _content is within the preset range, the change of the output value of L _perceptual is within the preset range, and the change of L _texture is within the preset range .

其中G_H为待训练的高频图像生成网络，x_H ⁽ⁱ⁾表示第i个第四高频图像，

表示第i个第五高频图像，m表示一批第四高频图像的个数，GT_H ⁽ⁱ⁾表示第i个第三高频图像，“||||”为L1范数。需要指出的是，在一些实施例中，该L1范数也可以换为L2范数。where G _H is the high-frequency image generation network to be trained, x _H ⁽ⁱ⁾ represents the i-th fourth high-frequency image,

represents the ith fifth high-frequency image, m represents the number of fourth high-frequency images in a batch, GT _H ⁽ⁱ⁾ represents the ith third high-frequency image, and “||||” is the L1 norm. It should be noted that, in some embodiments, the L1 norm may also be replaced by the L2 norm.

L_perceptual代表高频图像生成网络的输入图像和输出图像在视觉上的差异，其中φ()表示图像经过VGG网络(Justin Johnson et.al.，′Perceptual Losses for Real-TimeStyle Transfer and Super-Resolution″in ECCV 2016)得到的特征(比如为relu3_3层的特征，或者为relu4_3层的特征，或者为relu2_2层的特征，或者为relu1_2层的特征等)，C、H、W分别表示x_H ⁽ⁱ⁾的通道数、高度和宽度。L _perceptual represents the visual difference between the input image and the output image of the high-frequency image generation network, where φ() represents the image passing through the VGG network (Justin Johnson et.al., 'Perceptual Losses for Real-TimeStyle Transfer and Super-Resolution' in ECCV 2016) (such as the features of the relu3_3 layer, or the features of the relu4_3 layer, or the features of the relu2_2 layer, or the features of the relu1_2 layer, etc.), C, H, W represent x _H ⁽ⁱ⁾ number of channels, height and width.

在一些实施例中，上述VGG网络得到的特征也可以通过学习感知图像修补相似度(Learned Perceptual Image Patch Similarity，LPIPS)得到。In some embodiments, the features obtained by the above-mentioned VGG network can also be obtained by learning Perceptual Image Patch Similarity (Learned Perceptual Image Patch Similarity, LPIPS).

L_texture表示经过待训练的判别网络(D_H)判断生成的第五高频图像的分数，越接近真实图像GT_H的图像特征，得到的分数就越高。因为经过待训练的判别网络得到的是一张特征图，所以单张图的语义损失的损失函数(loss)是特征图平均的结果，即公式中mean()表示对特征图取平均。L _texture represents the score of the fifth high-frequency image generated by the discriminant network (D _H ) to be trained. The closer it is to the image features of the real image GT _H , the higher the obtained score. Because the discriminant network to be trained obtains a feature map, the loss function (loss) of the semantic loss of a single image is the result of the average of the feature maps, that is, mean() in the formula means to average the feature maps.

用于训练待训练的判别网络的损失函数(loss)L_D：The loss function (loss)LD used to train the discriminative network to be _trained :

步骤S46，将所述第四低频图像输入待训练的低频图像生成网络，得到所述待训练的低频图像生成网络输出的第五低频图像；Step S46, inputting the fourth low-frequency image into a low-frequency image generation network to be trained, to obtain a fifth low-frequency image output by the low-frequency image generation network to be trained;

步骤S47，若所述待训练的低频图像生成网络的损失函数的输出值的变化在预设范围内，则停止对所述待训练的低频图像生成网络的训练，得到训练后的低频图像生成网络，否则，调整所述待训练的低频图像生成网络的参数，并返回执行步骤S46和步骤S47，其中，所述待训练的低频图像生成网络的损失函数的输入包括所述第三低频图像和所述第五低频图像。Step S47, if the change of the output value of the loss function of the low-frequency image generation network to be trained is within a preset range, stop the training of the low-frequency image generation network to be trained, and obtain the low-frequency image generation network after training , otherwise, adjust the parameters of the low-frequency image generation network to be trained, and return to step S46 and step S47, wherein the input of the loss function of the low-frequency image generation network to be trained includes the third low-frequency image and the The fifth low-frequency image is described.

其中，待训练的低频图像生成网络的损失函数包括：表示内容损失的L′_content和表示视觉差异的L′_perceptual，其中，L′_content和L′_perceptual的定义参考上述的公式(5)和公式(6)。Wherein, the loss function of the low-frequency image generation network to be trained includes: L' _content representing the content loss and L' _perceptual representing the visual difference, wherein the definitions of L' _content and L' _perceptual refer to the above formula (5) and formula (6).

本申请实施例中，由于训练高频图像生成网络以及低频图像生成网络的样本都是终端设备采集的真实的图像，因此，使得训练后的高频图像生成网络以及训练后的低频图像生成网络所输出的图像更准确，更符合要求。另外，由于在训练高频图像生成网络和低频图像生成网络时，是通过分别提取图像的高频分量和低频分量，再对由高频分量组成的高频图像和由低频分量组成的低频图像进行处理来实现的，而由于终端设备(如智能手机)采集的图像后处理所产生的伪影以及经过图像压缩留下的噪声大部分集中在图像低频区，真正需要被增强的细节集中在图像的高频区，因此，通过将图像的高频分量和低频分量分开处理，能够在细节被增强的同时压制噪声和伪影。即，根据训练后的高频图像生成网络生成的高频图像和根据训练后的低频图像生成网络生成低频图像，能够重建出噪声和伪影更少，细节更清晰的超分图像。In the embodiment of the present application, since the samples for training the high-frequency image generation network and the low-frequency image generation network are real images collected by the terminal device, the trained high-frequency image generation network and the trained low-frequency image generation network are all real images. The output image is more accurate and more in line with the requirements. In addition, when training the high-frequency image generation network and the low-frequency image generation network, the high-frequency components and low-frequency components of the image are extracted respectively, and then the high-frequency image composed of high-frequency components and the low-frequency image composed of low-frequency components are processed. However, due to the artifacts generated by the post-processing of the image collected by the terminal device (such as a smartphone) and the noise left by image compression, most of the noise is concentrated in the low-frequency area of the image, and the details that really need to be enhanced are concentrated in the image. The high frequency region, therefore, can suppress noise and artifacts while detail is enhanced by processing the high frequency and low frequency components of the image separately. That is, the high-frequency image generated by the trained high-frequency image generation network and the low-frequency image generated by the trained low-frequency image generation network can reconstruct a super-segmented image with less noise and artifacts and clearer details.

图5示出了本申请实施例提供的训练后的高频图像生成网络的结构图，在图5中，所述训练后的高频图像生成网络包括：第一特征提取层、第一特征转换层、放大层以及第一融合层，其中M表示第一特征转换层具有M层。第二高频图像分别经过第一特征提取层、第一特征转换层、放大层和第一融合层之后，将得到第三高频图像。FIG. 5 shows a structure diagram of a trained high-frequency image generation network provided by an embodiment of the present application. In FIG. 5 , the trained high-frequency image generation network includes: a first feature extraction layer, a first feature conversion layer layer, magnification layer, and first fusion layer, where M indicates that the first feature conversion layer has M layers. After the second high-frequency image passes through the first feature extraction layer, the first feature conversion layer, the amplification layer, and the first fusion layer, the third high-frequency image will be obtained.

具体地，第一特征提取层用于提取特征；第一特征转换层用于组合提取的特征；放大层用于放大组合后的特征；第一融合层用于将放大后的至少2个通道的特征融合为一张图像输出。Specifically, the first feature extraction layer is used to extract features; the first feature conversion layer is used to combine the extracted features; the amplification layer is used to amplify the combined features; the first fusion layer is used to The features are fused into one image output.

其中，所述放大层通过插值算法实现放大功能，所述第一特征提取层、所述第一特征转换层以及所述第一融合层均通过卷积算法实现对应的功能。由于将插值算法和卷积算法结合，因此，使得得到的第三高频图像能够减少出现棋盘格噪声的现象。Wherein, the amplification layer realizes the amplification function through an interpolation algorithm, and the first feature extraction layer, the first feature conversion layer and the first fusion layer all realize the corresponding functions through a convolution algorithm. Since the interpolation algorithm and the convolution algorithm are combined, the third high-frequency image obtained can reduce the phenomenon of checkerboard noise.

在一些实施例中，所述第一特征提取层至少包括2层，且后一层的通道数大于前一层的通道数；所述第一特征转换层至少包括2层，且每一层的通道数保持不变；所述第一特征提取层的卷积核和所述第一融合层的卷积核均大于所述第一特征转换层的卷积核，且所述第一特征提取层、所述第一融合层以及所述第一特征转换层对应的图像缩放比例均大于或等于1。In some embodiments, the first feature extraction layer includes at least two layers, and the number of channels in the latter layer is greater than the number of channels in the previous layer; the first feature conversion layer includes at least two layers, and the number of channels in each layer is greater than that in the previous layer. The number of channels remains unchanged; the convolution kernel of the first feature extraction layer and the convolution kernel of the first fusion layer are both larger than the convolution kernel of the first feature conversion layer, and the first feature extraction layer , the image scaling ratios corresponding to the first fusion layer and the first feature conversion layer are all greater than or equal to 1.

由于所述第一特征提取层的卷积核和所述第一融合层的卷积核均大于所述第一特征转换层的卷积核，因此，使得得到的第三高频图像的视野更大，细节更真实。此外，由于本申请是需要对图像进行超分处理，因此，需要保证图像缩放比例不能小于1。Since both the convolution kernel of the first feature extraction layer and the convolution kernel of the first fusion layer are larger than the convolution kernel of the first feature conversion layer, the obtained third high-frequency image has a wider field of view. Larger, more realistic details. In addition, since the application needs to perform super-score processing on the image, it is necessary to ensure that the image scaling ratio cannot be less than 1.

其中，第一特征提取层是用较大的感受野的卷积核来提取图像的低级特征的过程，这些低级特征包括梯度、亮度、大小关系等。为了提取出更多的细节，第一特征提取层的层数限定至少包括2层。假设第一特征提取层有2层，则第2层的通道数多于第1层的通道数。下面表1以一个示例说明第一特征提取层的具体参数。Among them, the first feature extraction layer is a process of extracting low-level features of an image by using a convolution kernel with a larger receptive field, and these low-level features include gradients, brightness, and size relationships. In order to extract more details, the number of layers of the first feature extraction layer is limited to include at least 2 layers. Assuming that the first feature extraction layer has 2 layers, the number of channels in the second layer is more than the number of channels in the first layer. Table 1 below illustrates the specific parameters of the first feature extraction layer with an example.

表1：Table 1:

其中，第一特征转换层是将第一特征提取层提取的低级特征经过非线性组合得到高级特征的过程，这些高级特征包括结构、形状等。层数越多，特征的非线性程度越高，就越能够表达更加复杂图像结构，越有利于提高重建的图像的真实性。由于卷积核越小，其处理速度越快，而第一特征转换层的主要目的是将得到低级特征进行组合，因此，无需通过更多的通道额外获取更多的信息，因此，设定第一特征转换层选取卷积核小的卷积，大小不变(x1)，通道数也不变，且重复M次。这里我们选M＝16，具体的参数可以参考表2。The first feature conversion layer is a process of nonlinearly combining the low-level features extracted by the first feature extraction layer to obtain high-level features, and these high-level features include structure, shape, and the like. The more layers, the higher the degree of nonlinearity of the feature, the more complex the image structure can be expressed, and the more beneficial to improve the authenticity of the reconstructed image. Since the smaller the convolution kernel is, the faster its processing speed is, and the main purpose of the first feature conversion layer is to combine the obtained low-level features, so there is no need to obtain more information through more channels. Therefore, set the first A feature conversion layer selects a convolution with a small convolution kernel, the size is unchanged (x1), the number of channels is also unchanged, and it is repeated M times. Here we choose M=16, and the specific parameters can refer to Table 2.

表2：Table 2:

其中，放大层并不是卷积层，而是插值放大层，其将第一特征转换层得到的特征的宽和高放大到所需的比例，比如，放大2倍(x2)，第一融合层紧跟在放大层之后，目的是将多个通道的特征融合为一个通道的一张图像输出，这两层的参数可参考表3。Among them, the enlargement layer is not a convolution layer, but an interpolation enlargement layer, which enlarges the width and height of the feature obtained by the first feature conversion layer to the required ratio, for example, enlarges 2 times (x2), the first fusion layer Immediately after the amplification layer, the purpose is to fuse the features of multiple channels into one image output of one channel. The parameters of the two layers can be referred to in Table 3.

表3：table 3:

需要指出的是，待训练的低频图像生成网络的结构与待训练的高频图像生成网络的结构类似，具体可参考图5。It should be pointed out that the structure of the low-frequency image generation network to be trained is similar to the structure of the high-frequency image generation network to be trained. For details, please refer to FIG. 5 .

在一些实施例中，所述待训练的判别网络包括：用于提取特征的第二特征提取层、用于组合提取的特征的第二特征转换层，以及用于将至少2个通道的特征融合为一张图像输出的第二融合层，且所述第二特征提取层、第二特征转换层以及第二融合层对应的图像缩放比例均小于1。In some embodiments, the discriminative network to be trained comprises: a second feature extraction layer for extracting features, a second feature transformation layer for combining the extracted features, and a feature fusion for at least 2 channels is the second fusion layer output from an image, and the image scaling ratios corresponding to the second feature extraction layer, the second feature conversion layer, and the second fusion layer are all less than 1.

具体地，图6示出了待训练的判别网络的结构示意图，在图6中，第一高频图像和第三高频图像分别输入待训练的判别网络中，经过第二特征提取层、第二特征转换层以及第二融合层之后，输入该待训练的判别网络的损失函数中进行比较，以确定是否继续对该待训练的判别网络进行训练。Specifically, Fig. 6 shows a schematic diagram of the structure of the discriminant network to be trained. In Fig. 6, the first high-frequency image and the third high-frequency image are respectively input into the discriminant network to be trained. After the second feature conversion layer and the second fusion layer, input the loss function of the discriminant network to be trained for comparison to determine whether to continue training the discriminant network to be trained.

需要指出的是，第二特征提取层、第二特征转换层和第二融合层的功能分别与第一特征提取层、第一特征转换层和第一融合层的功能类似，此处不再赘述。It should be pointed out that the functions of the second feature extraction layer, the second feature conversion layer and the second fusion layer are respectively similar to the functions of the first feature extraction layer, the first feature conversion layer and the first fusion layer, and will not be repeated here. .

本实施例中，由于待训练的判别网络主要是分辨经过待训练的高频图像生成网络生成的第五高频图像与第三高频图像，因此，其不需要获取更多对构建图像真实性有利的信息，即不需要为第二特征提取层、第二特征转换层设置更多的层数，这样也有利于提高运算速度。此外，设置第二特征提取层、第二特征转换层以及第二融合层对应的图像缩放比例均小于1也有利于提高运算速度。In this embodiment, since the discriminant network to be trained mainly distinguishes the fifth high-frequency image and the third high-frequency image generated by the high-frequency image generation network to be trained, it does not need to obtain more information on the authenticity of the constructed image. The favorable information is that it is not necessary to set more layers for the second feature extraction layer and the second feature conversion layer, which is also beneficial to improve the operation speed. In addition, setting the image scaling ratios corresponding to the second feature extraction layer, the second feature conversion layer, and the second fusion layer to be less than 1 is also beneficial to improve the operation speed.

具体地，第二特征提取层的层数可设置为1层，该第二特征提取层的参数可参考表4。Specifically, the number of layers of the second feature extraction layer can be set to 1 layer, and Table 4 can be referred to for parameters of the second feature extraction layer.

表4：Table 4:

第二特征转换层的层数可设置为2层，该第二特征转换层的参数可参考表5。The number of layers of the second feature conversion layer can be set to 2, and the parameters of the second feature conversion layer can refer to Table 5.

表5：table 5:

第二融合层紧跟在第二特征转换层之后，其具体参数可参考表6：The second fusion layer follows the second feature conversion layer, and its specific parameters can refer to Table 6:

表6：Table 6:

图7示出了本申请实施例提供的另一种生成网络的训练方法流程图。FIG. 7 shows a flowchart of another method for training a generative network provided by an embodiment of the present application.

在图7中，第一图片的分辨率高于第二图片的分辨率。第一图片将经过颜色空间转换操作，转换为YUV颜色空间，并提取Y通道的数据，再通过低通滤波器提取该Y通道中的高频分量和低频分量，得到由该高频分量组成的第三高频图片和由该低频分量组成的第三低频图片。In FIG. 7, the resolution of the first picture is higher than the resolution of the second picture. The first picture will be converted into YUV color space through the color space conversion operation, and the data of the Y channel will be extracted, and then the high-frequency components and low-frequency components in the Y channel will be extracted through a low-pass filter to obtain the high-frequency components. A third high frequency picture and a third low frequency picture composed of the low frequency component.

第二图片同样经过颜色空间转换操作，转换为YUV颜色空间，并提取Y通道的数据，再通过低通滤波器提取该Y通道中的高频分量和低频分量，得到由该高频分量组成的第四高频图片和由该低频分量组成的第四低频图片。该第四高频图片经过待训练的高频图像生成网络，得到该待训练的高频图像生成网络输出的第五高频图片。该第四低频图片经过待训练的低频图像生成网络，得到该待训练的低频图像生成网络输出的第五低频图片。The second picture is also converted into YUV color space through the color space conversion operation, and the data of the Y channel is extracted, and then the high-frequency component and the low-frequency component in the Y channel are extracted through a low-pass filter to obtain the high-frequency component. A fourth high frequency picture and a fourth low frequency picture composed of the low frequency component. The fourth high-frequency image is passed through the high-frequency image generation network to be trained to obtain a fifth high-frequency image output by the high-frequency image generation network to be trained. The fourth low-frequency image is passed through the low-frequency image generation network to be trained to obtain a fifth low-frequency image output by the low-frequency image generation network to be trained.

将第三低频图片和第五低频图片作为上述公式(5)和公式(6)的输入，计算低频图像生成网络的内容损失和视觉差异，并根据计算结果判断是否继续对该低频图像生成网络进行训练。Taking the third low-frequency picture and the fifth low-frequency picture as the input of the above formula (5) and formula (6), calculate the content loss and visual difference of the low-frequency image generation network, and judge whether to continue the low-frequency image generation network according to the calculation results. train.

将第三高频图片和第五高频图片作为上述公式(5)、公式(6)、公式(7)以及公式(8)的输入，计算高频图像生成网络的内容损失、视觉差异、语义损失，以及计算判别网络的损失函数的输出值，根据计算的内容损失、视觉差异、语义损失以及判别网络的损失函数的输出值判断是否继续对该高频图像生成网络进行训练。Taking the third and fifth high-frequency pictures as the input of the above formula (5), formula (6), formula (7) and formula (8), calculate the content loss, visual difference, semantics of high-frequency image generation network Loss, and calculate the output value of the loss function of the discriminant network, and determine whether to continue training the high-frequency image generation network according to the calculated content loss, visual difference, semantic loss, and the output value of the loss function of the discriminant network.

实施例二：Embodiment 2:

对应于上文实施例一所述的图像处理方法，图8示出了本申请实施例提供的图像处理装置的结构框图，为了便于说明，仅示出了与本申请实施例相关的部分。Corresponding to the image processing method described in Embodiment 1 above, FIG. 8 shows a structural block diagram of an image processing apparatus provided by an embodiment of the present application. For convenience of description, only parts related to the embodiment of the present application are shown.

参照图8，该图像处理装置8包括：待重建的图像获取单元81、高低频图像提取单元82、第二高频图像生成单元83、第二低频图像生成单元84、图像重建单元85，其中：8 , the image processing device 8 includes: an image acquisition unit 81 to be reconstructed, a high and low frequency image extraction unit 82, a second high frequency image generation unit 83, a second low frequency image generation unit 84, and an image reconstruction unit 85, wherein:

待重建的图像获取单元81，用于获取待重建的图像；an image acquisition unit 81 to be reconstructed, configured to acquire an image to be reconstructed;

高低频图像提取单元82，用于提取所述待重建的图像的高频分量和低频分量，得到由所述高频分量组成的第一高频图像和由所述低频分量组成的第一低频图像，其中，所述高频分量是指大于或等于预设频率阈值的频率分量，所述低频分量是指小于所述预设频率阈值的频率分量；High and low frequency image extraction unit 82, configured to extract high frequency components and low frequency components of the image to be reconstructed, and obtain a first high frequency image composed of the high frequency components and a first low frequency image composed of the low frequency components , wherein the high-frequency component refers to a frequency component greater than or equal to a preset frequency threshold, and the low-frequency component refers to a frequency component less than the preset frequency threshold;

第二高频图像生成单元83，用将所述第一高频图像输入训练后的高频图像生成网络，得到所述高频图像生成网络输出的第二高频图像；The second high-frequency image generation unit 83 is used to input the first high-frequency image into a trained high-frequency image generation network to obtain a second high-frequency image output by the high-frequency image generation network;

第二低频图像生成单元84，用于将所述第一低频图像输入训练后的低频图像生成网络，得到所述低频图像生成网络输出的第二低频图像；The second low-frequency image generation unit 84 is configured to input the first low-frequency image into a trained low-frequency image generation network to obtain a second low-frequency image output by the low-frequency image generation network;

图像重建单元85，用于根据所述第二高频图像和所述第二低频图像生成重建后的图像，所述重建后的图像的分辨率高于所述待重建的图像的分辨率。The image reconstruction unit 85 is configured to generate a reconstructed image according to the second high-frequency image and the second low-frequency image, where the resolution of the reconstructed image is higher than the resolution of the image to be reconstructed.

在一些实施例中，为了降低网络的运行消耗，提高处理速度，只对图像的Y通道的数据进行处理，此时，所述高低频图像提取单元82，具体用于：In some embodiments, in order to reduce the running consumption of the network and improve the processing speed, only the data of the Y channel of the image is processed. At this time, the high and low frequency image extraction unit 82 is specifically used for:

将所述待重建的图像的颜色空间转换为YUV颜色空间，提取所述待重建的图像的Y通道的高频分量和低频分量，得到由Y通道的高频分量组成的第一高频图像，以及由Y通道的低频分量组成的第一低频图像。Converting the color space of the image to be reconstructed into the YUV color space, extracting the high-frequency component and the low-frequency component of the Y channel of the image to be reconstructed, to obtain the first high-frequency image composed of the high-frequency component of the Y channel, and a first low frequency image consisting of the low frequency components of the Y channel.

需要说明的是，上述装置/单元之间的信息交互、执行过程等内容，由于与本申请方法实施例基于同一构思，其具体功能及带来的技术效果，具体可参见方法实施例部分，此处不再赘述。It should be noted that the information exchange, execution process and other contents between the above-mentioned devices/units are based on the same concept as the method embodiments of the present application. For specific functions and technical effects, please refer to the method embodiments section. It is not repeated here.

实施例三：Embodiment three:

图9为本申请实施例三提供的终端设备的结构示意图。如图9所示，该实施例的终端设备9包括：至少一个处理器90(图9中仅示出一个处理器)、存储器91以及存储在所述存储器91中并可在所述至少一个处理器90上运行的计算机程序92，所述处理器90执行所述计算机程序92时实现上述任意各个方法实施例中的步骤：FIG. 9 is a schematic structural diagram of a terminal device according to Embodiment 3 of the present application. As shown in FIG. 9 , the terminal device 9 in this embodiment includes: at least one processor 90 (only one processor is shown in FIG. 9 ), a memory 91 , and a memory 91 that is stored in the memory 91 and can be processed in the at least one processor. The computer program 92 running on the processor 90, when the processor 90 executes the computer program 92, implements the steps in any of the foregoing method embodiments:

获取待重建的图像；Get the image to be reconstructed;

所述终端设备9可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。该终端设备可包括，但不仅限于，处理器90、存储器91。本领域技术人员可以理解，图9仅仅是终端设备9的举例，并不构成对终端设备9的限定，可以包括比图示更多或更少的部件，或者组合某些部件，或者不同的部件，例如还可以包括输入输出设备、网络接入设备等。The terminal device 9 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The terminal device may include, but is not limited to, a processor 90 and a memory 91 . Those skilled in the art can understand that FIG. 9 is only an example of the terminal device 9, and does not constitute a limitation on the terminal device 9. It may include more or less components than the one shown, or combine some components, or different components , for example, may also include input and output devices, network access devices, and the like.

所称处理器90可以是中央处理单元(Central Processing Unit，CPU)，该处理器90还可以是其他通用处理器、数字信号处理器(Digital Signal Processor，DSP)、专用集成电路(Application Specific Integrated Circuit，ASIC)、现场可编程门阵列(Field-Programmable Gate Array，FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The so-called processor 90 may be a central processing unit (Central Processing Unit, CPU), and the processor 90 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSP), application specific integrated circuits (Application Specific Integrated Circuits) , ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

所述存储器91在一些实施例中可以是所述终端设备9的内部存储单元，例如终端设备9的硬盘或内存。所述存储器91在另一些实施例中也可以是所述终端设备9的外部存储设备，例如所述终端设备9上配备的插接式硬盘，智能存储卡(Smart Media Card,SMC)，安全数字(Secure Digital,SD)卡，闪存卡(Flash Card)等。进一步地，所述存储器91还可以既包括所述终端设备9的内部存储单元也包括外部存储设备。所述存储器91用于存储操作系统、应用程序、引导装载程序(BootLoader)、数据以及其他程序等，例如所述计算机程序的程序代码等。所述存储器91还可以用于暂时地存储已经输出或者将要输出的数据。The memory 91 may be an internal storage unit of the terminal device 9 in some embodiments, such as a hard disk or a memory of the terminal device 9 . The memory 91 may also be an external storage device of the terminal device 9 in other embodiments, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, flash memory card (Flash Card), etc. Further, the memory 91 may also include both an internal storage unit of the terminal device 9 and an external storage device. The memory 91 is used to store an operating system, an application program, a boot loader (Boot Loader), data, and other programs, such as program codes of the computer program. The memory 91 can also be used to temporarily store data that has been output or will be output.

所属领域的技术人员可以清楚地了解到，为了描述的方便和简洁，仅以上述各功能单元、模块的划分进行举例说明，实际应用中，可以根据需要而将上述功能分配由不同的功能单元、模块完成，即将所述装置的内部结构划分成不同的功能单元或模块，以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中，上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。另外，各功能单元、模块的具体名称也只是为了便于相互区分，并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and simplicity of description, only the division of the above-mentioned functional units and modules is used as an example. Module completion, that is, dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above. Each functional unit and module in the embodiment may be integrated in one processing unit, or each unit may exist physically alone, or two or more units may be integrated in one unit, and the above-mentioned integrated units may adopt hardware. It can also be realized in the form of software functional units. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing from each other, and are not used to limit the protection scope of the present application. For the specific working processes of the units and modules in the above-mentioned system, reference may be made to the corresponding processes in the foregoing method embodiments, which will not be repeated here.

本申请实施例还提供了一种网络设备，该网络设备包括：至少一个处理器、存储器以及存储在所述存储器中并可在所述至少一个处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现上述任意各个方法实施例中的步骤。An embodiment of the present application also provides a network device, the network device includes: at least one processor, a memory, and a computer program stored in the memory and executable on the at least one processor, the processor executing The computer program implements the steps in any of the foregoing method embodiments.

本申请实施例还提供了一种计算机可读存储介质，所述计算机可读存储介质存储有计算机程序，所述计算机程序被处理器执行时实现可实现上述各个方法实施例中的步骤。Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps in the foregoing method embodiments can be implemented.

本申请实施例提供了一种计算机程序产品，当计算机程序产品在移动终端上运行时，使得移动终端执行时实现可实现上述各个方法实施例中的步骤。The embodiments of the present application provide a computer program product, when the computer program product runs on a mobile terminal, the steps in the foregoing method embodiments can be implemented when the mobile terminal executes the computer program product.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本申请实现上述实施例方法中的全部或部分流程，可以通过计算机程序来指令相关的硬件来完成，所述的计算机程序可存储于一计算机可读存储介质中，该计算机程序在被处理器执行时，可实现上述各个方法实施例的步骤。其中，所述计算机程序包括计算机程序代码，所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质至少可以包括：能够将计算机程序代码携带到拍照装置/终端设备的任何实体或装置、记录介质、计算机存储器、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，RandomAccess Memory)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区，根据立法和专利实践，计算机可读介质不可以是电载波信号和电信信号。The integrated unit, if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium. Based on this understanding, the present application realizes all or part of the processes in the methods of the above embodiments, which can be completed by instructing the relevant hardware through a computer program, and the computer program can be stored in a computer-readable storage medium. When executed by a processor, the steps of each of the above method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form, and the like. The computer-readable medium may include at least: any entity or device capable of carrying the computer program code to the photographing device/terminal device, recording medium, computer memory, read-only memory (ROM, Read-Only Memory), random access memory (RAM, RandomAccess Memory), electrical carrier signal, telecommunication signal, and software distribution medium. For example, U disk, mobile hard disk, disk or CD, etc. In some jurisdictions, under legislation and patent practice, computer readable media may not be electrical carrier signals and telecommunications signals.

在上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详述或记载的部分，可以参见其它实施例的相关描述。In the foregoing embodiments, the description of each embodiment has its own emphasis. For parts that are not described or described in detail in a certain embodiment, reference may be made to the relevant descriptions of other embodiments.

本领域普通技术人员可以意识到，结合本文中所公开的实施例描述的各示例的单元及算法步骤，能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.

在本申请所提供的实施例中，应该理解到，所揭露的装置/网络设备和方法，可以通过其它的方式实现。例如，以上所描述的装置/网络设备实施例仅仅是示意性的，例如，所述模块或单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口，装置或单元的间接耦合或通讯连接，可以是电性，机械或其它的形式。In the embodiments provided in this application, it should be understood that the disclosed apparatus/network device and method may be implemented in other manners. For example, the apparatus/network device embodiments described above are only illustrative. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods, such as multiple units. Or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

以上所述实施例仅用以说明本申请的技术方案，而非对其限制；尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围，均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the above-mentioned embodiments, those of ordinary skill in the art should understand that: it can still be used for the above-mentioned implementations. The technical solutions described in the examples are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions in the embodiments of the application, and should be included in the within the scope of protection of this application.

Claims

1. An image processing method, comprising:

acquiring an image to be reconstructed;

extracting a high-frequency component and a low-frequency component of the image to be reconstructed to obtain a first high-frequency image composed of the high-frequency component and a first low-frequency image composed of the low-frequency component, wherein the high-frequency component is a frequency component larger than or equal to a preset frequency threshold, and the low-frequency component is a frequency component smaller than the preset frequency threshold;

inputting the first high-frequency image into a trained high-frequency image generation network to obtain a second high-frequency image output by the high-frequency image generation network;

inputting the first low-frequency image into a trained low-frequency image generation network to obtain a second low-frequency image output by the low-frequency image generation network;

and generating a reconstructed image according to the second high-frequency image and the second low-frequency image, wherein the resolution of the reconstructed image is higher than that of the image to be reconstructed.

2. The image processing method according to claim 1, wherein said extracting high frequency components and low frequency components of the image to be reconstructed to obtain a first high frequency image composed of the high frequency components and a first low frequency image composed of the low frequency components comprises:

and converting the color space of the image to be reconstructed into a YUV color space, extracting the high-frequency component and the low-frequency component of the Y channel of the image to be reconstructed, and obtaining a first high-frequency image consisting of the high-frequency component of the Y channel and a first low-frequency image consisting of the low-frequency component of the Y channel.

3. The image processing method of claim 1, wherein the high frequency image generation network is trained by:

acquiring a training sample set, wherein the training sample set stores: the method comprises the steps that a first image and a second image are acquired by terminal equipment, wherein the first image corresponds to the first image one by one, and the resolution of the first image is higher than that of the second image;

acquiring a first image from the training sample set, and extracting a third high-frequency image formed by high-frequency components of the first image;

acquiring a second image corresponding to the first image from the training sample set, and extracting a fourth high-frequency image formed by high-frequency components of the second image;

inputting the fourth high-frequency image into a high-frequency image generation network to be trained to obtain a fifth high-frequency image output by the high-frequency image generation network to be trained;

if the variation of the output value of the loss function of the high-frequency image generation network to be trained is within a preset range, stopping training the high-frequency image generation network to be trained to obtain the trained high-frequency image generation network, otherwise, adjusting the parameters of the high-frequency image generation network to be trained, and returning to execute the step of inputting the fourth high-frequency image into the high-frequency image generation network to be trained and the subsequent steps, wherein the loss function of the high-frequency image generation network to be trained comprises a discriminant network to be trained, the discriminant network to be trained is trained through the loss function of the discriminant network to be trained, and the input of the loss function of the discriminant network to be trained comprises the third high-frequency image and the fifth high-frequency image.

4. The image processing method of any of claims 1 to 3, wherein the trained high frequency image generation network comprises: the system comprises a first feature extraction layer for extracting features, a first feature conversion layer for combining the extracted features, an amplification layer for amplifying the combined features, and a first fusion layer for fusing the amplified features of at least 2 channels into an image and outputting the image; the amplification layer realizes an amplification function through an interpolation algorithm, and the first feature extraction layer, the first feature conversion layer and the first fusion layer realize corresponding functions through a convolution algorithm.

5. The image processing method of claim 4, wherein the first feature extraction layer comprises at least 2 layers, and the number of channels of a subsequent layer is greater than the number of channels of a previous layer;

the first characteristic conversion layer at least comprises 2 layers, and the number of channels of each layer is kept unchanged;

the convolution kernel of the first feature extraction layer and the convolution kernel of the first fusion layer are both larger than the convolution kernel of the first feature conversion layer, and the image scaling ratios corresponding to the first feature extraction layer, the first fusion layer and the first feature conversion layer are all larger than or equal to 1.

6. The image processing method of claim 3, wherein the loss function of the high frequency image generation network to be trained further comprises L representing a loss of content_content：

Wherein G is_HGenerating a network, x, for the high-frequency image to be trained_H ⁽ⁱ⁾Denotes the ith fourth high frequency image, m denotes the number of the fourth high frequency images in a batch, GT_H ⁽ⁱ⁾Representing the ith third high-frequency image, "| | | |" is a norm of L1.

7. The image processing method of claim 3, wherein the discriminant network to be trained comprises: the image fusion system comprises a second feature extraction layer used for extracting features, a second feature conversion layer used for combining the extracted features, and a second fusion layer used for fusing the features of at least 2 channels into one image to be output, wherein image scaling ratios corresponding to the second feature extraction layer, the second feature conversion layer and the second fusion layer are all smaller than 1.

8. An image processing apparatus characterized by comprising:

the image acquisition unit to be reconstructed is used for acquiring an image to be reconstructed;

the high-low frequency image extraction unit is used for extracting a high-frequency component and a low-frequency component of the image to be reconstructed to obtain a first high-frequency image consisting of the high-frequency component and a first low-frequency image consisting of the low-frequency component, wherein the high-frequency component is a frequency component which is greater than or equal to a preset frequency threshold, and the low-frequency component is a frequency component which is less than the preset frequency threshold;

a second high-frequency image generating unit, which is used for inputting the first high-frequency image into the trained high-frequency image generating network to obtain a second high-frequency image output by the high-frequency image generating network;

the second low-frequency image generation unit is used for inputting the first low-frequency image into the trained low-frequency image generation network to obtain a second low-frequency image output by the low-frequency image generation network;

and the image reconstruction unit is used for generating a reconstructed image according to the second high-frequency image and the second low-frequency image, and the resolution of the reconstructed image is higher than that of the image to be reconstructed.

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.