WO2023046136A1

WO2023046136A1 - Feature fusion method, image defogging method and device

Info

Publication number: WO2023046136A1
Application number: PCT/CN2022/121209
Authority: WO
Inventors: 董航
Original assignee: 北京字跳网络技术有限公司
Priority date: 2021-09-27
Filing date: 2022-09-26
Publication date: 2023-03-30
Also published as: CN115880192A

Abstract

The present application provides a feature fusion method, an image defogging method, and a device. The method comprises: acquiring a target feature and at least one feature to be fused, where the target feature and the at least one feature to be fused are features on different spatial scales of the same image; dividing the target feature into a first feature and a second feature; processing the first feature on the basis of a residual dense block (RDB) to obtain a third feature; fusing the second feature and the at least one feature to be fused to obtain a fourth feature; and merging the third feature and the fourth feature to generate a fusion result of the target feature and the at least one feature to be fused.

Description

A feature fusion method, image defogging method and device

Cross References to Related Applications

This application claims the priority of the Chinese patent application number "202111138532.8" filed on September 27, 2021 with the title of "A Feature Fusion Method, Image Dehazing Method and Device", and the entirety of the Chinese patent application The contents are incorporated by reference in this application.

technical field

The present application relates to the technical field of image processing, in particular to a feature fusion method, an image defogging method and a device.

Background technique

Image dehazing is a classic image processing problem. The main purpose of image defogging is to repair the foggy image, so as to obtain a clear and fog-free image. Since in various advanced computer vision tasks (image detection, image recognition, etc.), it is first necessary to dehaze the image to obtain a clear image, the problem of image dehazing has received extensive attention in the computer vision community.

In the field of image defogging, there is generally a large amount of redundant information in the input image, and making full use of this redundant information can effectively improve the effect of image restoration. In order to make full use of these redundant information, it is necessary to extract these redundant information from different positions of the image, so the receptive field of the deep learning network has become an important design criterion. In order to expand the receptive field of deep learning network, multi-scale network has been widely used in the field of image dehazing and achieved good results. Although the multi-scale network improves the overall performance of image defogging by extracting and utilizing features from different scales, the multi-scale network architecture will lose the spatial information of image features during the downsampling process of image features, and the non-adjacent network layers. There are problems such as lack of sufficient connections between features at different scales. Integrating multi-scale features and improving the reuse of network features has been proven to be an effective way to improve network performance in many deep learning architectures. A multi-scale feature fusion method that is widely used at present is: a multi-scale feature fusion method based on reprojection technology. However, although the multi-scale feature fusion method based on re-projection technology can achieve multi-scale feature fusion, the re-projection technology will limit the content between different scale features, and limit the diversity of generated features during multi-scale feature fusion, which will affect the image quality. Defogging learning ability.

technical solution

In view of this, the present application provides a feature fusion method, image defogging method and device, which are used to solve the problem that the multi-scale feature fusion method in the related art will limit the diversity of features in the network architecture.

In order to achieve the above purpose, the embodiment of the present application provides the following technical solutions:

In the first aspect, the embodiment of the present application provides a feature fusion method, including: acquiring a target feature and at least one feature to be fused, the target feature and the at least one feature to be fused are respectively different spatial scales of the same image feature; divide the target feature into a first feature and a second feature; process the first feature based on a residual dense block RDB to obtain a third feature; combine the second feature and the at least one to-be-fused The features are fused to obtain a fourth feature; the third feature and the fourth feature are combined to generate a fusion result of the target feature and at least one feature to be fused.

In some embodiments, the merging the second feature and the at least one feature to be fused to obtain the fourth feature includes: according to the spatial scale of each feature to be fused and the spatial scale of the second feature The difference value sorts the at least one feature to be fused in descending order to obtain a sorting result; the first feature to be fused in the sorting result and the second feature are fused to generate the first feature to be fused Fusion results; fusion of other features to be fused with the fusion result of the last feature to be fused in the sorting results one by one, generating a fusion result of the last feature to be fused in the sorting results; The fusion result of the last feature to be fused is used as the fourth feature.

In some embodiments, the fusing the first feature to be fused with the second feature in the sorting result to generate the fusion result of the first feature to be fused includes: combining the second feature The feature sampling is a feature with the same spatial scale as the first feature to be fused, and the first sampled feature corresponding to the first feature to be fused is generated; the sum of the first sampled feature corresponding to the first feature to be fused is calculated The difference value of the first feature to be fused, obtaining the feature difference corresponding to the first feature to be fused; sampling the feature difference corresponding to the first feature to be fused to be the same as the second feature space scale feature, obtain the second sampling feature corresponding to the first feature to be fused; add and fuse the second feature and the second sampling feature corresponding to the first feature to be fused to generate the first The fusion result of the features to be fused.

In some embodiments, the one-by-one fusion of other features to be fused in the ranking result and the fusion result of the last feature to be fused includes: combining the m-1th feature to be fused in the sorting result The fusion result is sampled as the feature with the same spatial scale as the mth feature to be fused in the sorting result, and the first sampling feature corresponding to the mth to be fused is generated, and m is a positive integer greater than 1; The difference between the m feature to be fused and the first sampling feature corresponding to the m feature to be fused, to obtain the feature difference corresponding to the m feature to be fused; the feature difference corresponding to the m feature to be fused Sampling is a feature with the same spatial scale as the fusion result of the m-1 feature to be fused, and obtaining a second sampling feature corresponding to the m-th feature to be fused; for the m-1 feature to be fused The fusion result is added and fused with the second sampling feature corresponding to the mth feature to be fused to generate a fusion result of the mth feature to be fused.

In some embodiments, the dividing the target feature into a first feature and a second feature includes: dividing the target feature into a first feature and a second feature based on a feature channel of the target feature.

In the second aspect, an embodiment of the present application provides an image defogging method, including: processing the target image through an encoding module to obtain encoding features; wherein, the encoding module includes L cascaded and spatially different The same encoder, the mth encoder is used to fuse the image features of the encoding module on the mth encoder and the mth encoder before using the feature fusion method described in any one of the first aspect The fusion results output by all the encoders, generate the fusion results of the m-th encoder, and output the fusion results of the m-th encoder to all encoders after the m-th encoder, L, Both m are positive integers, and m≤L; the encoded feature is processed by a feature restoration module composed of at least one residual block RDB to obtain the restored feature; the restored feature is processed by the decoding module to obtain the described Dehazing effect image of the target image; wherein, the decoding module includes L cascaded decoders with different spatial scales, and the mth decoder is used to fuse all The image features of the encoding module on the mth encoder and the fusion results of all decoder outputs before the mth decoder are generated to generate the fusion result of the mth decoder, and the The fusion results of the m decoders are output to all decoders after the m decoder.

In a third aspect, an embodiment of the present application provides a feature fusion device, including: an acquisition unit, configured to acquire a target feature and at least one feature to be fused, the target feature and the at least one feature to be fused are respectively the same image The features of different spatial scales; the division unit is used to divide the target feature into the first feature and the second feature; the first processing unit is used to process the first feature based on the residual dense connection network to obtain The third feature; a second processing unit, configured to fuse the second feature and the at least one feature to be fused to obtain a fourth feature; a merging unit, configured to merge the third feature and the fourth feature , generating a fusion result of the target feature and at least one feature to be fused.

In some embodiments, the second processing unit is specifically configured to sort the at least one feature to be fused in descending order according to the difference between the spatial scale of each feature to be fused and the spatial scale of the second feature, and obtain the sorting Result; fuse the first feature to be fused with the second feature in the sorting result to generate a fusion result of the first feature to be fused; pair the other features to be fused and the sorting result one by one The fusion result of the last feature to be fused is fused to generate the fusion result of the last feature to be fused in the sorting result; the fusion result of the last feature to be fused in the sorting result is used as the fourth feature.

In some embodiments, the second processing unit is specifically configured to sample the second feature as a feature with the same space scale as the first feature to be fused, and generate a corresponding to the first feature to be fused. The first sampling feature; calculating the difference between the first sampling feature corresponding to the first feature to be fused and the first feature to be fused, and obtaining the feature difference corresponding to the first feature to be fused; The feature difference sampling corresponding to the first feature to be fused is the same feature as the second feature space scale, and the second sampling feature corresponding to the first feature to be fused is obtained; for the second feature and the first feature The second sampling feature corresponding to a feature to be fused is added and fused to generate a fusion result of the first feature to be fused.

In some embodiments, the second processing unit is specifically configured to sample the fusion result of the m-1th feature to be fused in the sorting result into a space similar to the mth feature to be fused in the sorting result For features of the same scale, generate the first sampling feature corresponding to the mth to-be-fused feature, where m is a positive integer greater than 1; calculate the first sample corresponding to the m-th to-be-fused feature and the m-th to-be-fused feature A feature difference, obtaining the feature difference corresponding to the mth feature to be fused; sampling the feature difference corresponding to the mth feature to be fused as the spatial scale of the fusion result of the m-1th feature to be fused The same feature, obtaining the second sampling feature corresponding to the mth feature to be fused; performing the fusion result of the m-1th feature to be fused and the second sampling feature corresponding to the mth feature to be fused Adding and merging to generate a fusion result of the mth feature to be fused.

In some embodiments, the dividing unit is specifically configured to divide the target feature into a first feature and a second feature based on a feature channel of the target feature.

In a fourth aspect, an embodiment of the present application provides an image defogging device, including: a feature extraction unit, configured to process a target image through an encoding module to obtain encoding features; wherein, the encoding module includes L cascaded and spatial Encoders with different scales, the mth encoder is used to fuse the image features of the encoding module on the mth encoder and the mth encoder through the feature fusion method described in any one of the first aspects The fusion results output by all encoders before the encoder, generating the fusion results of the m-th encoder, and outputting the fusion results of the m-th encoder to all encoders after the m-th encoder , L and m are both positive integers, and m≤L; the feature processing unit is used to process the encoded features through a feature restoration module composed of at least one residual block RDB to obtain restored features; the image generation unit uses The decoding module processes the restoration features to obtain the target image defogging effect image; wherein, the decoding module includes L cascaded decoders with different spatial scales, and the mth decoder is used for Fuse the image features of the encoding module on the mth encoder with the fusion results output by all decoders before the mth decoder by using the feature fusion method described in any one of the first aspects to generate the The fusion result of the mth decoder, and output the fusion result of the mth decoder to all decoders after the mth decoder.

In a fifth aspect, an embodiment of the present application provides an electronic device, including: a memory and a processor, the memory is used to store a computer program; the processor is used to enable the electronic device to implement the first Aspect or the feature fusion method described in any embodiment of the first aspect.

In the sixth aspect, the embodiment of the present application provides a computer-readable storage medium, when the computer program is executed by the computing device, the computing device realizes the feature fusion described in the first aspect or any embodiment of the first aspect method.

In a seventh aspect, an embodiment of the present application provides a computer program product, which enables the computer to implement the feature fusion method described in the first aspect or any embodiment of the first aspect when the computer program product is run on a computer.

In the feature fusion method provided by the embodiment of the present application, after obtaining the target feature and at least one feature to be fused, first divide the target feature into the first feature and the second feature, and then process the first feature based on the RDB to obtain the third feature. feature, fusing the second feature and the at least one feature to be fused to obtain a fourth feature, and finally merging the third feature and the fourth feature to generate a fusion of the target feature and at least one feature to be fused result. That is, the feature fusion method provided by the embodiment of the present application divides the features that need enhanced fusion into two parts, the first feature and the second feature, and processes one part (the first feature) based on the RDB, and divides the other part ( The second feature) is fused with the feature to be fused. Since features can be updated and redundant features can be generated based on RDB, the fusion of the second feature and the features to be fused can realize the introduction of effective information in features of other spatial scales, and realize multi-scale feature fusion. Therefore, this application The feature fusion method provided in the embodiment can ensure the generation of new features and ensure the diversity of features in the network architecture when realizing multi-scale feature fusion. Therefore, the feature fusion method provided in the embodiment of the present application can solve many problems in related technologies. The scale feature fusion method will limit the diversity of features in the network architecture.

Description of drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description serve to explain the principles of the application.

In order to more clearly illustrate the technical solutions in the embodiments of the present application or related technologies, the following will briefly introduce the drawings that need to be used in the descriptions of the embodiments or related technologies. Obviously, for those of ordinary skill in the art, Other drawings can also be obtained from these drawings without any creative effort.

Fig. 1 is a flow chart of the steps of the feature fusion method provided by the embodiment of the present application;

Fig. 2 is one of the data flow diagrams of the feature fusion method provided by the embodiment of the present application;

Fig. 3 is the second schematic diagram of the data flow of the feature fusion method provided by the embodiment of the present application;

FIG. 4 is a flow chart of the steps of the image defogging method provided by the embodiment of the present application;

FIG. 5 is a schematic structural diagram of a network model for implementing an image defogging method provided by an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a feature fusion device provided by an embodiment of the present application;

FIG. 7 is a schematic structural diagram of an image defogging device provided in an embodiment of the present application;

FIG. 8 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.

Detailed ways

In order to better understand the above purpose, features and advantages of the present application, the solution of the present application will be further described below. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments can be combined with each other.

In the following description, a lot of specific details have been set forth in order to fully understand the present application, but the present application can also be implemented in other ways different from those described here; obviously, the embodiments in the description are only a part of the present application, and Not all examples.

In the embodiments of the present application, words such as "exemplary" or "for example" are used as examples, illustrations or illustrations. Any embodiment or design scheme described as "exemplary" or "for example" in the embodiments of the present application shall not be interpreted as being more preferred or more advantageous than other embodiments or design schemes. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete manner. In addition, in the description of the embodiments of the present application, unless otherwise specified, the meaning of "plurality" refers to two or more.

The embodiment of the present application provides a feature fusion method, which can be used in the image processing process of any image processing scene. For example: the feature fusion method provided by the embodiment of the present application can fuse the features of the extracted image in the image dehazing scene; another example: the feature fusion method provided by the embodiment of the present application can also fuse the extracted image in the image restoration process features are fused. Another example: the feature fusion method provided in the embodiment of the present application can also fuse the features of the extracted image during the image super-resolution process. The embodiment of the present application does not limit the usage scenario of the feature fusion method, as long as the usage scenario includes multiple image features of different spatial scales that need to be fused. Referring to Figure 1, the feature fusion method includes the following steps:

S11. Obtain target features and at least one feature to be fused.

Wherein, the target feature and the at least one feature to be fused are features of different spatial scales of the same image.

Specifically, the target feature in the embodiment of the present application refers to a feature that needs to be fused and enhanced, and the feature to be fused refers to a feature used to perform fusion and enhancement on the target feature. Specifically, feature extraction may be performed on the image to be processed based on feature extraction functions or feature extraction networks of different spatial scales, so as to obtain the target feature and the at least one feature to be fused.

S12. Divide the target feature into a first feature and a second feature.

In some embodiments, the implementation of dividing the target feature into the first feature and the second feature may include:

The target feature is divided into a first feature and a second feature based on a feature channel of the target feature.

Specifically, the channel of the feature in the embodiment of the present application refers to the feature map (feature map) contained in the feature, and a channel of the feature is the feature map obtained by feature extraction based on a certain dimension, so the feature The channel of is a feature map in a specific sense. Dividing features based on feature channels is to divide the feature maps of some dimensions in the feature into a feature set, and use the feature maps of the remaining dimensions as another feature set.

The ratio of the first feature to the second feature is not limited in the embodiment of the present application. The higher the proportion of the first feature, the more new features can be generated, and the higher the proportion of the second feature, the more effective information of the features of other spatial scales can be introduced, so in practical applications, other features can be introduced as needed The amount of effective information of the features of the spatial scale and the amount of new features that need to be generated determine the ratio of the first feature to the second feature. Exemplarily, the ratio of the first feature to the second feature may be 1:1.

S13. Process the first feature based on a residual dense block (ResidualDense Block, RDB) to obtain a third feature.

Specifically, the residual dense block includes three main parts, which are: Contiguous Memory (CM), Local Feature Fusion (LFF) and Local Residual Learning (LRL). Among them, CM is mainly used to send the output of the previous RDB to each convolutional layer in the current RDB; LFF is mainly used to fuse the output of the previous RDB with the output of all convolutional layers of the current RDB; LRL is mainly used It is used to add the output of the previous RDB to the output of the LFF of the current RDB, and use the addition result as the output of the current RDB.

Since RDB can perform feature update and redundant feature generation, processing the first feature based on the residual dense block can increase the diversity of features.

S14. Fusion the second feature and the at least one feature to be fused to obtain a fourth feature.

In some embodiments, the above step S14 (fusing the second feature and the at least one feature to be fused to obtain a fourth feature) includes the following steps a to d:

Step a: sort the at least one feature to be fused in descending order according to the difference between the spatial scale of each feature to be fused and the spatial scale of the second feature, and obtain a sorting result.

That is, if the spatial scale of a feature to be fused is different from the spatial scale of the second feature, the higher the position of the feature to be fused is in the sorting result, and if the spatial scale of the feature to be fused is different from the second feature The smaller the difference in the spatial scale of the second feature is, the lower the position of the feature to be fused is in the ranking result.

Step b. Fusing the first feature to be fused with the second feature in the ranking result to generate a fusion result of the first feature to be fused.

Referring to FIG. 2 , in FIG. 2 the first feature to be fused in the sorting result is J ₀ , and the second feature is j ⁿ² to illustrate the above step b. The implementation of the above step b may include the following steps 1 to 4:

Step 1. Sampling the second feature j ⁿ² as a feature with the same spatial scale as the first feature to be fused, and generating the first sampled feature corresponding to the first feature to be fused J ₀

The sampling in the above steps can be up-sampling or down-sampling, which is specifically determined by the spatial scale of the first feature to be fused J ₀ and the spatial scale of the second feature j ⁿ² .

Step 2. Calculate the first sampling feature corresponding to _the first feature J to be fused

and the difference between the first feature to be fused J ₀ and obtain the feature difference corresponding to the first feature to be fused J ₀

The process of the above step 2 can be described as:

Step 3, the feature difference corresponding to the first feature J ₀ to be fused

Sampling is a feature with the same spatial scale as the second feature j ⁿ² , and obtaining the second sampling feature corresponding to the first feature to be fused J ₀

The sampling in the above steps can be up-sampling or down-sampling, specifically the feature difference corresponding to the first feature to be fused J ₀

The spatial scale of is determined by the spatial scale of the second feature j ⁿ² .

Step 4, the second sampling feature corresponding to the second feature j ⁿ² and the _first feature J to be fused

Perform additive fusion to generate the fusion result J ₀ ⁿ of the first feature to be fused.

The process of the above step 4 can be described as:

Step c. Fusing other features to be fused in the ranking result with the fusion result of the last feature to be fused one by one to generate a fusion result of the last feature to be fused in the ranking result.

In some embodiments, in the above step c, the fusion result of the mth (a positive integer greater than 1) feature to be fused and the previous feature to be fused (the m-1th feature to be fused) in the sorting result is fused The implementation method includes the following steps I to VI:

Step 1. Sampling the fusion result of the m-1th feature to be fused in the sorting result as a feature with the same spatial scale as the mth feature to be fused in the sorting result, and generating the mth feature to be fused corresponding to the first sampled features.

Step II: Calculate the difference between the m th feature to be fused and the first sampled feature corresponding to the m th feature to be fused, and obtain the feature difference corresponding to the m th feature to be fused.

Step III. Sampling the feature difference corresponding to the mth feature to be fused as a feature with the same spatial scale as the fusion result of the m-1th feature to be fused, and obtaining the th Two sampling features.

Step VI: Add and fuse the fusion result of the m-1 feature to be fused and the second sampled feature corresponding to the m th feature to be fused to generate a fusion result of the m th feature to be fused.

The difference between obtaining the fusion result of the mth feature to be fused in the sorting results obtained in steps I to VI and the fusion result of the first feature to be fused in steps 1 to 4 is that the first feature to be fused is obtained: When merging the fusion result of features, the input is the second feature and the first feature to be fused, and when obtaining the fusion result of the mth feature to be fused, the input is the fusion result of the m-1th feature to be fused and the mth features to be merged. It is calculated in the same way.

Exemplarily, as shown in FIG. 3 , in FIG. 3 , the sorting results include: feature to be fused J ₀ , feature to be fused J ₁ , feature to be fused J ₂ , ..., feature to be fused J _t as an example for the above steps c for explanation. On the basis of the embodiment shown in Figure 2, the process of obtaining the fusion result J ₀ ⁿ of the first feature to be fused and obtaining the fusion result J _t ⁿ of the last feature J _t to be fused in the sorting results also includes:

Sampling the fusion result J ₀ ⁿ of the first feature to be fused J ₀ as a _feature with the same spatial scale as the second feature to be fused J 1 to generate the first sampling feature corresponding to the second to be fused

Calculate the first sampling feature corresponding to the second feature J ₁ to be fused and the second feature J ₁ to be fused

The difference, to obtain the feature difference corresponding to the second feature to be fused

The feature difference corresponding to the second feature to be fused J ₁

Sampling is a feature with the same spatial scale as the fusion result J ₀ ⁿ of the first feature to be fused J ₀ , and obtains the second sampling feature corresponding to the second feature to be fused J ₁

For the fusion result J ₀ ⁿ of the first feature to be fused J ₀ and the second sampling feature corresponding to the second feature to be fused J ₁

Perform additive fusion to generate the fusion result J ₁ ⁿ of the second feature J ₁ to be fused.

The fusion result J ₁ ⁿ of the second feature to be fused J ₁ is sampled as a feature with the same spatial scale as the third feature to be fused J ₂ , and the first sampling feature corresponding to the third to be fused is generated

Calculate the first sampling feature corresponding to the third feature J ₂ to be fused and the third feature J ₂ to be fused

The difference, to obtain the feature difference corresponding to the third feature to be fused

The feature difference corresponding to the third feature to be fused J ₂

Sampling is a feature with the same spatial scale as the fusion result J ₁ ⁿ of the second feature to be fused J ₁ , and obtains the second sampling feature corresponding to the third feature to be fused J ₂

For the fusion result J ₁ ⁿ of the second feature to be fused J ₁ and the second sampling feature corresponding to the third feature to be fused J ₂

Perform additive fusion to generate the fusion result J ₂ ⁿ of the third feature to be fused J ₂ .

Based on the above method, the fourth feature to be fused J ₃ , the fifth feature to be fused J ₄ , ..., the t-th feature to be fused J _t-1 and the t+1 th feature to be fused J in the sorting results are obtained one by one _t , and finally obtain the fusion result J _t ⁿ of the t+1th feature J _t to be fused.

Step d. Taking the fusion result of the last feature to be fused in the ranking results as the fourth feature.

Continuing from the embodiment shown in Figure 3 above, the sorting results include in turn: the feature to be fused J ₀ , the feature to be fused J ₁ , the feature to be fused J ₂ , ..., the feature to be fused J _t , then the last A fusion result J _t ⁿ of a feature to be fused J _t is used as the fourth feature.

S15. Merge the third feature and the fourth feature to generate a fusion result of the target feature and at least one feature to be fused.

Specifically, combining the third feature and the fourth feature may include: connecting the third feature and the fourth feature in series in a channel dimension.

It should also be noted that when features of multiple spatial scales are fused, upsampling/downsampling convolution and deconvolution are generally required, and upsampling/downsampling convolution and deconvolution require a lot of computing resources , so the performance overhead is relatively large. The above embodiment divides the target feature into the first feature and the second feature, and only the second feature will participate in the multi-spatial scale feature fusion, so the above embodiment can also reduce the number of features that need to be fused (the feature of the second feature The number of features is less than the number of target features), thereby reducing the calculation amount of feature fusion and improving the efficiency of feature fusion.

On the basis of the foregoing embodiments, the embodiments of the present application further provide an image defogging method. Referring to Figure 4, the image defogging method provided by the embodiment of the present application includes the following steps S41 to S43:

S41. Process the target image through the coding module to obtain coding features.

Wherein, the encoding module includes L cascaded encoders with different spatial scales, and the mth encoder is used to fuse the encoding module in the mth encoder through the feature fusion method described in any of the above embodiments The image features on the encoder and the fusion results output by all encoders before the m encoder, generate the fusion result of the m encoder, and output the fusion result of the m encoder For all encoders after the m-th encoder, L and m are both positive integers, and m≤L.

S42. Use a feature restoration module composed of at least one RDB to process the coded features to obtain the restored features.

S43. Process the restoration feature through the decoding module, and acquire the image with the defogging effect of the target image.

Wherein, the decoding module includes L cascaded decoders with different spatial scales, and the mth decoder is used to fuse the encoding module in the mth The image features on the first encoder and the fusion results of all decoder outputs before the mth decoder, generate the fusion result of the mth decoder, and output the fusion result of the mth decoder to all decoders after the mth decoder.

That is, the encoding module, feature restoration module, and decoding module used to execute the embodiment shown in FIG. 4 above form a U-Net.

Specifically, the U-Net is a special convolutional neural network. The U-Net neural network mainly includes: an encoding module (also known as a contraction path), a feature restoration module, and a decoding module (also known as an expansion path. ). The encoding module is mainly used to capture the context information in the original image, and the corresponding decoding module is used to accurately localize the part that needs to be segmented in the original image, and then generate the processed image. Image. Compared with the fully convolutional neural network (Fully Convolutional Neural, FCN) U-shaped network, the improvement of the U-Net is that in order to accurately locate the parts that need to be segmented in the original image, the features extracted from the encoding module will be in the U-Net. The upsampling process is combined with a new feature map to preserve the important information in the feature to the greatest extent, thereby reducing the number of training samples and the demand for computing resources.

Referring to FIG. 5 , the network model used to implement the embodiment shown in FIG. 4 includes: an encoding module 51 forming a U-shaped network, a feature restoration module 52 and a decoding module 53 .

The encoding module 51 includes L cascaded encoders with different spatial scales, which are used to process the target image I and obtain encoding features i ^L . Wherein, the mth encoder is used in the feature fusion method provided by the above embodiment to fuse the image features of the encoding module on the mth encoder and the fusion results output by all encoders before the mth encoder , generating a fusion result of the m-th encoder, and outputting the fusion result of the m-th encoder to all encoders after the m-th encoder.

The feature restoration module 52 includes at least one RDB for receiving the encoded feature i ^L output by the encoding module 51, and processing the encoded feature i ^L through the at least one RDB to obtain the restored feature j ^L .

The decoding module 53 includes L cascaded decoders with different spatial scales, and the mth decoder is used to fuse the decoding module on the mth decoder through the feature fusion method provided by the above-mentioned embodiment The image features of the m-th decoder and the fusion results output by all decoders before the m-th decoder generate the fusion results of the m-th decoder, and output the fusion results of the m-th decoder to the All decoders after the m decoders; and obtaining the dehazing effect image J of the target image I according to the fusion result j ¹ output by the last decoder.

The mth encoder in the encoding module 51 fuses the image features of the encoding module on the mth encoder and all encoders before the mth encoder (the first The operation of the fusion result output from the first encoder to the m-1th encoder) can be described as:

i ^m =i ^m1 +i ^m2

Among them, i ^m1 represents the first feature obtained by dividing the feature ^im of the encoding module in the m-th encoder, f(...) represents the operation of processing the feature based on RDB,

Represents the third feature obtained by processing i ^m1 based on RDB, and i ^m2 represents the second feature obtained by dividing the feature ^im of the encoding module in the m encoder,

Indicates the fusion result output from the first encoder to the m-1th encoder,

means for i ^m2 and

perform the fusion operation,

means for i ^m2 and

The fusion result obtained by fusion,

The fusion result output by the mth encoder of the encoding module.

The mth decoder in the decoding module 53 fuses the image features of the decoding module on the mth decoder and all decoders before the mth decoder (th L The operation of the fusion result output from the first decoder to the m+1th decoder) can be described as:

j ^m =j ^m1 +j ^m2

Among them, j ^m1 represents the first feature obtained by dividing the feature j ^m of the decoding module in the m-th decoder, f(...) represents the operation of processing the feature based on RDB,

Represents the third feature obtained by processing j ^m1 based on RDB, j ^m2 represents the second feature obtained by dividing the feature j ^m of the decoding module in the m-th decoder, L is the total number of encoders in the encoding module,

Represents the fusion result output from the L-th decoder to the m+1-th decoder,

means that for j ^m2 and

perform the fusion operation,

means that for j ^m2 and

The fusion result obtained by fusion,

The fusion result output by the mth decoder of the decoding module.

Since the image defogging method provided in the embodiment of the present application can perform feature fusion through the feature fusion method provided in the above embodiment, the image defogging method provided in the embodiment of the present application can ensure the generation of new features when realizing multi-scale feature fusion , which ensures the diversity of features in the network architecture, so the image defogging method provided in the embodiment of the present application can improve the performance of image defogging.

Based on the same application idea, as an implementation of the above method, the embodiment of the present application also provides a feature fusion device. The device embodiment corresponds to the aforementioned method embodiment. For the convenience of reading, this device embodiment no longer implements the aforementioned method. The details in the examples are described one by one, but it should be clear that the feature fusion device in this embodiment can correspondingly implement all the content in the foregoing method embodiments.

The embodiment of the present application provides a feature fusion device. FIG. 6 is a schematic structural diagram of the feature fusion device. As shown in FIG. 6, the feature fusion device 600 includes:

The obtaining unit 61 is configured to obtain a target feature and at least one feature to be fused, and the target feature and the at least one feature to be fused are respectively features of different spatial scales of the same image.

A division unit 62, configured to divide the target feature into a first feature and a second feature.

The first processing unit 63 is configured to process the first feature based on the residual densely connected network to obtain a third feature.

The second processing unit 64 is configured to fuse the second feature and the at least one feature to be fused to obtain a fourth feature.

The combining unit 65 is configured to combine the third feature and the fourth feature to generate a fusion result of the target feature and at least one feature to be fused.

In some embodiments, the second processing unit 64 is specifically configured to sort the at least one feature to be fused in descending order according to the difference between the spatial scale of each feature to be fused and the spatial scale of the second feature, and obtain Sorting results; fusing the first feature to be fused with the second feature in the sorting result to generate a fusion result of the first feature to be fused; pairing other features to be fused in the sorting result one by one Fusing with the fusion result of the last feature to be fused to generate a fusion result of the last feature to be fused in the sorting result; using the fusion result of the last feature to be fused in the sorting result as the fourth feature.

In some embodiments, the second processing unit 64 is specifically configured to sample the second feature as a feature with the same spatial scale as the first feature to be fused, and generate the first feature to be fused corresponding to The first sampling feature of the first feature to be fused; calculate the difference between the first sampling feature corresponding to the first feature to be fused and the first feature to be fused, and obtain the feature difference corresponding to the first feature to be fused; The feature difference sampling corresponding to the first feature to be fused is a feature with the same space scale as the second feature, and the second sampling feature corresponding to the first feature to be fused is obtained; for the second feature and the The second sampling feature corresponding to the first feature to be fused is added and fused to generate a fusion result of the first feature to be fused.

In some embodiments, the second processing unit 64 is specifically configured to sample the fusion result of the m-1th feature to be fused in the sorting result as the same as the mth feature to be fused in the sorting result For features with the same spatial scale, generate the first sampling feature corresponding to the mth to-be-fused feature, where m is a positive integer greater than 1; calculate the first sampling feature corresponding to the m-th to-be-fused feature and the m-th to-be-fused feature Sampling the difference of features, obtaining the feature difference corresponding to the mth feature to be fused; sampling the feature difference corresponding to the mth feature to be fused as a fusion result space with the m-1th feature to be fused For features of the same scale, obtain the second sampling feature corresponding to the mth feature to be fused; the fusion result of the m-1th feature to be fused and the second sampling feature corresponding to the mth feature to be fused Perform additive fusion to generate a fusion result of the mth feature to be fused.

In some embodiments, the dividing unit 61 is specifically configured to divide the target feature into a first feature and a second feature based on a feature channel of the target feature.

The feature fusion device provided in this embodiment can execute the feature fusion method provided in the above method embodiment, and its implementation principle and technical effect are similar, and will not be repeated here.

An embodiment of the present application provides an image defogging device. FIG. 7 is a schematic structural diagram of the image defogging device. As shown in FIG. 7 , the image defogging device 700 includes:

The feature extraction unit 71 is used to process the target image through the encoding module to obtain encoding features; wherein, the encoding module includes L cascaded encoders with different spatial scales, and the mth encoder is used to pass the above-mentioned The feature fusion method described in any embodiment fuses the image features of the encoding module on the mth encoder with the fusion results output by all encoders before the mth encoder to generate the mth encoder encoder, and output the fusion result of the m encoder to all encoders after the m encoder, L and m are positive integers, and m≤L.

The feature processing unit 72 is configured to process the encoded features through a feature restoration module composed of at least one residual block RDB to obtain restored features.

The image generation unit 73 is configured to process the restoration feature through a decoding module to obtain a defogging effect image of the target image; wherein, the decoding module includes L cascaded decoders with different spatial scales, The mth decoder is used to fuse the image features of the encoding module on the mth encoder and the output of all decoders before the mth decoder through the feature fusion method described in any of the above embodiments. A fusion result, generating a fusion result of the m-th decoder, and outputting the fusion result of the m-th decoder to all decoders after the m-th decoder.

The image defogging device provided in this embodiment can implement the image defogging method provided in the above method embodiment, and its implementation principle and technical effect are similar, and will not be repeated here.

Based on the same application concept, the embodiment of the present application also provides an electronic device. FIG. 8 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. As shown in FIG. 8 , the electronic device provided by this embodiment includes: a memory 81 and a processor 82, the memory 81 is used to store computer programs; the processing The device 82 is configured to execute the feature fusion method or the image defogging method provided in the above embodiments when calling a computer program.

Based on the same application idea, an embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computing device implements the above-mentioned embodiment Provided feature fusion methods or image defogging methods.

Based on the same application idea, an embodiment of the present application also provides a computer program product, which enables the computing device to implement the feature fusion method or the image defogging method provided in the foregoing embodiments when the computer program product is run on a computer.

Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied therein.

The processor can be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.

Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read only memory (ROM) or flash RAM. The memory is an example of a computer readable medium.

Computer-readable media includes both volatile and non-volatile, removable and non-removable storage media. The storage medium may store information by any method or technology, and the information may be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, A magnetic tape cartridge, disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media excludes transitory computer readable media, such as modulated data signals and carrier waves.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and are not intended to limit it; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present application. scope.

Claims

A feature fusion method, comprising:

Acquiring target features and at least one feature to be fused, where the target feature and the at least one feature to be fused are features of different spatial scales of the same image;

dividing the target feature into a first feature and a second feature;

Processing the first feature based on the residual dense block RDB to obtain a third feature;

Fusing the second feature and the at least one feature to be fused to obtain a fourth feature;

Combining the third feature and the fourth feature to generate a fusion result of the target feature and at least one feature to be fused.
The method according to claim 1, wherein said merging said second feature and said at least one feature to be fused to obtain a fourth feature comprises:

sorting the at least one feature to be fused in descending order according to the difference between the spatial scale of each feature to be fused and the spatial scale of the second feature, and obtaining a sorting result;

Fusing the first feature to be fused with the second feature in the ranking result to generate a fusion result of the first feature to be fused;

Fusing other features to be fused in the sorting result with the fusion result of the last feature to be fused one by one to generate a fusion result of the last feature to be fused in the sorting result;

The fusion result of the last feature to be fused in the ranking results is used as the fourth feature.
The method according to claim 2, wherein said merging the first feature to be fused and the second feature in the ranking result to generate a fusion result of the first feature to be fused comprises:

Sampling the second feature as a feature with the same space scale as the first feature to be fused, and generating a first sampled feature corresponding to the first feature to be fused;

calculating the difference between the first sampling feature corresponding to the first feature to be fused and the first feature to be fused, and obtaining the feature difference corresponding to the first feature to be fused;

Sampling the feature difference corresponding to the first feature to be fused as a feature with the same spatial scale as the second feature, and obtaining a second sampled feature corresponding to the first feature to be fused;

Addition and fusion are performed on the second feature and the second sampling feature corresponding to the first feature to be fused to generate a fusion result of the first feature to be fused.
The method according to claim 2, wherein the merging of other features to be fused in the sorting result and the fusion result of the last feature to be fused one by one includes:

Sampling the fusion result of the m-1th feature to be fused in the sorting result as a feature with the same space scale as the mth feature to be fused in the sorting result, and generating the mth feature corresponding to the mth to be fused A sampling feature, m is a positive integer greater than 1;

calculating the difference between the mth feature to be fused and the first sampling feature corresponding to the mth feature to be fused, and obtaining the feature difference corresponding to the mth feature to be fused;

Sampling the feature difference corresponding to the mth feature to be fused as a feature with the same spatial scale as the fusion result of the m-1th feature to be fused, and obtaining a second sampling feature corresponding to the mth feature to be fused ;

Adding and fusing the fusion result of the m-1 feature to be fused and the second sampling feature corresponding to the m th feature to be fused to generate a fusion result of the m th feature to be fused.
The method according to any one of claims 1-4, wherein said dividing said target features into first features and second features comprises:

The target feature is divided into a first feature and a second feature based on a feature channel of the target feature.
An image defogging method, comprising:

Process the target image through the encoding module to obtain the encoding features; wherein, the encoding module includes L cascaded encoders with different spatial scales, and the mth encoder is used to pass any one of claims 1-5 The feature fusion method fuses the image features of the encoding module on the m-th encoder and the fusion results output by all encoders before the m-th encoder to generate the m-th encoder Fusion results, and output the fusion results of the m-th encoder to all encoders after the m-th encoder, where L and m are positive integers, and m≤L;

Process the encoded features through a feature restoration module composed of at least one residual block RDB to obtain restored features;

Process the restored features through the decoding module to obtain the dehazing effect image of the target image; wherein, the decoding module includes L cascaded decoders with different spatial scales, and the mth decoder is used to pass The feature fusion method according to any one of claims 1-5 fuses the image features of the encoding module on the mth encoder and the fusion results output by all decoders before the mth decoder to generate the fusion result of the mth decoder, and output the fusion result of the mth decoder to all decoders after the mth decoder.
A feature fusion device, comprising:

An acquisition unit, configured to acquire a target feature and at least one feature to be fused, the target feature and the at least one feature to be fused are respectively features of different spatial scales of the same image;

a division unit, configured to divide the target feature into a first feature and a second feature;

A first processing unit, configured to process the first feature based on a residual densely connected network to obtain a third feature;

a second processing unit, configured to fuse the second feature and the at least one feature to be fused to obtain a fourth feature;

A merging unit, configured to combine the third feature and the fourth feature to generate a fusion result of the target feature and at least one feature to be fused.
An image defogging device, comprising:

The feature extraction unit is used to process the target image through the encoding module to obtain the encoding features; wherein, the encoding module includes L cascaded encoders with different spatial scales, and the mth encoder is used to pass the claim The feature fusion method described in any one of 1-5 fuses the image features of the encoding module on the m-th encoder and the fusion results output by all encoders before the m-th encoder to generate the The fusion result of the m-th encoder, and outputting the fusion result of the m-th encoder to all encoders after the m-th encoder, where L and m are positive integers, and m≤L;

A feature processing unit, configured to process the encoded features through a feature restoration module composed of at least one residual block RDB to obtain restored features;

An image generating unit, configured to process the restored features through a decoding module to obtain a defogging effect image of the target image; wherein, the decoding module includes L cascaded decoders with different spatial scales, the first The m decoders are used to fuse the image features of the encoding module on the m encoder and all decoders before the m encoder through the feature fusion method described in any one of claims 1-5. The output fusion result is to generate the fusion result of the m-th decoder, and output the fusion result of the m-th decoder to all decoders after the m-th decoder.
An electronic device, comprising: a memory and a processor, the memory is used to store a computer program; the processor is used to enable the electronic device to implement the method described in any one of claims 1-6 when calling the computer program .
A computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a computing device, the computing device implements the method according to any one of claims 1-6 .
A computer program product, when the computer program product is run on a computer, the computer is made to implement the method according to any one of claims 1-6.