WO2023046136A1 - Procédé de fusion de caractéristiques, procédé et dispositif de désembuage d'images - Google Patents

Procédé de fusion de caractéristiques, procédé et dispositif de désembuage d'images Download PDF

Info

Publication number
WO2023046136A1
WO2023046136A1 PCT/CN2022/121209 CN2022121209W WO2023046136A1 WO 2023046136 A1 WO2023046136 A1 WO 2023046136A1 CN 2022121209 W CN2022121209 W CN 2022121209W WO 2023046136 A1 WO2023046136 A1 WO 2023046136A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
fused
fusion
features
mth
Prior art date
Application number
PCT/CN2022/121209
Other languages
English (en)
Chinese (zh)
Inventor
董航
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2023046136A1 publication Critical patent/WO2023046136A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features

Definitions

  • the present application relates to the technical field of image processing, in particular to a feature fusion method, an image defogging method and a device.
  • Image dehazing is a classic image processing problem.
  • the main purpose of image defogging is to repair the foggy image, so as to obtain a clear and fog-free image. Since in various advanced computer vision tasks (image detection, image recognition, etc.), it is first necessary to dehaze the image to obtain a clear image, the problem of image dehazing has received extensive attention in the computer vision community.
  • the multi-scale network improves the overall performance of image defogging by extracting and utilizing features from different scales
  • the multi-scale network architecture will lose the spatial information of image features during the downsampling process of image features, and the non-adjacent network layers.
  • There are problems such as lack of sufficient connections between features at different scales. Integrating multi-scale features and improving the reuse of network features has been proven to be an effective way to improve network performance in many deep learning architectures.
  • a multi-scale feature fusion method that is widely used at present is: a multi-scale feature fusion method based on reprojection technology.
  • the multi-scale feature fusion method based on re-projection technology can achieve multi-scale feature fusion
  • the re-projection technology will limit the content between different scale features, and limit the diversity of generated features during multi-scale feature fusion, which will affect the image quality. Defogging learning ability.
  • the present application provides a feature fusion method, image defogging method and device, which are used to solve the problem that the multi-scale feature fusion method in the related art will limit the diversity of features in the network architecture.
  • the embodiment of the present application provides a feature fusion method, including: acquiring a target feature and at least one feature to be fused, the target feature and the at least one feature to be fused are respectively different spatial scales of the same image feature; divide the target feature into a first feature and a second feature; process the first feature based on a residual dense block RDB to obtain a third feature; combine the second feature and the at least one to-be-fused The features are fused to obtain a fourth feature; the third feature and the fourth feature are combined to generate a fusion result of the target feature and at least one feature to be fused.
  • the merging the second feature and the at least one feature to be fused to obtain the fourth feature includes: according to the spatial scale of each feature to be fused and the spatial scale of the second feature The difference value sorts the at least one feature to be fused in descending order to obtain a sorting result; the first feature to be fused in the sorting result and the second feature are fused to generate the first feature to be fused Fusion results; fusion of other features to be fused with the fusion result of the last feature to be fused in the sorting results one by one, generating a fusion result of the last feature to be fused in the sorting results; The fusion result of the last feature to be fused is used as the fourth feature.
  • the fusing the first feature to be fused with the second feature in the sorting result to generate the fusion result of the first feature to be fused includes: combining the second feature
  • the feature sampling is a feature with the same spatial scale as the first feature to be fused, and the first sampled feature corresponding to the first feature to be fused is generated; the sum of the first sampled feature corresponding to the first feature to be fused is calculated The difference value of the first feature to be fused, obtaining the feature difference corresponding to the first feature to be fused; sampling the feature difference corresponding to the first feature to be fused to be the same as the second feature space scale feature, obtain the second sampling feature corresponding to the first feature to be fused; add and fuse the second feature and the second sampling feature corresponding to the first feature to be fused to generate the first The fusion result of the features to be fused.
  • the one-by-one fusion of other features to be fused in the ranking result and the fusion result of the last feature to be fused includes: combining the m-1th feature to be fused in the sorting result
  • the fusion result is sampled as the feature with the same spatial scale as the mth feature to be fused in the sorting result, and the first sampling feature corresponding to the mth to be fused is generated, and m is a positive integer greater than 1;
  • the feature difference corresponding to the m feature to be fused Sampling is a feature with the same spatial scale as the fusion result of the m-1 feature to be fused, and obtaining a second sampling feature corresponding to the m-th feature to be fused; for the m-1 feature to be fused
  • the fusion result is added and fused with the second sampling feature
  • the dividing the target feature into a first feature and a second feature includes: dividing the target feature into a first feature and a second feature based on a feature channel of the target feature.
  • an embodiment of the present application provides an image defogging method, including: processing the target image through an encoding module to obtain encoding features; wherein, the encoding module includes L cascaded and spatially different The same encoder, the mth encoder is used to fuse the image features of the encoding module on the mth encoder and the mth encoder before using the feature fusion method described in any one of the first aspect
  • the fusion results output by all the encoders, generate the fusion results of the m-th encoder, and output the fusion results of the m-th encoder to all encoders after the m-th encoder, L, Both m are positive integers, and m ⁇ L;
  • the encoded feature is processed by a feature restoration module composed of at least one residual block RDB to obtain the restored feature;
  • the restored feature is processed by the decoding module to obtain the described Dehazing effect image of the target image;
  • the decoding module includes L cascaded decoders with different spatial scales, and
  • an embodiment of the present application provides a feature fusion device, including: an acquisition unit, configured to acquire a target feature and at least one feature to be fused, the target feature and the at least one feature to be fused are respectively the same image The features of different spatial scales; the division unit is used to divide the target feature into the first feature and the second feature; the first processing unit is used to process the first feature based on the residual dense connection network to obtain The third feature; a second processing unit, configured to fuse the second feature and the at least one feature to be fused to obtain a fourth feature; a merging unit, configured to merge the third feature and the fourth feature , generating a fusion result of the target feature and at least one feature to be fused.
  • the second processing unit is specifically configured to sort the at least one feature to be fused in descending order according to the difference between the spatial scale of each feature to be fused and the spatial scale of the second feature, and obtain the sorting Result; fuse the first feature to be fused with the second feature in the sorting result to generate a fusion result of the first feature to be fused; pair the other features to be fused and the sorting result one by one
  • the fusion result of the last feature to be fused is fused to generate the fusion result of the last feature to be fused in the sorting result; the fusion result of the last feature to be fused in the sorting result is used as the fourth feature.
  • the second processing unit is specifically configured to sample the second feature as a feature with the same space scale as the first feature to be fused, and generate a corresponding to the first feature to be fused.
  • the first sampling feature calculating the difference between the first sampling feature corresponding to the first feature to be fused and the first feature to be fused, and obtaining the feature difference corresponding to the first feature to be fused;
  • the feature difference sampling corresponding to the first feature to be fused is the same feature as the second feature space scale, and the second sampling feature corresponding to the first feature to be fused is obtained; for the second feature and the first feature
  • the second sampling feature corresponding to a feature to be fused is added and fused to generate a fusion result of the first feature to be fused.
  • the second processing unit is specifically configured to sample the fusion result of the m-1th feature to be fused in the sorting result into a space similar to the mth feature to be fused in the sorting result For features of the same scale, generate the first sampling feature corresponding to the mth to-be-fused feature, where m is a positive integer greater than 1; calculate the first sample corresponding to the m-th to-be-fused feature and the m-th to-be-fused feature A feature difference, obtaining the feature difference corresponding to the mth feature to be fused; sampling the feature difference corresponding to the mth feature to be fused as the spatial scale of the fusion result of the m-1th feature to be fused The same feature, obtaining the second sampling feature corresponding to the mth feature to be fused; performing the fusion result of the m-1th feature to be fused and the second sampling feature corresponding to the mth feature to be fused Adding and merging to generate a fusion result of the m
  • the dividing unit is specifically configured to divide the target feature into a first feature and a second feature based on a feature channel of the target feature.
  • an embodiment of the present application provides an image defogging device, including: a feature extraction unit, configured to process a target image through an encoding module to obtain encoding features; wherein, the encoding module includes L cascaded and spatial Encoders with different scales, the mth encoder is used to fuse the image features of the encoding module on the mth encoder and the mth encoder through the feature fusion method described in any one of the first aspects The fusion results output by all encoders before the encoder, generating the fusion results of the m-th encoder, and outputting the fusion results of the m-th encoder to all encoders after the m-th encoder , L and m are both positive integers, and m ⁇ L; the feature processing unit is used to process the encoded features through a feature restoration module composed of at least one residual block RDB to obtain restored features; the image generation unit uses The decoding module processes the restoration features to obtain the target image defogging effect image;
  • an embodiment of the present application provides an electronic device, including: a memory and a processor, the memory is used to store a computer program; the processor is used to enable the electronic device to implement the first Aspect or the feature fusion method described in any embodiment of the first aspect.
  • the embodiment of the present application provides a computer-readable storage medium, when the computer program is executed by the computing device, the computing device realizes the feature fusion described in the first aspect or any embodiment of the first aspect method.
  • an embodiment of the present application provides a computer program product, which enables the computer to implement the feature fusion method described in the first aspect or any embodiment of the first aspect when the computer program product is run on a computer.
  • the feature fusion method After obtaining the target feature and at least one feature to be fused, first divide the target feature into the first feature and the second feature, and then process the first feature based on the RDB to obtain the third feature. feature, fusing the second feature and the at least one feature to be fused to obtain a fourth feature, and finally merging the third feature and the fourth feature to generate a fusion of the target feature and at least one feature to be fused result.
  • the feature fusion method provided by the embodiment of the present application divides the features that need enhanced fusion into two parts, the first feature and the second feature, and processes one part (the first feature) based on the RDB, and divides the other part ( The second feature) is fused with the feature to be fused. Since features can be updated and redundant features can be generated based on RDB, the fusion of the second feature and the features to be fused can realize the introduction of effective information in features of other spatial scales, and realize multi-scale feature fusion. Therefore, this application
  • the feature fusion method provided in the embodiment can ensure the generation of new features and ensure the diversity of features in the network architecture when realizing multi-scale feature fusion. Therefore, the feature fusion method provided in the embodiment of the present application can solve many problems in related technologies.
  • the scale feature fusion method will limit the diversity of features in the network architecture.
  • Fig. 1 is a flow chart of the steps of the feature fusion method provided by the embodiment of the present application
  • Fig. 2 is one of the data flow diagrams of the feature fusion method provided by the embodiment of the present application.
  • Fig. 3 is the second schematic diagram of the data flow of the feature fusion method provided by the embodiment of the present application.
  • FIG. 4 is a flow chart of the steps of the image defogging method provided by the embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a network model for implementing an image defogging method provided by an embodiment of the present application
  • FIG. 6 is a schematic structural diagram of a feature fusion device provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an image defogging device provided in an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
  • words such as “exemplary” or “for example” are used as examples, illustrations or illustrations. Any embodiment or design scheme described as “exemplary” or “for example” in the embodiments of the present application shall not be interpreted as being more preferred or more advantageous than other embodiments or design schemes. Rather, the use of words such as “exemplary” or “such as” is intended to present related concepts in a concrete manner.
  • the meaning of "plurality” refers to two or more.
  • the embodiment of the present application provides a feature fusion method, which can be used in the image processing process of any image processing scene.
  • the feature fusion method provided by the embodiment of the present application can fuse the features of the extracted image in the image dehazing scene; another example: the feature fusion method provided by the embodiment of the present application can also fuse the extracted image in the image restoration process features are fused.
  • the feature fusion method provided in the embodiment of the present application can also fuse the features of the extracted image during the image super-resolution process.
  • the embodiment of the present application does not limit the usage scenario of the feature fusion method, as long as the usage scenario includes multiple image features of different spatial scales that need to be fused. Referring to Figure 1, the feature fusion method includes the following steps:
  • the target feature and the at least one feature to be fused are features of different spatial scales of the same image.
  • the target feature in the embodiment of the present application refers to a feature that needs to be fused and enhanced
  • the feature to be fused refers to a feature used to perform fusion and enhancement on the target feature.
  • feature extraction may be performed on the image to be processed based on feature extraction functions or feature extraction networks of different spatial scales, so as to obtain the target feature and the at least one feature to be fused.
  • the implementation of dividing the target feature into the first feature and the second feature may include:
  • the target feature is divided into a first feature and a second feature based on a feature channel of the target feature.
  • the channel of the feature in the embodiment of the present application refers to the feature map (feature map) contained in the feature, and a channel of the feature is the feature map obtained by feature extraction based on a certain dimension, so the feature
  • the channel of is a feature map in a specific sense. Dividing features based on feature channels is to divide the feature maps of some dimensions in the feature into a feature set, and use the feature maps of the remaining dimensions as another feature set.
  • the ratio of the first feature to the second feature is not limited in the embodiment of the present application.
  • the amount of effective information of the features of the spatial scale and the amount of new features that need to be generated determine the ratio of the first feature to the second feature.
  • the ratio of the first feature to the second feature may be 1:1.
  • the residual dense block includes three main parts, which are: Contiguous Memory (CM), Local Feature Fusion (LFF) and Local Residual Learning (LRL).
  • CM Contiguous Memory
  • LFF Local Feature Fusion
  • LRL Local Residual Learning
  • CM is mainly used to send the output of the previous RDB to each convolutional layer in the current RDB
  • LFF is mainly used to fuse the output of the previous RDB with the output of all convolutional layers of the current RDB
  • LRL is mainly used It is used to add the output of the previous RDB to the output of the LFF of the current RDB, and use the addition result as the output of the current RDB.
  • RDB can perform feature update and redundant feature generation
  • processing the first feature based on the residual dense block can increase the diversity of features.
  • step S14 (fusing the second feature and the at least one feature to be fused to obtain a fourth feature) includes the following steps a to d:
  • Step a sort the at least one feature to be fused in descending order according to the difference between the spatial scale of each feature to be fused and the spatial scale of the second feature, and obtain a sorting result.
  • the spatial scale of a feature to be fused is different from the spatial scale of the second feature, the higher the position of the feature to be fused is in the sorting result, and if the spatial scale of the feature to be fused is different from the second feature The smaller the difference in the spatial scale of the second feature is, the lower the position of the feature to be fused is in the ranking result.
  • Step b Fusing the first feature to be fused with the second feature in the ranking result to generate a fusion result of the first feature to be fused.
  • the first feature to be fused in the sorting result is J 0
  • the second feature is j n2 to illustrate the above step b.
  • the implementation of the above step b may include the following steps 1 to 4:
  • Step 1 Sampling the second feature j n2 as a feature with the same spatial scale as the first feature to be fused, and generating the first sampled feature corresponding to the first feature to be fused J 0
  • the sampling in the above steps can be up-sampling or down-sampling, which is specifically determined by the spatial scale of the first feature to be fused J 0 and the spatial scale of the second feature j n2 .
  • Step 2 Calculate the first sampling feature corresponding to the first feature J to be fused and the difference between the first feature to be fused J 0 and obtain the feature difference corresponding to the first feature to be fused J 0
  • step 2 The process of the above step 2 can be described as:
  • Step 3 the feature difference corresponding to the first feature J 0 to be fused Sampling is a feature with the same spatial scale as the second feature j n2 , and obtaining the second sampling feature corresponding to the first feature to be fused J 0
  • the sampling in the above steps can be up-sampling or down-sampling, specifically the feature difference corresponding to the first feature to be fused J 0
  • the spatial scale of is determined by the spatial scale of the second feature j n2 .
  • Step 4 the second sampling feature corresponding to the second feature j n2 and the first feature J to be fused Perform additive fusion to generate the fusion result J 0 n of the first feature to be fused.
  • step 4 The process of the above step 4 can be described as:
  • Step c Fusing other features to be fused in the ranking result with the fusion result of the last feature to be fused one by one to generate a fusion result of the last feature to be fused in the ranking result.
  • the fusion result of the mth (a positive integer greater than 1) feature to be fused and the previous feature to be fused (the m-1th feature to be fused) in the sorting result is fused
  • the implementation method includes the following steps I to VI:
  • Step 1 Sampling the fusion result of the m-1th feature to be fused in the sorting result as a feature with the same spatial scale as the mth feature to be fused in the sorting result, and generating the mth feature to be fused corresponding to the first sampled features.
  • Step II Calculate the difference between the m th feature to be fused and the first sampled feature corresponding to the m th feature to be fused, and obtain the feature difference corresponding to the m th feature to be fused.
  • Step III Sampling the feature difference corresponding to the mth feature to be fused as a feature with the same spatial scale as the fusion result of the m-1th feature to be fused, and obtaining the th Two sampling features.
  • Step VI Add and fuse the fusion result of the m-1 feature to be fused and the second sampled feature corresponding to the m th feature to be fused to generate a fusion result of the m th feature to be fused.
  • the difference between obtaining the fusion result of the mth feature to be fused in the sorting results obtained in steps I to VI and the fusion result of the first feature to be fused in steps 1 to 4 is that the first feature to be fused is obtained:
  • the input is the second feature and the first feature to be fused, and when obtaining the fusion result of the mth feature to be fused, the input is the fusion result of the m-1th feature to be fused and the mth features to be merged. It is calculated in the same way.
  • the sorting results include: feature to be fused J 0 , feature to be fused J 1 , feature to be fused J 2 , ..., feature to be fused J t as an example for the above steps c for explanation.
  • the process of obtaining the fusion result J 0 n of the first feature to be fused and obtaining the fusion result J t n of the last feature J t to be fused in the sorting results also includes:
  • the feature difference corresponding to the second feature to be fused J 1 Sampling is a feature with the same spatial scale as the fusion result J 0 n of the first feature to be fused J 0 , and obtains the second sampling feature corresponding to the second feature to be fused J 1
  • the fusion result J 1 n of the second feature to be fused J 1 is sampled as a feature with the same spatial scale as the third feature to be fused J 2 , and the first sampling feature corresponding to the third to be fused is generated
  • the feature difference corresponding to the third feature to be fused J 2 Sampling is a feature with the same spatial scale as the fusion result J 1 n of the second feature to be fused J 1 , and obtains the second sampling feature corresponding to the third feature to be fused J 2
  • the fourth feature to be fused J 3 , the fifth feature to be fused J 4 , ..., the t-th feature to be fused J t-1 and the t+1 th feature to be fused J in the sorting results are obtained one by one t , and finally obtain the fusion result J t n of the t+1th feature J t to be fused.
  • Step d Taking the fusion result of the last feature to be fused in the ranking results as the fourth feature.
  • the sorting results include in turn: the feature to be fused J 0 , the feature to be fused J 1 , the feature to be fused J 2 , ..., the feature to be fused J t , then the last A fusion result J t n of a feature to be fused J t is used as the fourth feature.
  • combining the third feature and the fourth feature may include: connecting the third feature and the fourth feature in series in a channel dimension.
  • the feature fusion method After obtaining the target feature and at least one feature to be fused, first divide the target feature into the first feature and the second feature, and then process the first feature based on the RDB to obtain the third feature. feature, fusing the second feature and the at least one feature to be fused to obtain a fourth feature, and finally merging the third feature and the fourth feature to generate a fusion of the target feature and at least one feature to be fused result.
  • the feature fusion method provided by the embodiment of the present application divides the features that need enhanced fusion into two parts, the first feature and the second feature, and processes one part (the first feature) based on the RDB, and divides the other part ( The second feature) is fused with the feature to be fused. Since features can be updated and redundant features can be generated based on RDB, the fusion of the second feature and the features to be fused can realize the introduction of effective information in features of other spatial scales, and realize multi-scale feature fusion. Therefore, this application
  • the feature fusion method provided in the embodiment can ensure the generation of new features and ensure the diversity of features in the network architecture when realizing multi-scale feature fusion. Therefore, the feature fusion method provided in the embodiment of the present application can solve many problems in related technologies.
  • the scale feature fusion method will limit the diversity of features in the network architecture.
  • the above embodiment divides the target feature into the first feature and the second feature, and only the second feature will participate in the multi-spatial scale feature fusion, so the above embodiment can also reduce the number of features that need to be fused (the feature of the second feature The number of features is less than the number of target features), thereby reducing the calculation amount of feature fusion and improving the efficiency of feature fusion.
  • the embodiments of the present application further provide an image defogging method.
  • the image defogging method provided by the embodiment of the present application includes the following steps S41 to S43:
  • the encoding module includes L cascaded encoders with different spatial scales
  • the mth encoder is used to fuse the encoding module in the mth encoder through the feature fusion method described in any of the above embodiments
  • L and m are both positive integers, and m ⁇ L.
  • S43 Process the restoration feature through the decoding module, and acquire the image with the defogging effect of the target image.
  • the decoding module includes L cascaded decoders with different spatial scales, and the mth decoder is used to fuse the encoding module in the mth
  • the encoding module, feature restoration module, and decoding module used to execute the embodiment shown in FIG. 4 above form a U-Net.
  • the U-Net is a special convolutional neural network.
  • the U-Net neural network mainly includes: an encoding module (also known as a contraction path), a feature restoration module, and a decoding module (also known as an expansion path. ).
  • the encoding module is mainly used to capture the context information in the original image, and the corresponding decoding module is used to accurately localize the part that needs to be segmented in the original image, and then generate the processed image. Image.
  • the improvement of the U-Net is that in order to accurately locate the parts that need to be segmented in the original image, the features extracted from the encoding module will be in the U-Net.
  • the upsampling process is combined with a new feature map to preserve the important information in the feature to the greatest extent, thereby reducing the number of training samples and the demand for computing resources.
  • the network model used to implement the embodiment shown in FIG. 4 includes: an encoding module 51 forming a U-shaped network, a feature restoration module 52 and a decoding module 53 .
  • the encoding module 51 includes L cascaded encoders with different spatial scales, which are used to process the target image I and obtain encoding features i L .
  • the mth encoder is used in the feature fusion method provided by the above embodiment to fuse the image features of the encoding module on the mth encoder and the fusion results output by all encoders before the mth encoder , generating a fusion result of the m-th encoder, and outputting the fusion result of the m-th encoder to all encoders after the m-th encoder.
  • the feature restoration module 52 includes at least one RDB for receiving the encoded feature i L output by the encoding module 51, and processing the encoded feature i L through the at least one RDB to obtain the restored feature j L .
  • the decoding module 53 includes L cascaded decoders with different spatial scales, and the mth decoder is used to fuse the decoding module on the mth decoder through the feature fusion method provided by the above-mentioned embodiment
  • the image features of the m-th decoder and the fusion results output by all decoders before the m-th decoder generate the fusion results of the m-th decoder, and output the fusion results of the m-th decoder to the All decoders after the m decoders; and obtaining the dehazing effect image J of the target image I according to the fusion result j 1 output by the last decoder.
  • the mth encoder in the encoding module 51 fuses the image features of the encoding module on the mth encoder and all encoders before the mth encoder (the first The operation of the fusion result output from the first encoder to the m-1th encoder) can be described as:
  • i m i m1 +i m2
  • i m1 represents the first feature obtained by dividing the feature im of the encoding module in the m-th encoder
  • f((7) represents the operation of processing the feature based on RDB
  • i m2 represents the second feature obtained by dividing the feature im of the encoding module in the m encoder
  • the mth decoder in the decoding module 53 fuses the image features of the decoding module on the mth decoder and all decoders before the mth decoder (th L
  • the operation of the fusion result output from the first decoder to the m+1th decoder) can be described as:
  • j m1 represents the first feature obtained by dividing the feature j m of the decoding module in the m-th decoder
  • f((7) represents the operation of processing the feature based on RDB
  • j m2 represents the second feature obtained by dividing the feature j m of the decoding module in the m-th decoder
  • L is the total number of encoders in the encoding module
  • the fusion result output by the mth decoder of the decoding module is the total number of encoders in the encoding module
  • the image defogging method provided in the embodiment of the present application can perform feature fusion through the feature fusion method provided in the above embodiment, the image defogging method provided in the embodiment of the present application can ensure the generation of new features when realizing multi-scale feature fusion , which ensures the diversity of features in the network architecture, so the image defogging method provided in the embodiment of the present application can improve the performance of image defogging.
  • the embodiment of the present application also provides a feature fusion device.
  • the device embodiment corresponds to the aforementioned method embodiment.
  • this device embodiment no longer implements the aforementioned method.
  • the details in the examples are described one by one, but it should be clear that the feature fusion device in this embodiment can correspondingly implement all the content in the foregoing method embodiments.
  • FIG. 6 is a schematic structural diagram of the feature fusion device. As shown in FIG. 6, the feature fusion device 600 includes:
  • the obtaining unit 61 is configured to obtain a target feature and at least one feature to be fused, and the target feature and the at least one feature to be fused are respectively features of different spatial scales of the same image.
  • a division unit 62 configured to divide the target feature into a first feature and a second feature.
  • the first processing unit 63 is configured to process the first feature based on the residual densely connected network to obtain a third feature.
  • the second processing unit 64 is configured to fuse the second feature and the at least one feature to be fused to obtain a fourth feature.
  • the combining unit 65 is configured to combine the third feature and the fourth feature to generate a fusion result of the target feature and at least one feature to be fused.
  • the second processing unit 64 is specifically configured to sort the at least one feature to be fused in descending order according to the difference between the spatial scale of each feature to be fused and the spatial scale of the second feature, and obtain Sorting results; fusing the first feature to be fused with the second feature in the sorting result to generate a fusion result of the first feature to be fused; pairing other features to be fused in the sorting result one by one Fusing with the fusion result of the last feature to be fused to generate a fusion result of the last feature to be fused in the sorting result; using the fusion result of the last feature to be fused in the sorting result as the fourth feature.
  • the second processing unit 64 is specifically configured to sample the second feature as a feature with the same spatial scale as the first feature to be fused, and generate the first feature to be fused corresponding to The first sampling feature of the first feature to be fused; calculate the difference between the first sampling feature corresponding to the first feature to be fused and the first feature to be fused, and obtain the feature difference corresponding to the first feature to be fused;
  • the feature difference sampling corresponding to the first feature to be fused is a feature with the same space scale as the second feature, and the second sampling feature corresponding to the first feature to be fused is obtained; for the second feature and the The second sampling feature corresponding to the first feature to be fused is added and fused to generate a fusion result of the first feature to be fused.
  • the second processing unit 64 is specifically configured to sample the fusion result of the m-1th feature to be fused in the sorting result as the same as the mth feature to be fused in the sorting result
  • For features with the same spatial scale generate the first sampling feature corresponding to the mth to-be-fused feature, where m is a positive integer greater than 1; calculate the first sampling feature corresponding to the m-th to-be-fused feature and the m-th to-be-fused feature Sampling the difference of features, obtaining the feature difference corresponding to the mth feature to be fused; sampling the feature difference corresponding to the mth feature to be fused as a fusion result space with the m-1th feature to be fused For features of the same scale, obtain the second sampling feature corresponding to the mth feature to be fused; the fusion result of the m-1th feature to be fused and the second sampling feature corresponding to the mth feature to be fused Perform additive fusion to generate a fusion result of
  • the dividing unit 61 is specifically configured to divide the target feature into a first feature and a second feature based on a feature channel of the target feature.
  • the feature fusion device provided in this embodiment can execute the feature fusion method provided in the above method embodiment, and its implementation principle and technical effect are similar, and will not be repeated here.
  • FIG. 7 is a schematic structural diagram of the image defogging device. As shown in FIG. 7 , the image defogging device 700 includes:
  • the feature extraction unit 71 is used to process the target image through the encoding module to obtain encoding features; wherein, the encoding module includes L cascaded encoders with different spatial scales, and the mth encoder is used to pass the above-mentioned
  • the feature fusion method described in any embodiment fuses the image features of the encoding module on the mth encoder with the fusion results output by all encoders before the mth encoder to generate the mth encoder encoder, and output the fusion result of the m encoder to all encoders after the m encoder, L and m are positive integers, and m ⁇ L.
  • the feature processing unit 72 is configured to process the encoded features through a feature restoration module composed of at least one residual block RDB to obtain restored features.
  • the image generation unit 73 is configured to process the restoration feature through a decoding module to obtain a defogging effect image of the target image; wherein, the decoding module includes L cascaded decoders with different spatial scales,
  • the mth decoder is used to fuse the image features of the encoding module on the mth encoder and the output of all decoders before the mth decoder through the feature fusion method described in any of the above embodiments.
  • a fusion result generating a fusion result of the m-th decoder, and outputting the fusion result of the m-th decoder to all decoders after the m-th decoder.
  • the image defogging device provided in this embodiment can implement the image defogging method provided in the above method embodiment, and its implementation principle and technical effect are similar, and will not be repeated here.
  • FIG. 8 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device provided by this embodiment includes: a memory 81 and a processor 82, the memory 81 is used to store computer programs; the processing The device 82 is configured to execute the feature fusion method or the image defogging method provided in the above embodiments when calling a computer program.
  • an embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computing device implements the above-mentioned embodiment Provided feature fusion methods or image defogging methods.
  • an embodiment of the present application also provides a computer program product, which enables the computing device to implement the feature fusion method or the image defogging method provided in the foregoing embodiments when the computer program product is run on a computer.
  • the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied therein.
  • the processor can be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read only memory (ROM) or flash RAM.
  • RAM random access memory
  • ROM read only memory
  • flash RAM flash random access memory
  • Computer-readable media includes both volatile and non-volatile, removable and non-removable storage media.
  • the storage medium may store information by any method or technology, and the information may be computer-readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, A magnetic tape cartridge, disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • computer readable media excludes transitory computer readable media, such as modulated data signals and carrier waves.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)

Abstract

La présente demande concerne un procédé de fusion de caractéristiques, un procédé de désembuage d'image et un dispositif. Le procédé comprend les étapes consistant à : acquérir une caractéristique cible et au moins une caractéristique à fusionner, la caractéristique cible et la ou les caractéristiques à fusionner étant des caractéristiques sur différentes échelles spatiales de la même image ; diviser la caractéristique cible en une première caractéristique et une deuxième caractéristique ; traiter la première caractéristique sur la base d'un bloc dense résiduel (RDB) pour obtenir une troisième caractéristique ; fusionner la deuxième caractéristique et la ou les caractéristiques à fusionner pour obtenir une quatrième caractéristique ; et fusionner la troisième caractéristique et la quatrième caractéristique pour générer un résultat de fusion de la caractéristique cible et de la ou des caractéristiques à fusionner.
PCT/CN2022/121209 2021-09-27 2022-09-26 Procédé de fusion de caractéristiques, procédé et dispositif de désembuage d'images WO2023046136A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111138532.8A CN115880192A (zh) 2021-09-27 2021-09-27 一种特征融合方法、图像去雾方法及装置
CN202111138532.8 2021-09-27

Publications (1)

Publication Number Publication Date
WO2023046136A1 true WO2023046136A1 (fr) 2023-03-30

Family

ID=85720113

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/121209 WO2023046136A1 (fr) 2021-09-27 2022-09-26 Procédé de fusion de caractéristiques, procédé et dispositif de désembuage d'images

Country Status (2)

Country Link
CN (1) CN115880192A (fr)
WO (1) WO2023046136A1 (fr)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090292468A1 (en) * 2008-03-25 2009-11-26 Shunguang Wu Collision avoidance method and system using stereo vision and radar sensor fusion
CN110544213A (zh) * 2019-08-06 2019-12-06 天津大学 一种基于全局和局部特征融合的图像去雾方法
CN111539886A (zh) * 2020-04-21 2020-08-14 西安交通大学 一种基于多尺度特征融合的去雾方法
CN111968064A (zh) * 2020-10-22 2020-11-20 成都睿沿科技有限公司 一种图像处理方法、装置、电子设备及存储介质
CN112232132A (zh) * 2020-09-18 2021-01-15 北京理工大学 一种融合导航信息的目标识别定位方法
CN112801047A (zh) * 2021-03-19 2021-05-14 腾讯科技(深圳)有限公司 缺陷检测方法、装置、电子设备及可读存储介质
CN112884682A (zh) * 2021-01-08 2021-06-01 福州大学 一种基于匹配与融合的立体图像颜色校正方法及系统

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090292468A1 (en) * 2008-03-25 2009-11-26 Shunguang Wu Collision avoidance method and system using stereo vision and radar sensor fusion
CN110544213A (zh) * 2019-08-06 2019-12-06 天津大学 一种基于全局和局部特征融合的图像去雾方法
CN111539886A (zh) * 2020-04-21 2020-08-14 西安交通大学 一种基于多尺度特征融合的去雾方法
CN112232132A (zh) * 2020-09-18 2021-01-15 北京理工大学 一种融合导航信息的目标识别定位方法
CN111968064A (zh) * 2020-10-22 2020-11-20 成都睿沿科技有限公司 一种图像处理方法、装置、电子设备及存储介质
CN112884682A (zh) * 2021-01-08 2021-06-01 福州大学 一种基于匹配与融合的立体图像颜色校正方法及系统
CN112801047A (zh) * 2021-03-19 2021-05-14 腾讯科技(深圳)有限公司 缺陷检测方法、装置、电子设备及可读存储介质

Also Published As

Publication number Publication date
CN115880192A (zh) 2023-03-31

Similar Documents

Publication Publication Date Title
US20200334819A1 (en) Image segmentation apparatus, method and relevant computing device
US11604272B2 (en) Methods and systems for object detection
CN109816659B (zh) 图像分割方法、装置及系统
CN110544214A (zh) 一种图像修复方法、装置及电子设备
CN113505792A (zh) 面向非均衡遥感图像的多尺度语义分割方法及模型
WO2021143207A1 (fr) Procédé et appareil de traitement d'image, dispositif de traitement de calcul et support
CN109766918B (zh) 基于多层次上下文信息融合的显著性物体检测方法
CN111754404A (zh) 基于多尺度机制和注意力机制的遥感图像时空融合方法
CN111932480A (zh) 去模糊视频恢复方法、装置、终端设备以及存储介质
CN112861795A (zh) 基于多尺度特征融合的遥感图像显著目标检测方法及装置
CN112381716A (zh) 一种基于生成式对抗网络的图像增强方法
CN113962861A (zh) 图像重建方法、装置、电子设备和计算机可读介质
CN113705575B (zh) 一种图像分割方法、装置、设备及存储介质
Makarov et al. Sparse depth map interpolation using deep convolutional neural networks
Park et al. Pyramid attention upsampling module for object detection
Wang et al. ARFP: A novel adaptive recursive feature pyramid for object detection in aerial images
WO2023046136A1 (fr) Procédé de fusion de caractéristiques, procédé et dispositif de désembuage d'images
Chen et al. Multi‐feature fusion attention network for single image super‐resolution
Ge et al. Acsnet: adaptive cross-scale network with feature maps refusion for vehicle density detection
WO2023125522A1 (fr) Procédé et appareil de traitement d'image
Liu et al. Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model
CN115358962B (zh) 一种端到端视觉里程计方法及装置
CN115909088A (zh) 基于超分辨特征聚合的光学遥感图像目标检测方法
CN113255675B (zh) 基于扩张卷积和残差路径的图像语义分割网络结构及方法
Shen et al. Itsrn++: Stronger and better implicit transformer network for continuous screen content image super-resolution

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22872180

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE