WO2024131707A1 - Procédé d'amélioration des cheveux, réseau neuronal, dispositif électronique et support de stockage - Google Patents

Procédé d'amélioration des cheveux, réseau neuronal, dispositif électronique et support de stockage Download PDF

Info

Publication number
WO2024131707A1
WO2024131707A1 PCT/CN2023/139420 CN2023139420W WO2024131707A1 WO 2024131707 A1 WO2024131707 A1 WO 2024131707A1 CN 2023139420 W CN2023139420 W CN 2023139420W WO 2024131707 A1 WO2024131707 A1 WO 2024131707A1
Authority
WO
WIPO (PCT)
Prior art keywords
residual
features
feature
module
image
Prior art date
Application number
PCT/CN2023/139420
Other languages
English (en)
Chinese (zh)
Inventor
张航
许合欢
王进
Original Assignee
虹软科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 虹软科技股份有限公司 filed Critical 虹软科技股份有限公司
Publication of WO2024131707A1 publication Critical patent/WO2024131707A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present application relates to but is not limited to the field of image processing technology, and in particular to a hair enhancement method, a neural network, an electronic device and a storage medium.
  • the present application provides a hair enhancement method, the method comprising:
  • the residual calculation and feature fusion are performed on the image features of the original image through a plurality of sequentially connected residual modules to obtain residual module fusion features, wherein, in two adjacent residual modules, the input feature of the latter residual module is The feature is the output feature of the previous residual module;
  • Feature reconstruction is performed based on the residual module fusion features to obtain an enhanced image corresponding to the original image.
  • the plurality of sequentially connected residual modules include a first residual module and a second residual module that are sequentially connected; performing residual calculation and feature fusion on the image features of the original image through the plurality of sequentially connected residual modules to obtain the residual module fusion features includes:
  • the first feature and the second feature are fused to obtain the residual module fusion feature.
  • the plurality of sequentially connected residual modules include a first residual module, a second residual module, a third residual module, and a fourth residual module that are sequentially connected; performing residual calculation and feature fusion on the image features of the original image through the plurality of sequentially connected residual modules to obtain the residual module fusion features includes:
  • the first feature, the second feature, the third feature and the fourth feature are fused to obtain the residual module fusion feature.
  • each of the residual modules includes an initial layer and a plurality of sequentially connected residual layers
  • the method for obtaining the output features of the residual module includes:
  • convolution calculation is performed on the final output feature of the previous residual layer in the subsequent residual layer to obtain a convolution output feature, and the convolution output feature is added to the final output feature of the previous residual layer as the final output feature of the subsequent residual layer;
  • the final output feature of each residual layer and the final output feature of the initial layer of the residual module are concatenated to obtain a residual layer concatenated feature
  • acquiring the image features of the original image includes:
  • the initial features are downsampled to obtain image features of the original image.
  • downsampling the initial features to obtain the image features of the original image includes:
  • the initial features are downsampled step by step based on a plurality of sequentially connected downsampling modules to obtain image features of the original image, wherein, in two adjacent downsampling modules, the input features of the latter downsampling module are the output features of the former downsampling module.
  • downsampling the initial features is achieved by wavelet transform.
  • the step of reconstructing features based on the residual module fusion features to obtain an enhanced image corresponding to the original image includes:
  • the enhanced image is obtained by performing multiple upsampling and feature fusion calculations on the fusion features of the residual module based on multiple upsampling modules connected in sequence; wherein the number of the upsampling modules corresponds one to one to the number of the downsampling modules, and in two adjacent upsampling modules, the input features of the subsequent upsampling module are jointly determined according to the output features of the previous upsampling module and the output features of the target downsampling module, and the target downsampling module refers to the downsampling module corresponding to the subsequent upsampling module.
  • the wavelet transform comprises:
  • the initial features of the original image are sampled at intervals in rows and columns according to a preset step size to obtain sampling results;
  • a plurality of different frequency band information of the initial feature is calculated according to the sampling result as the image feature of the original image.
  • the hair enhancement method is implemented based on a neural network, and a method for obtaining sample image pairs for training the neural network includes:
  • the first sample image and the second sample image are regarded as a sample image pair.
  • an embodiment of the present application provides a neural network, including an acquisition module, a plurality of sequentially connected residual modules and a reconstruction module;
  • the acquisition module is configured to acquire image features of the original image
  • the plurality of residual modules are configured to sequentially perform residual calculation and feature fusion on the image features of the original image to obtain residual module fusion features, wherein, in two adjacent residual modules, the input features of the latter residual module are the output features of the former residual module;
  • the reconstruction module is configured to perform feature reconstruction based on the fusion features of the residual module to obtain an enhanced image corresponding to the original image.
  • an embodiment of the present application provides an electronic device, comprising a processor and a memory, wherein the memory is configured to store executable instructions of the processor; and the processor is configured to execute the hair enhancement method as described in any one of the first aspects above by executing the executable instructions.
  • an embodiment of the present application provides a computer-readable storage medium, which stores one or more programs, and the one or more programs can be executed by one or more processors to implement the hair enhancement method as described in any of the first aspects above.
  • FIG1 is a flow chart of a hair enhancement method according to an embodiment of the present application.
  • FIG2 is a flow chart of a method for generating a residual module fusion feature according to an embodiment of the present application
  • FIG3 is a schematic diagram of the structure of multiple residual modules according to an embodiment of the present application.
  • FIG4 is a flow chart of a method for calculating output features of a residual module according to an embodiment of the present application
  • FIG5 is a schematic diagram of the internal structure of a residual module according to an embodiment of the present application.
  • FIG6 is a flow chart of wavelet transform according to an embodiment of the present application.
  • FIG7 is a schematic diagram of the effect of wavelet transformation according to an embodiment of the present application.
  • FIG8 is a flow chart of a method for acquiring a sample image pair according to an embodiment of the present application.
  • FIG9 is a schematic diagram of the structure of a neural network according to an embodiment of the present application.
  • FIG10 is a schematic diagram showing a comparison between an original image and an enhanced image according to an embodiment of the present application.
  • FIG11 is a structural block diagram of a neural network according to an embodiment of the present application.
  • connection is not limited to physical or mechanical connections, but may include electrical connections, whether directly or indirectly.
  • the “multiple” involved in this application refers to two or more.
  • “And/or” describes the association relationship of associated objects, indicating that there can be three relationships. For example, “A and/or B” can mean: A exists alone, A and B exist at the same time, and B exists alone. Usually, the character “/” indicates that the objects associated with each other are in an “or” relationship.
  • first”, “second”, “third”, etc. involved in this application are only used to distinguish similar objects and do not represent a specific ordering of objects.
  • the specification may have presented the method and/or process as a specific sequence of steps. However, to the extent that the method or process does not rely on the specific order of the steps described herein, the method or process should not be limited to the steps of the specific order described. As will be understood by those of ordinary skill in the art, other sequences of steps are also possible. Therefore, the specific sequence of the steps set forth in the specification should not be interpreted as a limitation to the claims. In addition, the claims for the method and/or process should not be limited to the steps of performing them in the order written, and those skilled in the art can easily understand that these sequences can be changed and still remain within the spirit and scope of the embodiments of the present application.
  • the present application provides a hair enhancement method, as shown in FIG1 , which comprises the following steps:
  • Step S101 obtaining image features of an original image.
  • the image features of the original image in this embodiment are image features related to hair.
  • the hair in this embodiment includes pet hair and/or human hair.
  • the original image can be any type of image. If the original image contains hair, the method in this embodiment can enhance the detailed texture of the hair to obtain a clearer image.
  • the process of acquiring image features can be achieved by a trained neural network through convolution calculation.
  • Step S102 performing residual calculation and feature fusion on the image features of the original image through a plurality of sequentially connected residual modules to obtain residual module fusion features, wherein, in two adjacent residual modules, the input features of the latter residual module are the output features of the former residual module.
  • the “connected in sequence” in this step indicates the data transmission relationship between the residual modules, for example, multiple residual modules can be cascaded.
  • the neural network used for hair enhancement includes multiple residual modules, which perform residual calculations on input features in turn to obtain multiple residual features, and then perform feature fusion on the multiple residual features through convolution calculation to obtain residual module fusion features.
  • the output features of the first residual module are the input features of the second residual module
  • the output features of the second residual module are the input features of the third residual module
  • the input of the first residual module is the image feature of the original image.
  • the number of residual modules is not limited, so the number of residual modules can be 2, 3, 4, 5, or even more.
  • the receptive field of the neural network can be deepened, and features at different scales can be better extracted, which is conducive to restoring complex hair textures.
  • the scale of the convolutional layer used for feature fusion can be 1 ⁇ 1 to increase the correlation between features of different depths.
  • Step S103 reconstructing features based on the residual module fusion features to obtain an enhanced image corresponding to the original image.
  • feature reconstruction can be achieved through convolution calculation.
  • the image features of the original image are calculated based on multiple residual modules, so as to obtain the residual module fusion features including details such as direction and texture at different scales in the original image.
  • the enhanced image obtained by feature reconstruction based on the residual module fusion features has higher resolution and richer details than the original image, which can improve the processing effect of the hair texture details in the pet hair or portrait, and enhance the texture details in the image.
  • FIG2 is a flow chart of a method for generating residual module fusion features according to an embodiment of the present application. As shown in FIG2, the method may include the following steps:
  • Step S201 performing convolution fusion on the image features based on the first residual module to obtain a first feature
  • Step S202 performing convolution fusion on the first feature based on the second residual module to obtain a second feature
  • Step S203 fusing the first feature and the second feature to obtain a residual module fusion feature.
  • a method for processing image features using multiple residual modules is provided.
  • the first feature output by the first residual module is used as the input of the second residual module, and the receptive field of the extracted features can be increased step by step. Finally, all outputs are fused to obtain features under different receptive fields, thereby enhancing the restoration effect of the original image.
  • the fusion of the first feature and the second feature can be achieved through a 1 ⁇ 1 convolution layer to enhance the correlation between features of different receptive fields.
  • the neural network may also include a third residual module and a fourth residual module. As shown in FIG3 , the neural network includes four residual modules (Multi-Scale Res-Block, referred to as MSRB).
  • MSRB Multi-Scale Res-Block
  • the image features are convolutionally fused based on the first residual module to obtain the first feature; the first feature is convolutionally fused based on the second residual module to obtain the second feature, the second feature is convolutionally fused based on the third residual module to obtain the third feature, and the third feature is convolutionally fused based on the fourth residual module to obtain the fourth feature.
  • the first feature, the second feature, the third feature and the fourth feature are convolutionally fused through the convolution layer of the fusion module to obtain the residual module fusion feature.
  • the fusion module in this embodiment is a 1 ⁇ 1 convolution layer, which is used to change the number of output channels and increase the correlation between each feature at different receptive field depths.
  • the residual module may include an initial layer and a plurality of sequentially connected residual layers, and the output features of the residual module are calculated through the plurality of residual layers.
  • FIG4 is a flow chart of a method for calculating the output features of the residual module according to an embodiment of the present application. The first feature, the second feature, the third feature, and the fourth feature in the above embodiment as output features of the residual module can be obtained by the method, and the method comprises the following steps:
  • Step S401 for two adjacent residual layers, convolution calculation is performed on the final output feature of the previous residual layer by the subsequent residual layer to obtain a convolution output feature, and the convolution output feature is added to the final output feature of the previous residual layer as the final output feature of the subsequent residual layer;
  • Step S402 when there are multiple residual layers, concatenate the final output feature of each residual layer with the final output feature of the initial layer of the residual module to obtain a residual layer concatenation feature;
  • Step S403 determining the output features of the residual module according to the input features of the residual module and the residual layer concatenation features.
  • the residual layer splicing features can be first convolved through a convolution layer to reduce the channel, and then added to the input features of the residual module to obtain the output features of the residual module.
  • the first layer of the residual module serves as the initial layer, which can be an ordinary convolution layer, which is set to calculate the input features of the residual module and directly obtain the final output features of the initial layer.
  • the residual module is a residual layer from the second layer onwards, and the residual layer of the second layer (i.e., the first residual layer) adds its own convolution output features and the final output features of the initial layer as its own final output features.
  • FIG5 is a schematic diagram of the internal structure of a residual module according to an embodiment of the present application.
  • the residual module is composed of a convolution layer for residual calculation and a convolution layer for splicing and fusion.
  • the present embodiment includes 4 residual structures for residual calculation. After the residual calculation, the output features of each residual layer are concatenated (concat), and then a 1 ⁇ 1 convolution layer is connected to reduce the number of channels to reduce the amount of calculation of the neural network.
  • the input feature S of the residual module is convolved through the initial layer of the residual module to obtain S01
  • S01 is the final output feature of the initial layer
  • S01 passes through a convolution layer to obtain the convolution output feature S01'
  • S01' and S01 are added to form the first residual structure
  • the final output feature S02 of the first residual layer is obtained
  • S02 passes through a convolution layer to obtain the convolution output feature S02'
  • S02' and S02 are added to form a second residual structure
  • the final output feature S03 of the second residual layer is obtained
  • S03 After a convolution layer, the convolution output feature S03' is obtained.
  • S03' and S03 are added to form the third residual structure to obtain the final output feature S04 of the third residual layer.
  • S01, S02, S03, and S04 are concatenated (concat) in the channel dimension to obtain the residual layer concatenation feature.
  • the residual layer concatenation feature is convoluted and fused by a 1 ⁇ 1 convolution layer to increase the correlation between features of different receptive field depths and reduce the number of channels to obtain S'.
  • S' and S are added to form a residual structure again to obtain the output feature of the residual module.
  • the " ⁇ " in Figure 5 represents addition.
  • the addition process may be elementwise add to achieve element-by-element addition, thereby retaining more information in the original image and ensuring that the texture details of the enhanced image are consistent with the direction information of the hair in the original image.
  • the receptive field is gradually increased through multiple residual layers, and multi-scale features under different receptive fields are obtained, which is conducive to restoring the hair texture.
  • the image features of the original image are obtained by first obtaining the initial features of the original image, such as the underlying features at the pixel level, and then downsampling the initial features to obtain the image features of the original image for residual calculation.
  • the initial features of the original image such as the underlying features at the pixel level
  • downsampling the initial features to obtain the image features of the original image for residual calculation.
  • the image features of the original image when the image features of the original image are obtained based on the initial features, it can be implemented by downsampling step by step, which may include: downsampling the initial features step by step based on multiple sequentially connected downsampling modules to obtain image features at different scales.
  • the input features of the later downsampling module are the output features of the previous downsampling module.
  • multiple step-by-step decomposition and downsampling of the initial features can be achieved through wavelet transform (WT).
  • WT wavelet transform
  • wavelet transform can save calculation amount without losing various feature information of the original image. It can not only efficiently obtain the high and low frequency information after decomposition, but also restore it through inverse transformation without losing details, and the calculation amount is very small, which is very conducive to deployment on mobile terminals. Therefore, for texture features such as hair, the use of wavelet transform can better retain details and reduce losses.
  • discrete wavelet transform DWT
  • DWT discrete wavelet transform
  • the input features after the wavelet transform can be convolved to reduce the number of channels, thereby finally obtaining the output features of the downsampling module.
  • the step-by-step decomposition and feature extraction of the initial features include a total of 3 downsampling modules, and each downsampling module includes DWT decomposition and convolution calculation.
  • the first layer of decomposition and convolution structure performs DWT decomposition on the initial feature x0, and then convolves the decomposed features to reduce the number of channels, and then enhances the nonlinearity through the ReLU operation to obtain x1.
  • the second layer of decomposition and convolution structure performs DWT decomposition on x1, and also performs convolution and ReLU operations on the decomposed features to obtain the output feature x2.
  • the third layer of decomposition and convolution operation performs DWT decomposition on the feature x2, and then performs convolution operation on the decomposed features to obtain the output feature x3, which can be used as the input feature S of the residual module.
  • the size of the convolution layer can be 3 ⁇ 3 to reduce the amount of calculation, and the number of convolution layers is not limited.
  • FIG6 is a flow chart of wavelet transform according to an embodiment of the present application. As shown in FIG6 , the method includes:
  • Step S601 performing interval sampling on the initial features of the original image in rows and columns according to a preset step size to obtain sampling results.
  • the preset step size can be set according to the requirements.
  • p represents the pixel of the initial feature
  • p01 represents the pixel obtained by sampling every two pixels starting from 0 in the column direction of the image, and taking half of the sampling result
  • p02 represents the pixel obtained by sampling every two pixels starting from 1 in the column direction of the image, and taking half of the sampling result.
  • p1 to p4 represent four pixels in a 2 ⁇ 2 square
  • p1 is the pixel obtained by sampling every two pixels starting from 0 in the row direction of the image for p01
  • p2 is the pixel obtained by sampling every two pixels starting from 0 in the row direction of the image for p02
  • p3 is the pixel obtained by sampling every two pixels starting from 1 in the row direction of the image for p01
  • p4 is the pixel obtained by sampling every two pixels starting from 1 in the row direction of the image for p02. And so on, complete the entire sampling process and get the sampling result.
  • Step S602 Calculate a plurality of different frequency band information of the initial feature according to the sampling result as the image feature of the original image.
  • LL low-frequency information
  • HL high-frequency information in the vertical direction
  • LH high-frequency information in the horizontal direction
  • HH high-frequency information in the diagonal direction. Since low frequency reflects the image overview and high frequency reflects the image details, the image features can be better preserved through wavelet transform.
  • the image on the left is the original input image
  • the image on the right is a schematic diagram after a wavelet decomposition. After wavelet transform, 4 different frequency band information are obtained, and the horizontal and vertical coordinates in the image on the right represent the image size after wavelet transform.
  • the process of reconstructing the enhanced image by fusion features of the residual module is to perform multiple upsampling and feature fusion calculations on the fusion features of the residual module based on multiple sequentially connected upsampling modules to obtain an enhanced image, wherein the number of upsampling modules corresponds to the number of downsampling modules one by one, and in two adjacent upsampling modules, the input features of the subsequent upsampling module are determined based on the output features of the previous upsampling module and the output features of the target downsampling module, and the target downsampling module refers to the downsampling module corresponding to the subsequent upsampling module.
  • the input features of the subsequent upsampling module are obtained by elementwise add element by element.
  • the downsampling is a wavelet transform
  • the upsampling corresponds to an inverse wavelet transform (Inverse Wavelet Transform, referred to as IWT) to reduce the loss of details in the original image.
  • IWT Inverse Wavelet Transform
  • the obtained LL, HL, LH, and HH components are first concatenated in the channel dimension and then restored.
  • Rlt is the result of the final inverse wavelet transform.
  • feature reconstruction and step-by-step synthetic upsampling include a total of three upsampling modules, each of which includes a convolution layer and an IWT layer.
  • the residual module fusion feature is regarded as the input y3 of the first upsampling module.
  • the first upsampling module first convolves y3 output by the bottom multi-scale residual module to increase the number of channels, followed by a ReLU operation to enhance nonlinearity, and then uses IWT to obtain the feature y3', which is added to x2 obtained in the downsampling process to obtain the input feature y2 of the second upsampling module.
  • the second upsampling module reconstructs the feature of y2 through convolution, ReLU operation and IWT to obtain y2', which is added to x1 to obtain the input feature y1 of the third upsampling module.
  • the third upsampling module also uses convolution, ReLU operation and IWT to reconstruct the feature of y1 to obtain y1', which is added to x0 to obtain the feature y0.
  • y0 is calculated through a 3 ⁇ 3 convolution layer to obtain the final output feature y as an enhanced image.
  • the addition process in this embodiment may be an elementwise add to perform element-by-element addition, so that more information in the original image can be retained, thereby ensuring that the texture details of the enhanced image are consistent with the direction information of the hair in the original image.
  • the present application implements the above-mentioned hair enhancement method based on a neural network.
  • a neural network When training the neural network, corresponding sample image pairs are required.
  • FIG8 is a flow chart of a method for obtaining sample image pairs according to an embodiment of the present application. The method includes the following steps:
  • Step S801 acquiring a first sample image, wherein the image quality of the first sample image meets a preset image quality threshold.
  • a first sample image of high-definition pet hair or human hair can be collected by a high-definition image acquisition device such as a SLR camera.
  • a high-definition image acquisition device such as a SLR camera.
  • the collected hair is required to be smooth, the texture is clear, the detail resolution is high, and the hair direction consistency is good.
  • a corresponding image quality threshold can be set to screen the first sample image.
  • Step S802 performing image degradation on the first sample image to obtain a second sample image, wherein the image quality of the second sample image is lower than that of the first sample image.
  • degradation refers to the process of reducing image quality, which can be simulated through JPEG compression, raw noise, lens blur, zoom and other operations, and finally a low-quality pet hair image is obtained after the actual image is degraded.
  • Step S803 taking the first sample image and the second sample image as a sample image pair.
  • the training set is all acquired by real shooting with an image acquisition device that can acquire high-quality images, and high-definition hair images under different light, different environments, and different angles are collected, requiring the hair in the hair image to be smooth, with clear texture, high detail resolution, and good consistency of hair direction.
  • paired low-quality images are obtained through degradation to simulate low-quality hair images taken in real scenes, and finally sample image pairs are obtained, ensuring that the input and output are strictly aligned, and there is no pixel misalignment problem, so that the training results of the neural network are better.
  • the training of the neural network may include the following steps:
  • the loss function in this embodiment is obtained by weighted summation of multiple sub-loss functions.
  • L represents the final loss function
  • n represents the number of sample image pairs
  • L1 is the pixel-by-pixel calculation loss
  • L SSIM is the structural similarity loss
  • L VGG is the perceptual loss
  • L GAN is the loss of the generative adversarial network
  • the weights ⁇ 1 , ⁇ 2 , ⁇ 3 , and ⁇ 4 can be set according to requirements.
  • the loss is calculated based on the output results of the neural network and the real training set. When the value of the loss function reaches the minimum or the number of iterations exceeds the preset threshold, the training ends.
  • the structure of the neural network is shown in Figure 9, including an initial feature extraction module, multiple downsampling modules, multiple residual modules, a fusion module, multiple upsampling modules and a repair enhancement module, wherein the initial feature extraction module is configured to extract the basic features of the original image, multiple downsampling modules, multiple residual modules, a fusion module and multiple upsampling modules are configured to mine more feature information of the image, and the repair enhancement module is configured to achieve the final pet hair repair enhancement.
  • the initial feature extraction module is implemented by a 3 ⁇ 3 convolution layer, which is responsible for extracting the underlying pixel-level features x0 from the low-quality pet hair image x input to the neural network, and using more output channels to represent the feature information of x.
  • the size of the convolution kernel can be 3 ⁇ 3, which can avoid the increase in network parameters caused by too large a convolution kernel and reduce the computing performance consumed in the network inference stage.
  • the three downsampling modules decompose and downsample x0 step by step, and obtain the output features x1, x2, and x3 in turn through the DWT decomposition layer and the convolution layer.
  • the multiple residual modules are exemplified as 4 identical multi-scale residual modules
  • the fusion module is a 1 ⁇ 1 convolution layer, which is used to change the number of output channels and increase the correlation between each feature at different receptive field depths.
  • Each residual module consists of multiple 3 ⁇ 3 convolution layers and 1 1 ⁇ 1 convolution layer, and the residual layer performs residual calculation.
  • the final fusion module concatenates the output features of each of the four residual modules in the channel dimension and then convolves them to obtain the underlying feature extraction result, that is, the residual module fusion feature.
  • the multiple upsampling modules are exemplified as three upsampling modules, each of which includes a convolutional layer and an IWT reconstruction layer.
  • y1’ is obtained through calculation by the three upsampling modules, and the output feature y0 is obtained by adding y1’ and x0.
  • the final repair enhancement module is implemented by a deconvolution layer with a convolution kernel size of 3 ⁇ 3 and a step size of 2. Deconvolution is performed on y0 to obtain the final repair reconstruction result y.
  • the convolution layers of the initial feature extraction module, downsampling module, upsampling module and repair enhancement module in this embodiment are all 3 ⁇ 3, which can reduce parameter calculation and reduce the amount of calculation of the neural network, which is conducive to deployment on the mobile terminal.
  • the convolution layers of the fusion module are all 1 ⁇ 1, which can increase the correlation between each feature at different receptive field depths.
  • the “ ⁇ ” in Figure 9 represents addition.
  • the addition process can be elementwise add to achieve element-by-element addition, so that more information in the original image can be retained, ensuring that the texture details of the enhanced image are consistent with the direction information of the hair in the original image.
  • this embodiment uses DWT and IWT to implement step-by-step decomposition downsampling and step-by-step decomposition downsampling.
  • DWT and IWT are parameter-free operations with simple calculations, avoiding the performance consumption caused by parameterized up and down sampling; 2.
  • the high-frequency detail information of the image can be effectively mined, and DWT and IWT are a pair of lossless conversion operations, which can ensure that the content of the original image is restored without losing details.
  • Src represents the original image
  • Rlt represents the enhanced image after restoration.
  • the texture of the restored and reconstructed pet hair is clearer, and the direction is consistent with the original image, which can significantly enhance the hair resolution of the original image and improve the visual effect of the human eye.
  • the hair enhancement method based on the multi-scale residual network structure in this embodiment can solve the problems of blur, noise, out-of-focus, etc. in the hair area of the image.
  • the multi-scale residual structure can not only obtain the characteristics of different receptive fields and better mine the missing high-frequency detail information, but also the residual structure is convenient for training, ensuring the stability of the training process, and ultimately achieving the repair and enhancement of low-quality hair areas.
  • a neural network is also provided, which is used to implement the above embodiments and implementation methods, and the descriptions that have been made will not be repeated.
  • the terms “module”, “unit”, “sub-unit”, etc. used below can be a combination of software and/or hardware that implements the predetermined functions.
  • the devices described in the following embodiments are preferably implemented in software, the implementation of hardware, or a combination of software and hardware, is also possible and conceivable.
  • FIG. 11 is a block diagram of a neural network according to an embodiment of the present application.
  • the neural network is used for hair enhancement, and includes an acquisition module 1101, a plurality of sequentially connected residual modules 1102, and a reconstruction module 1103;
  • An acquisition module 1101 is configured to acquire image features of an original image
  • a plurality of residual modules 1102 are configured to sequentially perform residual calculation and feature fusion on image features of the original image to obtain residual module fusion features, wherein, in two adjacent residual modules, the input features of the latter residual module are the output features of the former residual module;
  • the reconstruction module 1103 is configured to perform feature reconstruction based on the residual module fusion features to obtain an enhanced image corresponding to the original image.
  • the image features of the original image are calculated based on multiple residual modules 1102, so as to obtain the residual module fusion features including details such as direction and texture in the original image, and the reconstruction module 1103
  • the enhanced image obtained by feature reconstruction based on the residual module fusion feature has higher resolution and richer details than the original image, which can improve the processing effect of the hair texture details in the pet hair or portrait, and enhance the texture details in the image.
  • a plurality of sequentially connected residual modules include a first residual module and a second residual module connected in sequence; residual calculation and feature fusion are performed on the image features of the original image through a plurality of sequentially connected residual modules to obtain residual module fusion features, including: the first residual module performs convolution fusion on the image features to obtain a first feature; the second residual module performs convolution fusion on the first feature to obtain a second feature; the fusion module fuses the first feature and the second feature to obtain a residual module fusion feature.
  • a plurality of sequentially connected residual modules include a first residual module, a second residual module, a third residual module and a fourth residual module that are sequentially connected; the residual calculation and feature fusion are performed on the image features of the original image through a plurality of sequentially connected residual modules to obtain the residual module fusion features, including: performing convolution fusion on the image features based on the first residual module to obtain the first feature; performing convolution fusion on the first feature based on the second residual module to obtain the second feature; performing convolution fusion on the second feature based on the third residual module to obtain the third feature; performing convolution fusion on the third feature based on the fourth residual module to obtain the fourth feature; and fusing the first feature, the second feature, the third feature and the fourth feature to obtain the residual module fusion feature.
  • a convolution output feature is obtained by performing a convolution calculation on the final output feature of the previous residual layer in the subsequent residual layer, and the convolution output feature is added to the final output feature of the previous residual layer as the final output feature of the subsequent residual layer; in the case of multiple residual layers, the fusion layer splices the final output feature of each residual layer and the final output feature of the initial layer of the residual module to obtain a residual layer splicing feature; the input feature of the residual module and the residual layer splicing feature jointly determine the output feature of the residual module.
  • the acquisition module 1101 is further configured to acquire initial features of the original image; and downsample the initial features to obtain image features of the original image.
  • the acquisition module 1101 downsamples the initial features step by step based on multiple sequentially connected downsampling modules to obtain image features of the original image, wherein, in two adjacent downsampling modules, the input features of the later downsampling module are the output features of the previous downsampling module.
  • downsampling of the initial features is achieved through wavelet transformation.
  • the reconstruction module 1103 is further configured to perform multiple upsampling and feature fusion calculations on the residual module fusion features based on a plurality of sequentially connected upsampling modules to obtain an enhanced image; wherein the number of upsampling modules corresponds to the number of downsampling modules one by one, and in two adjacent upsampling modules, the input of the subsequent upsampling module is The input features are determined based on the output features of the preceding upsampling module and the output features of the target downsampling module.
  • the target downsampling module refers to the downsampling module corresponding to the succeeding upsampling module.
  • the wavelet transform includes performing interval sampling on the initial features of the original image in rows and columns according to a preset step size to obtain sampling results; and calculating multiple different frequency band information of the initial features as image features of the original image based on the sampling results.
  • a method for acquiring a sample image pair for training a neural network may include: acquiring a first sample image, where the image quality of the first sample image meets a preset image quality threshold; performing image degradation on the first sample image to obtain a second sample image, where the image quality of the second sample image is lower than that of the first sample image; and treating the first sample image and the second sample image as a sample image pair.
  • This embodiment uses the acquisition method to acquire a large number of sample image pairs for training a neural network.
  • the hair enhancement method provided by the present application performs residual calculation and feature fusion on the image features of the original image through multiple sequentially connected residual modules to obtain residual module fusion features, wherein, in two adjacent residual modules, the input features of the latter residual module are the output features of the former residual module; feature reconstruction is performed based on the residual module fusion features to obtain an enhanced image corresponding to the original image, thereby improving the processing effect of hair texture details in pet hair or portraits and enhancing the texture details in the image.
  • Each of the above modules may be a functional module or a program module, and may be implemented by software or hardware.
  • each of the above modules may be located in the same processor; or each of the above modules may be located in different processors in any combination.
  • This embodiment also provides an electronic device, including a memory and a processor, wherein a computer program is stored in the memory, and the processor is configured to run the computer program to execute the steps in any one of the above method embodiments.
  • the electronic device may include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
  • the processor may be configured to perform the following steps through a computer program:
  • a computer-readable storage medium may be provided in this embodiment for implementation.
  • the storage medium stores one or more programs, and the one or more programs may be executed by one or more processors; when the program is executed by the processor, any one of the methods in the above embodiments is implemented.
  • the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties.
  • Such software may be distributed on a computer-readable medium, which may include a computer storage medium (or non-transitory medium) and a communication medium (or temporary medium).
  • a computer storage medium includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storing information (such as computer-readable instructions, data structures, program modules, or other data).
  • Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tapes, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and can be accessed by a computer.
  • communication media typically contain computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)

Abstract

L'invention concerne un procédé d'amélioration des cheveux, un réseau neuronal, un dispositif électronique et un support de stockage. Le procédé d'amélioration des cheveux comprend : l'acquisition d'une caractéristique d'image d'une image d'origine ; la réalisation d'un calcul résiduel et d'une fusion de caractéristiques sur la caractéristique d'image de l'image d'origine au moyen d'une pluralité de modules résiduels connectés séquentiellement pour obtenir une caractéristique fusionnée de module résiduel, dans deux modules résiduels adjacents, une caractéristique d'entrée du dernier module résiduel étant une caractéristique de sortie du premier module résiduel ; et la réalisation d'une reconstruction de caractéristique sur la base de la caractéristique fusionnée de module résiduel pour obtenir une image améliorée correspondant à l'image d'origine.
PCT/CN2023/139420 2022-12-22 2023-12-18 Procédé d'amélioration des cheveux, réseau neuronal, dispositif électronique et support de stockage WO2024131707A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211659099.7 2022-12-22
CN202211659099.7A CN116188295A (zh) 2022-12-22 2022-12-22 毛发增强方法、神经网络、电子装置和存储介质

Publications (1)

Publication Number Publication Date
WO2024131707A1 true WO2024131707A1 (fr) 2024-06-27

Family

ID=86449858

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/139420 WO2024131707A1 (fr) 2022-12-22 2023-12-18 Procédé d'amélioration des cheveux, réseau neuronal, dispositif électronique et support de stockage

Country Status (2)

Country Link
CN (1) CN116188295A (fr)
WO (1) WO2024131707A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180137603A1 (en) * 2016-11-07 2018-05-17 Umbo Cv Inc. Method and system for providing high resolution image through super-resolution reconstruction
CN112990171A (zh) * 2021-05-20 2021-06-18 腾讯科技(深圳)有限公司 图像处理方法、装置、计算机设备及存储介质
CN114129171A (zh) * 2021-12-01 2022-03-04 山东省人工智能研究院 一种基于改进的残差密集网络的心电信号降噪方法
CN114742733A (zh) * 2022-04-19 2022-07-12 中国工商银行股份有限公司 云去除方法、装置、计算机设备和存储介质
CN115100583A (zh) * 2022-08-29 2022-09-23 君华高科集团有限公司 一种后厨食品安全实时监管的方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180137603A1 (en) * 2016-11-07 2018-05-17 Umbo Cv Inc. Method and system for providing high resolution image through super-resolution reconstruction
CN112990171A (zh) * 2021-05-20 2021-06-18 腾讯科技(深圳)有限公司 图像处理方法、装置、计算机设备及存储介质
CN114129171A (zh) * 2021-12-01 2022-03-04 山东省人工智能研究院 一种基于改进的残差密集网络的心电信号降噪方法
CN114742733A (zh) * 2022-04-19 2022-07-12 中国工商银行股份有限公司 云去除方法、装置、计算机设备和存储介质
CN115100583A (zh) * 2022-08-29 2022-09-23 君华高科集团有限公司 一种后厨食品安全实时监管的方法及系统

Also Published As

Publication number Publication date
CN116188295A (zh) 2023-05-30

Similar Documents

Publication Publication Date Title
TWI728465B (zh) 圖像處理方法和裝置、電子設備及儲存介質
CN111898701B (zh) 模型训练、帧图像生成、插帧方法、装置、设备及介质
US10325346B2 (en) Image processing system for downscaling images using perceptual downscaling method
Zheng et al. Learning frequency domain priors for image demoireing
RU2706891C1 (ru) Способ формирования общей функции потерь для обучения сверточной нейронной сети для преобразования изображения в изображение с прорисованными деталями и система для преобразования изображения в изображение с прорисованными деталями
CN112801901A (zh) 基于分块多尺度卷积神经网络的图像去模糊算法
CN110060204B (zh) 一种基于可逆网络的单一图像超分辨率方法
TWI769725B (zh) 圖像處理方法、電子設備及電腦可讀儲存介質
CN111681177B (zh) 视频处理方法及装置、计算机可读存储介质、电子设备
KR20200132682A (ko) 이미지 최적화 방법, 장치, 디바이스 및 저장 매체
CN113129212B (zh) 图像超分辨率重建方法、装置、终端设备及存储介质
CN110428382A (zh) 一种用于移动终端的高效视频增强方法、装置和存储介质
Xu et al. Exploiting raw images for real-scene super-resolution
Fang et al. High-resolution optical flow and frame-recurrent network for video super-resolution and deblurring
CN111800630A (zh) 一种视频超分辨率重建的方法、系统及电子设备
CN112991231A (zh) 单图像超分与感知图像增强联合任务学习系统
CN114881888A (zh) 基于线性稀疏注意力Transformer的视频去摩尔纹方法
Rasheed et al. LSR: Lightening super-resolution deep network for low-light image enhancement
CN111429371A (zh) 图像处理方法、装置及终端设备
CN112150363B (zh) 一种基于卷积神经网络的图像夜景处理方法及运行该方法的计算模块与可读存储介质
CN111951171A (zh) Hdr图像生成方法、装置、可读存储介质及终端设备
WO2024131707A1 (fr) Procédé d'amélioration des cheveux, réseau neuronal, dispositif électronique et support de stockage
Zhou et al. Deep fractal residual network for fast and accurate single image super resolution
Li et al. RGSR: A two-step lossy JPG image super-resolution based on noise reduction
CN111383171B (zh) 一种图片处理方法、系统及终端设备