WO2023070495A9 - 图像处理方法、电子设备和非瞬态计算机可读介质 - Google Patents
图像处理方法、电子设备和非瞬态计算机可读介质 Download PDFInfo
- Publication number
- WO2023070495A9 WO2023070495A9 PCT/CN2021/127282 CN2021127282W WO2023070495A9 WO 2023070495 A9 WO2023070495 A9 WO 2023070495A9 CN 2021127282 W CN2021127282 W CN 2021127282W WO 2023070495 A9 WO2023070495 A9 WO 2023070495A9
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- model
- mask
- sub
- target
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 28
- 238000012545 processing Methods 0.000 claims abstract description 67
- 238000000034 method Methods 0.000 claims abstract description 36
- 238000005070 sampling Methods 0.000 claims abstract description 4
- 238000000605 extraction Methods 0.000 claims description 26
- 230000005284 excitation Effects 0.000 claims description 15
- 238000010606 normalization Methods 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 11
- 238000012549 training Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 4
- 238000003708 edge detection Methods 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 4
- 230000004044 response Effects 0.000 claims description 3
- 230000008569 process Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 6
- 230000004927 fusion Effects 0.000 description 4
- 101150007921 CBR2 gene Proteins 0.000 description 3
- 102100021973 Carbonyl reductase [NADPH] 1 Human genes 0.000 description 3
- 101000896985 Homo sapiens Carbonyl reductase [NADPH] 1 Proteins 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000007723 transport mechanism Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/174—Segmentation; Edge detection involving the use of two or more images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Definitions
- the present disclosure relates to the field of image processing technology, and in particular to an image processing method, electronic equipment and non-transitory computer-readable media.
- the present disclosure aims to solve at least one of the technical problems existing in the prior art, and proposes an image processing method, an electronic device and a non-transitory computer-readable medium.
- an embodiment of the present disclosure provides an image processing method, including:
- the down-sampled image and the first target mask are input into a pre-trained mask super-resolution model; the first target mask is super-resolved using the mask super-resolution model. rate processing to obtain a second target mask, wherein the resolution of the second target mask is higher than the resolution of the first target mask;
- the second target mask and the original image are fused to obtain a target image.
- the mask super-resolution model includes a first sub-model and a second sub-model
- the method of using the mask super-resolution model to perform super-resolution processing on the first target mask to obtain a second target mask includes:
- the first sub-model includes P levels of first operation modules connected in sequence, and each level of the first operation module includes a first operation unit and a second operation unit, where P is a positive integer greater than 1;
- Using the first sub-model to extract image features corresponding to the downsampled image includes:
- n is a positive integer and not greater than P: use its first operation unit to perform image feature extraction based on the downsampled image or the first feature map output by the upper level first operation module. , generate a second feature map, and output the second feature map to the second sub-model; use its second operation unit to enlarge the size of the second feature map, and use the enlarged second feature map to The graph is output to the first computing module of the next level;
- the method of using the second sub-model and combining the image features extracted by the first sub-model to perform super-resolution processing on the first target mask to obtain the second target mask includes:
- the first operation unit includes a convolution layer, a batch normalization layer and an excitation layer connected in sequence
- the second operation unit includes a transposed convolution layer
- the second sub-model includes P-level second operation modules connected in sequence.
- Each level of the second operation module includes a splicing layer, a third operation unit and a fourth operation unit, wherein the first operation module at the same level
- the computing unit and the third computing unit are connected through a splicing layer;
- m is a positive integer and not greater than P: use its splicing layer to splice the second feature map and the first target mask, or splice the second feature map and its
- the third feature map output by the second operation module of the upper level generates a fourth feature map; its third operation unit is used to extract image features according to the fourth feature map to generate a fifth feature map; its fourth operation is used The unit enlarges the size of the fifth feature map and outputs the enlarged fifth feature map to the second computing module of the next level;
- the amplified fifth feature map is used as the second target object mask and output.
- the third operation unit includes a convolution layer, a batch normalization layer and an excitation layer connected in sequence
- the fourth operation unit includes a transposed convolution layer
- extracting the target area in the down-sampled image and obtaining the first target mask includes:
- the target extraction model is a UNet network model
- the first sub-model includes three-level first operation modules connected in sequence
- the second sub-model includes three-level second operation modules connected in series.
- the mask super-resolution model is trained through the following steps:
- the mask super-resolution model to be trained is trained based on the down-sampled image sample and the target mask sample; wherein, the first sub-model to be trained is used to extract the down-sampled image sample. Sampling the image features corresponding to the image samples, and performing super-resolution processing on the down-sampled image samples; and inputting the image features extracted by the first sub-model to be trained into the second sub-model to be trained. ;Utilize the second sub-model to be trained and perform super-resolution processing on the target mask sample in combination with the image features extracted by the first sub-model to be trained;
- the training ends and the mask super-resolution model is obtained.
- the preset convergence conditions include at least one of the following:
- the first loss value and the second loss value satisfy the preset loss value conditions, wherein the first loss value is based on the original image sample corresponding to the downsampled image sample and the downsampled image sample after super-resolution processing.
- the second loss value is calculated based on the original image sample and the target mask sample after super-resolution processing.
- the method further includes:
- the first loss value is calculated according to the original image sample and the down-sampled image sample after super-resolution processing.
- Edge matching is performed on the first edge map and the second edge map, and the second loss value is determined according to the edge matching result.
- the target area is a portrait area
- the target image is a portrait image
- an embodiment of the present disclosure also provides an electronic device, including:
- processors one or more processors
- Memory used to store one or more programs
- the one or more processors are caused to implement the image processing method as described in any of the above embodiments.
- embodiments of the present disclosure also provide a non-transitory computer-readable medium on which a computer program is stored, wherein when the program is executed, the image processing method as described in any of the above embodiments is implemented. .
- Figure 1 is a flow chart of an image processing method provided by an embodiment of the present disclosure
- Figure 2 is a flow chart of a specific implementation method of step S3 according to the embodiment of the present disclosure
- FIG. 3 is a flow chart of another specific implementation method of step S3 according to the embodiment of the present disclosure.
- Figure 4 is a flow chart of a training method for a mask super-resolution model provided by an embodiment of the present disclosure
- Figure 5 is a flow chart of a specific implementation method of step S02 according to the embodiment of the present disclosure.
- Figure 6 is a flow chart of a specific implementation method of step S2 according to the embodiment of the present disclosure.
- Figure 7 is a schematic structural diagram of a mask super-resolution model provided by an embodiment of the present disclosure.
- Figure 8 is a block diagram of an electronic device provided by an embodiment of the present disclosure.
- Figure 9 is a block diagram of a non-transitory computer-readable medium provided by an embodiment of the present disclosure.
- Figure 1 is a flow chart of an image processing method provided by an embodiment of the present disclosure. As shown in Figure 1, the method includes:
- Step S1 Downsample the original image according to the preset resolution to generate a downsampled image.
- the preset resolution is smaller than the resolution of the original image; downsampling the original image corresponds to the process of scaling the original image, which generates a downsampled image with a lower resolution, and the resolution of the downsampled image is the preset resolution. rate, the number of pixels in the down-sampled image is reduced compared to the number of pixels in the original image, and the time-consuming operation and processing of the down-sampled image is also reduced accordingly.
- the reduction ratio can be approximated by the preset resolution and the resolution of the original image. the ratio between.
- Step S2 Extract the target area in the downsampled image to obtain the first target mask.
- the mask is a single-channel image, which can be used to block all or part of the image to be processed during the image processing process to control the object area and process of image processing; in this embodiment, the first The target mask is a mask corresponding to the extracted target area, in which the target area is used as the foreground area of the down-sampled image, and other parts are used as the background area of the down-sampled image. If the first target mask is The mask acts on the downsampled image to obtain a downsampled image that only retains the foreground area; in some embodiments, the mask is a binary image composed of 0 and 1, or in some embodiments, the mask can also be a multi-dimensional image. value image.
- the target object area is a portrait area and the target object image is a portrait image
- step S2 corresponds to the process of portrait cutout.
- a human figure is only a specific implementation provided by the embodiment of the present disclosure, and it will not limit the technical solution of the present disclosure.
- Other types of target objects are also applicable to the technical solution of the present disclosure, such as Animals and plants, vehicles and other means of transportation, license plates, etc.;
- the target object under the corresponding target object type must meet at least one of the following conditions: have a specific shape; have a clear outline; be able to use the corresponding detection algorithm to determine its The location of the area in the image.
- Step S3 Input the downsampled image and the first target mask into the pre-trained mask super-resolution model, use the mask super-resolution model to perform super-resolution processing on the first target mask, and obtain the first target mask. Two target masks.
- the resolution of the second target mask is higher than the resolution of the first target mask; in step S3, the mask super-resolution model is used to super-resolve the first target mask in combination with the down-sampled image.
- Super Resolution also known as super-resolution processing, corresponds to the process of reconstructing a high-resolution image based on a low-resolution image.
- the mask super-resolution model is trained in advance based on original image samples, downsampled image samples, and target mask samples.
- Step S4 Fusion of the second target mask and the original image to obtain a target image.
- the target image is the final cutout result of the target in the original image.
- the second target mask is a binary image, and the second target mask and the original image are fused by multiplying; or, in some embodiments, as mentioned above, the second target mask is a binary image.
- the two-object mask is a single-channel image.
- the second object mask and the original image can be fused through channel fusion; or, in some embodiments, the second object mask can be fused through Poisson fusion. fused with the original image.
- Embodiments of the present disclosure provide an image processing method, which can be used to extract a target object area in a down-sampled image of an original image to obtain a first target object mask; input the down-sampled image and the first target object mask into the mask
- the mask super-resolution model is used to perform super-resolution processing on the first target mask to obtain the second target mask; the second target mask and the original image are fused to obtain the target
- the object image can be; thus, by improving the resolution of the mask corresponding to the target object, the overall precision of the target object extraction process is improved, which can effectively avoid the appearance of larger, high-resolution images during target object extraction and cutout.
- the problem of jagged edges can be used to extract a target object area in a down-sampled image of an original image to obtain a first target object mask; input the down-sampled image and the first target object mask into the mask
- the mask super-resolution model is used to perform super-resolution processing on the first target mask to obtain the second target mask;
- FIG. 2 is a flow chart of a specific implementation method of step S3 according to the embodiment of the present disclosure.
- the mask super-resolution model includes a first sub-model and a second sub-model; as shown in Figure 2, in step S3, the mask super-resolution model is used to perform super-resolution processing on the first target mask,
- the step of obtaining the second target object mask includes: step S301 and step S302.
- Step S301 Use the first sub-model to extract image features corresponding to the downsampled image.
- Step S302 Input the image features extracted by the first sub-model into the second sub-model, use the second sub-model to perform super-resolution processing on the first target mask in combination with the image features extracted by the first sub-model, and obtain Second target mask.
- the down-sampled image and the first target mask are input to the first sub-model and the second sub-model respectively;
- the first sub-model is used to extract the image features of the down-sampled image and output it to the second sub-model.
- the first sub-model can finally output the down-sampled image after super-resolution processing.
- the results can be used to calibrate model input and output and detect super-resolution effects;
- the second sub-model is used to perform super-resolution processing on the first target object mask based on the image features extracted by the first sub-model, and finally output the second target object.
- the feature maps can be spliced through a concatenation layer (Concat), and in some embodiments, feature fusion can be achieved through 1*1 convolution and channel latitude pooling.
- Super-resolution processing is performed on the first target object mask in combination with the image features extracted by the first sub-model.
- FIG. 3 is a flow chart of another specific implementation method of step S3 according to the embodiment of the present disclosure.
- this method is a specific optional implementation based on the method shown in Figure 2; based on it, the first sub-model includes P-level first computing modules connected in sequence, and each level of the first computing module includes a An arithmetic unit and a second arithmetic unit, where P is a positive integer greater than 1; as shown in Figure 3, when performing step S301, using the first sub-model to extract image features corresponding to the down-sampled image, for the nth
- the first level operation module (n is a positive integer and not greater than P) includes: step S3011 and step S3012.
- Step S3011 Utilize the first computing unit of the first computing module at this level to extract image features based on the downsampled image or the first feature map output by the first computing module at the upper level, generate a second feature map, and add the second feature map to the second feature map.
- the feature map is output to the second sub-model.
- Step S3012 Use the second computing unit of the first computing module at this level to enlarge the size of the second feature map, and output the enlarged second feature map to the first computing module at the next level.
- the second feature map output by the first operation module at this level is the first feature map received by the first operation module at the next level.
- the amplified feature map output by it is the down-sampled image after super-resolution processing.
- the first computing unit includes a convolution layer, a batch normalization layer, and an excitation layer connected in sequence
- the second computing unit includes a transposed convolution layer, also known as a deconvolution layer and a deconvolution layer.
- step S3012 is the step of using the second operation unit of the first operation module at this level to enlarge the size of the feature map corresponding to the extracted image feature, which specifically includes: using the second operation unit of the first operation module at this level to extract the The feature map corresponding to the image feature is transposed and convolved to enlarge the size of the feature map.
- the parameter settings of the above-mentioned convolution layer, batch normalization layer, excitation layer and transposed convolution layer may be different.
- the term "convolution kernel” refers to the two-dimensional matrix used in the convolution process.
- each of the multiple entries in the two-dimensional matrix has a specific value.
- the term “convolution” refers to the process of processing images.
- Convolution kernel is used for convolution.
- Each pixel of the input image has a value, and the convolution kernel starts at one pixel of the input image and moves sequentially over each pixel in the input image.
- the convolution kernel overlaps several pixels on the image based on the scale of the convolution kernel.
- the value of one pixel among several overlapping pixels is multiplied by the corresponding value of the convolution kernel to obtain the multiplied value of one pixel among several overlapping pixels.
- all multiplied values of overlapping pixels are added to obtain a sum corresponding to the position of the convolution kernel on the input image.
- convolution By moving the convolution kernel on each pixel of the input image, all sums corresponding to all positions of the convolution kernel are collected and output to form the output image.
- convolution can use different convolution kernels to extract different features of the input image.
- the convolution process can use different convolution kernels to add more features to the input image.
- the convolutional layer is used to perform convolution on the input image to obtain the output image.
- the excitation layer can perform non-linear mapping on the output signal output from the convolution layer.
- Various functions can be used in the excitation layer. Examples of functions suitable for use in the excitation layer include, but are not limited to: rectified linear unit (ReLU) functions, sigmoid functions, and hyperbolic tangent functions (eg, tanh functions).
- ReLU rectified linear unit
- sigmoid sigmoid functions
- hyperbolic tangent functions eg, tanh functions
- the excitation layer and the batch normalization layer are included in the convolutional layer.
- the batch normalization layer (Batch Normalization, referred to as BN) can standardize the output of a small batch of data at each layer of the network model. Standardization is the process of making the data conform to the standard normal distribution with a mean of 0 and a standard deviation of 1. It can solve the problem of gradient disappearance in neural network models.
- the steps of using the second sub-model and combining the image features extracted by the first sub-model to perform super-resolution processing on the first target mask to obtain the second target mask include : Using the second sub-model, combined with the second feature map extracted by the first operation module at each level, super-resolution processing is performed on the first target mask to obtain the second target mask.
- the second sub-model includes P levels of second operation modules connected in sequence, and each level of the second operation module includes a splicing layer, a third operation unit and a fourth operation unit, where P is greater than 1.
- P is greater than 1.
- the first computing unit and the third computing unit of the same level are connected through the splicing layer; thus, in some embodiments, as shown in Figure 3, when executing the above second sub-model, combining each When performing super-resolution processing on the first target object mask to obtain the second target object mask using the second feature map extracted by the first operation module of the level m, for the second operation module of the mth level (m is A positive integer and not greater than P), which includes: step S3021 to step S3023.
- Step S3021 Use the splicing layer of the second operation module at this level to splice the second feature map and the first target object mask, or splice the second feature map and the third feature map output by the second operation module at the upper level to generate The fourth feature map.
- the second layer splices the second feature map and the third feature map output by the second operation module of the upper level.
- the number of levels of the multi-level first operation module is the same as the number of levels of the multi-level second operation module.
- the tiling layer is used for tiling at channel latitudes.
- Step S3022 Use the third computing unit of the second computing module at this level to extract image features based on the fourth feature map and generate a fifth feature map.
- Step S3023 Use the fourth computing unit of the second computing module at this level to enlarge the size of the fifth feature map, and output the enlarged fifth feature map to the second computing module at the next level.
- the fifth feature map output by the second operation module at this level is the third feature map received by the second operation module at the next level; for the second operation module at the last level, the amplified fifth feature map is used as The second target is masked and output.
- the third operation unit includes a convolution layer, a batch normalization layer and an excitation layer connected in sequence
- the fourth operation unit includes a transposed convolution layer
- the parameter settings of the above-mentioned convolution layer, batch normalization layer, excitation layer and transposed convolution layer may be different.
- Embodiments of the present disclosure provide an image processing method that can be used to perform super-resolution processing on a mask corresponding to a target object based on the image features of the downsampled image, thereby increasing its feature dimension and improving the precision of target object extraction.
- Figure 4 is a flow chart of a training method for a mask super-resolution model provided by an embodiment of the present disclosure.
- the mask super-resolution model is the mask super-resolution model corresponding to Figure 2, which includes a first sub-model and a second sub-model; as shown in Figure 4, the mask super-resolution model is as follows Step training results:
- Step S01 Input the downsampled image sample and its corresponding target mask sample into the mask super-resolution model to be trained.
- the down-sampled image sample is obtained by down-sampling its corresponding original image sample
- the target mask sample is obtained by extracting the target object from the down-sampled image sample.
- Step S02 Train the mask super-resolution model to be trained based on the down-sampled image samples and the target mask samples in an iterative manner.
- FIG. 5 is a flowchart of a specific implementation method of step S02 according to the embodiment of the present disclosure. As shown in Figure 5, step S02 includes: step S021 and step S022.
- Step S021 Use the first sub-model to be trained to extract image features corresponding to the down-sampled image samples, and perform super-resolution processing on the down-sampled image samples.
- Step S022 Input the image features extracted by the first sub-model to be trained into the second sub-model to be trained, and use the second sub-model to be trained to combine the image features extracted by the first sub-model to be trained to target Object mask samples are processed for super-resolution.
- the image features corresponding to the down-sampled image sample include the image features of the down-sampled image sample and the image features of its feature map; the above-mentioned process of training the first sub-model and the second sub-model Corresponds to the actual reasoning process of the first sub-model and the second sub-model.
- Step S03 In response to the preset convergence conditions being met, the training ends and the mask super-resolution model is obtained.
- the preset convergence condition includes at least one of the following: the preset number of iterations has been trained; the first loss value and the second loss value satisfy the preset loss value condition.
- the first loss value is calculated based on the original image sample corresponding to the down-sampled image sample and the down-sampled image sample after super-resolution processing
- the second loss value is calculated based on the original image sample and the target mask sample after super-resolution processing. calculated.
- step S021 using the first sub-model to be trained to extract image features corresponding to the down-sampled image samples, and performing super-resolution processing on the down-sampled image samples, it also includes: based on the mean square error (Mean Square Error, MSE for short) function calculates the first loss value based on the original image samples and the downsampled image samples after super-resolution processing.
- MSE mean Square Error
- step S022 using the second sub-model to be trained and combining the image features extracted by the first sub-model to be trained to perform super-resolution processing on the target mask sample, it also includes: Obtain the first edge map corresponding to the original image sample; perform edge detection on the target mask sample after super-resolution processing to obtain the second edge map; perform edge matching on the first edge map and the second edge map, and perform edge matching according to the edge map. The result determines the second loss value.
- edge detection is performed on the original image sample to obtain the first edge map, or the pre-calculated first edge map is read from the storage area.
- the edge map is an 8-bit grayscale image, EM1 (x i ,y i )>127, EM2 (x i ,y i )>127.
- FIG. 6 is a flow chart of a specific implementation method of step S2 according to the embodiment of the present disclosure. As shown in Figure 6, step S2, the step of extracting the target object area in the down-sampled image and obtaining the first target mask includes: step S201.
- Step S201 Input the down-sampled image into a pre-trained target extraction model, use the target extraction model to extract the target area in the down-sampled image, and obtain a first target mask.
- the target extraction model adopts the UNet network model, and the resolutions of its input image and output image are both 512*512; accordingly, in step S1, the preset resolution is 512*512.
- the first target mask obtained by using the target extraction model is input into the mask super-resolution model.
- the mask super-resolution model includes a first sub-model and a second sub-model.
- the first sub-model includes sequentially connected
- the second operation module of each level uses its fourth operation unit to enlarge the size of its fifth feature map.
- the size of the feature map can be enlarged by two times each time, and the final output of the fifth feature map is
- the resolution of the two-object mask can be 4096*4096, which can be applied to 4K scenes; specifically, by setting the padding parameter (padding) of the transposed convolution layer, double amplification can be achieved, such as setting this parameter to "same" etc.
- the image processing method provided by the embodiment of the present disclosure will be described in detail below in conjunction with practical applications. Specifically, taking the application to portrait cutout as an example, the target area in the downsampled image is the portrait area, and the final target image is the portrait image.
- the original image is first down-sampled according to a preset resolution to generate a down-sampled image; where the original image is a 4K image including portraits, and the preset resolution is 512*512.
- the down-sampled image is input into the pre-trained target extraction model, and the target extraction model is used to extract the portrait area in the down-sampled image to obtain the first target mask; wherein, the target extraction model is specifically used for portrait matting.
- Figure which uses the UNet network model.
- the mask super-resolution model includes a first sub-model and a second sub-model;
- the second sub-model includes sequential connections A three-level second operation module, each level of the second operation module includes a splicing layer, a third operation unit and a fourth operation unit, wherein the first operation unit and the third operation unit at the same level are connected through a splicing layer.
- An object mask is input into the second sub-model.
- its first operation unit is used to perform image processing based on the downsampled image or the first feature map output by its upper-level first operation module.
- Feature extraction generates a second feature map; and, uses its second operation unit to enlarge the size of the second feature map, and outputs the enlarged second feature map to the next-level first operation module; wherein, for the first-level The first operation module, its first operation unit directly extracts image features from the downsampled image, and for the second-level and third-level first operation modules, its first operation unit performs image feature extraction on the first-level first operation module and the third-level first operation module respectively.
- the feature map output by the second-level first operation module is used for image feature extraction, and the third-level operation module directly outputs the enlarged second feature map.
- the second feature map is the down-sampled image after super-resolution processing; in some embodiments, the first operation unit includes a convolution layer, a batch normalization layer and an excitation layer connected in sequence, and the second operation unit includes a transposed convolution layer.
- the second computing module of the mth level (m is a positive integer and not greater than 3) its splicing layer is used to splice the second feature map output by the first computing module of the same level and the first target mask, or splice
- the second feature map and the third feature map output by the upper-level second operation module generate a fourth feature map; use its third operation unit to extract image features based on the fourth feature map to generate a fifth feature map; use Its fourth computing unit amplifies the size of the fifth feature map, and outputs the enlarged fifth feature map to the next-level second computing module; wherein, for the first-level second computing module, its splicing layer is used to splice the first The feature map and first target object mask output by the first-level operation module.
- the splicing layer splices the first-level second operation module and the second-level first operation module respectively.
- the feature map output by the module, and the feature map output by the second-level second operation module and the third-level first operation module are spliced together.
- the third-level operation module directly outputs the amplified fifth feature map.
- the fifth feature map That is, the second target object mask; in some embodiments, the third operation unit includes a convolution layer, a batch normalization layer and an excitation layer connected in sequence, and the fourth operation unit includes a transposed convolution layer.
- FIG. 7 is a schematic structural diagram of a mask super-resolution model provided by an embodiment of the present disclosure.
- the mask super-resolution model includes a first sub-model and a second sub-model;
- the first sub-model includes three-level first operation modules 301 connected in sequence, each
- the first-level computing module 301 includes a first computing unit CBR1 and a second computing unit T_conv1.
- the down-sampled image LR is input to the first sub-model, and the first sub-model outputs the down-sampled image HR after super-resolution processing;
- the second sub-model The model includes three levels of second operation modules 401 connected in sequence.
- Each level of the second operation module 401 includes a splicing layer (not shown in the figure), a third operation unit CBR2 and a fourth operation unit T_conv2, where the first operation module of the same level
- the computing unit CBR1 and the third computing unit CBR2 are connected through a splicing layer.
- the first target mask MASK_LR is input to the second sub-model, and the second sub-model outputs the second target mask MASK_HR; among them, CBR1 and CBR2 internal
- the settings of each layer are similar, as shown in Figure 7, which includes the convolution layer Conv, the batch normalization layer Batch_norm and the excitation layer ReLu.
- FIG. 8 is a block diagram of an electronic device provided by an embodiment of the present disclosure. As shown in Figure 8, the electronic device includes:
- processors 101 one or more processors 101;
- the memory 102 has one or more programs stored thereon.
- the one or more processors 101 implement image processing as in any of the above embodiments. method;
- One or more I/O interfaces 103 are connected between the processor and the memory, and are configured to implement information exchange between the processor and the memory.
- the processor 101 is a device with data processing capabilities, including but not limited to a central processing unit (CPU), etc.
- the memory 102 is a device with data storage capabilities, including but not limited to random access memory (RAM, more specifically Such as SDRAM, DDR, etc.), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory (FLASH);
- the I/O interface (read-write interface) 103 is connected between the processor 101 and the memory 102 , can realize information interaction between the processor 101 and the memory 102, which includes but is not limited to a data bus (Bus), etc.
- processor 101 memory 102, and I/O interface 103 are connected to each other and, in turn, to other components of the computing device via bus 104.
- the plurality of processors 101 includes a plurality of graphics processors (GPUs), which are combined to form a graphics processor array.
- GPUs graphics processors
- Figure 9 is a block diagram of a non-transitory computer-readable medium provided by an embodiment of the present disclosure.
- a computer program is stored on the computer-readable medium, wherein the computer program implements the image processing method in any of the above embodiments when executed by the processor.
- Non-transitory computer-readable media may include computer storage media (or non-transitory media) and communication media (or transitory media).
- computer storage media includes volatile and nonvolatile media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. removable, removable and non-removable media.
- Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, tapes, disk storage or other magnetic storage devices, or may Any other medium used to store the desired information and that can be accessed by a computer.
- communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .
- Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a general illustrative sense only and not for purpose of limitation. In some instances, it will be apparent to those skilled in the art that features, characteristics and/or elements described in connection with a particular embodiment may be used alone, or may be used in conjunction with other embodiments, unless expressly stated otherwise. Features and/or components used in combination. Accordingly, it will be understood by those skilled in the art that various changes in form and details may be made without departing from the scope of the present disclosure as set forth in the appended claims.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
本公开提供了一种图像处理方法,包括:根据预设分辨率对原始图像进行下采样,生成下采样图像;提取下采样图像中的目标物区域,得到第一目标物掩膜;将下采样图像和第一目标物掩膜输入至预先训练得到的掩膜超分辨率模型中;利用掩膜超分辨率模型对第一目标物掩膜进行超分辨率处理,得到第二目标物掩膜;将第二目标物掩膜和原始图像进行融合,得到目标物图像。本公开还提供了一种电子设备和非瞬态计算机可读介质。
Description
本公开涉及图像处理技术领域,特别涉及一种图像处理方法、电子设备和非瞬态计算机可读介质。
现有的基于神经网络模型的目标检测、提取算法已经取得了不错的效果,其在自然图像、医疗图像等各类别图像的处理中都有比较广泛的应用。但是,当前的神经网络模型处理精细度较低,实际进行目标物的提取及抠图时,对于尺寸较大的图像,处理后常常会出现在边缘处存在锯齿等问题。
发明内容
本公开旨在至少解决现有技术中存在的技术问题之一,提出了一种图像处理方法、电子设备和非瞬态计算机可读介质。
为实现上述目的,第一方面,本公开实施例提供了一种图像处理方法,包括:
根据预设分辨率对原始图像进行下采样,生成下采样图像;
提取所述下采样图像中的目标物区域,得到第一目标物掩膜;
将所述下采样图像和所述第一目标物掩膜输入至预先训练得到的掩膜超分辨率模型中;利用所述掩膜超分辨率模型对所述第一目标物掩膜进行超分辨率处理,得到第二目标物掩膜,其中,所述第二目标物掩膜的分辨率高于所述第一目标物掩膜的分辨率;
将所述第二目标物掩膜和所述原始图像进行融合,得到目标物图像。
在一些实施例中,所述掩膜超分辨率模型包括第一子模型和第二子模型;
所述利用所述掩膜超分辨率模型对所述第一目标物掩膜进行超分辨率处理,得到第二目标物掩膜,包括:
利用所述第一子模型提取所述下采样图像对应的图像特征;
将所述第一子模型提取的图像特征输入至所述第二子模型中;利用所述第二子模型,结合由所述第一子模型提取的图像特征对所述第一目标物掩膜进行超分辨率处理,得到所述第二目标物掩膜。
在一些实施例中,所述第一子模型包括依次连接的P级第一运算模块,每级第一运算模块包括第一运算单元和第二运算单元,其中,P为大于1的正整数;
所述利用所述第一子模型提取所述下采样图像对应的图像特征,包括:
对于第n级第一运算模块,n为正整数且不大于P:利用其第一运算单元,根据所述下采样图像或其上一级第一运算模块输出的第一特征图进行图像特征提取,生成第二特征图,并将所述第二特征图输出至所述第二子模型;利用其第二运算单元放大所述第二特征图的尺寸,并将放大后的所述第二特征图输出至下一级第一运算模块;
所述利用所述第二子模型,结合由所述第一子模型提取的图像特征对所述第一目标物掩膜进行超分辨率处理,得到所述第二目标物掩膜,包括:
利用所述第二子模型,结合每一级所述第一运算模块提取出的所述第二特征图,对所述第一目标物掩膜进行超分辨率处理,得到所述第二目标物掩膜。
在一些实施例中,所述第一运算单元包括依次连接的卷积层、批标准化层和激励层,所述第二运算单元包括转置卷积层。
在一些实施例中,所述第二子模型包括依次连接的P级第二运算模块,每级第二运算模块包括拼接层、第三运算单元和第四运算单元,其中,同级的第一运算单元和第三运算单元之间通过拼接层连接;
所述利用所述第二子模型,结合每一级所述第一运算模块提取出的所述第二特征图,对所述第一目标物掩膜进行超分辨率处理,得到所述第二目标物掩膜,包括:
对于第m级第二运算模块,m为正整数且不大于P:利用其拼接层,拼接所述第二特征图和所述第一目标物掩膜,或拼接所述第二特征图和其上一 级第二运算模块输出的第三特征图,生成第四特征图;利用其第三运算单元,根据所述第四特征图进行图像特征提取,生成第五特征图;利用其第四运算单元放大所述第五特征图的尺寸,并将放大后的所述第五特征图输出至下一级第二运算模块;
其中,对于最后一级第二运算模块,将其放大后的所述第五特征图作为所述第二目标物掩膜并输出。
在一些实施例中,所述第三运算单元包括依次连接的卷积层、批标准化层和激励层,所述第四运算单元包括转置卷积层。
在一些实施例中,所述提取所述下采样图像中的目标物区域,并得到第一目标物掩膜,包括:
将所述下采样图像输入至预先训练得到的目标物提取模型中,利用所述目标物提取模型提取所述下采样图像中的目标物区域,得到所述第一目标物掩膜;
其中,所述目标物提取模型为UNet网络模型,所述第一子模型包括依次连接的三级第一运算模块,所述第二子模型包括依次连接的三级第二运算模块。
在一些实施例中,所述掩膜超分辨率模型通过如下步骤训练得到:
将下采样图像样本和其对应的目标物掩膜样本输入至待训练的所述掩膜超分辨率模型中;
通过迭代的方式,基于所述下采样图像样本和所述目标物掩膜样本对待训练的所述掩膜超分辨率模型进行训练;其中,利用待训练的所述第一子模型提取所述下采样图像样本对应的图像特征,并对所述下采样图像样本进行超分辨率处理;以及,将待训练的所述第一子模型提取的图像特征输入至待训练的所述第二子模型中;利用待训练的所述第二子模型,结合由待训练的所述第一子模型提取的图像特征对所述目标物掩膜样本进行超分辨率处理;
响应于预设收敛条件满足,结束训练,得到所述掩膜超分辨率模型。
在一些实施例中,所述预设收敛条件包括以下至少之一:
已训练预设迭代次数;
第一损失值和第二损失值满足预设的损失值条件,其中,所述第一损失值基于所述下采样图像样本对应的原始图像样本和超分辨率处理后的所述下采样图像样本计算得到,所述第二损失值基于所述原始图像样本和超分辨率处理后的所述目标物掩膜样本计算得到。
在一些实施例中,在所述利用待训练的所述第一子模型提取所述下采样图像样本对应的图像特征,并对所述下采样图像样本进行超分辨率处理之后,还包括:
基于均方误差函数,根据所述原始图像样本和超分辨率处理后的所述下采样图像样本计算得到所述第一损失值。
在一些实施例中,在所述利用待训练的所述第二子模型,结合由待训练的所述第一子模型提取的图像特征对所述目标物掩膜样本进行超分辨率处理之后,还包括:
获取所述原始图像样本对应的第一边缘图;
对超分辨率处理后的所述目标物掩膜样本进行边缘检测,得到第二边缘图;
对所述第一边缘图和所述第二边缘图进行边缘匹配,根据边缘匹配结果确定所述第二损失值。
在一些实施例中,所述目标物区域为人像区域,所述目标物图像为人像图像。
第二方面,本公开实施例还提供了一种电子设备,包括:
一个或多个处理器;
存储器,用于存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上述实施例中任一所述的图像处理方法。
第三方面,本公开实施例还提供了一种非瞬态计算机可读介质,其上存 储有计算机程序,其中,所述程序被执行时实现如上述实施例中任一所述的图像处理方法。
附图用来提供对本公开的进一步理解,并且构成说明书的一部分,与本公开的实施例一起用于解释本公开,并不构成对本公开的限制。通过参考附图对详细示例实施例进行描述,以上和其他特征和优点对本领域技术人员将变得更加显而易见,在附图中:
图1为本公开实施例提供的一种图像处理方法的流程图;
图2为本公开实施例步骤S3的一种具体实施方法流程图;
图3为本公开实施例步骤S3的另一种具体实施方法流程图;
图4为本公开实施例提供的一种掩膜超分辨率模型的训练方法流程图;
图5为本公开实施例步骤S02的具体实施方法流程图;
图6为本公开实施例步骤S2的一种具体实施方法流程图;
图7为本公开实施例提供的一种掩膜超分辨率模型的结构示意图;
图8为本公开实施例提供的一种电子设备的组成框图;
图9为本公开实施例提供的一种非瞬态计算机可读介质的组成框图。
为使本领域的技术人员更好地理解本公开的技术方案,下面结合附图对本公开提供的图像处理方法、电子设备和非瞬态计算机可读介质进行详细描述。
在下文中将参考附图更充分地描述示例实施例,但是所述示例实施例可以以不同形式来体现且不应当被解释为限于本文阐述的实施例。反之,提供这些实施例的目的在于使本公开透彻和完整,并将使本领域技术人员充分理解本公开的范围。
本文所使用的术语仅用于描述特定实施例,且不意欲限制本公开。如本文所使用的,单数形式“一个”和“该”也意欲包括复数形式,除非上下文 另外清楚指出。还将理解的是,当本说明书中使用术语“包括”和/或“由……制成”时,指定存在所述特征、整体、步骤、操作、元件和/或组件,但不排除存在或添加一个或多个其他特征、整体、步骤、操作、元件、组件和/或其群组。
将理解的是,虽然本文可以使用术语第一、第二等来描述各种元件,但这些元件不应当受限于这些术语。这些术语仅用于区分一个元件和另一元件。因此,在不背离本公开的指教的情况下,下文讨论的第一元件、第一组件或第一模块可称为第二元件、第二组件或第二模块。
除非另外限定,否则本文所用的所有术语(包括技术和科学术语)的含义与本领域普通技术人员通常理解的含义相同。还将理解,诸如那些在常用字典中限定的那些术语应当被解释为具有与其在相关技术以及本公开的背景下的含义一致的含义,且将不解释为具有理想化或过度形式上的含义,除非本文明确如此限定。
图1为本公开实施例提供的一种图像处理方法的流程图。如图1所示,该方法包括:
步骤S1、根据预设分辨率对原始图像进行下采样,生成下采样图像。
其中,该预设分辨率小于原始图像的分辨率;对原始图像进行下采样对应缩放原始图像的过程,其生成较低分辨率的下采样图像,下采样图像的分辨率即为该预设分辨率,下采样图像的像素点数较原始图像的像素点数减小,对下采样图像进行相应运算和处理的耗时也相应减少,其减少的比例可近似为预设分辨率与原始图像的分辨率之间的比值。
步骤S2、提取下采样图像中的目标物区域,得到第一目标物掩膜。
其中,掩膜(Mask)为一种单通道图像,在图像处理过程中可用于对待处理图像的全部或局部进行遮挡,以控制图像处理的对象区域及进程等;在本实施例中,第一目标物掩膜为与提取出的目标物区域相对应的掩膜,其中,将目标物区域作为下采样图像的前景区域,将其他部分作为下采样图像的背景区域,若将第一目标物掩膜作用于下采样图像,即可得到仅保留前景区域 的下采样图像;在一些实施例中,掩膜为由0和1组成的二进制图像,或者在一些实施例中,掩膜也可以是多值图像。
在一些实施例中,目标物区域为人像区域,目标物图像为人像图像,则步骤S2对应人像抠图的过程。需要说明的是,将人像作为目标物仅为本公开实施例所提供的一种具体实施方式,其不会对本公开的技术方案产生限制,其他目标物类型同样适用于本公开的技术方案,如动植物、车辆及其他交通工具、车牌等;具体地,相应目标物类型下的目标物需符合下述条件中的至少一个:具备具体形状;存在较清晰的轮廓;能够利用相应检测算法确定其在图像中所属的区域位置。
步骤S3、将下采样图像和第一目标物掩膜输入至预先训练得到的掩膜超分辨率模型中,利用掩膜超分辨率模型对第一目标物掩膜进行超分辨率处理,得到第二目标物掩膜。
其中,第二目标物掩膜的分辨率高于第一目标物掩膜的分辨率;在步骤S3中,利用掩膜超分辨率模型,结合下采样图像对第一目标物掩膜进行超分辨率处理,其中,超分辨率处理(Super Resolution,简称SR)又称超分处理,其对应基于低分辨率图像重建出高分辨率图像的过程。
在一些实施例中,该掩膜超分辨率模型为预先基于原始图像样本、下采样图像样本以及目标物掩膜样本进行训练得到的。
步骤S4、将第二目标物掩膜和原始图像进行融合,得到目标物图像。
其中,目标物图像即是原始图像中针对目标物的最终的抠图结果。在一些实施例中,第二目标物掩膜为二值图像,通过相乘的方式将第二目标物掩膜和原始图像进行融合;或者,在一些实施例中,如上文提到的,第二目标物掩膜是单通道图像,可通过通道融合的方式将第二目标物掩膜和原始图像进行融合;或者,在一些实施例中,通过泊松融合的方式将第二目标物掩膜和原始图像进行融合。
本公开实施例提供了一种图像处理方法,其可用于提取原始图像的下采样图像中的目标物区域,得到第一目标物掩膜;将下采样图像和第一目标物 掩膜输入至掩膜超分辨率模型中,利用掩膜超分辨率模型对第一目标物掩膜进行超分辨率处理,得到第二目标物掩膜;将第二目标物掩膜和原始图像进行融合,得到目标物图像可;由此通过提升目标物对应掩膜的分辨率,提升目标物提取过程整体的精细度,可有效避免尺寸较大的、高分辨率的图像在进行目标物提取、抠图时出现边缘锯齿的问题。
图2为本公开实施例步骤S3的一种具体实施方法流程图。具体地,掩膜超分辨率模型包括第一子模型和第二子模型;如图2所示,步骤S3中,利用掩膜超分辨率模型对第一目标物掩膜进行超分辨率处理,得到第二目标物掩膜的步骤,包括:步骤S301和步骤S302。
步骤S301、利用第一子模型提取下采样图像对应的图像特征。
步骤S302、将第一子模型提取的图像特征输入至第二子模型中,利用第二子模型,结合由第一子模型提取的图像特征对第一目标物掩膜进行超分辨率处理,得到第二目标物掩膜。
其中,在利用掩膜超分辨率模型对第一目标物掩膜进行超分辨率处理的过程中,将下采样图像和第一目标物掩膜分别输入至第一子模型和第二子模型;第一子模型用于提取下采样图像的图像特征,并输出至第二子模型,另外,在一些实施例中,第一子模型最后还可输出超分辨率处理后的下采样图像,该输出结果可用于校对模型输入输出以及检测超分辨率效果等;第二子模型用于结合由第一子模型提取的图像特征对第一目标物掩膜进行超分辨率处理,最后输出第二目标物掩膜,在一些实施例中,可通过拼接层(Concat)对特征图进行拼接,以及在一些实施例中,可通过1*1卷积以及通道纬度的池化等方式实现特征融合,以此结合由第一子模型提取的图像特征对第一目标物掩膜进行超分辨率处理。
图3为本公开实施例步骤S3的另一种具体实施方法流程图。具体地,该方法为基于图2所示方法的一种具体化可选实施方案;在其基础上,第一子模型包括依次连接的P级第一运算模块,每级第一运算模块包括第一运算单元和第二运算单元,其中,P为大于1的正整数;如图3所示,在执行步骤S301,利用第一子模型提取下采样图像对应的图像特征的步骤时,对于 第n级第一运算模块(n为正整数且不大于P),其包括:步骤S3011和步骤S3012。
步骤S3011、利用该级第一运算模块的第一运算单元,根据下采样图像或其上一级第一运算模块输出的第一特征图进行图像特征提取,生成第二特征图,并将第二特征图输出至第二子模型。
其中,利用多级第一运算模块分层级对下采样图像进行多次图像特征提取;具体地,当n=1时,即对于第一级第一运算模块,利用其第一运算单元直接对下采样图像进行图像特征提取;当n>1时,利用该级第一运算模块的第一运算单元对其上一级第一运算模块输出的第一特征图进行图像特征提取;具体地,基于上述的多次图像特征提取流程,则上述的,下采样图像对应的图像特征,包括:下采样图像的图像特征和其特征图的图像特征。
步骤S3012、利用该级第一运算模块的第二运算单元放大第二特征图的尺寸,并将放大后的第二特征图输出至下一级第一运算模块。
其中,该级第一运算模块输出的第二特征图即为下一级第一运算模块接收到的第一特征图。在一些实施例中,对于最后一级第一运算模块,其输出的放大后的特征图即超分辨率处理后的下采样图像。
在一些实施例中,第一运算单元包括依次连接的卷积层、批标准化层和激励层,第二运算单元包括转置卷积层,又称反卷积层、逆卷积层。则步骤S3012,利用该级第一运算模块的第二运算单元放大提取出的图像特征所对应的特征图的尺寸的步骤,具体包括:利用该级第一运算模块的第二运算单元对提取出的图像特征所对应的特征图进行转置卷积处理,以放大特征图的尺寸。
在一些实施例中,在不同层级的第一运算模块中,上述的卷积层、批标准化层、激励层和转置卷积层的参数设置可不相同。
具体地,对应于上述各层的运算处理过程,在本公开实施例中,术语“卷积核”指的是在卷积过程中使用的二维矩阵。可选地,二维矩阵中的多个项中的各个项具有特定值。
在本公开实施例中,术语“卷积”是指处理图像的过程。卷积核用于卷积。输入图像的每个像素具有值,卷积核开始于输入图像的一个像素,并顺序地在输入图像中的每个像素上移动。在卷积核的每个位置处,卷积核基于卷积核的尺度与图像上的几个像素重叠。在卷积核的位置处,将几个重叠像素中的一个像素的值乘以卷积核的相应一个值,以获得几个重叠像素中的一个像素的相乘值。随后,将重叠像素的所有相乘值相加,以获得与卷积核在输入图像上的位置相对应的和。通过在输入图像的每个像素上移动卷积核,收集并输出与卷积核的所有位置相对应的所有和,以形成输出图像。在一些实施例中,卷积可以使用不同的卷积核来提取输入图像的不同特征。在一些实施例中,卷积过程可使用不同的卷积核,将更多特征添加到输入图像。
其中,卷积层用于对输入图像执行卷积以获得输出图像。可选地,使用不同的卷积核来对同一输入图像执行不同的卷积。可选地,使用不同的卷积核来对同一输入图像的不同部分执行卷积。可选地,使用不同的卷积核来对不同的输入图像执行卷积,例如,在卷积层中输入多个图像,使用相应的卷积核来对多个图像中的图像执行卷积。可选地,根据输入图像的不同情况使用不同的卷积核。
其中,激励层可对从卷积层输出的输出信号执行非线性映射。在激励层中可以使用各种函数。适于在激励层中采用的函数的示例包括但不限于:整流线性单元(ReLU)函数、S形(sigmoid)函数和双曲正切函数(例如tanh函数)。在一些实施例中,激励层以及批标准化层包括在卷积层中。
其中,批标准化层(Batch Normalization,简称BN)可对一小批数据在网络模型各层的输出做标准化处理,标准化是使数据符合0均值,且1为标准差的标准正态分布的过程,可以解决神经网络模型中梯度消失的问题。
具体地,基于步骤S3011,步骤S302,利用第二子模型,结合由第一子模型提取的图像特征对第一目标物掩膜进行超分辨率处理,得到第二目标物掩膜的步骤,包括:利用第二子模型,结合每一级所述第一运算模块提取出的第二特征图,对第一目标物掩膜进行超分辨率处理,得到第二目标物掩膜。
在一些实施例中,第二子模型包括依次连接的P级第二运算模块,每级 第二运算模块包括拼接层、第三运算单元和第四运算单元,其中,其中,P为大于1的正整数,同级的第一运算单元和第三运算单元之间通过拼接层连接;由此,在一些实施例中,如图3所示,在执行上述的利用第二子模型,结合每一级所述第一运算模块提取出的第二特征图,对第一目标物掩膜进行超分辨率处理,得到第二目标物掩膜的步骤时,对于第m级第二运算模块(m为正整数且不大于P),其包括:步骤S3021至步骤S3023。
步骤S3021、利用该级第二运算模块的拼接层,拼接第二特征图和第一目标物掩膜,或拼接第二特征图和其上一级第二运算模块输出的第三特征图,生成第四特征图。
其中,当m=1时,即对于第一级第二运算模块,利用其拼接层拼接第二特征图和第一目标物掩膜;当m>1时,利用该级第二运算模块的拼接层拼接第二特征图和其上一级第二运算模块输出的第三特征图。
在一些实施例中,多级第一运算模块的层级数与多级第二运算模块的层级数相同。
在一些实施例中,拼接层用于在通道纬度上进行拼接。
步骤S3022、利用该级第二运算模块的第三运算单元,根据第四特征图进行图像特征提取,生成第五特征图。
步骤S3023、利用该级第二运算模块的第四运算单元放大第五特征图的尺寸,并将放大后的第五特征图输出至下一级第二运算模块。
其中,该级第二运算模块输出的第五特征图即为下一级第二运算模块接收到的第三特征图;对于最后一级第二运算模块,将其放大后的第五特征图作为第二目标物掩膜并输出。
在一些实施例中,与第一运算模块类似地,第三运算单元包括依次连接的卷积层、批标准化层和激励层,第四运算单元包括转置卷积层。
在一些实施例中,在不同层级的第二运算模块中,上述的卷积层、批标准化层、激励层和转置卷积层的参数设置可不相同。
本公开实施例提供了一种图像处理方法,其可用于结合下采样图像的图 像特征对目标物对应的掩膜进行超分辨率处理,增加其特征维度,提升目标物提取的精细度。
图4为本公开实施例提供的一种掩膜超分辨率模型的训练方法流程图。具体地,该掩膜超分辨率模型为图2所对应的掩膜超分辨率模型,其包括第一子模型和第二子模型;如图4所示,该掩膜超分辨率模型通过如下步骤训练得到:
步骤S01、将下采样图像样本和其对应的目标物掩膜样本输入至待训练的所述掩膜超分辨率模型中。
其中,下采样图像样本为其对应的原始图像样本经下采样后得到,目标物掩膜样本由下采样图像样本经目标物提取后得到。
步骤S02、通过迭代的方式,基于下采样图像样本和目标物掩膜样本对待训练的掩膜超分辨率模型进行训练。
图5为本公开实施例步骤S02的具体实施方法流程图。如图5所示,步骤S02包括:步骤S021和步骤S022。
步骤S021、利用待训练的第一子模型提取下采样图像样本对应的图像特征,并对下采样图像样本进行超分辨率处理。
步骤S022、将待训练的第一子模型提取的图像特征输入至待训练的第二子模型中,利用待训练的第二子模型,结合由待训练的第一子模型提取的图像特征对目标物掩膜样本进行超分辨率处理。
其中,与下采样图像对应的图像特征类似地,下采样图像样本对应的图像特征包括下采样图像样本的图像特征以及其特征图的图像特征;上述训练第一子模型和第二子模型的过程与第一子模型和第二子模型的实际推理过程相对应。
步骤S03、响应于预设收敛条件满足,结束训练,得到掩膜超分辨率模型。
在一些实施例中,预设收敛条件包括以下至少之一:已训练预设迭代次数;第一损失值和第二损失值满足预设的损失值条件。
其中,第一损失值基于下采样图像样本对应的原始图像样本和超分辨率处理后的下采样图像样本计算得到,第二损失值基于原始图像样本和超分辨率处理后的目标物掩膜样本计算得到。
在一些实施例中,在步骤S021,利用待训练的第一子模型提取下采样图像样本对应的图像特征,并对下采样图像样本进行超分辨率处理的步骤之后,还包括:基于均方误差(Mean Square Error,简称MSE)函数,根据原始图像样本和超分辨率处理后的下采样图像样本计算得到第一损失值。
在一些实施例中,在步骤S022,利用待训练的第二子模型,结合由待训练的第一子模型提取的图像特征对目标物掩膜样本进行超分辨率处理的步骤之后,还包括:获取原始图像样本对应的第一边缘图;对超分辨率处理后的目标物掩膜样本进行边缘检测,得到第二边缘图;对第一边缘图和第二边缘图进行边缘匹配,根据边缘匹配结果确定第二损失值。在一些实施例中,对原始图像样本进行边缘检测以得到第一边缘图,或者从存储区读取预先计算得到的该第一边缘图。
在一些实施例中,采用如下公式:
计算第二损失值L
EM;其中,EM1(x
i,y
i)表示第一边缘图中像素坐标(x
i,y
i)对应的像素值,EM2(x
i,y
i)表示第二边缘图中像素坐标(x
i,y
i)对应的像素值,i∈[1,n];在一些实施例中,边缘图为8bit灰度图像,EM1(x
i,y
i)>127,EM2(x
i,y
i)>127。
图6为本公开实施例步骤S2的一种具体实施方法流程图。如图6所示,步骤S2,提取下采样图像中的目标物区域,并得到第一目标物掩膜的步骤,包括:步骤S201。
步骤S201、将下采样图像输入至预先训练得到的目标物提取模型中,利用目标物提取模型提取下采样图像中的目标物区域,得到第一目标物掩膜。
其中,该目标物提取模型采用UNet网络模型,其输入图像及输出图像 的分辨率均为512*512;相应地,在步骤S1中,预设分辨率为512*512。具体地,利用目标物提取模型得到的第一目标物掩膜输入至掩膜超分辨率模型中,掩膜超分辨率模型包括第一子模型和第二子模型,第一子模型包括依次连接的三级第一运算模块,第二子模型包括依次连接的三级第二运算模块,第一运算模块和第二运算模块中具体各层的设置与图3所对应的各层设置相同,由此,可利用三级第一运算模块对下采样图像进行三次图像特征提取,三级第二运算模块结合各次提取出的图像特征对第一目标物掩膜进行超分辨率处理。
在一些实施例中,每级第二运算模块利用其第四运算单元放大其第五特征图的尺寸,则基于相应模型参数设置,每次可将特征图的尺寸放大两倍,最后输出的第二目标物掩膜的分辨率可为4096*4096,由此可应用于4K场景;具体地,通过设置转置卷积层的填充参数(padding)可实现两倍放大,如将该参数设置为“same”等。
下面结合实际应用对本公开实施例提供的图像处理方法进行详细说明。具体地,以应用于人像抠图为例,则下采样图像中的目标物区域为人像区域,最终得到的目标物图像为人像图像。
在具体实施中,首先根据预设分辨率对原始图像进行下采样,生成下采样图像;其中,该原始图像为包括人像的4K图像,预设分辨率为512*512。
将下采样图像输入至预先训练得到的目标物提取模型中,利用目标物提取模型提取下采样图像中的人像区域,得到第一目标物掩膜;其中,该目标物提取模型具体用于人像抠图,其采用UNet网络模型。
将下采样图像和第一目标物掩膜输入至预先训练得到的掩膜超分辨率模型中;其中,掩膜超分辨率模型包括第一子模型和第二子模型;第一子模型包括依次连接的三级第一运算模块(即P=3),每级第一运算模块包括第一运算单元和第二运算单元,下采样图像输入至第一子模型中;第二子模型包括依次连接的三级第二运算模块,每级第二运算模块包括拼接层、第三运算单元和第四运算单元,其中,同级的第一运算单元和第三运算单元之间通过拼接层连接,第一目标物掩膜输入至第二子模型中。具体地,对于第n级 第一运算模块(n为正整数且不大于3),利用其第一运算单元,根据下采样图像或其上一级第一运算模块输出的第一特征图进行图像特征提取,生成第二特征图;以及,利用其第二运算单元放大第二特征图的尺寸,并将放大后的第二特征图输出至下一级第一运算模块;其中,对于第一级第一运算模块,其第一运算单元直接对下采样图像进行图像特征提取,而对于第二级和第三级第一运算模块,其第一运算单元分别对第一级第一运算模块和第二级第一运算模块输出的特征图进行图像特征提取,第三级运算模块将其放大后的第二特征图直接输出,该第二特征图为进行超分辨率处理后的下采样图像;在一些实施例中,第一运算单元包括依次连接的卷积层、批标准化层和激励层,第二运算单元包括转置卷积层。具体地,对于第m级第二运算模块(m为正整数且不大于3),利用其拼接层,拼接同级第一运算模块输出的第二特征图和第一目标物掩膜,或拼接第二特征图和其上一级第二运算模块输出的第三特征图,生成第四特征图;利用其第三运算单元,根据第四特征图进行图像特征提取,生成第五特征图;利用其第四运算单元放大第五特征图的尺寸,并将放大后的第五特征图输出至下一级第二运算模块;其中,对于第一级第二运算模块,利用其拼接层拼接第一级第一运算模块输出的特征图和第一目标物掩膜,而对于第二级和第三级第二运算模块,其拼接层分别拼接第一级第二运算模块和第二级第一运算模块输出的特征图,以及拼接第二级第二运算模块和第三级第一运算模块输出的特征图,第三级运算模块将其放大后的第五特征图直接输出,该第五特征图即为第二目标物掩膜;在一些实施例中,第三运算单元包括依次连接的卷积层、批标准化层和激励层,第四运算单元包括转置卷积层。
最后,将第二目标物掩膜和原始图像进行融合,得到人像图像。
图7为本公开实施例提供的一种掩膜超分辨率模型的结构示意图。如图7所示,图中箭头示出了数据传输方向;掩膜超分辨率模型包括第一子模型和第二子模型;第一子模型包括依次连接的三级第一运算模块301,每级第一运算模块301包括第一运算单元CBR1和第二运算单元T_conv1,下采样图像LR输入至第一子模型中,第一子模型输出超分辨率处理后的下采样图 像HR;第二子模型包括依次连接的三级第二运算模块401,每级第二运算模块401包括拼接层(图中未示出)、第三运算单元CBR2和第四运算单元T_conv2,其中,同级的第一运算单元CBR1和第三运算单元CBR2之间通过拼接层连接,第一目标物掩膜MASK_LR输入至第二子模型中,第二子模型输出第二目标物掩膜MASK_HR;其中,CBR1和CBR2内部各层的设置类似,如图7所示,其包括卷积层Conv、批标准化层Batch_norm和激励层ReLu。
图8为本公开实施例提供的一种电子设备的组成框图。如图8所示,该电子设备包括:
一个或多个处理器101;
存储器102,其上存储有一个或多个程序,当该一个或多个程序被该一个或多个处理器执行,使得该一个或多个处理器101实现如上述实施例中任一的图像处理方法;
一个或多个I/O接口103,连接在处理器与存储器之间,配置为实现处理器与存储器的信息交互。
其中,处理器101为具有数据处理能力的器件,其包括但不限于中央处理器(CPU)等;存储器102为具有数据存储能力的器件,其包括但不限于随机存取存储器(RAM,更具体如SDRAM、DDR等)、只读存储器(ROM)、带电可擦可编程只读存储器(EEPROM)、闪存(FLASH);I/O接口(读写接口)103连接在处理器101与存储器102间,能实现处理器101与存储器102的信息交互,其包括但不限于数据总线(Bus)等。
在一些实施例中,处理器101、存储器102和I/O接口103通过总线104相互连接,进而与计算设备的其它组件连接。
在一些实施例中,多个处理器101包括多个图形处理器(GPU),其组合设置构成图形处理器阵列。
图9为本公开实施例提供的一种非瞬态计算机可读介质的组成框图。该计算机可读介质上存储有计算机程序,其中,该计算机程序在被处理器执行 时实现如上述实施例中任一的图像处理方法。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在非瞬态计算机可读介质上,非瞬态计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。
本文已经公开了示例实施例,并且虽然采用了具体术语,但它们仅用于并仅应当被解释为一般说明性含义,并且不用于限制的目的。在一些实例中,对本领域技术人员显而易见的是,除非另外明确指出,否则可单独使用与特定实施例相结合描述的特征、特性和/或元素,或可与其他实施例相结合描述的特征、特性和/或元件组合使用。因此,本领域技术人员将理解,在不脱离由所附的权利要求阐明的本公开的范围的情况下,可进行各种形式和细节上的改变。
Claims (14)
- 一种图像处理方法,其中,包括:根据预设分辨率对原始图像进行下采样,生成下采样图像;提取所述下采样图像中的目标物区域,得到第一目标物掩膜;将所述下采样图像和所述第一目标物掩膜输入至预先训练得到的掩膜超分辨率模型中;利用所述掩膜超分辨率模型对所述第一目标物掩膜进行超分辨率处理,得到第二目标物掩膜,其中,所述第二目标物掩膜的分辨率高于所述第一目标物掩膜的分辨率;将所述第二目标物掩膜和所述原始图像进行融合,得到目标物图像。
- 根据权利要求1所述的图像处理方法,其中,所述掩膜超分辨率模型包括第一子模型和第二子模型;所述利用所述掩膜超分辨率模型对所述第一目标物掩膜进行超分辨率处理,得到第二目标物掩膜,包括:利用所述第一子模型提取所述下采样图像对应的图像特征;将所述第一子模型提取的图像特征输入至所述第二子模型中;利用所述第二子模型,结合由所述第一子模型提取的图像特征对所述第一目标物掩膜进行超分辨率处理,得到所述第二目标物掩膜。
- 根据权利要求2所述的图像处理方法,其中,所述第一子模型包括依次连接的P级第一运算模块,每级第一运算模块包括第一运算单元和第二运算单元,其中,P为大于1的正整数;所述利用所述第一子模型提取所述下采样图像对应的图像特征,包括:对于第n级第一运算模块,n为正整数且不大于P:利用其第一运算单元,根据所述下采样图像或其上一级第一运算模块输出的第一特征图进行图 像特征提取,生成第二特征图,并将所述第二特征图输出至所述第二子模型;利用其第二运算单元放大所述第二特征图的尺寸,并将放大后的所述第二特征图输出至下一级第一运算模块;所述利用所述第二子模型,结合由所述第一子模型提取的图像特征对所述第一目标物掩膜进行超分辨率处理,得到所述第二目标物掩膜,包括:利用所述第二子模型,结合每一级所述第一运算模块提取出的所述第二特征图,对所述第一目标物掩膜进行超分辨率处理,得到所述第二目标物掩膜。
- 根据权利要求3所述的图像处理方法,其中,所述第一运算单元包括依次连接的卷积层、批标准化层和激励层,所述第二运算单元包括转置卷积层。
- 根据权利要求3所述的图像处理方法,其中,所述第二子模型包括依次连接的P级第二运算模块,每级第二运算模块包括拼接层、第三运算单元和第四运算单元,其中,同级的第一运算单元和第三运算单元之间通过拼接层连接;所述利用所述第二子模型,结合每一级所述第一运算模块提取出的所述第二特征图,对所述第一目标物掩膜进行超分辨率处理,得到所述第二目标物掩膜,包括:对于第m级第二运算模块,m为正整数且不大于P:利用其拼接层,拼接所述第二特征图和所述第一目标物掩膜,或拼接所述第二特征图和其上一级第二运算模块输出的第三特征图,生成第四特征图;利用其第三运算单元,根据所述第四特征图进行图像特征提取,生成第五特征图;利用其第四运算单元放大所述第五特征图的尺寸,并将放大后的所述第五特征图输出至下一级第二运算模块;其中,对于最后一级第二运算模块,将其放大后的所述第五特征图作为 所述第二目标物掩膜并输出。
- 根据权利要求5所述的图像处理方法,其中,所述第三运算单元包括依次连接的卷积层、批标准化层和激励层,所述第四运算单元包括转置卷积层。
- 根据权利要求5所述的图像处理方法,其中,所述提取所述下采样图像中的目标物区域,并得到第一目标物掩膜,包括:将所述下采样图像输入至预先训练得到的目标物提取模型中,利用所述目标物提取模型提取所述下采样图像中的目标物区域,得到所述第一目标物掩膜;其中,所述目标物提取模型为UNet网络模型,所述第一子模型包括依次连接的三级第一运算模块,所述第二子模型包括依次连接的三级第二运算模块。
- 根据权利要求2所述的图像处理方法,其中,所述掩膜超分辨率模型通过如下步骤训练得到:将下采样图像样本和其对应的目标物掩膜样本输入至待训练的所述掩膜超分辨率模型中;通过迭代的方式,基于所述下采样图像样本和所述目标物掩膜样本对待训练的所述掩膜超分辨率模型进行训练;其中,利用待训练的所述第一子模型提取所述下采样图像样本对应的图像特征,并对所述下采样图像样本进行超分辨率处理;以及,将待训练的所述第一子模型提取的图像特征输入至待训练的所述第二子模型中;利用待训练的所述第二子模型,结合由待训练的所述第一子模型提取的图像特征对所述目标物掩膜样本进行超分辨率处理;响应于预设收敛条件满足,结束训练,得到所述掩膜超分辨率模型。
- 根据权利要求8所述的图像处理方法,其中,所述预设收敛条件包括以下至少之一:已训练预设迭代次数;第一损失值和第二损失值满足预设的损失值条件,其中,所述第一损失值基于所述下采样图像样本对应的原始图像样本和超分辨率处理后的所述下采样图像样本计算得到,所述第二损失值基于所述原始图像样本和超分辨率处理后的所述目标物掩膜样本计算得到。
- 根据权利要求9所述的图像处理方法,其中,在所述利用待训练的所述第一子模型提取所述下采样图像样本对应的图像特征,并对所述下采样图像样本进行超分辨率处理之后,还包括:基于均方误差函数,根据所述原始图像样本和超分辨率处理后的所述下采样图像样本计算得到所述第一损失值。
- 根据权利要求9所述的图像处理方法,其中,在所述利用待训练的所述第二子模型,结合由待训练的所述第一子模型提取的图像特征对所述目标物掩膜样本进行超分辨率处理之后,还包括:获取所述原始图像样本对应的第一边缘图;对超分辨率处理后的所述目标物掩膜样本进行边缘检测,得到第二边缘图;对所述第一边缘图和所述第二边缘图进行边缘匹配,根据边缘匹配结果确定所述第二损失值。
- 根据权利要求1-11中任意一项所述的图像处理方法,其中,所述目标物区域为人像区域,所述目标物图像为人像图像。
- 一种电子设备,包括:一个或多个处理器;存储器,用于存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-12中任意一项所述的图像处理方法。
- 一种非瞬态计算机可读介质,其上存储有计算机程序,其中,所述程序被执行时实现如权利要求1-12中任意一项所述的图像处理方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/273,026 US20240087085A1 (en) | 2021-10-29 | 2021-10-29 | Image processing method, electronic device, and non-transitory computer readable medium |
CN202180003148.9A CN116368512A (zh) | 2021-10-29 | 2021-10-29 | 图像处理方法、电子设备和非瞬态计算机可读介质 |
PCT/CN2021/127282 WO2023070495A1 (zh) | 2021-10-29 | 2021-10-29 | 图像处理方法、电子设备和非瞬态计算机可读介质 |
DE112021008413.5T DE112021008413T5 (de) | 2021-10-29 | 2021-10-29 | Bildverarbeitungsverfahren, elektronisches gerät und nichtflüchtiges computerlesbares medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/127282 WO2023070495A1 (zh) | 2021-10-29 | 2021-10-29 | 图像处理方法、电子设备和非瞬态计算机可读介质 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2023070495A1 WO2023070495A1 (zh) | 2023-05-04 |
WO2023070495A9 true WO2023070495A9 (zh) | 2024-01-11 |
Family
ID=86160396
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/127282 WO2023070495A1 (zh) | 2021-10-29 | 2021-10-29 | 图像处理方法、电子设备和非瞬态计算机可读介质 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240087085A1 (zh) |
CN (1) | CN116368512A (zh) |
DE (1) | DE112021008413T5 (zh) |
WO (1) | WO2023070495A1 (zh) |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10909657B1 (en) * | 2017-09-11 | 2021-02-02 | Apple Inc. | Flexible resolution support for image and video style transfer |
CN112819720B (zh) * | 2021-02-02 | 2023-10-03 | Oppo广东移动通信有限公司 | 图像处理方法、装置、电子设备及存储介质 |
CN113449735B (zh) * | 2021-07-15 | 2023-10-31 | 北京科技大学 | 一种超像素分割的语义分割方法及装置 |
CN113763249A (zh) * | 2021-09-10 | 2021-12-07 | 平安科技(深圳)有限公司 | 文本图像超分辨率重建方法及其相关设备 |
-
2021
- 2021-10-29 WO PCT/CN2021/127282 patent/WO2023070495A1/zh active Application Filing
- 2021-10-29 CN CN202180003148.9A patent/CN116368512A/zh active Pending
- 2021-10-29 US US18/273,026 patent/US20240087085A1/en active Pending
- 2021-10-29 DE DE112021008413.5T patent/DE112021008413T5/de active Pending
Also Published As
Publication number | Publication date |
---|---|
US20240087085A1 (en) | 2024-03-14 |
WO2023070495A1 (zh) | 2023-05-04 |
CN116368512A (zh) | 2023-06-30 |
DE112021008413T5 (de) | 2024-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kim et al. | Deformable kernel networks for joint image filtering | |
US11403838B2 (en) | Image processing method, apparatus, equipment, and storage medium to obtain target image features | |
WO2022111355A1 (zh) | 车牌识别方法及装置、存储介质、终端 | |
CN112381879B (zh) | 基于图像和三维模型的物体姿态估计方法、系统及介质 | |
CN112800964B (zh) | 基于多模块融合的遥感影像目标检测方法及系统 | |
CN110598714A (zh) | 一种软骨图像分割方法、装置、可读存储介质及终端设备 | |
US11030750B2 (en) | Multi-level convolutional LSTM model for the segmentation of MR images | |
CN109815931B (zh) | 一种视频物体识别的方法、装置、设备以及存储介质 | |
CN110223300A (zh) | Ct图像腹部多器官分割方法及装置 | |
CN109242771B (zh) | 一种超分辨率图像重建方法及装置、计算机可读存储介质和计算机设备 | |
CN111476719A (zh) | 图像处理方法、装置、计算机设备及存储介质 | |
WO2023077809A1 (zh) | 神经网络训练的方法、电子设备及计算机存储介质 | |
CN110827335B (zh) | 乳腺影像配准方法和装置 | |
Gong et al. | Combining sparse representation and local rank constraint for single image super resolution | |
CN111310758A (zh) | 文本检测方法、装置、计算机设备和存储介质 | |
CN113298032A (zh) | 基于深度学习的无人机视角图像的车辆目标检测方法 | |
CN111611968B (zh) | 一种遥感图像的处理方法以及遥感图像处理模型 | |
CN113421276A (zh) | 一种图像处理方法、装置及存储介质 | |
CN115867933A (zh) | 用于处理图像的计算机实现的方法、计算机程序产品以及系统 | |
CN112613541A (zh) | 目标检测方法及装置、存储介质及电子设备 | |
CN117726513A (zh) | 一种基于彩色图像引导的深度图超分辨率重建方法及系统 | |
WO2024153156A1 (zh) | 一种图像处理方法、装置、设备和介质 | |
Lu et al. | Non-convex joint bilateral guided depth upsampling | |
Xiong et al. | Single image super-resolution via image quality assessment-guided deep learning network | |
US20240135632A1 (en) | Method and appratus with neural rendering based on view augmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21961857 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18273026 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202317084318 Country of ref document: IN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 112021008413 Country of ref document: DE |