WO2024051690A1 - 图像的修复方法、装置及电子设备 - Google Patents
图像的修复方法、装置及电子设备 Download PDFInfo
- Publication number
- WO2024051690A1 WO2024051690A1 PCT/CN2023/117018 CN2023117018W WO2024051690A1 WO 2024051690 A1 WO2024051690 A1 WO 2024051690A1 CN 2023117018 W CN2023117018 W CN 2023117018W WO 2024051690 A1 WO2024051690 A1 WO 2024051690A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- cell
- area
- map
- feature map
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 64
- 230000008439 repair process Effects 0.000 claims description 71
- 238000012545 processing Methods 0.000 claims description 43
- 238000004590 computer program Methods 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 14
- 238000012937 correction Methods 0.000 claims description 7
- 238000005070 sampling Methods 0.000 claims description 4
- 230000001172 regenerating effect Effects 0.000 claims 5
- 238000010586 diagram Methods 0.000 description 10
- 230000000694 effects Effects 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 238000013473 artificial intelligence Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/77—Retouching; Inpainting; Scratch removal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/70—Labelling scene content, e.g. deriving syntactic or semantic representations
Definitions
- Embodiments of the present disclosure relate to an image repair method, device and electronic equipment.
- Artificial intelligence technology is increasingly used in the image field. Artificial intelligence technology is usually used to repair damaged original images, or to remove obstructions in original images to generate new images. At present, in the new image obtained by processing the original image using related technologies, the processed area will have residual traces of the original image, and the effect is poor. Therefore, a solution that can repair the modified areas in the image is needed.
- the present disclosure provides an image repair method, device and electronic equipment.
- an image repair method including:
- the first image is an image obtained by processing the target object in the original image
- the first area is at least a partial area of the target object
- the first area is repaired to obtain a repaired second image.
- an image repair device includes:
- the first acquisition module is used to acquire the first image;
- the first image is an image obtained by processing the target object in the original image;
- a determination module configured to determine a first area to be repaired in the first image; the first area is at least a partial area of the target object;
- the second acquisition module is used to acquire the target semantic map corresponding to the first image
- a repair module configured to repair the first area based on the target semantic map to obtain a repaired second image.
- a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the above method is implemented.
- an electronic device including a memory, a processor, and a computer program stored in the memory and executable on the processor.
- the processor executes the program, the above method is implemented.
- Figure 1 is a schematic diagram of an image repair scene according to an exemplary embodiment of the present disclosure
- Figure 2 is a flow chart of an image repair method according to an exemplary embodiment of the present disclosure
- Figure 3 is a flow chart of another image repair method according to an exemplary embodiment of the present disclosure.
- Figure 4 is a block diagram of an image repair device according to an exemplary embodiment of the present disclosure.
- Figure 5 is a schematic block diagram of an electronic device provided by some embodiments of the present disclosure.
- Figure 6 is a schematic block diagram of another electronic device provided by some embodiments of the present disclosure.
- Figure 7 is a schematic diagram of a storage medium provided by some embodiments of the present disclosure.
- first, second, third, etc. may be used in this disclosure to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from each other.
- first information may also be called second information, and similarly, the second information may also be called first information.
- word “if” as used herein may be interpreted as "when” or “when” or “in response to determining.”
- Artificial intelligence technology is increasingly used in the image field. Artificial intelligence technology is usually used to repair damaged original images, or to remove obstructions in original images to generate new images. For example, changing the long hair of characters in human images to short hair, or removing trees or buildings in landscape images, etc. At present, in the new image obtained by processing the original image using related technologies, the processed area will have residual traces of the original image, and the effect is poor. Take, for example, changing the long hair of the character in the image into short hair. In the blocked area exposed after removing the long hair, there will be problems such as hair residue, unclear boundaries of the blocked clothes, and abnormal colors. Therefore, a solution that can repair the modified areas in the image is needed.
- the present disclosure provides an image repair solution that repairs at least part of the modified area in the image to be repaired through the semantic map corresponding to the image to be repaired, thereby obtaining an image with better display effect. Since the solution provided by this embodiment considers the semantic map of the image to be repaired when repairing the modified area in the image to be repaired, and the semantic map of the image to be repaired contains richer Semantic information, therefore, the image to be repaired can be repaired based on richer semantic information. The residual traces of the original image in the repaired image are reduced, making the boundaries of different semantic areas clear, the texture richer, and the image more realistic.
- FIG. 1 a schematic diagram of an image repair scene is shown according to an exemplary embodiment.
- the solution of the present disclosure will be schematically explained below with reference to Figure 1 and a complete and specific application example.
- This application example describes a specific image repair process.
- the original image A is an image that needs to remove occlusions or has missing areas.
- image B can be obtained . Since the modified area a in image B has problems such as loss of texture details and unclear edges, image B needs to be further repaired for area a.
- semantic segmentation processing can be performed on image B to obtain the semantic map C corresponding to image B, and obtain the information of area a.
- a mask operation is performed on image B based on the information of area a to assign the pixel value of area a in image B to 0 to obtain image D.
- the image D and the semantic map C are input into the pre-trained image repair network, and the image repair network performs repair for area a.
- the semantic map C used here is the semantic map corresponding to image B. This semantic map is essentially different from the semantic map corresponding to the original image A. Since the information of the area to be modified in the original image A is seriously missing, the semantic map corresponding to the original image A lacks the semantic information of the area to be modified.
- the image D can be processed first through the downsampling module to downsample the image D and extract the image features of the image D.
- the downsampling module can be composed of multiple convolutional layers, and the image D can be convolved by each convolutional layer in turn.
- the results of the convolution processing can be semantically corrected after each convolution processing.
- two parameters ⁇ and ⁇ can be learned through two different convolutional layers, and the parameters ⁇ and ⁇ can be used to semantically correct the feature map obtained by the convolution process.
- SPADE spatial adaptation can be used to perform semantic correction based on the semantic map C.
- the unknown area corresponding to the area a in the feature map to be repaired can be divided into multiple unknown sub-regions according to the semantics, so that each unknown sub-region corresponds to only one semantics. And determine the known areas in the feature map to be repaired except the unknown areas. The known areas are also Divided into multiple known sub-regions, each known sub-region corresponds to only one semantics.
- the initial features corresponding to the unknown sub-region in the feature map to be repaired can be determined, and the features of the unknown sub-region can be reconstructed using known sub-regions with the same semantics as the unknown sub-region. , to obtain the reconstructed features (for the specific process, please refer to the embodiment in Figure 3). By merging the initial features with the reconstructed features through stacking processing, the repaired feature map can be obtained.
- the repaired feature map is then processed through upsampling to upsample the repaired feature map to convert the repaired feature map into the repaired target image E.
- the upsampling module can be composed of multiple deconvolution layers, and the repaired feature maps can be deconvolved by each deconvolution layer in turn.
- the result of the deconvolution process can be semantically corrected after each deconvolution process.
- FIG. 2 is a flowchart of an image repair method according to an exemplary embodiment.
- the execution subject of this method can be implemented as any device, platform, server or device cluster with computing and processing capabilities.
- the method includes the following steps:
- step 201 a first image is obtained, and a first area to be repaired in the first image is determined.
- the first image is an image obtained by processing the target object in the original image, and the first area is at least a partial area corresponding to the target object.
- the first image may be an image obtained by removing occlusions from the original image (the target object is the occlusion), and the first area may be at least a partial area corresponding to the removed occlusion. For example, to change long hair in a character image into short hair, you need to remove part of the hair ends in the image.
- the image obtained after the hair tail removal process is the first image, and the removed hair tail area is the first area.
- the area to be repaired in the image usually contains a variety of semantics, and the area to be repaired accounts for a large proportion of the image, and there is less known information to refer to, therefore, the solution provided in this embodiment is used for repair. All you can The effect achieved is more significant.
- the first image may also be an image obtained by repairing and filling an area in the original image that is damaged or has missing information.
- the first area may be at least part of the area that is damaged or has missing information (the target object is damaged or missing information). For example, scan an old photo that is severely damaged in some areas to obtain the original image. Repair the area corresponding to the damaged part in the original image to obtain a first image, where the repaired area is the first area. It can be understood that this solution can also be applied in other scenarios, and this embodiment is not limited to specific application scenarios.
- step 202 a target semantic map corresponding to the first image is obtained, and in step 203, the first area is repaired based on the target semantic map to obtain a repaired second image.
- semantic segmentation can be performed on the first image to obtain a target semantic map corresponding to the first image, and based on the target semantic map, the features corresponding to the first region in the first image are repaired, and the first repaired image is obtained. new features corresponding to the region, and then generate a repaired second image based on the new features corresponding to the first region.
- the semantic map used here is the semantic map corresponding to the modified first image, not the semantic map of the unmodified original image. This is because the semantic information of the area to be modified in the original image is missing, while the semantic information of the area to be repaired in the modified first image is richer.
- the features corresponding to the second area in the first image can be used to perform the processing on the features corresponding to the first area in the first image.
- repair For example, the repair parameters are obtained based on the features corresponding to the target semantic map and the second region, and the repair parameters are used to repair the features corresponding to the first region (such as adding or multiplying the repair parameters and the features corresponding to the first region, or Perform preset calculations, etc.).
- the first feature map corresponding to the first image can also be obtained, and based on the target semantic map, the features corresponding to the second region in the first feature map are used to regenerate the features corresponding to the first region, and we obtain The second feature map. And based on the second feature map, the second image is obtained. For example, for a first region corresponding to a semantic meaning, the features corresponding to the first region in the first feature map can be used to regenerate the features corresponding to the first region by using the features corresponding to the second region that is closest and have the same semantic meaning within a preset range around it.
- a first cell corresponding to the first region may also be determined, and based on the target semantic map, at least one second cell having the same semantics as the first cell (the second cell corresponding to second area). Then, the features corresponding to the first cell are regenerated based on the features of the second cell. Since this implementation further subdivides the first area to be repaired into first cells, and uses the characteristics of the second cell with the same semantics as the first cell to regenerate the characteristics corresponding to the first cell, therefore, It can make the repaired image quality higher and the semantic boundaries clearer and more natural.
- the present disclosure provides an image repair method that repairs at least part of the modified area in the image to be repaired through the semantic map corresponding to the image to be repaired, thereby obtaining an image with better display effect. Since the solution provided by this embodiment takes into account the semantic map of the image to be repaired when repairing the modified area in the image to be repaired, and the semantic map of the image to be repaired contains richer semantic information, it can be based on Richer semantic information is used to repair the image to be repaired. The residual traces of the original image in the repaired image are reduced, making the boundaries of different semantic areas clear, the texture richer, and the image more realistic.
- One application scenario can be: changing the long hair of the character in the original image 1 into short hair, that is, removing the tail part of the long hair in the original image 1 to obtain image 2. Since the area where the long hair is removed in Image 2 has a lot of texture loss and details, Image 2 needs to be further repaired.
- image 2 can be acquired as the first image, and the modified area f in image 2 is determined as the first area.
- the area f may be at least a partial area corresponding to the removed hair tail part.
- the area g other than area f in image 2 (for example, area g includes clothes, skin, background, etc. around the hair) can be used as the second area.
- the sub-region g′ is used to repair the sub-region f′ with the same semantics.
- the sub-region g1' corresponding to the skin semantics to repair the sub-region f1' corresponding to the skin semantics
- use The sub-region g2′ corresponding to the clothes semantics is used to repair the sub-region f2′ corresponding to the clothes semantics
- the sub-region g3′ corresponding to the clothes semantics is used to repair the sub-region f3′ corresponding to the clothes semantics, etc.
- the repaired image 3 can be obtained.
- Another application scenario can be: scanning a partially damaged old photo to obtain the original image 4, and filling in the missing areas in the original image 4 to obtain the image 5. Since the missing area filled in image 5 has a lot of texture loss and details, image 5 needs to be further repaired.
- image 5 can be acquired as the first image, and at least part of the area w corresponding to the filled missing area in image 5 is determined as the first area.
- the area v other than area w in image 5 is regarded as the second area.
- the sub-region v' is used to repair the sub-region w' with the same semantics.
- the repaired image 6 can be obtained.
- Figure 3 is a flow chart of another image repair method according to an exemplary embodiment. This embodiment describes a process of repairing the first area, including the following steps:
- step 301 the first feature map corresponding to the first image is obtained.
- the features of the first image can be extracted first to obtain the first feature map.
- the first image can be directly input into the downsampling module (for example, it can be composed of multiple convolutional layers) to obtain the first feature map output by the downsampling module.
- the first image may be masked using the first area, and the image after masking may be processed. Specifically, using the first area to perform mask processing on the first image may include assigning pixels in the first area in the first image to 0. Then, the masked image is input to the downsampling module.
- the image after mask processing can be convolved by multiple convolution layers. After processing by the convolution layer, the result of the convolution processing can be semantically corrected based on the target semantic map corresponding to the first image. , thereby obtaining the first feature map.
- the target semantic map can be used to perform semantic correction. It is also possible to use the target semantic map to perform a semantic correction after processing multiple convolutional layers. It can be understood that this embodiment does not limit the specific number of semantic modifications.
- the first feature map corresponding to the first image can be obtained. Since this embodiment uses semantic information to correct the extracted features during the process of extracting features from the first image, Thus, semantics is used to guide the extraction and generation of subsequent features, making the boundaries of different semantic areas in the repaired image clearer and the texture richer.
- step 302 a plurality of first cells corresponding to the first region and a plurality of second cells corresponding to the second region are determined. And, in step 303, each first feature corresponding to each first cell in the first feature map and each second feature corresponding to each second cell in the first feature map are obtained.
- each feature point in the first feature map corresponds to a pixel point in the first image, and if downsampling is performed, the number of feature points in the first feature map is smaller than the number of feature points in the first feature map. The number of pixels in the image. Therefore, for each feature point, the corresponding pixel point can be found in the first image.
- a semantic label can be added to each pixel in the first image based on the target semantic map in advance, and a region mark (used to indicate whether the pixel belongs to the first region or the second region) is added to each pixel. Therefore, after the first feature map is obtained, each feature point in the first feature map also has the same semantic label and region label as its corresponding pixel point.
- the first feature map can be evenly divided into multiple cells.
- the cells can be squares, rectangles, etc., and each cell has the same size and includes the same number of feature points.
- each cell may include m ⁇ n feature points, etc.
- a plurality of first cells corresponding to the first region and a plurality of second cells corresponding to the second region may be determined according to the region marks corresponding to the feature points. For example, for a cell, if the cell includes feature points corresponding to the first area, the cell can be determined as a first cell. If the cell does not include feature points corresponding to the first region (that is, all included feature points correspond to the second region), the cell can be determined to be a second cell.
- the semantics corresponding to each cell can also be determined based on the semantic labels of the feature points included in each cell. For example, if the semantic labels of the feature points included in the cell are the same, the semantics indicated by the semantic labels are the semantics corresponding to the cell. If the semantic labels of the feature points included in the cell are different, the semantics indicated by the semantic labels with the largest number can be used as the semantics corresponding to the cell.
- each first feature corresponding to each first cell in the first feature map can be obtained (for example, it can be the feature value of the feature point in the first cell) and each second cell's corresponding first feature map Each corresponding second feature in .
- step 304 according to each first feature corresponding to each first cell and each first Each second feature corresponding to the two cells is used to regenerate the features corresponding to each first cell to obtain a second feature map.
- At least one second cell with the same semantics corresponding to each first cell can be determined based on the corresponding semantics of each first cell and each second cell.
- the feature corresponding to any first cell can be regenerated based on the second feature of the second cell corresponding to the first cell in the first feature map.
- the first feature map includes cells A1m, A2m, A3n..., B1m, B2m, B3n, B4n, B5m, B6n..., where A represents the first cell, B represents the second cell, m and n Represent two different semantics respectively. Therefore, the second cell with the same semantics as cell A1m includes B1m, B2m, and B5m, and the features corresponding to cell A1m can be regenerated using cells B1m, B2m, and B5m.
- the second cell with the same semantics as cell A2m also includes B1m, B2m, and B5m. Cells B1m, B2m, and B5m can also be used to regenerate the features corresponding to cell A2m.
- the second cells with the same semantics as cell A3n include B3n, B4n, and B6n. Cells B3n, B4n, and B6n can be used to regenerate the features corresponding to cell A3n.
- the characteristics corresponding to the first cell can be regenerated in the following way: calculating the similarity between the first cell and each second cell that has the same semantics as it, and according to the The similarity determines the weight of the second feature corresponding to each second cell, calculates a weighted sum of the second feature based on the weight, and uses the weighted sum to regenerate the feature corresponding to the first cell.
- the similarity between the first cell and the second cell can be calculated using an inner product. It can be understood that any method known in the art and that may appear in the future that can calculate image similarity can be applied. This embodiment does not limit the specific method of calculating image similarity.
- the similarities between cell A1m and cells B1m, B2m, and B5m with the same semantics are S1, S2, and S3 respectively.
- S1, S2, and S3 can be normalized to obtain weights w1, w2, and w3.
- the second features corresponding to cells B1m, B2m, and B5m in the first feature map are V1, V2, and V3 respectively.
- the first feature V" corresponding to the cell A1m in the first feature map can be obtained, and V' and V" can be stacked to obtain the regenerated feature V corresponding to the cell A1m.
- This embodiment is based on the similarity between the first cell and the second cell. Generate features corresponding to the first cell, making the image based on the regenerated features more real and natural.
- a second image is generated based on the target semantic map and the second feature map.
- the second feature map can be input into the upsampling module.
- the upsampling module can be composed of multiple deconvolution layers, and can perform deconvolution processing on the second feature map.
- the result of the deconvolution processing can be semantically corrected based on the target semantic map to obtain the second image.
- the target semantic map can be used to perform semantic correction. It is also possible to use the target semantic map to perform a semantic correction after multiple deconvolution layer processes. It can be understood that this embodiment does not limit the specific number of semantic modifications. Since this embodiment uses semantic information to correct the results of the upsampling process during the upsampling process, semantics is used to guide the generation of subsequent images, making the boundaries of different semantic areas in the resulting image clearer and with richer textures.
- this embodiment When repairing the image, this embodiment considers the correlation between the known area (i.e., the second area) and the unknown area (i.e., the first area) in the image, and determines the correlation between the known area and the unknown area through semantics. Under the guidance of rich semantics, the features of known areas are used to regenerate the features of unknown areas with the same semantics, thereby obtaining the repaired image, which further improves the quality of the repaired image.
- the present disclosure also provides an image repair device embodiment.
- FIG. 4 is a block diagram of an image repair device according to an exemplary embodiment of the present disclosure.
- the device may include: a determination module 401 , a first acquisition module 402 and a repair module 403 .
- the determination module 401 is used to obtain the first image and determine the first area to be repaired in the first image.
- the first image is an image obtained by processing the target object in the original image, and the first area is the target object. At least some areas.
- the first acquisition module 402 is used to acquire the target semantic map corresponding to the first image.
- the repair module 403 is used to repair the first area based on the target semantic map to obtain a repaired second image.
- the above-described processing includes removing the target object.
- the repair module 403 may include: a first acquisition sub-module, a repair sub-module and a second acquisition sub-module (not shown in the figure).
- the first acquisition sub-module is used to acquire the first feature map corresponding to the first image.
- the repair submodule is used to regenerate the features of the first region based on the above-mentioned target semantic map and use the corresponding features of the second region in the first feature map to obtain the second feature map.
- the second region is the first feature map in the first image. area outside the area.
- the second acquisition submodule is used to acquire the second image based on the second feature map.
- the first acquisition sub-module may acquire the first feature map corresponding to the first image in the following manner: using the first area to mask the first image, and performing mask processing on the image. Sampling processing is performed, and the results obtained by the down-sampling processing are semantically corrected based on the target semantic map to obtain the first feature map.
- the repair sub-module may include: a determining sub-module and a generating sub-module (not shown in the figure).
- the determination sub-module is used to determine the first cell corresponding to the first area, and based on the target semantic map, determine at least one second cell with the same semantics as the first cell, and the second cell corresponds to the second cell. area.
- the generation sub-module is used to regenerate the features of the first cell based on the features corresponding to the second cell in the first feature map.
- the generation sub-module is configured to: obtain the first feature corresponding to the first cell in the first feature map and the respective second features corresponding to each second cell in the first feature map.
- Features regenerate features corresponding to the first cell based on the above-mentioned first features and the above-mentioned second features.
- the second acquisition sub-module is configured to generate the second image based on the target semantic map and the second feature map.
- the second acquisition sub-module generates the second image based on the target semantic map and the second feature map in the following manner: upsampling the second feature map, and upsampling the target semantic map based on the target semantic map to obtain The result is semantically corrected to obtain the second image.
- the generation sub-module regenerates the characteristics corresponding to the first cell according to the above-mentioned first characteristics and the above-mentioned second characteristics in the following manner: calculating the similarity between the above-mentioned first characteristics and each second characteristic, based on the similarity Degree, regenerate the features corresponding to the first cell.
- the generation sub-module regenerates the features corresponding to the first cell based on the similarity in the following manner: determines the corresponding weight of each second feature based on the similarity, and calculates the weighted sum of the second features. , stack the weighted sum and the first feature to obtain the feature corresponding to the first cell.
- the device embodiment since it basically corresponds to the method embodiment, please refer to the partial description of the method embodiment for relevant details.
- the device embodiments described above are only illustrative.
- the units described as separate components may or may not be physically separated.
- the components shown as units may or may not be physical units, that is, they may be located in One location, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the embodiments of the present disclosure. Persons of ordinary skill in the art can understand and implement the method without any creative effort.
- FIG. 5 is a schematic block diagram of an electronic device provided by some embodiments of the present disclosure.
- the electronic device 910 includes a processor 911 and a memory 912, which can be used to implement a client or a server.
- Memory 912 is used to non-transitoryly store computer-executable instructions (eg, one or more computer program modules).
- the processor 911 is configured to run the computer-executable instructions. When the computer-executable instructions are run by the processor 911, the computer-executable instructions can perform one or more steps in the above-described image repair method, thereby realizing the above-described image. repair method.
- Memory 912 and processor 911 may be interconnected by a bus system and/or other forms of connection mechanisms (not shown).
- the processor 911 may be a central processing unit (CPU), a graphics processing unit (GPU), or other forms of processing units with data processing capabilities and/or program execution capabilities.
- the central processing unit (CPU) may be of X86 or ARM architecture.
- the processor 911 may be a general-purpose processor or a special-purpose processor and may control other components in the electronic device 910 to perform desired functions.
- memory 912 may include any combination of one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
- Volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache), etc.
- Non-volatile memory may include, for example, read-only memory (ROM), hard disk, erasable programmable read-only memory (EPROM), portable compact disk ROM (CD-ROM), USB memory, flash memory, etc.
- One or more computer program modules may be stored on a computer-readable storage medium, and the processor 911 may run one or more computer program modules to implement various functions of the electronic device 910 .
- Various application programs and various data, as well as various data used and/or generated by the application programs, etc. can also be stored in the computer-readable storage medium.
- FIG. 6 is a schematic block diagram of another electronic device provided by some embodiments of the present disclosure.
- the electronic device 920 is, for example, suitable for implementing the image repair method provided by the embodiment of the present disclosure.
- the electronic device 920 may be a terminal device or the like, and may be used to implement a client or a server.
- the electronic device 920 may include, but is not limited to, a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle-mounted terminal (such as a vehicle-mounted navigation terminal), Mobile terminals such as wearable electronic devices and fixed terminals such as digital TVs, desktop computers, smart home devices, etc.
- PDA personal digital assistant
- PAD tablet computer
- PMP portable multimedia player
- a vehicle-mounted terminal such as a vehicle-mounted navigation terminal
- Mobile terminals such as wearable electronic devices and fixed terminals such as digital TVs, desktop computers, smart home devices,
- the electronic device 920 may include a processing device (eg, central processing unit, graphics processor, etc.) 921 , which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 922 or loaded from a storage device 928
- the program in the memory (RAM) 923 executes various appropriate actions and processes.
- various programs and data required for the operation of the electronic device 920 are also stored.
- the processing device 921, ROM 922 and RAM 923 are connected to each other through a bus 924.
- An input/output (I/O) interface 925 is also connected to bus 924.
- the following devices may be connected to the I/O interface 925: input devices 926 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration An output device 927 such as a computer; a storage device 928 including a magnetic tape, a hard disk, etc.; and a communication device 929.
- the communication device 929 may allow the electronic device 920 to communicate wirelessly or wiredly with other electronic devices to exchange data.
- FIG. 6 illustrates electronic device 920 having various means, it should be understood that implementation or provision of all illustrated means is not required and electronic device 920 may alternatively implement or be provided with more or fewer means.
- the above image repair method may be implemented as a computer software program.
- embodiments of the present disclosure include a computer program product including a computer program carried on a non-transitory computer-readable medium, the computer program including a computer program for executing the above-mentioned figure.
- the computer program may be downloaded and installed from the network via communication device 929, or from storage device 928, or from ROM 922.
- the processing device 921 When the computer program is executed by the processing device 921, the functions defined in the image repair method provided by the embodiments of the present disclosure can be implemented.
- Figure 7 is a schematic diagram of a storage medium provided by some embodiments of the present disclosure.
- the storage medium 930 may be a non-transitory computer-readable storage medium for storing non-transitory computer-executable instructions 931 .
- the image repair method described in the embodiment of the present disclosure may be implemented.
- the method according to the above may be executed. One or more steps in the image repair method.
- the storage medium 930 may be applied in the above-mentioned electronic device.
- the storage medium 930 may include a memory in the electronic device.
- the storage medium may include a memory card of a smartphone, a storage component of a tablet computer, a hard drive of a personal computer, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM), Portable compact disk read-only memory (CD-ROM), flash memory, or any combination of the above storage media can also be other suitable storage media.
- RAM random access memory
- ROM read only memory
- EPROM erasable programmable read only memory
- CD-ROM Portable compact disk read-only memory
- flash memory or any combination of the above storage media can also be other suitable storage media.
- the description of the storage medium 930 may refer to the description of the memory in the embodiment of the electronic device, and repeated descriptions will not be repeated.
- the specific functions and technical effects of the storage medium 930 please refer to the above description of the image repair method, which will not be described again here.
- a computer-readable medium may be a tangible medium that may contain or be stored for use by or in conjunction with an instruction execution system, apparatus, or device. program.
- the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
- the computer-readable storage medium may be, for example, but is not limited to: an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof.
- Computer readable storage media may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard drive, random access memory (RAM), read only memory (ROM), removable Programmd read-only memory (EPROM or flash memory), fiber optics, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
- a computer-readable storage medium may be any tangible medium that contains or stores a program that can be instructed to execute a system, Devices or devices used or used in combination with them.
- a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein.
- Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
- a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
- Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to: wire, optical cable, RF (radio frequency), etc., or any suitable combination of the above.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Image Processing (AREA)
- Processing Or Creating Images (AREA)
Abstract
本公开的实施例公开了一种图像的修复方法、装置及电子设备,所述方法的一具体实施方式包括:获取第一图像;所述第一图像为对原始图像中目标对象进行处理后得到的图像;确定所述第一图像中待修复的第一区域;所述第一区域为所述目标对象的至少部分区域;获取所述第一图像对应的目标语义图;基于所述目标语义图,对所述第一区域进行修复,得到修复后的第二图像。该实施方式能够基于语义信息对待修复图像进行修复,减少了修复后得到的图像中的原始图像的残留痕迹,使不同语义区域的边界清晰,纹理更丰富,图像更真实。
Description
本申请要求于2022年9月6日递交的中国专利申请第202211098607.9号的优先权,在此全文引用上述中国专利申请公开的内容以作为本申请的一部分。
本公开的实施例涉及一种图像的修复方法、装置及电子设备。
人工智能技术在图像领域的应用越来越广泛,通常采用人工智能技术对损坏的原始图像进行修复,或者将原始图像中的遮蔽物进行去除,生成新的图像。目前来说,采用相关技术对原始图像进行处理得到的新的图像中,被处理过的区域会有原始图像的残留痕迹,效果较差。因此,需要一种能对图像中被修改过的区域进行修复的方案。
发明内容
本公开提供一种图像的修复方法、装置及电子设备。
根据第一方面,提供一种图像的修复方法,所述方法包括:
获取第一图像;所述第一图像为对原始图像中目标对象进行处理后得到的图像;
确定所述第一图像中待修复的第一区域;所述第一区域为所述目标对象的至少部分区域;
获取所述第一图像对应的目标语义图;
基于所述目标语义图,对所述第一区域进行修复,得到修复后的第二图像。
根据第二方面,提供一种图像的修复装置,所述装置包括:
第一获取模块,用于获取第一图像;所述第一图像为对原始图像中目标对象进行处理后得到的图像;
确定模块,用于确定所述第一图像中待修复的第一区域;所述第一区域为所述目标对象的至少部分区域;
第二获取模块,用于获取所述第一图像对应的目标语义图;
修复模块,用于基于所述目标语义图对所述第一区域进行修复,得到修复后的第二图像。
根据第三方面,提供一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述的方法。
根据第四方面,提供一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上述的方法。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
为了更清楚地说明本公开实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本公开根据一示例性实施例示出的一种图像的修复场景示意图;
图2是本公开根据一示例性实施例示出的一种图像的修复方法的流程图;
图3是本公开根据一示例性实施例示出的另一种图像的修复方法的流程图;
图4是是本公开根据一示例性实施例示出的一种图像的修复装置框图;
图5是本公开一些实施例提供的一种电子设备的示意框图;
图6是本公开一些实施例提供的另一种电子设备的示意框图;以及
图7是本公开一些实施例提供的一种存储介质的示意图。
为了使本技术领域的人员更好地理解本公开中的技术方案,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描
述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本公开保护的范围。
下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
在本公开使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开。在本公开中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本公开可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本公开范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
人工智能技术在图像领域的应用越来越广泛,通常采用人工智能技术对损坏的原始图像进行修复,或者将原始图像中的遮蔽物进行去除,生成新的图像。例如,将人物图像中人物的长发变成短发,或者将风景图像中的树木或建筑去除等。目前来说,采用相关技术对原始图像进行处理得到的新的图像中,被处理过的区域会有原始图像的残留痕迹,效果较差。以将图像中人物的长发变成短发为例,在去除长发后露出的遮挡区域中,会存在发丝残留,遮挡住的衣服边界不清晰,颜色异常等问题。因此,需要一种能对图像中被修改过的区域进行修复的方案。
本公开提供的一种图像的修复方案,通过待修复图像所对应的语义图,对待修复图像中被修改过的至少部分区域进行修复,从而得到显示效果更好的图像。由于本实施例提供的方案在对待修复图像中被修改过的区域进行修复时,考虑了待修复图像的语义图,而待修复图像的语义图中包含更为丰富
的语义信息,因此,能够基于更为丰富的语义信息对待修复图像进行修复。减少了修复后得到的图像中的原始图像的残留痕迹,使不同语义区域的边界清晰,纹理更丰富,图像更真实。
参见图1,为根据一示例性实施例示出的一种图像的修复场景示意图。下面参考图1,结合一个完整具体的应用实例,对本公开的方案进行示意性说明。该应用实例描述了一个具体的修复图像的过程。
如图1所示,原始图像A为需要去除遮挡物或者存在缺失区域的图像,对原始图像A进行修改处理(例如去除遮挡物的处理,或者填补缺失区域的处理等)之后,可以得到图像B。由于图像B中经过修改处理的区域a存在纹理细节损失较多,边缘不清晰等问题,因此,还需要针对区域a对图像B进行进一步地修复。具体地,可以对图像B进行语义分割处理,得到图像B对应的语义图C,并获取区域a的信息。然后,根据区域a的信息对图像B进行掩膜操作,以将图像B中区域a的像素点赋值为0,得到图像D。将图像D和语义图C输入至预先训练好的图像修复网络中,由该图像修复网络进行针对区域a的修复。
需要说明的是,这里使用的语义图C是图像B对应的语义图,该语义图与原始图像A对应的语义图具有本质的区别。由于原始图像A中待修改区域的信息缺失严重,因此,原始图像A对应的语义图中缺少待修改区域的语义信息。
在图像修复网络中,可以先通过下采样模块对图像D进行处理,从而对图像D进行下采样,并提取图像D的图像特征。例如,下采样模块可以由多个卷积层构成,可以依次由各个卷积层对图像D进行卷积处理。同时,可以基于语义图C,在每次进行卷积处理之后,对卷积处理的结果进行语义修正。具体地,可以基于语义图C,通过两个不同的卷积层学习得到两个参数α和β(α和β均为向量),利用参数α和β对卷积处理得到的特征图进行语义修正。例如,可以采用SPADE空间自适应的方式,根据语义图C进行语义修正。经过多个卷积层的卷积处理之后,可以得到待修复的特征图,再利用图像修复模块对待修复的特征图进行处理。
具体来说,可以基于语义图C,按照语义将待修复的特征图中对应于区域a的未知区域划分成多个未知分区域,使得每个未知分区域仅对应于一种语义。并确定待修复的特征图中除未知区域之外的已知区域,已知区域也被
划分成多个已知分区域,每个已知分区域仅对应于一种语义。对于任一未知分区域,可以确定该未知分区域在待修复的特征图中所对应的初始特征,并利用与该未知分区域语义相同的已知分区域对该未知分区域的特征进行重构,得到重构特征(具体过程可参见图3实施例)。通过堆叠处理将初始特征与该重构特征进行特征融合,可以得到修复后的特征图。
再通过上采样对修复后的特征图进行处理,从而对修复后的特征图进行上采样,以将修复后的特征图转换成修复后的目标图像E。例如,上采样模块可以由多个反卷积层构成,可以依次由各个反卷积层对修复后的特征图进行反卷积处理。同样,也可以基于语义图C,在每次进行反卷积处理之后,对反卷积处理的结果进行语义修正。
需要说明的是,在训练上述图像修复网络的阶段,可以选取完整真实的图像作为样本图像,并获取样本图像对应的语义图。选取样本图像中的部分区域(例如语义信息丰富的区域)进行掩膜处理。将样本图像对应的语义图和经过掩膜处理的图像输入至待训练的图像修复网络中,获取图像修复网络输出的预测图像。基于预测图像和样本图像计算预测损失,并根据预测损失调整图像修复网络的网络参数,从而对图像修复网络进行训练。
下面将结合具体的实施例对本公开进行详细描述。
图2为根据一示例性实施例示出的一种图像的修复方法的流程图。该方法的执行主体可以实现为任何具有计算、处理能力的设备、平台、服务器或设备集群。该方法包括以下步骤:
如图2所示,在步骤201中,获取第一图像,并确定第一图像中待修复的第一区域。
在本实施例中,第一图像为对原始图像中目标对象进行处理后得到的图像,第一区域为目标对象对应的至少部分区域。其中,在一种场景下,第一图像可以是对原始图像中的遮挡物进行去除而得到的图像(目标对象为遮挡物),第一区域可以是去除掉的遮挡物对应的至少部分区域。例如,将人物图像中的长发变成短发,需要去除图像中的部分发尾。经过去除发尾处理之后得到的图像为第一图像,去除的发尾区域为第一区域。由于此场景下,图像中的待修复区域通常包含多种语义,并且,待修复区域在图像中的占比较大,可参考的已知信息较少,因此,通过本实施例提供的方案进行修复所能
达到的效果更为显著。
在另一种场景下,第一图像也可以是对原始图像中被损坏或者信息缺失的区域进行修复填补而得到的图像,第一区域可以是被损坏或者信息缺失的至少部分区域(目标对象为被损坏或信息缺失的部分)。例如,对部分区域严重损坏的老照片进行扫描,得到原始图像。将原始图像中对应于损坏部分的区域进行修复,可以得到第一图像,其中,被修复的区域为第一区域。可以理解,还可以在其它场景下应用本方案,本实施例对具体应用场景方面不限定。
在步骤202中,获取第一图像对应的目标语义图,以及在步骤203中,基于目标语义图,对第一区域进行修复,得到修复后的第二图像。
在本实施例中,可以对第一图像进行语义分割,得到第一图像对应的目标语义图,并基于目标语义图,对第一图像中第一区域对应的特征进行修复,得到修复后第一区域对应的新特征,再基于第一区域对应的新特征生成修复后的第二图像。
需要注意的是,这里使用的语义图是经过修改后的第一图像对应的语义图,而非未经修改的原始图像的语义图。因为,原始图像中待修改区域的语义信息缺失较多,而经过修改后的第一图像中待修复区域的语义信息更为丰富。
在一种实现方式中,可以基于目标语义图,利用第一图像中第二区域(第一图像中第一区域之外的区域)对应的特征,对第一图像中第一区域对应的特征进行修复。例如,根据目标语义图和第二区域对应的特征,得到修复参数,并利用修复参数对第一区域对应的特征进行修复(如将修复参数与第一区域对应的特征相加或相乘,或进行预设运算等)。
在另一种实现方式中,也可以获取第一图像对应的第一特征图,基于目标语义图,利用第二区域在第一特征图中对应的特征,重新生成第一区域对应的特征,得到第二特征图。并基于第二特征图,获取第二图像。例如,对于对应于一个语义的第一区域,可以采用其周围预设范围内最近且语义相同的第二区域在第一特征图中对应的特征,重新生成该第一区域对应的特征。
可选地,还可以确定对应于第一区域的第一单元格,并基于目标语义图,确定与该第一单元格语义相同的至少一个第二单元格(该第二单元格对应于
第二区域)。然后,根据第二单元格的特征重新生成该第一单元格对应的特征。由于本实现方式将待修复的第一区域进一步细分成第一单元格,利用和该第一单元格语义相同的第二单元格的特征,重新生成该第一单元格对应的特征,因此,能够使修复后的图像质量更高,语义的边界处更清晰自然。
本公开提供的一种图像的修复方法,通过待修复图像所对应的语义图,对待修复图像中被修改过的至少部分区域进行修复,从而得到显示效果更好的图像。由于本实施例提供的方案在对待修复图像中被修改过的区域进行修复时,考虑了待修复图像的语义图,而待修复图像的语义图中包含更为丰富的语义信息,因此,能够基于更为丰富的语义信息对待修复图像进行修复。减少了修复后得到的图像中的原始图像的残留痕迹,使不同语义区域的边界清晰,纹理更丰富,图像更真实。
需要说明的是,虽然一些示例中存在多种对图像进行修复的方法,但是,修复后得到的图像质量较差,修复后的图像中存在原始图像的残留痕迹,不同语义区域的边界模糊且不自然。本领域技术人员并未发现问题所在,是由于在修复时没有考虑修复后的图像的语义信息对修复效果的影响。导致图像修复效果不好的原因可能有多种,本领域技术人员在不付出劳动的前提下难以想到上述原因。本公开的技术方案考虑了修复后的图像的语义信息对修复效果的影响,因此,也通过问题的发现,解决了在上述技术问题。
下面结合两个完整的应用实例,对本公开的方案进行示意性说明。
一种应用场景可以为:将原始图像1中人物的长发变成短发,即将原始图像1中长发的发尾部分去除掉,得到图像2。而由于图像2中去掉长发遮挡的区域存在纹理损失细节较多,因此,需要对图像2进行进一步修复。
具体来说,首先,可以获取图像2作为第一图像,并确定图像2中被修改过的区域f作为第一区域。其中,区域f可以是去除掉的发尾部分所对应的至少部分区域。可以将图像2中区域f以外的区域g(例如区域g包括头发周围的衣服、皮肤和背景等)作为第二区域。然后,获取图像2对应的语义图C作为目标语义图。根据语义图C对区域f和区域g进行语义划分,确定区域f中对应于不同语义的多个子区域f′,以及区域g中对应于不同语义的多个子区域g′。
接着,利用子区域g′对语义相同的子区域f′进行修复。例如,利用对应于皮肤语义的子区域g1′对对应于皮肤语义的子区域f1′进行修复;利用
对应于衣服语义的子区域g2′对对应于衣服语义的子区域f2′进行修复;利用对应于衣服语义的子区域g3′对对应于衣服语义的子区域f3′进行修复等。最后,可以得到修复后的图像3。
另一种应用场景可以为:对部分损坏的老照片进行扫描得到原始图像4,对原始图像4中缺失区域进行填补得到图像5。而由于图像5中填补的缺失区域存在纹理损失细节较多,因此,需要对图像5进行进一步修复。
具体来说,首先,可以获取图像5作为第一图像,并确定图像5中填补的缺失区域所对应的至少部分区域w作为第一区域。将图像5中区域w以外的区域v作为第二区域。然后,获取图像5对应的语义图D作为目标语义图。根据语义图D对区域w和区域v进行语义划分,确定区域w中对应于不同语义的多个子区域w′,以及区域v中对应于不同语义的多个子区域v′。接着,利用子区域v′对语义相同的子区域w′进行修复。最后,可以得到修复后的图像6。
图3是根据一示例性实施例示出的另一种图像的修复方法的流程图,该实施例描述了对第一区域进行修复的过程,包括以下步骤:
如图3所示,在步骤301中,获取第一图像对应的第一特征图。
在本实施例中,可以先提取第一图像的特征,得到第一特征图。例如,可以直接将第一图像输入至下采样模块(例如可以由多层卷积层构成)中,得到下采样模块输出的第一特征图。又例如,还可以先利用第一区域对第一图像进行掩膜处理,对经过掩膜处理之后的图像进行处理。具体地,利用第一区域对第一图像进行掩膜处理,可以是将第一图像中第一区域的像素点赋值为0。然后,将经过掩膜处理之后的图像输入至下采样模块。可选地,可以由多层卷积层对掩膜处理之后的图像进行卷积处理,经过卷积层的处理之后,可以基于第一图像对应的目标语义图对卷积处理的结果进行语义修正,从而得到第一特征图。
例如,可以每经过一次卷积层的处理,利用目标语义图进行一次语义修正。也可以在经过多次卷积层的处理之后,利用目标语义图进行一次语义修正。可以理解,本实施例对进行语义修正的具体次数方面不限定。在经过多个卷积层的卷积处理之后,可以得到第一图像对应的第一特征图。由于本实施例在对第一图像提取特征的过程中,利用语义信息对提取的特征进行修正,
从而利用语义引导后续特征的提取和生成,使得修复后得到的图像中不同语义区域的边界更清晰,纹理更丰富。
在步骤302中,确定第一区域对应的多个第一单元格和第二区域对应的多个第二单元格。以及,在步骤303中,获取每个第一单元格在第一特征图中各自对应的各个第一特征以及每个第二单元格在第一特征图中各自对应的各个第二特征。
在本实施例中,第一特征图中的每个特征点均与第一图像中的像素点相对应,并且,若进行了下采样处理,第一特征图中的特征点的数目小于第一图像中的像素点的数目。因此,每个特征点在第一图像中均可以找到相对应的像素点。可以预先基于目标语义图对第一图像中的每个像素点添加语义标签,并对每个像素点添加区域标记(用于指示像素点是属于第一区域还是第二区域)。因此,得到第一特征图之后,第一特征图中的每个特征点也具有和其对应的像素点相同的语义标签和区域标记。
然后,可以将第一特征图均匀划分成多个单元格,单元格可以是正方形,或者长方形等,每个单元格尺寸相同,包括相同数量的特征点。例如,每个单元格可以包括m×n个特征点等。可以根据特征点对应的区域标记确定对应于第一区域的多个第一单元格以及对应于第二区域的多个第二单元格。例如,对于一个单元格来说,如果该单元格中包括对应于第一区域的特征点,则可以将该单元格确定为一个第一单元格。如果该单元格中不包括对应于第一区域的特征点(即包括的特征点全部对应于第二区域),则可以将该单元格确定为一个第二单元格。
另外,还可以根据每个单元格中包括的特征点的语义标签,确定每个单元格对应的语义。例如,如果该单元格中包括的特征点的语义标签都相同,则该语义标签指示的语义即为该单元格对应的语义。如果该单元格中包括的特征点的语义标签不同,则可以将数量最多是语义标签所指示的语义作为该单元格对应的语义。
接着,可以获取每个第一单元格在第一特征图中各自对应的各个第一特征(例如可以是第一单元格中特征点的特征值)和每个第二单元格在第一特征图中各自对应的各个第二特征。
在步骤304中,根据每个第一单元格各自对应的各个第一特征和每个第
二单元格各自对应的各个第二特征,重新生成每个第一单元格对应的特征,得到第二特征图。
具体来说,可以根据每个第一单元格和每个第二单元格各自对应的语义,确定每个第一单元格对应的至少一个语义相同的第二单元格。可以根据任一第一单元格对应的第二单元格在第一特征图中的第二特征,重新生成该第一单元格对应的特征。
例如,第一特征图包括单元格A1m,A2m,A3n……,B1m,B2m,B3n,B4n,B5m,B6n……,其中,A表示第一单元格,B表示第二单元格,m和n分别表示两种不同的语义。因此,与单元格A1m语义相同的第二单元格包括B1m,B2m,B5m,可以利用单元格B1m,B2m,B5m重新生成单元格A1m对应的特征。与单元格A2m语义相同的第二单元格也包括B1m,B2m,B5m,同样可以利用单元格B1m,B2m,B5m重新生成单元格A2m对应的特征。与单元格A3n语义相同的第二单元格包括B3n,B4n,B6n,可以利用单元格B3n,B4n,B6n重新生成单元格A3n对应的特征。
具体来说,针对任一第一单元格,可以通过如下方式重新生成该第一单元格对应的特征:计算该第一单元格和每个与其语义相同的第二单元格的相似度,根据该相似度确定每个第二单元格对应的第二特征的权重,基于该权重计算第二特征的加权和,利用该加权和重新生成该第一单元格对应的特征。可选地,可以采用内积的方式计算第一单元格和第二单元格的相似度,可以理解,本领域中已知的以及将来可能出现的任何能够计算图像相似度的方法都可以应用于本实施例,本实施例对计算图像相似度的具体方式方面不限定。
例如,单元格A1m与语义相同的单元格B1m,B2m,B5m的相似度分别为S1,S2,S3,可以将S1,S2,S3进行归一化,得到权重w1,w2,w3。单元格B1m,B2m,B5m在第一特征图中对应的第二特征分别为V1,V2,V3。可以根据权重计算第二特征的加权和,得到重构特征V’,V’=w1V1+w2V2+w3V3。可选地,可以获取单元格A1m在第一特征图中对应的第一特征V”,对V’和V”进行堆叠处理,得到重新生成的单元格A1m对应的特征V。
由于语义相同的第一单元格和第二单元格之间相似度越大,说明相应的特征越接近,因此根据相似度确定的权重更能体现第一单元格和第二单元格之间的关联关系。本实施例基于第一单元格和第二单元格之间相似度,重新
生成第一单元格对应的特征,使得基于重新生成的特征而得到的图像更真实和自然。
在步骤305中,基于目标语义图和第二特征图,生成第二图像。
在本实施例中,可以将第二特征图输入至上采样模块中,例如上采样模块可以由多层反卷积层构成,可以对第二特征图进行反卷积处理,可选地,经过反卷积层的处理之后,可以基于目标语义图对反卷积处理的结果进行语义修正,得到第二图像。例如,可以每经过一次反卷积层的处理,利用目标语义图进行一次语义修正。也可以在经过多次反卷积层的处理之后,利用目标语义图进行一次语义修正。可以理解,本实施例对进行语义修正的具体次数方面不限定。由于本实施例在进行上采样的过程中,利用语义信息对上采样处理的结果进行修正,从而利用语义引导后续图像的生成,使得到的图像中不同语义区域的边界更清晰,纹理更丰富。
本实施例在对图像进行修复时,考虑了图像中已知区域(即第二区域)和未知区域(即第一区域)的关联性,通过语义确定已知区域和未知区域的关联关系,在丰富的语义的指导下,利用已知区域的特征重新生成语义相同的未知区域的特征,从而得到修复后的图像,进一步提高了修复后得到的图像的质量。
应当注意,尽管在上述实施例中,以特定顺序描述了本公开实施例的方法的操作,但是,这并非要求或者暗示必须按照该特定顺序来执行这些操作,或是必须执行全部所示的操作才能实现期望的结果。相反,流程图中描绘的步骤可以改变执行顺序。附加地或备选地,可以省略某些步骤,将多个步骤合并为一个步骤执行,和/或将一个步骤分解为多个步骤执行。
与前述图像的修复方法实施例相对应,本公开还提供了图像的修复装置的实施例。
如图4所示,图4是本公开根据一示例性实施例示出的一种图像的修复装置的框图,该装置可以包括:确定模块401,第一获取模块402和修复模块403。
其中,确定模块401,用于获取第一图像,并确定第一图像中待修复的第一区域,第一图像为对原始图像中目标对象进行处理后得到的图像,第一区域为目标对象的至少部分区域。
第一获取模块402,用于获取第一图像对应的目标语义图。
修复模块403,用于基于目标语义图,对第一区域进行修复,得到修复后的第二图像。
在一些实施方式中,上述处理包括去除掉目标对象的操作。
在另一些实施方式中,修复模块403可以包括:第一获取子模块,修复子模块和第二获取子模块(图中未示出)。
其中,第一获取子模块,用于获取第一图像对应的第一特征图。
修复子模块,用于基于上述目标语义图,利用第二区域在第一特征图中对应的特征,重新生成第一区域的特征,得到第二特征图,第二区域为第一图像中第一区域之外的区域。
第二获取子模块,用于基于第二特征图,获取第二图像。
在另一些实施方式中,第一获取子模块可以通过如下方式获取第一图像对应的第一特征图:利用第一区域对第一图像进行掩膜处理,对经过掩膜处理之后的图像进行下采样处理,并基于目标语义图对下采样处理得到的结果进行语义修正,得到第一特征图。
在另一些实施方式中,修复子模块可以包括:确定子模块和生成子模块(图中未示出)。
其中,确定子模块,用于确定对应于第一区域的第一单元格,并基于目标语义图,确定与第一单元格语义相同的至少一个第二单元格,第二单元格对应于第二区域。
生成子模块,用于根据第二单元格在第一特征图中对应的特征,重新生成第一单元格的特征。
在另一些实施方式中,生成子模块被配置用于:获取第一单元格在第一特征图中对应的第一特征以及每个第二单元格在第一特征图中各自对应的各个第二特征,根据上述第一特征和上述第二特征重新生成第一单元格对应的特征。
在另一些实施方式中,第二获取子模块被配置用于:基于目标语义图和第二特征图,生成第二图像。
在另一些实施方式中,第二获取子模块通过如下方式基于目标语义图和第二特征图,生成第二图像:对第二特征图进行上采样处理,并基于目标语义图对上采样处理得到的结果进行语义修正,得到第二图像。
在另一些实施方式中,生成子模块通过如下方式根据上述第一特征和上述第二特征重新生成第一单元格对应的特征:计算上述第一特征和各个第二特征的相似度,基于该相似度,重新生成第一单元格对应的特征。
在另一些实施方式中,生成子模块通过如下方式基于该相似度,重新生成第一单元格对应的特征:基于该相似度确定各个第二特征各自对应的权重,并计算第二特征的加权和,对该加权和以及第一特征进行堆叠处理,得到第一单元格对应的特征。
对于装置实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本公开实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
图5为本公开一些实施例提供的一种电子设备的示意框图。如图5所示,该电子设备910包括处理器911和存储器912,可以用于实现客户端或服务器。存储器912用于非瞬时性地存储有计算机可执行指令(例如一个或多个计算机程序模块)。处理器911用于运行该计算机可执行指令,该计算机可执行指令被处理器911运行时可以执行上文所述的图像的修复方法中的一个或多个步骤,进而实现上文所述的图像的修复方法。存储器912和处理器911可以通过总线系统和/或其它形式的连接机构(未示出)互连。
例如,处理器911可以是中央处理单元(CPU)、图形处理单元(GPU)或者具有数据处理能力和/或程序执行能力的其它形式的处理单元。例如,中央处理单元(CPU)可以为X86或ARM架构等。处理器911可以为通用处理器或专用处理器,可以控制电子设备910中的其它组件以执行期望的功能。
例如,存储器912可以包括一个或多个计算机程序产品的任意组合,计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。非易失性存储器例如可以包括只读存储器(ROM)、硬盘、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储
器(CD-ROM)、USB存储器、闪存等。在计算机可读存储介质上可以存储一个或多个计算机程序模块,处理器911可以运行一个或多个计算机程序模块,以实现电子设备910的各种功能。在计算机可读存储介质中还可以存储各种应用程序和各种数据以及应用程序使用和/或产生的各种数据等。
需要说明的是,本公开的实施例中,电子设备910的具体功能和技术效果可以参考上文中关于图像的修复方法的描述,此处不再赘述。
图6为本公开一些实施例提供的另一种电子设备的示意框图。该电子设备920例如适于用来实施本公开实施例提供的图像的修复方法。电子设备920可以是终端设备等,可以用于实现客户端或服务器。电子设备920可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)、可穿戴电子设备等等的移动终端以及诸如数字TV、台式计算机、智能家居设备等等的固定终端。需要注意的是,图6示出的电子设备920仅仅是一个示例,其不会对本公开实施例的功能和使用范围带来任何限制。
如图6所示,电子设备920可以包括处理装置(例如中央处理器、图形处理器等)921,其可以根据存储在只读存储器(ROM)922中的程序或者从存储装置928加载到随机访问存储器(RAM)923中的程序而执行各种适当的动作和处理。在RAM 923中,还存储有电子设备920操作所需的各种程序和数据。处理装置921、ROM 922以及RAM 923通过总线924彼此相连。输入/输出(I/O)接口925也连接至总线924。
通常,以下装置可以连接至I/O接口925:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置926;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置927;包括例如磁带、硬盘等的存储装置928;以及通信装置929。通信装置929可以允许电子设备920与其他电子设备进行无线或有线通信以交换数据。虽然图6示出了具有各种装置的电子设备920,但应理解的是,并不要求实施或具备所有示出的装置,电子设备920可以替代地实施或具备更多或更少的装置。
例如,根据本公开的实施例,上述图像的修复方法可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包括用于执行上述图
像的修复方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置929从网络上被下载和安装,或者从存储装置928安装,或者从ROM922安装。在该计算机程序被处理装置921执行时,可以实现本公开实施例提供的图像的修复方法中限定的功能。
图7为本公开一些实施例提供的一种存储介质的示意图。例如,如图7所示,存储介质930可以为非暂时性计算机可读存储介质,用于存储非暂时性计算机可执行指令931。当非暂时性计算机可执行指令931由处理器执行时可以实现本公开实施例所述的图像的修复方法,例如,当非暂时性计算机可执行指令931由处理器执行时,可以执行根据上文所述的图像的修复方法中的一个或多个步骤。
例如,该存储介质930可以应用于上述电子设备中,例如,该存储介质930可以包括电子设备中的存储器。
例如,存储介质可以包括智能电话的存储卡、平板电脑的存储部件、个人计算机的硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、闪存、或者上述存储介质的任意组合,也可以为其他适用的存储介质。
例如,关于存储介质930的说明可以参考电子设备的实施例中对于存储器的描述,重复之处不再赘述。存储介质930的具体功能和技术效果可以参考上文中关于图像的修复方法的描述,此处不再赘述。
需要说明的是,在本公开的上下文中,计算机可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是,但不限于:电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、
装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。
本领域技术人员在考虑本公开后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。本公开的实施例仅被视为示例性的,本公开的真正范围和精神由权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。
Claims (14)
- 一种图像的修复方法,包括:获取第一图像,其中,所述第一图像为对原始图像中目标对象进行处理后得到的图像;确定所述第一图像中待修复的第一区域,其中,所述第一区域为所述目标对象的至少部分区域;获取所述第一图像对应的目标语义图;基于所述目标语义图,对所述第一区域进行修复,得到修复后的第二图像。
- 根据权利要求1所述的方法,其中,所述处理包括去除掉所述目标对象的操作。
- 根据权利要求1或2所述的方法,其中,所述基于所述目标语义图,对所述第一区域进行修复,得到修复后的第二图像,包括:获取所述第一图像对应的第一特征图;基于所述目标语义图,利用第二区域在所述第一特征图中对应的特征,重新生成所述第一区域的特征,得到第二特征图,其中,所述第二区域为所述第一图像中所述第一区域之外的区域;基于所述第二特征图,获取所述第二图像。
- 根据权利要求3所述的方法,其中,所述获取所述第一图像对应的第一特征图,包括:利用所述第一区域对所述第一图像进行掩膜处理;对经过所述掩膜处理之后的图像进行下采样处理,并基于所述目标语义图对所述下采样处理得到的结果进行语义修正,得到所述第一特征图。
- 根据权利要求3或4所述的方法,其中,所述基于所述目标语义图,利用第二区域在所述第一特征图中对应的特征,重新生成所述第一区域的特征,包括:确定对应于所述第一区域的第一单元格,并基于所述目标语义图,确定与所述第一单元格语义相同的至少一个第二单元格,其中,所述第二单元格对应于所述第二区域;根据所述第二单元格在所述第一特征图中对应的特征,重新生成所述第一单元格的特征。
- 根据权利要求5所述的方法,其中,所述根据所述第二单元格在所述第一特征图中对应的特征,重新生成所述第一单元格的特征,包括:获取所述第一单元格在所述第一特征图中对应的第一特征以及每个所述第二单元格在所述第一特征图中各自对应的各个第二特征;根据所述第一特征和所述第二特征重新生成所述第一单元格的特征。
- 根据权利要求3-6任一项所述的方法,其中,所述基于所述第二特征图,获取所述第二图像,包括:基于所述目标语义图和所述第二特征图,生成所述第二图像。
- 根据权利要求7所述的方法,其中,所述基于所述目标语义图和所述第二特征图,生成所述第二图像,包括:对所述第二特征图进行上采样处理,并基于所述目标语义图对所述上采样处理得到的结果进行语义修正,得到所述第二图像。
- 根据权利要求6所述的方法,其中,所述根据所述第一特征和所述第二特征重新生成所述第一单元格的特征,包括:计算所述第一特征和各个所述第二特征的相似度;基于所述相似度,重新生成所述第一单元格的特征。
- 根据权利要求9所述的方法,其中,所述基于所述相似度,重新生成所述第一单元格的特征,包括:基于所述相似度确定各个所述第二特征各自对应的权重,并计算所述第二特征的加权和;根据所述加权和,重新生成所述第一单元格的特征。
- 根据权利要求10所述的方法,其中,所述根据所述加权和,重新生成所述第一单元格的特征,包括:对所述加权和以及所述第一特征进行堆叠处理,得到所述第一单元格的特征。
- 一种图像的修复装置,包括:第一获取模块,用于获取第一图像,其中,所述第一图像为对原始图像中目标对象进行处理后得到的图像;确定模块,用于确定所述第一图像中待修复的第一区域,其中,所述第一区域为所述目标对象的至少部分区域;第二获取模块,用于获取所述第一图像对应的目标语义图;修复模块,用于基于所述目标语义图对所述第一区域进行修复,得到修复后的第二图像。
- 一种计算机可读存储介质,其上存储有计算机程序,其中,当所述计算机程序在计算机中执行时,令所述计算机执行权利要求1-11中任一项所述的方法。
- 一种电子设备,包括存储器和处理器,其中,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现权利要求1-11中任一项所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211098607.9 | 2022-09-06 | ||
CN202211098607.9A CN117726551A (zh) | 2022-09-06 | 2022-09-06 | 图像的修复方法、装置及电子设备 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024051690A1 true WO2024051690A1 (zh) | 2024-03-14 |
Family
ID=90192057
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/117018 WO2024051690A1 (zh) | 2022-09-06 | 2023-09-05 | 图像的修复方法、装置及电子设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN117726551A (zh) |
WO (1) | WO2024051690A1 (zh) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080238942A1 (en) * | 2007-03-29 | 2008-10-02 | Microsoft Corporation | Object-Based Image Inpainting |
CN113034390A (zh) * | 2021-03-17 | 2021-06-25 | 复旦大学 | 一种基于小波先验注意力的图像修复方法及系统 |
WO2021133001A1 (ko) * | 2019-12-26 | 2021-07-01 | 주식회사 픽스트리 | 시멘틱 이미지 추론 방법 및 장치 |
CN113888417A (zh) * | 2021-09-12 | 2022-01-04 | 天津工业大学 | 基于语义解析生成指导的人脸图像修复方法 |
CN114331912A (zh) * | 2022-01-06 | 2022-04-12 | 北京字跳网络技术有限公司 | 一种图像修复方法及装置 |
-
2022
- 2022-09-06 CN CN202211098607.9A patent/CN117726551A/zh active Pending
-
2023
- 2023-09-05 WO PCT/CN2023/117018 patent/WO2024051690A1/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080238942A1 (en) * | 2007-03-29 | 2008-10-02 | Microsoft Corporation | Object-Based Image Inpainting |
WO2021133001A1 (ko) * | 2019-12-26 | 2021-07-01 | 주식회사 픽스트리 | 시멘틱 이미지 추론 방법 및 장치 |
CN113034390A (zh) * | 2021-03-17 | 2021-06-25 | 复旦大学 | 一种基于小波先验注意力的图像修复方法及系统 |
CN113888417A (zh) * | 2021-09-12 | 2022-01-04 | 天津工业大学 | 基于语义解析生成指导的人脸图像修复方法 |
CN114331912A (zh) * | 2022-01-06 | 2022-04-12 | 北京字跳网络技术有限公司 | 一种图像修复方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN117726551A (zh) | 2024-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111368685B (zh) | 关键点的识别方法、装置、可读介质和电子设备 | |
US11436863B2 (en) | Method and apparatus for outputting data | |
CN110413812B (zh) | 神经网络模型的训练方法、装置、电子设备及存储介质 | |
CN110781923B (zh) | 特征提取方法及装置 | |
CN109829432B (zh) | 用于生成信息的方法和装置 | |
CN112132847A (zh) | 模型训练方法、图像分割方法、装置、电子设备和介质 | |
CN108875931B (zh) | 神经网络训练及图像处理方法、装置、系统 | |
US10891471B2 (en) | Method and system for pose estimation | |
CN115965791B (zh) | 图像生成方法、装置及电子设备 | |
CN114898177B (zh) | 缺陷图像生成方法、模型训练方法、设备、介质及产品 | |
CN108170751A (zh) | 用于处理图像的方法和装置 | |
CN112967196A (zh) | 图像修复方法及装置、电子设备和介质 | |
CN115937033A (zh) | 图像生成方法、装置及电子设备 | |
JP2023526899A (ja) | 画像修復モデルを生成するための方法、デバイス、媒体及びプログラム製品 | |
CN118042246A (zh) | 视频生成方法、装置、电子设备及可读存储介质 | |
CN117315758A (zh) | 面部表情的检测方法、装置、电子设备及存储介质 | |
WO2024051690A1 (zh) | 图像的修复方法、装置及电子设备 | |
CN115393868B (zh) | 文本检测方法、装置、电子设备和存储介质 | |
CN112487943B (zh) | 关键帧去重的方法、装置和电子设备 | |
CN116309158A (zh) | 网络模型的训练方法、三维重建方法、装置、设备和介质 | |
CN115797920A (zh) | 车牌识别方法、装置、电子设备及存储介质 | |
CN112288748B (zh) | 一种语义分割网络训练、图像语义分割方法及装置 | |
CN109741250B (zh) | 图像处理方法及装置、存储介质和电子设备 | |
CN116309274B (zh) | 图像中小目标检测方法、装置、计算机设备及存储介质 | |
CN118429229B (zh) | 图像修复方法、设备、存储介质及计算机程序产品 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23862384 Country of ref document: EP Kind code of ref document: A1 |