WO2023039865A1 - 图像处理方法、视频处理方法、训练方法、设备、程序产品及存储介质 - Google Patents

图像处理方法、视频处理方法、训练方法、设备、程序产品及存储介质 Download PDF

Info

Publication number
WO2023039865A1
WO2023039865A1 PCT/CN2021/119186 CN2021119186W WO2023039865A1 WO 2023039865 A1 WO2023039865 A1 WO 2023039865A1 CN 2021119186 W CN2021119186 W CN 2021119186W WO 2023039865 A1 WO2023039865 A1 WO 2023039865A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target area
area
processed
target
Prior art date
Application number
PCT/CN2021/119186
Other languages
English (en)
French (fr)
Inventor
张雪
刘鹏
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2021/119186 priority Critical patent/WO2023039865A1/zh
Publication of WO2023039865A1 publication Critical patent/WO2023039865A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation

Definitions

  • the present application relates to the technical field of visual information processing, and in particular, relates to an image processing method, a video processing method, a training method, equipment, a program product, and a storage medium.
  • visual information such as images, videos, etc.
  • Information processing one of the more popular visual information processing methods is to replace certain areas in images or video frames.
  • one of the objectives of the present application is to provide an image processing method, a video processing method, a training method, a device, a program product and a storage medium.
  • the inventors found that in the image replacement method in the related art, when using the material map to replace the region to be replaced in the image, the material image is only rigidly filled into the region to be replaced according to the position of the region to be replaced, Makes the replaced image quality poor.
  • an image processing method including:
  • the target area in the first image is replaced with at least part of the material image, so as to generate a second image combining the non-target area and at least part of the material image.
  • the material image to replace the target area in the first image when using the material image to replace the target area in the first image, fully consider the combined effect of the material image and the remaining part of the image that has not been replaced, and realize the replacement of the first image.
  • Perform mask processing on the target area in the image acquire the mask image, and determine the combination parameters of the non-target area and the material image according to the feature information of the mask image, and use the combination parameters to assist the non-target
  • the combination process of the area and at least part of the material image is beneficial to improve the quality of the generated second image.
  • the inventors found that in the video replacement method in the related art, when using material images to replace the regions to be replaced in multiple video frames, the same material is only replaced according to the positions of the regions to be replaced in different video frames Images are crammed into different video frames so that the replaced video plays back poorly.
  • an embodiment of the present application provides a video processing method, including:
  • the material image is transformed according to the transformation relationship between the different video frames to be processed, so that the movement between the different video frames to be processed is applied to the material image, so that the material image After the transformation, different video frames to be processed can be matched, thereby improving the real effect and quality of the replaced video frame.
  • the inventors found that the image segmented by the image segmentation model in the related art has relatively rough edges, which makes subsequent edge-based image processing results less accurate.
  • the embodiment of the present application provides a training method for an image segmentation model, including:
  • the difference between the first mask image and the second mask image related to the edge is used to optimize the image model, and the enhanced The edge learning of the image segmentation model is improved, and the edge of the image segmented by the trained image segmentation model is finer.
  • the embodiment of the present application provides an electronic device, including: a memory for storing executable instructions; one or more processors; wherein, when the one or more processors execute the executable instructions , are individually or collectively configured to perform the method of any one of the first aspect, the second aspect or the third aspect.
  • the embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores executable instructions, and when the executable instructions are executed by a processor, the first aspect and the second aspect are implemented. Or the method described in any one of the third aspect.
  • the embodiment of the present application provides a computer program product, including the computer program according to the method described in any one of the first aspect, the second aspect, or the third aspect.
  • FIG. 1 is a schematic diagram of a first image provided by an embodiment of the present application.
  • Fig. 2 is a schematic diagram of a second image provided by an embodiment of the present application.
  • Fig. 3 is a schematic flow chart of an image processing method provided by an embodiment of the present application.
  • Fig. 4 is a schematic diagram of a mask image provided by an embodiment of the present application.
  • Fig. 5A is a schematic diagram of another mask image provided by an embodiment of the present application.
  • FIG. 5B is a schematic diagram of a filtered mask image provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a traversal mask image provided by an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of a video processing method provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a boundary line determined by the lowest point of an edge provided by an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a training method for an image segmentation model provided by an embodiment of the present application.
  • Fig. 10A is a schematic diagram of a training image provided by an embodiment of the present application.
  • FIG. 10B is a schematic diagram of a segmented annotation map provided by an embodiment of the present application.
  • FIG. 10C is a schematic diagram of an edge map provided by an embodiment of the present application.
  • Fig. 11 is a schematic diagram of an image processed by an image segmentation model in the related art provided by an embodiment of the present application.
  • Fig. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • one of the more popular visual information processing methods is to use material images to replace certain areas in images or video frames to obtain target images or target videos that include at least part of the material images that meet user needs.
  • the material image is only rigidly filled into the region to be replaced according to the position of the region to be replaced, so that the synthesized image is replaced Poor quality.
  • an embodiment of the present application provides an image processing method, which includes acquiring a first image containing a target area and a material image to be replaced by the target area; performing mask processing on the target area in the first image, and obtaining Mask image; according to the feature information of the mask image, determine the combination parameters of the non-target area in the first image except the target area and the material image; according to the combination parameter, combine the The target area in the first image is replaced with at least part of the material image to generate a second image combining the non-target area and at least part of the material image.
  • the combination parameters of the non-target area and the material image can be determined according to the feature information of the mask image, so that The display quality of the second image aided in combining based on the combining parameters is better.
  • the image processing method provided in the embodiment of the present application may be executed by an electronic device.
  • the electronic device may include a program for executing the image processing method.
  • the electronic device includes at least a memory and a processor, the memory stores executable instructions of the image processing method, and the processor can be configured to execute the executable instructions.
  • the electronic devices include, but are not limited to, remote controls, smartphones/cell phones, tablet computers, personal digital assistants (PDAs), laptop computers, desktop computers, media content players, video game stations/systems, virtual reality Computer systems, augmented reality systems, wearable devices (for example, watches, glasses, helmets or pendants, etc.) and other computing devices with image processing capabilities.
  • the target area included in the first image includes a sky area, and the sky area needs to be replaced this time, and the material image includes a sky image.
  • the replacement process of this time The sky image is used to replace the sky region in the first image.
  • the electronic device may perform mask processing on the target area in the first image to obtain a mask image, the mask image includes a blank portion indicating the sky area and the non-sky area, and according to the mask
  • the feature information of the film image is used to determine the combination parameters of the non-sky area and the sky image, and based on the combination parameters, replace the sky area with at least part of the sky image to combine the at least part of the sky image
  • Generate the second image as shown in Figure 2 with the non-sky area realize that the combination effect of the sky image and the non-sky area can be considered comprehensively in the process of replacement and synthesis, thereby helping to improve the generated second image.
  • Image quality is used to determine the combination parameters of the non-sky area and the sky image, and based on the combination parameters, replace the sky area with at least part of the sky image to combine the at least part of the sky image
  • FIG. 3 is an image processing method provided by the embodiment of the present application.
  • the method can be performed by an electronic device, and the method includes:
  • step S101 a first image including a target area and a material image of the target area to be replaced are acquired.
  • step S102 mask processing is performed on the target area in the first image to acquire a mask image.
  • step S103 according to the feature information of the mask image, a combination parameter of a non-target area in the first image except the target area and the material image is determined.
  • step S104 according to the combination parameter, the target area in the first image is replaced with at least part of the material image, so as to generate a second image combining the non-target area and at least part of the material image.
  • the target area includes a sky area
  • the material image includes a sky image
  • the target area includes a sea surface area
  • the material image includes an ocean image.
  • the target area may also be a background area such as the ground or a building, or the target area may be an area where a specified object (such as a human body, an object, etc.) is located.
  • the embodiment of the present application does not impose any limitation on the sources of the first image and the material image, which can be selected according to actual application scenarios.
  • the first image can be collected by a user using an imaging device.
  • the material image may be synthesized by professional image processing software and the like.
  • the electronic device may provide an interactive interface, and the interactive interface includes an image adding control, and the user may use the image adding control to obtain an image containing the target area from the storage medium of the electronic device.
  • the first image of the first image, the interactive interface further includes a material selection control, and the user can select a material image desired to replace the target area from multiple candidate material images by operating the material selection control.
  • the electronic device may Mask processing is performed on a target area in an image, a mask image is acquired, and then a combination parameter of the non-target area and the material image is determined according to feature information of the mask image.
  • the mask image includes a blank area indicating the target area and the non-target area, for example, please refer to FIG. 1 and FIG. 4, the target area is a sky area, and the pair of electronic devices is shown in FIG. 1
  • a mask image as shown in FIG. 4 can be obtained.
  • the area corresponding to the sky area is a blank area.
  • the mask image can be obtained by classifying each pixel in the first image using a preset image segmentation model, and processing the pixels belonging to the target area; for example, the pixels belonging to The pixel values of the target area are all set to white (255, 255, 255).
  • the electronic device may input the first image into the preset image segmentation model, so as to obtain the mask image through the image segmentation model.
  • the image segmentation model includes but is not limited to a semantic segmentation model, an instance segmentation model or a panoramic segmentation model, etc.;
  • the semantic segmentation model classifies all pixels on the image;
  • the instance segmentation model is a combination of target detection and semantic segmentation , which can distinguish different individuals of the same type of objects in the image;
  • the panoramic segmentation model is a combination of semantic segmentation and instance segmentation, which can detect and segment all objects in the image including the background.
  • the edge accuracy between the blank area indicating the target area and the non-target area is low, such as the
  • the image segmentation model classifies the pixels in the first image containing the sky area, it often misclassifies the non-sky pixels into the sky at some confusing edges (such as near the edges of buildings and green plants), and it is also easy to classify them as sky. Sky pixels are misclassified as non-sky. Therefore, in order to reduce the degree of edge misclassification in the mask image output by the image segmentation model, after the electronic device acquires the mask image output by the image segmentation model, The membrane image is filtered, so that the edge information in the image can be effectively preserved.
  • the mask image used to determine the above combining parameters may be an image filtered by an edge-preserving filter, so as to effectively preserve edge details in the mask image.
  • the electronic device uses an edge-preserving filter to filter the mask image, and the filtered mask image as shown in Figure 5B can be obtained Compared with the mask image, it can be seen that the edge of the mask image after filtering is finer.
  • the embodiment of the present application does not impose any limitation on the specific type of the edge-preserving filter.
  • the edge-preserving filter includes but is not limited to a bilateral filter, a guided filter, a weighted least squares filter, or Non-mean local filters, etc.
  • the feature information of the mask image includes but not limited to the size information of the non-target area in the mask image , position information indicating a blank area of the target area, size information of the blank area, and/or edge information between the blank area and the non-target area, and the like.
  • the combination parameters include but not limited to edge alignment parameters, the size of the material image to be displayed in the second image, the position of the object in the material image to be displayed in the second image and/or size; wherein, the edge alignment parameter is used to align the edge between the non-target area and the target area with the edge of the material image.
  • the position and/or size of the object to be displayed in the second image in the material image is determined according to the width of the blank area in the mask image.
  • the position of the target object in the material image to be displayed in the second image is determined according to the position where the blank area has the largest width in the mask image.
  • the size of the material image in the second image may be determined according to size information of a blank area in the mask image.
  • the size of the material image in the last generated second image may be determined according to the difference between the size information of the blank area and the size information of the material image; for example, if the blank area If the size of the blank area is larger than the size of the material image, the material image can be upsampled to enlarge the size of the material image; if the size of the blank area is smaller than the size of the material image, the material image can be upsampled The image is down-sampled to scale the size of the material image.
  • the resizing of the material image in the second image is not limited to up-and-down sampling, and other solutions capable of resizing are also within the protection scope of the present invention.
  • the size of the target object in the material image in the second image may be determined according to the size information of the non-target area.
  • the size of the target object in the material image to be displayed in the second image may be determined according to the difference between the size information of the non-target area and the size information of the material image.
  • the target object in the material image can be determined according to the size information of the non-target area and the preset size ratio relationship (referring to the size ratio relationship between the non-target area and the target object). The dimensions shown in the second image.
  • the size of the target object in the material image in the second image may be determined according to the proportion of the non-target area in the first image.
  • the material image and the target object in the material image can be enlarged or reduced in equal proportions, or can be enlarged or reduced in different proportions respectively, and specific settings can be made according to actual application scenarios.
  • an edge alignment parameter may be determined according to edge information between a blank area and the non-target area in the mask image, and the edge alignment parameter is used to align the non-target area with the target area
  • the edges between and the edges of the footage image are aligned.
  • the target area is a sky area
  • the material image is a sky image
  • the edge between the non-target area and the target area refers to the skyline
  • the skyline of the sky image is determined according to the lower edge of the sky image.
  • the skyline of the non-sky area is determined according to the lowest point where the non-sky area meets the sky area.
  • the position of the target object in the material image to be displayed in the second image may be determined according to the position information of the blank area indicating the target area in the mask image.
  • the target area is a sky area
  • the material image is a sky image
  • the target object includes at least the sun or the moon
  • this embodiment can flexibly determine the sun or the moon according to the position information of the blank area display position of .
  • the target includes, but is not limited to, one or more of the sun, the moon, clouds, stars, airplanes, and the like.
  • the electronic device may determine a position suitable for displaying the target object in the material image in the mask image according to the position information of the blank area in the mask image indicating the target area And/or the area range, for example, the area of the blank area included in the area range corresponding to the position is greater than a preset value, or the area range corresponding to the position only includes the blank area.
  • the position and/or area range are used to indicate the position of the object in the material image to be displayed in the second image.
  • the target object in the material image is displayed in the part of the second image corresponding to the blank area as much as possible, so as to prevent the target object from blocking the non-target area.
  • the preset value may be specifically set according to an actual application scenario, which is not limited in this embodiment.
  • the preset value when the preset value is greater than or equal to the area corresponding to the area range, it means that only the blank area may be included in the area range corresponding to the position.
  • the preset value when the preset value is smaller than the area corresponding to the area range, it means that the area range corresponding to the position may contain less non-target areas.
  • the electronic device may obtain one or more search boxes related to the target object, and use the The one or more search boxes traverse the mask image with a preset step size.
  • the display size of the search box may be greater than or equal to the size of the target to be displayed in the Size in the second image: the size of the target object to be displayed in the second image is determined according to the size information of the non-target area.
  • the preset step size can also be determined according to the size of the target object to be displayed in the second image, for example, the larger the display size, in order to save running resources in the traversal process, the preset step size can also be set to The step size is set to be larger, that is, the display size is positively correlated with the preset step size.
  • the electronic device may determine the area of the blank area contained in each sub-area traversed by the search box according to the position information of the blank area in the mask image; if the sub-area contains The area of the blank area is greater than a preset value, and the sub-area is determined to be a position and/or area suitable for displaying the target object.
  • the display of the target object in the material image can be adaptively determined according to the position of the blank area (or the target area), so that even if the same material image is used, different target areas (blank areas) can be determined.
  • the position suitable for displaying the target object is also different, that is, the display position of the target object in the material image in the second image can be flexibly changed based on different target areas, thereby improving the display effect of the second image.
  • the electronic device may use the one or more search boxes to predict Set the step size to traverse at least part of the blank area and/or at least part of the non-target area in the mask image; wherein, in order to avoid the target object from causing excessive occlusion to the non-target area, the at least part of the non-target area
  • the area may include a portion of the non-target area that is close to the target area.
  • the electronic device may use the one or more search boxes to select from preset Starting from a position, the mask image is traversed with a preset step size and a preset direction; wherein, the preset position and/or the preset direction can be determined according to the type of the target.
  • the target area is a sky area
  • the target object is the moon
  • the sky area that is, a blank area
  • the moon is usually in the upper part of the sky area.
  • the preset position can be determined in the upper part of the blank area in the mask map (such as the position above the bisector of the blank area), assuming that the preset direction can be from left to right and/or in a top-to-bottom direction, for example, in the mask image as shown in FIG.
  • the electronic device may use the search box to traverse the mask image from a preset position along a left-to-right direction , and then continue to traverse the mask image from top to bottom or from bottom to top; or in order to improve the traversal efficiency, the electronic device can also use multiple search boxes to start from different preset positions and go from left to right Iterate over the mask image.
  • the preset direction includes one or more of from left to right, from right to left, from top to bottom, and from bottom to top.
  • the preset position is located in the middle of the mask image. The position is located in the upper half of the mask image, for example, one-third of the upper edge.
  • the electronic device may The target sub-area is acquired with the set position priority, and the target sub-area is determined to be a position and/or area range suitable for displaying the target object.
  • the position priority may be set according to actual needs, so that the selected target sub-area is the position where the user expects to display the target object.
  • a position located in the middle of the mask image has a higher priority than other positions.
  • the sub-area in the middle of the mask image can be preferentially selected as the target sub-area according to the preset position priority. Select the sub-region in the middle of the mask image, and then select the sub-region in other positions, for example, first select the sub-region in the left half of the mask image, and then select the sub-region in the right half of the mask image.
  • the electronic device may display the position and/or area range of the target object in the material image according to the suitable Area range, processing the material image;
  • the processing performed on the material image includes but not limited to cropping processing, scaling processing and/or rotation processing; then the electronic device converts the first image The target area in is replaced with the processed footage image.
  • the material image can be effectively processed according to the determined synthesis parameters (suitable for displaying the position and/or area range of the target object in the material image), so that the processed material image can be combined with the The non-target area is adapted to improve the display effect of the generated second image.
  • the electronic device when generating the second image, may use the mask image to extract an image containing the non-target region from the first image, and then extract an image containing the non-target region The image of the region is fused with at least part of the material image to generate the second image.
  • the material image may be selected according to the replacement time of the first target image, and different replacement times indicate different target objects and/or color tones in the material image.
  • the target area as the sky if the replacement time is in the daytime, the target object in the material image may be the sun; if the replacement time is in the evening, the target object in the material image It could be the moon.
  • the target area as the sky the brightness of the material image when the replacement time is noon is greater than the brightness when the replacement time is evening.
  • the moon as an example, if the replacement time is around the fifteenth day of the lunar calendar, the moon is round, and if the replacement time is the first month of the lunar calendar, the moon is sickle-shaped.
  • the electronic device may adjust the display parameters of the non-target area according to the color information of the target area and the color information of the material image, so as to generate a combined A second image of the adjusted non-target area and at least part of the footage image.
  • the display parameters include brightness and/or hue of the non-target area.
  • the display parameters of the non-target area are coordinately processed according to the color information of the target area and the color information of the material image, so that the color tone, light etc. are consistent.
  • the electronic device when generating the second image, adjusts the target object of the material image according to the recognition result of the non-target area. For example, according to the category recognition result of the non-target area, the size, type, size, movement speed, and shape change speed of the target object in the material image are adjusted. For example, when it is identified that the non-target area includes ancient buildings, the type of the material image is adjusted to be moving clouds.
  • the second image includes multiple frames of images, and the positions of the target object in the second image are different among the multiple frames of images.
  • multiple frames of images constitute a video clip, and the target moves at a preset speed in the video clip.
  • the target object includes a first target object and a second target object, and the first target object and the second target object have different speeds in the video clip.
  • the speed of motion of the clouds is greater than the speed of motion of the moon. Velocities of the first object and the second object in the video segment are not zero.
  • Multiple frames of images constitute a video clip, in which the shape of the target object changes continuously at a preset speed, for example, the shape of a cloud changes. Different types of objects have different speeds of shape change.
  • the second image includes multiple frames of images, and the shapes of the objects in the second images are different among the multiple frames of images.
  • multiple frames of images constitute a video clip, and the shape of the target object changes in the video clip.
  • the changes include continuous changes. For example, the change of clouds, the continuity of strokes.
  • the status (including shape, category, etc.) of the target object in the target area in the first image is recognized, and corresponding stickers are added according to the shape to generate the second image.
  • the first image includes multiple frames of images, and the shapes of objects in the first image are different among the multiple frames of images.
  • the display state of the corresponding sticker in the second image is adjusted.
  • stickers of different shapes to accommodate changes in the shape of the object.
  • the stickers include strokes, icons, etc., and the strokes displayed in the second continuous image change continuously.
  • the color information of the target area includes the average value of the three color channels of the target area BGR
  • the color information of the material image includes the average value of the three color channels of the material image BGR.
  • the electronic device may determine the brightness of the non-target area according to the ratio of the color information of the material image to the color information of the target area, and/or, according to the color information of the material image and the color information of the target area
  • the difference in color information QA adjusts the tone of the non-target area.
  • the electronic device when it generates the second image, it may perform color transition processing on the edge between the non-target area and at least part of the material image, so as to improve the real effect of the second image .
  • a skyline halo may be added to an edge between a non-sky area and at least part of the sky image, so as to ensure a natural transition between the two.
  • the above-mentioned processing process of the image processing method does not need to interact with other devices, and can be completed locally by the electronic device alone. Further, the electronic device can also be performed locally offline.
  • the above image processing method thus has wide applicability.
  • the method described in the embodiment of the above-mentioned image processing method may also be adopted.
  • the target area in the video frame to be processed is replaced with the material image.
  • the embodiment of the present application provides a video processing method, which can perform motion estimation on the video frame sequence to obtain the transformation relationship between different video frames to be processed, and then Transforming the material image according to the transformation relationship between them, so as to apply the motion between different video frames to be processed to the material image, so that the material image can match the different video frames to be processed after transformation, Thereby improving the realistic effect and quality of the replaced video frame.
  • the above image processing method can be used to process the first to-be-processed video frame containing the target area, and the first to-be-processed video frame containing the target area Perform mask processing on the target area in the video frame to be processed, obtain a mask image, and then determine non-target areas in the video frame to be processed except for the target area according to the feature information of the mask image Combining parameters with the material image; according to the combining parameter, replacing the target area with at least part of the material image, so as to generate a video frame combining the non-target area and at least part of the material image.
  • the video processing method provided in the embodiment of the present application can be used to perform motion estimation on the sequence of video frames to obtain the difference between the different video frames to be processed. transformation relationship between different video frames to be processed, and then transform the material image according to the transformation relationship between different video frames to be processed, and then use the changed material image to replace the target area in the video frame to be processed.
  • the video processing method provided in the embodiment of the present application may be applied to electronic equipment.
  • the electronic device may include a program for executing the image processing method.
  • the electronic device includes at least a memory and a processor, the memory stores executable instructions of the image processing method, and the processor can be configured to execute the executable instructions.
  • the electronic devices include, but are not limited to, remote controls, smartphones/cell phones, tablet computers, personal digital assistants (PDAs), laptop computers, desktop computers, media content players, video game stations/systems, virtual reality Computer systems, augmented reality systems, wearable devices (for example, watches, glasses, helmets or pendants, etc.) and other computing devices with image processing capabilities.
  • FIG. 7 is a schematic flowchart of a video processing method provided in an embodiment of the present application. The method may be executed by the electronic device, and the method includes:
  • step S201 at least two to-be-processed video frames containing a target area are acquired from a sequence of video frames.
  • step S202 motion estimation is performed according to the sequence of video frames, and a transformation relationship between different video frames to be processed is acquired.
  • step S203 a material image used to replace the target area of the video to be processed is obtained, and the material image is transformed according to the transformation relationship.
  • step S204 the transformed material image is used to replace the target area in the video frame to be processed, so that the material image adapts to the motion of the different video frame to be processed.
  • motion estimation when motion estimation is performed on the sequence of video frames, if computing resources are sufficient, motion estimation may be performed based on all regions of each of the at least two video frames to be processed.
  • feature extraction can be performed on two adjacent video frames to be processed to obtain feature information (such as corner points, edges, etc.); feature matching is performed according to the feature information (for example, use the KLT tracking algorithm), and then obtain the transformation relationship between two adjacent video frames to be processed through the matched feature information.
  • the electronic device may perform motion estimation according to a partial area of each of the at least two video frames to be processed.
  • the video frame to be processed includes the target area and non-target areas other than the target area.
  • the partial region includes at least part of the target region in the video frame to be processed; or, the partial region includes at least part of the target region in the video frame to be processed and a part of non-target regions close to the target region.
  • the target area; or, the partial area includes a part of the video frame to be processed on one side of the boundary line, and the area of the target area on one side of the boundary line is larger than the area on the other side.
  • the electronic device can perform motion estimation according to at least part of the target area of each of the at least two video frames to be processed.
  • the The electronic device may perform motion estimation according to at least a part of the target area and a part of the non-target area close to the target area in each of the at least two video frames to be processed, wherein the part close to the target area
  • the non-target area is beneficial to increase the feature information for motion estimation, thereby improving the accuracy of the motion estimation result.
  • the transformation of the material image pays more attention to the motion between different target areas of the video frame to be processed, that is to say , the feature information extracted from the target area should occupy a larger proportion in the motion estimation process, and the feature information of the part of the non-target area close to the target area can occupy a larger proportion in the motion estimation process. Small.
  • the electronic device may, according to at least part of the target area and its first weight of each of at least two video frames to be processed, and part of non-target areas close to the target area region and its second weight to perform motion estimation; wherein, the first weight is greater than the second weight, and the first weight represents the proportion of the feature information of the at least part of the target region in the motion estimation process, so The second weight represents the proportion of the feature information of the part of the non-target area in the motion estimation process.
  • the weight of at least part of the target area is increased, so that a transformation relationship adapted to the target area can be obtained.
  • the part of the non-target area close to the target area is determined according to an edge between the target area and the non-target area in the video frame to be processed.
  • a boundary line may be determined according to the edge between the target area and the non-target area, for example, the boundary line may be a straight line determined according to the lowest point of the edge in the target area, and the The boundary line divides the non-target area to obtain a part of the non-target area close to the target area.
  • the edge is a dividing line or one or more dividing points therein.
  • the boundary line may be acquired in the video frame to be processed, the area of the target area on one side of the boundary line is larger than the area on the other side, and the electronic device may A part on one side of the boundary line (that is, a part with a larger area of the target area) is used for motion estimation, so that a transformation relationship adapted to the target area can be obtained.
  • the boundary line is used to divide the image to be processed into two parts, and the part of the video frame to be processed on the same side of the boundary line includes part of the target area and part of the non-target area.
  • the area of part of the target area is larger than the area of part of the non-target area; on the other side of the boundary line, the area of part of the target area is smaller than the area of part of the target area The area of the non-target area.
  • the boundary line may be determined according to an edge between the target area and the non-target area.
  • the boundary line may be determined according to one of the edge points in the edges.
  • one of the edge points includes the lowest point of the target area, and the white straight line shown in FIG. 8 is the boundary line determined according to the lowest point.
  • the boundary line may also be determined according to other edge points, which is not limited in this embodiment.
  • the edge points are segmentation points.
  • the boundary line between the target area and the non-target area in the video frame to be processed may be obtained, and according to the Motion estimation is performed on the boundary line.
  • using the boundary line to perform motion estimation can effectively reduce the calculation amount to be processed and improve the efficiency of motion estimation.
  • the boundary line may be determined according to the edge between the target area and the non-target area in the video frame to be processed; in one example, the boundary line may be formed by the edge; in another In an example, the boundary line may be determined according to one of the edge points in the edge between the target area and the non-target area, such as a straight line segment formed by the one of the edge points.
  • the boundary line may be expanded to obtain the expanded boundary line; according to the expanded boundary line in at least two video frames to be processed, the motion estimation.
  • the boundary line by performing expansion processing on the boundary line, feature information for motion estimation can be effectively increased, thereby improving the accuracy of the motion estimation result.
  • the display range of the material image may be adapted to the motion of different video frames to be processed, such as
  • the material image is synthesized by a plurality of images, and the acquisition angles of the plurality of images are different, so that no matter how much the motion of the different video frames to be processed is, the display range of the exclusive image can be adapted.
  • the material image includes a sky image
  • the material image is stored in the form of a sky bounding box or a hemisphere.
  • a real-time video playback scene such as a live broadcast scene
  • motion estimation is performed on two adjacent video frames to be processed (such as the current video frame to be processed and the last video frame to be processed) to obtain the transformation relationship
  • the process may be synchronized with the playback process of the video frame sequence, so the transformation relationship cannot be applied to the current video frame to be processed, then the transformation relationship can be used to transform the material image corresponding to the next video frame to be processed , taking into account the transformation process of the material image and the real-time playback process of the video frame.
  • the transformation relationship obtained by performing motion estimation on two adjacent video frames to be processed can be directly applied to the current video frame to be processed, and the transformation can be used The relationship transforms the material image corresponding to the current video frame to be processed to ensure that the transformed material image is compatible with the current video frame to be processed.
  • the mask image in the above-mentioned image processing process, can be obtained through an image segmentation model, and the image segmentation model provided in the related art includes but not limited to a semantic segmentation model, an instance segmentation model or a panoramic segmentation model, etc.
  • the inventors have found that when the size of the image input into the image segmentation model is small, the image segmented by the above segmentation model has rough edges.
  • the outer contour of a building is rectangular, and the image The corners (outer contours) of the segmented model output are rounded.
  • the leaves are hollowed out, the position of the leaves in the segmentation map output by the image segmentation model is a solid map, and the actual hollowed out part does not detect the edge. In this way, the segmentation map output by the image segmentation model will lead to low accuracy of subsequent edge-based image processing results.
  • the embodiment of the present application provides a training method for an image segmentation model.
  • the edge learning of the image segmentation model is strengthened, so that the image segmented by the trained image segmentation model The edges of the image are finer.
  • FIG. 9 is a schematic flow chart of the training method of the image segmentation model provided by the embodiment of the present application.
  • the training method can be executed by electronic equipment, and the electronic equipment includes but is not limited to desktop computers, notebooks, palmtop computers, Computing devices such as wearable devices, servers, cloud servers, and mobile phones.
  • the methods include:
  • step S301 the training image is input into the image segmentation model to obtain a segmentation prediction map.
  • step S302 edge detection is performed on the segmented label map corresponding to the training image to obtain an edge map.
  • step S303 mask processing is performed on the segmentation annotation image and the segmentation prediction image respectively by using the edge image to obtain a first mask image and a second mask image.
  • step S304 parameters of the image segmentation model are adjusted at least according to the difference between the first mask image and the second mask image to obtain a trained image segmentation model.
  • the image segmentation model can be trained in a supervised learning manner.
  • a number of training images containing target regions are obtained, and the training images correspond to segmentation annotation maps, so
  • the segmented labeled map is a "true" map with the target region pre-labeled.
  • the image segmentation model is used to segment the sky area and non-sky area, as shown in Figure 10A and Figure 10B, Figure 10A shows one of the training images, FIG. 10B shows the segmentation annotation map corresponding to the training image, where the target area and the non-target area are displayed in different colors.
  • the electronic device After acquiring the training image, the electronic device inputs the training image into a preset image segmentation model, so as to acquire a segmentation prediction map output by the image segmentation model.
  • a preset image segmentation model so as to acquire a segmentation prediction map output by the image segmentation model. It can be understood that the embodiment of the present application does not involve improving the structure of the image segmentation model, and does not impose any limitation on the specific type of the image segmentation model, which can be specifically set according to actual application scenarios.
  • the image segmentation model includes, but is not limited to, a semantic segmentation model, an instance segmentation model, or a panoramic segmentation model, etc.; wherein, the semantic segmentation model classifies all pixels on the image; the instance segmentation model is a target detection model Combining with semantic segmentation, different individuals of the same type of objects in the image can be distinguished; the panoramic segmentation model is a combination of semantic segmentation and instance segmentation, which can detect and segment all objects in the image, including the background.
  • the electronic device may perform edge detection on the segmented annotation map corresponding to the training image to obtain the edge map. It can be understood that this embodiment does not impose any restrictions on the edge detection algorithm used, and can be selected according to the actual application scenario, for example, canny edge detection algorithm, Sobel edge detection algorithm, Prewitt edge detection algorithm or Roberts edge detection algorithm, etc. Edge detection is performed on the segmented annotation map.
  • the detected edge pass is relatively small, and may only occupy one or two pixels in the segmented annotation map, in order to further highlight the detected edge and improve image segmentation
  • the detected edge is expanded to obtain the expanded edge image.
  • edge detection is performed on the segmented annotation map shown in FIG. 10B , and the detected edges are dilated to obtain the dilation as shown in FIG. 10C After the edge map.
  • the electronic device may use the edge map to perform mask processing on the segmented annotation map to obtain a first mask image, and the first mask image It is used to highlight the edge (the edge between the target area and the non-target area) in the segmentation annotation map; and, after obtaining the segmentation prediction map, the electronic device can use the edge map to predict the segmentation Mask processing is performed on the image to obtain a second mask image, and the second mask image is used to highlight edges (edges between target regions and non-target regions) in the segmentation prediction map.
  • edges and non-edges in the edge map are distinguished by different colors.
  • the edges in the edge map are displayed in white (255, 255, 255), and the non-edges in the edge map are displayed in black (0,0,0) is displayed.
  • the process of mask processing may be to combine the segmentation label image or the segmentation prediction image with the edge image Multiplication, that is, the first mask image is the product of the segmentation label map and the edge map, the non-edge part in the first mask image is black (0,0,0), and the edge part non-zero, so that the edge part can be highlighted; correspondingly, the second mask image is the product of the segmentation prediction map and the edge map, and the non-edge part in the second mask image is black (0,0,0), and the edge part is not 0, so that the edge part can be highlighted.
  • the electronic device may adjust at least according to the difference between the first mask image and the second mask image
  • the parameters of the image segmentation model are to obtain the trained image segmentation model.
  • the difference between the first mask image and the second mask image about the edge is used to optimize the image model, and the image is enhanced.
  • the edge learning of the segmentation model makes the edge of the image segmented by the trained image segmentation model finer.
  • the loss function used to optimize the image segmentation model includes indicating that “the first mask image and the second mask image
  • the difference between" part and the "difference between the segmentation prediction map and the segmentation annotation map” part the electronic device can according to the difference between the segmentation prediction map and the segmentation annotation map, and the Adjusting the parameters of the image segmentation model based on the difference between the first mask image and the second mask image to obtain a trained image segmentation model.
  • the image segmentation model trained in this embodiment does not have any requirements on the size of the input image, and can obtain a finer segmentation map than the image segmentation model in the related art no matter how small the size of the input image is.
  • the image shown in Fig. 1 is input into the image segmentation model in the related art (without strengthening the learning of edges) for segmentation processing, and the result shown in Fig. 11 can be obtained Segmentation diagram; the image shown in Figure 1 is input into the image segmentation model (strengthening the learning of edges) obtained by the embodiment of the present application for segmentation processing, and the segmentation diagram as shown in Figure 5A can be obtained.
  • the segmentation diagram shown in Figure 5A The edges of the segmentation map described above are finer than the edges of the segmentation map shown in FIG. 11 .
  • an electronic device 40 including:
  • processors one or more processors
  • the one or more processors execute the executable instructions, they are individually or collectively configured to perform any one of the above methods.
  • the processor 41 executes the executable instructions included in the memory 42, and the processor 41 can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the memory 42 stores executable instructions of the image processing method, the video processing method or the training method of the image segmentation model, and the memory 42 may include at least one type of storage medium, and the storage medium includes a flash memory, a hard disk , multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. Also, the device may cooperate with a web storage which performs a storage function of the memory through a network connection.
  • the storage medium includes a flash memory, a hard disk , multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory,
  • the storage 42 may be an internal storage unit of the device 40 , such as a hard disk or a memory of the device 40 .
  • Memory 42 also can be the external storage device of equipment 40, for example equipped with plug-in type hard disk on equipment 40, smart memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash memory card (Flash Card) wait.
  • the memory 42 may also include both an internal storage unit of the device 40 and an external storage device.
  • the memory 42 is used to store the computer program 44 and other programs and data required by the device.
  • the memory 42 can also be used to temporarily store data that has been output or will be output.
  • Various implementations described herein can be implemented using a computer readable medium such as computer software, hardware, or any combination thereof.
  • the embodiments described herein can be implemented by using Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays ( FPGA), processors, controllers, microcontrollers, microprocessors, electronic units designed to perform the functions described herein.
  • ASICs Application Specific Integrated Circuits
  • DSPs Digital Signal Processors
  • DSPDs Digital Signal Processing Devices
  • PLDs Programmable Logic Devices
  • FPGA Field Programmable Gate Arrays
  • processors controllers, microcontrollers, microprocessors, electronic units designed to perform the functions described herein.
  • an embodiment such as a procedure or a function may be implemented with a separate software module that allows at least one function or operation to be performed.
  • the software codes can be implemented by a software application (or program
  • the electronic device 40 may be computing devices such as desktop computers, notebooks, palmtop computers, servers, cloud servers, and mobile phones.
  • the device may include, but not limited to, a processor 41 , a memory 42 .
  • FIG. 4 is only an example of the electronic device 40, and does not constitute a limitation to the electronic device 40. It may include more or less components than shown in the figure, or combine certain components, or different components. , for example, devices may also include input and output devices, network access devices, buses, and so on.
  • non-transitory computer-readable storage medium including instructions, such as a memory including instructions, which are executable by a processor of an apparatus to perform the above method.
  • the non-transitory computer readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.
  • a non-transitory computer-readable storage medium enabling the terminal to execute the above method when instructions in the storage medium are executed by a processor of the terminal.
  • a computer program product including the computer program of any one of the above methods.

Abstract

一种图像处理方法、视频处理方法、训练方法、设备、程序产品及存储介质。其中,所述图像处理方法包括:获取包含目标区域的第一图像以及待替换所述目标区域的素材图像;对所述第一图像中的目标区域进行掩膜处理,获取掩膜图像;根据所述掩膜图像的特征信息,确定所述第一图像中除所述目标区域之外的非目标区域和所述素材图像的结合参数;根据所述结合参数,将所述第一图像中的目标区域替换为至少部分所述素材图像,以生成结合所述非目标区域和至少部分所述素材图像的第二图像。本实施例有利于提高替换后的第二图像的质量。

Description

图像处理方法、视频处理方法、训练方法、设备、程序产品及存储介质 技术领域
本申请涉及视觉信息处理技术领域,具体而言,涉及一种图像处理方法、视频处理方法、训练方法、设备、程序产品及存储介质。
背景技术
随着当今社会中的智能化设备的发展,视觉信息(如图像、视频等)处理已经成为人们生活及中不可或缺的一部分,无论是工作中的专业视觉信息处理还是生活中的娱乐型视觉信息处理,其中一种较为普及的视觉信息处理方式就是对图像或者视频帧中的某些区域进行替换。
发明内容
有鉴于此,本申请的目的之一是提供一种图像处理方法、视频处理方法、训练方法、设备、程序产品及存储介质。
第一方面,发明人发现,相关技术中的图像替换方法,在使用素材图替换图像中的待替换区域时,只是根据所述待替换区域的位置将素材图像生硬地填入该待替换区域,使得替换后的图像质量不佳。
基于此,本申请实施例提供了一种图像处理方法,包括:
获取包含目标区域的第一图像以及待替换所述目标区域的素材图像;
对所述第一图像中的目标区域进行掩膜处理,获取掩膜图像;
根据所述掩膜图像的特征信息,确定所述第一图像中除所述目标区域之外的非目标区域和所述素材图像的结合参数;
根据所述结合参数,将所述第一图像中的目标区域替换为至少部分所述素材图像,以生成结合所述非目标区域和至少部分所述素材图像的第二图像。
本实施例中,在使用所述素材图像替换所述第一图像中的目标区域时,充分考虑到所述素材图像与所述图像中剩余未被替换部分的结合效果,实现对所述第一图像中 的目标区域进行掩膜处理,获取掩膜图像,并根据所述掩膜图像的特征信息,确定非目标区域和所述素材图像的结合参数,使用所述结合参数来辅助所述非目标区域和至少部分所述素材图像的结合过程,从而有利于提高生成的第二图像的质量。
第二方面,发明人发现,相关技术中的视频替换方法,在使用素材图像替换多个视频帧中的待替换区域时,只是根据不同视频帧中的所述待替换区域的位置将相同的素材图像生硬地填入不同的视频帧中,使得替换后的视频的播放质量不佳。
基于此,本申请实施例提供了一种视频处理方法,包括:
从视频帧序列中获取至少两个包含目标区域的待处理视频帧;
根据所述视频帧序列进行运动估计,获取不同所述待处理视频帧之间的变换关系;
获取用于替换所述待处理视频的目标区域的素材图像,并根据所述变换关系对所述素材图像进行变换处理;
使用变换后的素材图像替换所述待处理视频帧中的目标区域,以使得所述素材图像与不同的所述待处理视频帧的运动相适应。
本实施例中,根据不同所述待处理视频帧之间的变换关系来对所述素材图像进行变换处理,实现将不同所述待处理视频帧之间的运动作用到素材图像上,使得素材图像在变换后能够匹配不同的所述待处理视频帧,从而提高替换后的视频帧的真实效果和质量。
第三方面,发明人发现,相关技术中的图像分割模型所分割的图像,其分割的边缘较为粗糙,使得后续基于边缘的图像处理结果的准确度较低。
基于此,本申请实施例提供了一种图像分割模型的训练方法,包括:
将训练图像输入图像分割模型,以获取分割预测图;
对所述训练图像对应的分割标注图进行边缘检测,获取边缘图;
使用边缘图分别对所述分割标注图和所述分割预测图进行掩膜处理,获得第一掩膜图像和第二掩膜图像;
至少根据所述第一掩膜图像和所述第二掩膜图像之间的差异调整所述图像分割模型的参数,获得训练好的图像分割模型。
本申请实施例中,在训练所述图像分割模型的过程中,使用有关于边缘的所述第一掩膜图像和所述第二掩膜图像之间的差异对所述图像模型进行优化,强化了图像分割模型对边缘的学习,使得训练好的图像分割模型所分割的图像的边缘更加精细。
第四方面,本申请实施例提供了一种电子设备,包括:用于存储可执行指令的存储器;一个或多个处理器;其中,所述一个或多个处理器执行所述可执行指令时,被 单独地或共同地配置成执行第一方面、第二方面或第三方面任一项所述的方法。
第五方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有可执行指令,所述可执行指令被处理器执行时实现如第一方面、第二方面或第三方面任一项所述的方法。
第六方面,本申请实施例提供了一种计算机程序产品,包括如第一方面、第二方面或第三方面任一所述方法的计算机程序。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一个实施例提供的第一图像的示意图;
图2是本申请一个实施例提供的第二图像的示意图;
图3是本申请一个实施例提供的图像处理方法的流程示意图;
图4是本申请一个实施例提供的一种掩膜图像的示意图;
图5A是本申请一个实施例提供的另一种掩膜图像的示意图;
图5B是本申请一个实施例提供的滤波后的掩膜图像的示意图;
图6是本申请一个实施例提供的遍历掩膜图像的示意图;
图7是本申请一个实施例提供的一种视频处理方法的流程示意图;
图8是本申请一个实施例提供的以边缘的最低点确定的交界线的示意图;
图9是本申请一个实施例提供的一种图像分割模型的训练方法的流程示意图;
图10A是本申请一个实施例提供的训练图像的示意图;
图10B是本申请一个实施例提供的分割标注图的示意图;
图10C是本申请一个实施例提供的边缘图的示意图;
图11是本申请一个实施例提供的相关技术中的图像分割模型处理得到的图像的示意图;
图12是本申请一个实施例提供的电子设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
在视觉信息处理方面,其中一个较为普及的视觉信息处理方式就是使用素材图对图像或者视频帧中的某些区域进行替换,获得满足用户需求的包含至少部分素材图像的目标图像或目标视频。
其中,相关技术中的图像替换方法,在使用素材图替换图像中的待替换区域时,只是根据所述待替换区域的位置将素材图像生硬地填入该待替换区域,使得替换合成后的图像质量不佳。
基于此,本申请实施例提供了一种图像处理方法,获取包含目标区域的第一图像以及待替换所述目标区域的素材图像;对所述第一图像中的目标区域进行掩膜处理,获取掩膜图像;根据所述掩膜图像的特征信息,确定所述第一图像中除所述目标区域之外的非目标区域和所述素材图像的结合参数;根据所述结合参数,将所述第一图像中的目标区域替换为至少部分所述素材图像,以生成结合所述非目标区域和至少部分所述素材图像的第二图像。本实施例中充分考虑到所述素材图像与所述图像中剩余未被替换部分的结合效果,能够根据掩膜图像的特征信息来确定所述非目标区域和所述素材图像的结合参数,使得基于所述结合参数来辅助结合的第二图像的显示质量更佳。
本申请实施例提供的所述图像处理方法可以由电子设备来执行。示例性的,所述电子设备可以包括有执行所述图像处理方法的程序。示例性的,所述电子设备至少包括有存储器和处理器,所述存储器存储有所述图像处理方法的可执行指令,所述处理器可被配置为执行所述可执行指令。示例性的,所述电子设备包括但不限于遥控器、智能电话/手机、平板计算机、个人数字助理(PDA)、膝上计算机、台式计算机、媒体内容播放器、视频游戏站/系统、虚拟现实系统、增强现实系统、可穿戴式装置(例如,手表、眼镜、头盔或者挂件等)等具备图像处理能力的计算设备。
在一示例性的换天场景中,如图1所示,所述第一图像包括的目标区域包括天空区域,本次需要替换所述天空区域,所述素材图像包括天空图像,本次替换过程使用所述天空图像来替换所述第一图像中的天空区域。所述电子设备可以对所述第一图像中的目标区域进行掩膜处理,获取掩膜图像,所述掩膜图像包括指示所述天空区域的 空白部分和所述非天空区域,根据所述掩膜图像的特征信息来确定所述非天空区域与所述天空图像的结合参数,并基于所述结合参数,将所述天空区域替换为至少部分所述天空图像,以结合所述至少部分天空图像和所述非天空区域生成如图2所示的第二图像,实现在替换及合成的过程中能够综合考虑所述天空图像与所述非天空区域的结合效果,从而有利于提高生成的第二图像的质量。
接下来对本申请实施例提供的图像处理方法进行说明:请参阅图3,为本申请实施例提供的一种图像处理方法,所述方法可以由电子设备来执行,所述方法包括:
在步骤S101中,获取包含目标区域的第一图像以及待替换所述目标区域的素材图像。
在步骤S102中,对所述第一图像中的目标区域进行掩膜处理,获取掩膜图像。
在步骤S103中,根据所述掩膜图像的特征信息,确定所述第一图像中除所述目标区域之外的非目标区域和所述素材图像的结合参数。
在步骤S104中,根据所述结合参数,将所述第一图像中的目标区域替换为至少部分所述素材图像,以生成结合所述非目标区域和至少部分所述素材图像的第二图像。
可以理解的是,本实施例对于所述目标区域和所述素材图像的具体类型不做任何限制,可依据实际应用场景进行具体设置。示例性的,所述目标区域包括天空区域,所述素材图像包括天空图像。示例性的,所述目标区域包括海面区域,所述素材图像包括海洋图像。示例性的,所述目标区域还可以是地面、建筑等背景区域,或者所述目标区域可以是指定对象(比如人体、物体等)所在的区域。
当然,本申请实施例对于所述第一图像和所述素材图像的来源也不做任何限制,可依据实际应用场景进行具体选择,例如所述第一图像可以是用户使用成像装置采集得到的,例如所述素材图像可以是专业的图像处理软件合成的等等。
在一示例性的应用场景中,所述电子设备可以提供一交互界面,所述交互界面包含有图像添加控件,用户可以使用所述图像添加控件从所述电子设备的存储介质中获取包含目标区域的第一图像,所述交互界面还包含有素材选择控件,用户可以通过操作所述素材选择控件从多张候选素材图像中选择期望替换所述目标区域的素材图像。
在一些实施例中,在获取所述第一图像和所述素材图像之后,为了提高所述第一图像中的非目标区域和所述素材图像的结合效果,所述电子设备可以对所述第一图像中的目标区域进行掩膜处理,获取掩膜图像,然后根据所述掩膜图像的特征信息,确定所述非目标区域和所述素材图像的结合参数。其中,所述掩膜图像包括指示所述目标区域的空白区域和所述非目标区域,例如请参考图1以及图4,所述目标区域为天 空区域,所述电子设备对如图1所示的第一图像中的目标区域进行掩膜处理之后,可以得到如图4所示的掩膜图像,在该掩膜图像中,天空区域对应的区域为空白区域。
在一些实施例中,所述掩膜图像可以使用预设的图像分割模型对所述第一图像中的各个像素进行分类,并对属于所述目标区域的像素进行处理后得到;比如可以将属于所述目标区域的像素值均置为白色(255,255,255)。示例性的,所述电子设备可以将所述第一图像输入所述预设的图像分割模型中,以通过所述图像分割模型获得所述掩膜图像。
其中,所述图像分割模型包括但不限于语义分割模型、实例分割模型或者全景分割模型等等;语义分割模型即对图像上的所有像素点进行分类;实例分割模型是目标检测和语义分割的结合,可以对图像中同一类型的物体中的不同个体进行区分;全景分割模型是语义分割和实例分割的结合,可以对图像中的所有物体包括背景均进行检测和分割。
在一些实施例中,考虑到所述图像分割模型输出的所述掩膜图像中,所述指示所述目标区域的空白区域和所述非目标区域之间的边缘准确度较低,比如所述图像分割模型在对包含天空区域的第一图像中的像素点进行分类时,在一些容易混淆的边缘(如建筑、绿植边缘附近),经常会将非天空的像素错分成天空,也容易将天空像素错分成非天空。因此,为了降低所述图像分割模型输出的所述掩膜图像中的边缘错分程度,所述电子设备在获取所述图像分割模型输出的掩膜图像之后,可以使用保边滤波器对该掩膜图像进行滤波处理,从而能够有效的保留图像中的边缘信息。
即是说,用于确定上述结合参数的掩膜图像可以为经过保边滤波器滤波处理后的图像,从而有效保留所述掩膜图像中的边缘细节。在一个例子中,假设所述图像分割模型输出的掩膜图像如图5A所示,所述电子设备使用保边滤波器对该掩膜图像进行滤波处理,可以得到如图5B所示的滤波后的掩膜图像,对比可知,滤波处理之后的掩膜图像的边缘更为精细。可以理解的是,本申请实施例对于所述保边滤波器的具体类型不做任何限制,例如所述保边滤波器包括但不限于双边滤波器、导向滤波器、加权最小二乘法滤波器或者非均值局部滤波器等。
在一些实施例中,在根据所述掩膜图像的特征信息确定所述结合参数的过程中,所述掩膜图像的特征信息包括但不限于所述掩膜图像中的非目标区域的尺寸信息、指示所述目标区域的空白区域的位置信息、所述空白区域的尺寸信息和/或所述空白区域与所述非目标区域之间的边缘信息等。
所述结合参数包括但不限于边缘对齐参数、所述素材图像待显示在所述第二图像 中的尺寸、所述素材图像中的目标物待显示在所述第二图像中的位置和/或尺寸;其中,所述边缘对齐参数用于将所述非目标区域与所述目标区域之间的边缘和所述素材图像的边缘对齐。
示例性的,根据所述空白区域在掩膜图像中的宽度确定所述素材图像中的目标物待显示在所述第二图像中的位置和/或尺寸。例如根据所述空白区域在掩膜图像中的宽度最大的位置确定所述素材图像中的目标物待显示在所述第二图像中的位置。
示例性的,可以根据所述掩膜图像中的空白区域的尺寸信息确定所述素材图像在所述第二图像中的尺寸。在一个例子中,可以根据所述空白区域的尺寸信息和所述素材图像的尺寸信息之间的差异,来确定所述素材图像在最后生成的第二图像中的尺寸;比如如果所述空白区域的尺寸大于所述素材图像的尺寸,可以对所述素材图像进行上采样处理,以放大所述素材图像的尺寸;如果所述空白区域的尺寸小于所述素材图像的尺寸,可以对所述素材图像进行下采样处理,以缩放所述素材图像的尺寸。所述素材图像在第二图像中的尺寸调整不限于通过上下采样实现,其他能够实现尺寸调整的方案也在本发明的保护范围内。
示例性的,可以根据所述非目标区域的尺寸信息确定所述素材图像中的目标物在所述第二图像中的尺寸。在一个例子中,可以根据所述非目标区域的尺寸信息和所述素材图像的尺寸信息之间的差异,来确定所述素材图像中的目标物待显示在所述第二图像中的尺寸。在另一个例子中,可以根据所述非目标区域的尺寸信息和预设的尺寸比例关系(指的是非目标区域和所述目标物的尺寸比例关系)来确定所述素材图像中的目标物待显示在所述第二图像中的尺寸。
示例性的,可以根据所述非目标区域占所述第一图像的比例,确定所述素材图像中的目标物在所述第二图像中的尺寸。
可以理解的是,所述素材图像和所述素材图像中的目标物可以等比例放大或缩小,也可以以不同的比例分别进行放大或缩小,可依据实际应用场景进行具体设置。
示例性的,可以根据所述掩膜图像中空白区域与所述非目标区域之间的边缘信息来确定边缘对齐参数,所述边缘对齐参数用于将所述非目标区域与所述目标区域之间的边缘和所述素材图像的边缘对齐。在一个例子中,所述目标区域为天空区域,所述素材图像为天空图像,则所述非目标区域与所述目标区域之间的边缘指的是天际线,为了实现非天空区域和所述天空图像的准确融合,需要将所述非天空区域的天际线和所述天空图像的天际线对齐。其中,所述天空图像的天际线根据所述天空图像的下边缘确定。所述非天空区域的天际线根据所述非天空区域与天空区域交接的最低点确定。
示例性的,可以根据所述掩膜图像中指示所述目标区域的空白区域的位置信息,确定所述素材图像中的目标物待显示在所述第二图像中的位置。在一个例子中,所述目标区域为天空区域,所述素材图像为天空图像,所述目标物至少包括太阳或者月亮,则本实施例可以根据所述空白区域的位置信息,灵活确定太阳或者月亮的显示位置。在一些实施例中,所述目标物包括但不限于太阳、月亮、云朵、星星、飞机等中的一种或多种。
在一些实施例中,所述电子设备可以根据所述掩膜图像中指示所述目标区域的空白区域的位置信息,在所述掩膜图像中确定适合显示所述素材图像中的目标物的位置和/或区域范围,比如所述位置对应的所述区域范围内包含的空白区域的面积大于预设值,又或者所述位置对应的所述区域范围内仅包含所述空白区域。其中,所述位置和/或区域范围用于指示所述素材图像中的目标物待显示在所述第二图像中的位置。本实施例中,实现所述素材图像中的目标物尽量显示在所述第二图像与所述空白区域对应的部分,避免所述目标物对所述非目标区域造成遮挡。
可以理解的是,所述预设值可依据实际应用场景进行具体设置,本实施例对此不做任何限制。在一个例子中,在所述预设值大于或等于所述区域范围对应的面积时,表示所述位置对应的所述区域范围内可以仅包含所述空白区域。在另一个例子中,在所述预设值小于所述区域范围对应的面积时,表示所述位置对应的所述区域范围内可以包含较少的非目标区域。
示例性的,在确定适合显示所述素材图像中的目标物的位置和/或区域范围的过程中,所述电子设备可以获取与所述目标物相关的一个或多个搜索框,并使用所述一个或多个搜索框以预设步长遍历所述掩膜图像。其中,为了使得所述搜索框搜索的子区域适应于所述目标物待显示在所述第二图像中的尺寸,所述搜索框的显示尺寸可以大于或等于所述目标物待显示在所述第二图像中的尺寸;所述目标物待显示在所述第二图像中的尺寸根据所述非目标区域的尺寸信息确定。所述预设步长也可以根据所述目标物待显示在所述第二图像中的尺寸来确定,比如所述显示尺寸越大,为了节省遍历过程中的运行资源,也可以将所述预设步长设置得更大,即所述显示尺寸与所述预设步长成正相关关系。
在遍历过程中,所述电子设备可以根据所述掩膜图像中所述空白区域的位置信息,确定所述搜索框遍历的各个子区域内包含的空白区域的面积;如果所述子区域内包含的空白区域的面积大于预设值,确定该子区域为适合显示所述目标物的位置和/或区域范围。本实施例中,能够根据空白区域(或者说所述目标区域)所在位置自适应确定 所述素材图像中目标物的显示,实现即使使用同一张素材图像,不同的目标区域(空白区域)所确定的适合显示所述目标物的位置也有所不同,即所述素材图像中的目标物在第二图像中的显示位置能够基于不同的目标区域灵活变动,从而有利于提高第二图像的显示效果。
其中,在使用所述一个或多个搜索框遍历所述掩膜图像时,为了节省遍历过程中的运行资源,可以根据实际需要,所述电子设备可以使用所述一个或多个搜索框以预设步长遍历所述掩膜图像中的至少部分空白区域和/或至少部分非目标区域;其中,为了避免所述目标物对所述非目标区域造成过多的遮挡,所述至少部分非目标区域可以包括所述非目标区域中靠近所述目标区域的部分。
其中,在使用所述一个或多个搜索框遍历所述掩膜图像时,考虑到不同类型的目标物的显示需求不同,则所述电子设备可以使用所述一个或多个搜索框从预设位置开始,以预设步长和预设方向遍历所述掩膜图像;其中,所述预设位置和/或所述预设方向可以根据所述目标物的类型来确定。
在一个例子中,假设所述目标区域为天空区域,所述目标物为月亮,天空区域(即空白区域)通常在所述第一图像的上半部分,而月亮通常在天空区域中的偏上部分,则可以在掩膜图中的空白区域的偏上部分确定所述预设位置(比如空白区域的等分线偏上的位置),假设所述预设方向可以是从左到右的方向和/或从上到下的方向,例如在如图6所述的掩膜图中,所述电子设备可以使用所述搜索框从预设位置开始沿从左到右方向遍历所述掩膜图像,然后还可以接着从上到下或者从下到上继续遍历所述掩膜图像;或者为了提高遍历效率,所述电子设备也可以使用多个搜索框从不同的预设位置开始从左到右遍历所述掩膜图像。在一些实施例中,所述预设方向包括由左往右、从右往左、从上之下,从下到上的一种或多张。所述预设位置位于所述掩膜图像的中间位置。所述位置位于所述掩膜图像的上半部分,例如距离上边缘三分之一的位置。
示例性的,在遍历过程中,若有多个子区域包含的空白区域的面积均大于预设值,所述电子设备可以根据多个所述子区域分别在所述掩膜图像中的位置和预设的位置优先级获取目标子区域,确定所述目标子区域为适合显示所述目标物的位置和/或区域范围。本实施例中,可以根据实际需要设置所述位置优先级,使得选择到的目标子区域为用户期望显示所述目标物的位置。
作为例子,在所述预设的位置优先级中,比如位于所述掩膜图像中部的位置的优先级高于其他位置的优先级。以所述目标物为月亮举例,如果有多个子区域均适合显示所述月亮,可以根据预设的位置优先级,优先选择处于掩膜图像的中部的子区域作 为目标子区域,如果没有处于掩膜图像的中部的子区域,再选择其他位置的子区域,比如先选择处于掩膜图像的左半部分的子区域,再选择处于掩膜图像的右半部分的子区域。
在一些实施例中,在确定适合显示所述素材图像中的目标物的位置和/或区域范围之后,所述电子设备可以根据所述适合显示所述素材图像中的目标物的位置和/或区域范围,对所述素材图像进行处理;示例性的,对所述素材图像进行的处理包括但不限于裁切处理、缩放处理和/或旋转处理;然后所述电子设备将所述第一图像中的目标区域替换为处理后的素材图像。本实施例中,能够根据确定的合成参数(适合显示所述素材图像中的目标物的位置和/或区域范围)来对所述素材图像进行有效处理,使得处理后的素材图像能够与所述非目标区域相适应,从而提高了生成的第二图像的显示效果。
在一些实施例中,在生成所述第二图像时,所述电子设备可以使用所述掩膜图像从所述第一图像中提取包含所述非目标区域的图像,然后将包含所述非目标区域的图像和至少部分所述素材图像进行融合,生成所述第二图像。
在一些实施例中,所述素材图像可以根据所述第一目标图像的替换时间进行选取,不同的替换时间所指示的所述素材图像中的目标物和/或色调不同。示例性的,以所述目标区域为天空为例,如果替换时间是在白天,则所述素材图像中的目标物可以是太阳,如果替换时间是在晚上,则所述素材图像中的目标物可以是月亮。示例性,以所述目标区域为天空为例,所述素材图像在替换时间为正午时的亮度大于替换时间为傍晚时的亮度。示例性的,以所述目标物为月亮为例,如果替换时间为农历十五左右,则所述月亮是圆形的,如果替换时间为农历月初,则所述月亮是镰刀型的。
在一些实施例中,所述电子设备在生成所述第二图像时,可以根据所述目标区域的色彩信息和所述素材图像的色彩信息,调整所述非目标区域的显示参数,以生成结合调整后的非目标区域和至少部分所述素材图像的第二图像。示例性的,所述显示参数包括所述非目标区域的亮度和/或色调。本实施例中,根据所述目标区域的色彩信息和所述素材图像的色彩信息对所述非目标区域的显示参数进行协调处理,使得所述非目标区域和所述素材图像的色彩色调、光照等具有一致性。
在一些实施例中,所述电子设备在生成所述第二图像时,根据非目标区域的识别结果,调整所述素材图像的目标物。例如,根据所述非目标区域的类别识别结果,调整所述素材图像的目标物的尺寸、类型、大小、运动速度、形状变化速度。例如,当识别到非目标区域包括古建筑时,调整所述素材图像的类型为会动的云彩。
在一些实施例中,所述第二图像包括多帧图像,多帧图像之间的目标物在所述第二图像中的位置不同。示例性的,多帧图像构成视频片段,所述目标物在所述视频片段中以预设速度移动。所述目标物包括第一目标物和第二目标物,所述第一目标物和第二目标物在所述视频片段中的速度不同。例如,云朵的运动速度大于所述月亮的运动速度。第一目标物和第二目标物在所述视频片段中的速度均不为零。
多帧图像构成视频片段,所述目标物的形状在所述视频片段中以预设速度连续变化,例如云朵形状的变化。不同类型的目标物的形状变化速度不同。
在一些实施例中,所述第二图像包括多帧图像,多帧图像之间的目标物在所述第二图像中的形状不同。示例性的,多帧图像构成视频片段,所述目标物在所述视频片段中的形状进行变化。所述变化包括连续的变化。例如,云朵的变化,笔画的连续性。
在一些实施例中,识别所述第一图像中目标区域中目标物的状态(包括形状、类别等),根据所述形状添加相应的贴纸以生成第二图像。所述第一图像包括多帧图像,多帧图像之间的目标物在所述第一图像中的形状不同。根据不同第一图像的目标物的行政变化,以调整对应第二图像中的所述贴纸的显示状态。例如,不同形状的贴纸以适应所述目标物的形状的变化。所述贴纸包括笔画、图标等,连续第二图像显示的笔画变化连续。
在一个例子中,所述目标区域的色彩信息包括所述目标区域BGR三个色彩通道的均值,所述素材图像的色彩信息包括所述素材图像BGR三个色彩通道的均值。所述电子设备可以根据所述素材图像的色彩信息与所述目标区域的色彩信息的比值确定所述非目标区域的亮度,和/或,根据所述素材图像的色彩信息与所述目标区域的色彩信息质检的差异调整所述非目标区域的色调。
在一些实施例中,所述电子设备在生成所述第二图像时,可以对所述非目标区域和至少部分所述素材图像之间的边缘进行色彩过渡处理,以提高第二图像的真实效果。在一个例子中,假设在换天场景中,可以在非天空区域和至少部分所述天空图像之间的边缘添加天际线光晕,以保证两者过渡自然。
在一些实施例中,所述图像处理方法的上述处理过程均无需与其他设备进行交互,可以由所述电子设备在本地单独完成,进一步地,所述电子设备也可以在离线情况下在本地进行上述的图像处理方法,从而具有广泛的适用性。
在一些实施例中,在对视频中的视频帧进行区域替换的场景中,对于视频帧序列中的至少两个包含目标区域的待处理视频帧,也可以采用如上述图像处理方法的实施 例所提及的方式,使用素材图像对所述待处理视频帧中的目标区域进行替换。进一步地,发明人考虑到视频帧序列是具有时间连续性,如果只是根据不同视频帧中的所述待替换区域的位置将相同的素材图像生硬地填入不同的视频帧中,忽略了不同的所述待处理视频帧的运动,会降低替换后的视频帧的质量。
因此,在对所述视频帧序列中的至少两个包含目标区域的待处理视频帧进行区别替换时,还需进一步考虑不同的所述待处理视频帧的运动对所述素材图像的影响。基于此,本申请实施例提供了一种视频处理方法,能够对所述视频帧序列进行运动估计以获取不同所述待处理视频帧之间的变换关系,然后根据不同所述待处理视频帧之间的变换关系来对所述素材图像进行变换处理,实现将不同所述待处理视频帧之间的运动作用到素材图像上,使得素材图像在变换后能够匹配不同的所述待处理视频帧,从而提高替换后的视频帧的真实效果和质量。
需要说明的是,针对于视频帧序列中的首个包含目标区域的待处理视频帧,可以使用上述图像处理方法来对该首个包含目标区域的待处理视频帧进行处理,对首个包含目标区域的待处理视频帧中的目标区域进行掩膜处理,获取掩膜图像,然后根据所述掩膜图像的特征信息,确定所述待处理视频帧中除所述目标区域之外的非目标区域和所述素材图像的结合参数;根据所述结合参数,将所述目标区域替换为至少部分所述素材图像,以生成结合所述非目标区域和至少部分所述素材图像的视频帧。
而对于视频帧序列中的非首个包含目标区域的待处理视频帧,可以使用本申请实施例提供的视频处理方法,对所述视频帧序列进行运动估计以获取不同所述待处理视频帧之间的变换关系,然后根据不同所述待处理视频帧之间的变换关系来对所述素材图像进行变换处理,然后使用变化后的素材图像替换所述待处理视频帧中的目标区域。
本申请实施例提供的所述视频处理方法可以应用于电子设备中。示例性的,所述电子设备可以包括有执行所述图像处理方法的程序。示例性的,所述电子设备至少包括有存储器和处理器,所述存储器存储有所述图像处理方法的可执行指令,所述处理器可被配置为执行所述可执行指令。示例性的,所述电子设备包括但不限于遥控器、智能电话/手机、平板计算机、个人数字助理(PDA)、膝上计算机、台式计算机、媒体内容播放器、视频游戏站/系统、虚拟现实系统、增强现实系统、可穿戴式装置(例如,手表、眼镜、头盔或者挂件等)等具备图像处理能力的计算设备。
请参阅图7,图7为本申请实施例提供的一种视频处理方法的流程示意图,所述方法可以由所述电子设备来执行,所述方法包括:
在步骤S201中,从视频帧序列中获取至少两个包含目标区域的待处理视频帧。
在步骤S202中,根据所述视频帧序列进行运动估计,获取不同所述待处理视频帧之间的变换关系。
在步骤S203中,获取用于替换所述待处理视频的目标区域的素材图像,并根据所述变换关系对所述素材图像进行变换处理。
在步骤S204中,使用变换后的素材图像替换所述待处理视频帧中的目标区域,以使得所述素材图像与不同的所述待处理视频帧的运动相适应。
在一些实施例中,在对所述视频帧序列进行运动估计时,在运算资源足够的情况下,可以根据所述至少两个所述待处理视频帧中每个的全部区域进行运动估计。在一个例子中,比如针对于相邻两个待处理视频帧,可以分别对相邻两个待处理视频帧进行特征提取得到特征信息(比如角点、边缘等);根据特征信息进行特征匹配(比如使用KLT跟踪算法),然后通过匹配的特征信息获取相邻两个待处理视频帧的变换关系。
在另一些实施例中,为了提高运动估计的效率,所述电子设备可以根据所述至少两个所述待处理视频帧中每个的部分区域进行运动估计。其中,所述待处理视频帧包含所述目标区域和除所述目标区域之外的非目标区域。示例性的,所述部分区域包括所述待处理视频帧中的至少部分目标区域;或者,所述部分区域包括所述待处理视频帧中的至少部分目标区域和靠近所述目标区域的部分非目标区域;或者,所述部分区域包括所述待处理视频帧中在交界线的其中一侧的部分,所述目标区域在所述交界线的其中一侧的面积大于在另一侧的面积。
在一种可能的实现方式中,考虑到素材图像替换的是所述待处理视频帧中的目标区域,则对素材图像的变换更关注的是不同的所述待处理视频帧的目标区域之间的运动,因此所述电子设备可以根据所述至少两个所述待处理视频帧中每个的至少部分目标区域进行运动估计。
在一种可能的实现方式中,考虑到某些场景中所述目标区域的特征信息比较稀疏,比如天空区域在某些情况下能够提取的特征信息较少,导致运动估计结果不够准确,则所述电子设备可以根据所述至少两个所述待处理视频帧中每个的至少部分目标区域和靠近所述目标区域的部分非目标区域进行运动估计,其中,所述靠近所述目标区域的部分非目标区域有利于增加进行运动估计的特征信息,从而提高运动估计结果的准确性。
进一步地,考虑到素材图像替换的是所述待处理视频帧中的目标区域,则对素材图像的变换更关注的是不同的所述待处理视频帧的目标区域之间的运动,即是说,从所述目标区域提取的特征信息在运动估计过程中所占据的比重应该更大,而所述靠近 所述目标区域的部分非目标区域的特征信息在运动估计过程中所占据的比重可以更小。因此,在进行运动估计的过程中,所述电子设备可以根据至少两个所述待处理视频帧中的每个的至少部分目标区域及其第一权重、和靠近所述目标区域的部分非目标区域及其第二权重进行运动估计;其中,所述第一权重大于所述第二权重,所述第一权重表示所述至少部分目标区域的特征信息在运动估计过程中所占据的比例,所述第二权重表示所述部分非目标区域的特征信息在运动估计过程中所占据的比例。本实施例中,在进行运动估计时,加大所述至少部分目标区域的权重,从而可以获取与所述目标区域适应的变换关系。
示例性的,所述靠近所述目标区域的部分非目标区域根据所述待处理视频帧中所述目标区域和所述非目标区域之间的边缘确定。作为例子,可以根据所述目标区域和所述非目标区域之间的边缘确定一交界线,比如所述交界线可以是根据该边缘在所述目标区域中的最低点确定的一条直线,以该交界线分割所述非目标区域,得到靠近所述目标区域的部分非目标区域。
在一些实施例中,所述边缘为分割线或其中一个或多个分割点。
在一种可能的实现方式中,可以在所述待处理视频帧中获取交界线,所述目标区域在所述交界线的其中一侧的面积大于在另一侧的面积,所述电子设备可以使用所述交界线其中一侧的部分(即目标区域的面积较大的部分)进行运动估计,从而可以获取与所述目标区域适应的变换关系。
所述交界线用于将所述待处理图像分割成两个部分,所述待处理视频帧在所述交界线同一侧的部分包括部分所述目标区域和部分所述非目标区域。在所述交界线的其中一侧中,部分所述目标区域的面积大于部分所述非目标区域的面积;在所述交界线的另一侧中,部分所述目标区域的面积小于部分所述非目标区域的面积。
示例性的,所述交界线可以根据所述目标区域和所述非目标区域之间的边缘确定。在一个例子中,比如所述交界线可以根据所述边缘中的其中一个边缘点确定。例如,所述其中一个边缘点包括所述目标区域的最低点,图8所示的白色直线即为根据所述最低点确定的交界线。当然,也可以根据其他边缘点确定所述交界线,本实施例对此不做任何限制。在一些实施例中,所述边缘点为分割点。
在一种可能的实现方式中,为了进一步提高运动估计效率,可以获取所述待处理视频帧中的所述目标区域和所述非目标区域的交界线,并根据至少两个待处理视频帧中的所述交界线进行运动估计。本实施例中,使用所述交界线进行运动估计,可以有效减少需要处理的计算量,提高运动估计效率。其中,所述交界线可以根据所述待处 理视频帧中所述目标区域和所述非目标区域之间的边缘确定;在一个例子中,所述交界线可以由所述边缘构成的;在另一个例子中,所述交界线可以根据所述目标区域和所述非目标区域之间的边缘中的其中一个边缘点确定,比如是以所述其中一个边缘点形成的直线段。
进一步地,为了提高运动估计过程中的准确性,可以对所述交界线进行膨胀处理,获取膨胀处理后的交界线;根据至少两个待处理视频帧中的所述膨胀处理后的交界线进行运动估计。本实施例中,通过对所述交界线进行膨胀处理,可以有效增加进行运动估计的特征信息,从而提高运动估计结果的准确性。
在一些实施例中,为了防止待处理视频帧之间的大运动造成超出素材图像的显示范围,可以将所述素材图像的显示范围与不同的所述待处理视频帧的运动相适应,比如所述素材图像由多个图像合成,所述多个图像的采集角度不同,实现无论不同的所述待处理视频帧的运动多大,所述独裁图像的显示范围均能适应。
在一个例子中,所述素材图像包括天空图像,所述素材图像以天空包围盒形式或者半球形形式存储。
在一些实施例中,在实时的视频播放场景(例如直播场景)中,对相邻两个待处理视频帧(比如当前待处理视频帧和上一待处理视频帧)进行运动估计获取变换关系的过程可能跟所述视频帧序列的播放过程同步,因此无法将所述变换关系作用于当前的待处理视频帧,则可以使用所述变换关系对下一待处理视频帧对应的素材图像进行变换处理,兼顾素材图像的变换处理过程与视频帧实时播放过程。
在另一些实施例中,在无需实时播放视频的视频编辑场景中,则对相邻两个待处理视频帧进行运动估计获取的变换关系可以直接作用于当前的待处理视频帧,可以使用该变换关系对当前待处理视频帧对应的素材图像进行变换处理,保证变换后的素材图像与当前待处理视频帧相适应。
所述待处理视频帧的处理可参考上述对第二图像的处理,在此不再赘述。
在一些实施例中,在上述的图像处理过程中,所述掩膜图像可以通过图像分割模型来获得,而相关技术中提供的图像分割模型包括但不限于语义分割模型、实例分割模型或者全景分割模型等等,发明人发现,当输入所述图像分割模型中的图像的尺寸较小时,上述的分割模型所分割的图像,其分割的边缘较为粗糙,比如建筑物的外轮廓是矩形的,图像分割模型输出的边角(外轮廓)是圆角。又比如树叶有镂空,图像分割模型输出的分割图中树叶所在位置是一个实心的图,实际镂空部分并未检测出边缘。这样图像分割模型输出的分割图会导致后续基于边缘的图像处理结果的准确度较 低。
基于此,本申请实施例提供了一种图像分割模型的训练方法,通过在图像分割模型的训练过程中增加边缘约束,强化图像分割模型对边缘的学习,使得训练好的图像分割模型所分割的图像的边缘更加精细。
请参阅图9,为本申请实施例提供的图像分割模型的训练方法的流程示意图,所述训练方法可由电子设备来执行,所述电子设备包括但不限于桌上型计算机、笔记本、掌上电脑、可穿戴设备、服务器、云服务器及手机等计算设备。所述方法包括:
在步骤S301中,将训练图像输入图像分割模型,以获取分割预测图。
在步骤S302中,对所述训练图像对应的分割标注图进行边缘检测,获取边缘图。
在步骤S303中,使用边缘图分别对所述分割标注图和所述分割预测图进行掩膜处理,获得第一掩膜图像和第二掩膜图像。
在步骤S304中,至少根据所述第一掩膜图像和所述第二掩膜图像之间的差异调整所述图像分割模型的参数,获得训练好的图像分割模型。
在一些实施例中,可以采用有监督学习方式来训练所述图像分割模型,在图像分割模型的训练过程中,获取若干包含有目标区域的训练图像,所述训练图像对应有分割标注图,所述分割标注图为预先标注了所述目标区域的“真”图。
在一个例子中,假设所述目标区域为天空区域,所述图像分割模型用于分割出天空区域和非天空区域,如图10A以及图10B所示,图10A示出了其中一张训练图像,图10B示出了该训练图像对应的分割标注图,其中目标区域和非目标区域以不同的颜色进行区别显示。
在获取所述训练图像之后,所述电子设备将所述训练图像输入预设的图像分割模型中,以获取所述图像分割模型输出的分割预测图。可以理解的是,本申请实施例不涉及对所述图像分割模型的结构的改进,对于所述图像分割模型的具体类型不做任何限制,可依据实际应用场景进行具体设置。在一个例子中,所述图像分割模型包括但不限于语义分割模型、实例分割模型或者全景分割模型等等;其中,语义分割模型即对图像上的所有像素点进行分类;实例分割模型是目标检测和语义分割的结合,可以对图像中同一类型的物体中的不同个体进行区分;全景分割模型是语义分割和实例分割的结合,可以对图像中的所有物体包括背景均进行检测和分割。
在一些实施例中,所述电子设备可以对所述训练图像对应的分割标注图进行边缘检测,获取边缘图。可以理解的是,本实施例对于使用的边缘检测算法不做任何限制,可依据实际应用场景进行具体选择,例如可以canny边缘检测算法、Sobel边缘检测算 法、Prewitt边缘检测算法或者Roberts边缘检测算法等对所述分割标注图进行边缘检测。
考虑到所述分割标注图在进行边缘检测之后,检测到的边缘通过是比较细小的,可能仅占据所述分割标注图中的一两个像素,为了进一步凸显检测到的边缘,以提高图像分割模型对于边缘的检测精细度,可以对所述训练图像对应的分割标注图进行边缘检测后,将检测到的边缘进行膨胀处理,获取膨胀后的所述边缘图。在一个例子中,在图10A以及图10B所示的实施例中,对图10B示出的分割标注图进行边缘检测,并对检测到的边缘进行膨胀处理,可以得到如图10C所述的膨胀后的边缘图。
接着,在步骤S303中,在获取所述边缘图之后,所述电子设备可以使用所述边缘图对所述分割标注图进行掩膜处理,获得第一掩膜图像,所述第一掩膜图像用于凸显所述分割标注图中的边缘(目标区域和非目标区域之间的边缘);以及,在获取所述分割预测图之后,所述电子设备可以使用所述边缘图对所述分割预测图进行掩膜处理,获得第二掩膜图像,所述第二掩膜图像用于凸显所述分割预测图中的边缘(目标区域和非目标区域之间的边缘)。
其中,在所述边缘图中的边缘和非边缘以不同的颜色区分,比如在图10C中,所述边缘图中的边缘以白色(255,255,255)显示,且所述边缘图中的非边缘以黑色(0,0,0)显示。
在一个例子中,以所述边缘图中的边缘以白色显示且非边缘以黑色显示为例,掩膜处理的过程可以是将所述分割标注图或者所述分割预测图分别与所述边缘图相乘,即所述第一掩膜图像为所述分割标注图和所述边缘图的乘积,所述第一掩膜图像中的非边缘部分为黑色(0,0,0),而边缘部分非0,从而可以凸显出所述边缘部分;相应的,所述第二掩膜图像为所述分割预测图和所述边缘图的乘积,所述第二掩膜图像中的非边缘部分为黑色(0,0,0),而边缘部分非0,从而可以凸显出所述边缘部分。
在一些实施例中,在获取所述第一掩膜图像和第二掩膜图像之后,所述电子设备可以至少根据所述第一掩膜图像和所述第二掩膜图像之间的差异调整所述图像分割模型的参数,获得训练好的图像分割模型。本实施例实现在训练所述图像分割模型的过程中,使用有关于边缘的所述第一掩膜图像和所述第二掩膜图像之间的差异对所述图像模型进行优化,强化了图像分割模型对边缘的学习,使得训练好的图像分割模型所分割的图像的边缘更加精细。
示例性的,本实施例可以在相关技术中根据所述分割预测图和所述分割标注图之间的差异来优化所述图像分割模型的基础上,增加了有关于“所述第一掩膜图像和所 述第二掩膜图像之间的差异”部分,在训练过程中,用于优化所述图像分割模型的损失函数包括指示“所述第一掩膜图像和所述第二掩膜图像之间的差异”部分和“所述分割预测图和所述分割标注图之间的差异”部分,所述电子设备可以根据所述分割预测图和所述分割标注图之间的差异,以及所述第一掩膜图像和所述第二掩膜图像之间的差异调整所述图像分割模型的参数,获得训练好的图像分割模型。本实施例训练得到的图像分割模型,对于输入图像的尺寸不做任何要求,无论输入图像的尺寸多小,都能获取比相关技术中的图像分割模型更为精细的分割图。
在一个例子中,请参阅图1、图5A和图11,将图1所示的图像输入相关技术中的图像分割模型(未强化对边缘的学习)进行分割处理,可以得到如图11所示的分割图;将图1所示的图像输入本申请实施例训练得到的图像分割模型(强化对边缘的学习)进行分割处理,可以得到如图5A所述的分割图,对比可知,图5A所述的分割图的边缘比图11所示的分割图的边缘更为精细。
相应的,请参阅图12,本申请实施例还提供了一种电子设备40,包括:
用于存储可执行指令的存储器;
一个或多个处理器;
其中,所述一个或多个处理器执行所述可执行指令时,被单独地或共同地配置成执行上述任一项方法。
所述处理器41执行所述存储器42中包括的可执行指令,所述处理器41可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
所述存储器42存储所述图像处理方法、所述视频处理方法或者所述图像分割模型的训练方法的可执行指令,所述存储器42可以包括至少一种类型的存储介质,存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等等。而且,设备可以与通过网络连接执行存储器的存储功能的网络存储装置协作。存储器42可以是设备40的内部存储单元,例如设备40的硬盘或内存。存储器42也可以是 设备40的外部存储设备,例如设备40上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器42还可以既包括设备40的内部存储单元也包括外部存储设备。存储器42用于存储计算机程序44以及设备所需的其他程序和数据。存储器42还可以用于暂时地存储已经输出或者将要输出的数据。
这里描述的各种实施方式可以使用例如计算机软件、硬件或其任何组合的计算机可读介质来实施。对于硬件实施,这里描述的实施方式可以通过使用特定用途集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理装置(DSPD)、可编程逻辑装置(PLD)、现场可编程门阵列(FPGA)、处理器、控制器、微控制器、微处理器、被设计为执行这里描述的功能的电子单元中的至少一种来实施。对于软件实施,诸如过程或功能的实施方式可以与允许执行至少一种功能或操作的单独的软件模块来实施。软件代码可以由以任何适当的编程语言编写的软件应用程序(或程序)来实施,软件代码可以存储在存储器中并且由控制器执行。
电子设备40可以是桌上型计算机、笔记本、掌上电脑、服务器、云服务器及手机等计算设备。设备可包括,但不仅限于,处理器41、存储器42。本领域技术人员可以理解,图4仅仅是电子设备40的示例,并不构成对电子设备40的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如设备还可以包括输入输出设备、网络接入设备、总线等。
上述设备中各个单元的功能和作用的实现过程具体详见上述方法中对应步骤的实现过程,在此不再赘述。
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器,上述指令可由装置的处理器执行以完成上述方法。例如,非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
一种非临时性计算机可读存储介质,当存储介质中的指令由终端的处理器执行时,使得终端能够执行上述方法。
在示例性实施例中,还提供了一种计算机程序产品,包括上述任一方法的计算机程序。
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。术语“包括”、“包含”或者其任何其他变体意 在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上对本申请实施例所提供的方法和装置进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (44)

  1. 一种图像处理方法,其特征在于,包括:
    获取包含目标区域的第一图像以及待替换所述目标区域的素材图像;
    对所述第一图像中的目标区域进行掩膜处理,获取掩膜图像;
    根据所述掩膜图像的特征信息,确定所述第一图像中除所述目标区域之外的非目标区域和所述素材图像的结合参数;
    根据所述结合参数,将所述第一图像中的目标区域替换为至少部分所述素材图像,以生成结合所述非目标区域和至少部分所述素材图像的第二图像。
  2. 根据权利要求1所述的方法,其特征在于,所述掩膜图像中包括指示所述目标区域的空白区域和所述非目标区域。
  3. 根据权利要求1或2所述的方法,其特征在于,所述特征信息包括以下至少一种:所述掩膜图像中的非目标区域的尺寸信息、指示所述目标区域的空白区域的位置信息、或所述空白区域与所述非目标区域之间的边缘信息。
  4. 根据权利要求1至3任意一项所述的方法,其特征在于,所述结合参数包括以下至少一种:
    边缘对齐参数、所述素材图像中的目标物待显示在所述第二图像中的位置或者尺寸;其中,所述边缘对齐参数用于将所述非目标区域与所述目标区域之间的边缘和所述素材图像的边缘对齐。
  5. 根据权利要求1所述的方法,其特征在于,所述根据所述掩膜图像的特征信息,确定所述第一图像中除所述目标区域之外的非目标区域和所述素材图像的结合参数,包括:
    根据所述掩膜图像中指示所述目标区域的空白区域的位置信息,在所述掩膜图像中确定适合显示所述素材图像中的目标物的位置和/或区域范围。
  6. 根据权利要求5所述的方法,其特征在于,所述位置对应的所述区域范围内包含的空白区域的面积大于预设值,或者所述位置对应的所述区域范围内仅包含所述空白区域。
  7. 根据权利要求5所述的方法,其特征在于,所述根据所述掩膜图像中指示所述目标区域的空白区域的位置信息,在所述掩膜图像中确定适合显示所述素材图像中的目标物的位置范围,包括:
    获取与所述目标物相关的一个或多个搜索框,并使用所述一个或多个搜索框以预设步长遍历所述掩膜图像;
    根据所述掩膜图像中所述空白区域的位置信息,确定所述搜索框遍历的各个子区域内包含的空白区域的面积;
    如果所述子区域内包含的空白区域的面积大于预设值,确定该子区域为适合显示所述目标物的位置和/或区域范围。
  8. 根据权利要求7所述的方法,其特征在于,使用所述一个或多个搜索框以预设步长遍历所述掩膜图像中的至少部分空白区域和/或至少部分非目标区域;
    其中,所述至少部分非目标区域包括所述非目标区域中靠近所述目标区域的部分。
  9. 根据权利要求7所述的方法,其特征在于,使用所述搜索框从预设位置开始,以预设步长和预设方向遍历所述掩膜图像;
    其中,所述预设位置和/或所述预设方向根据所述目标物的类型确定。
  10. 根据权利要求7所述的方法,其特征在于,所述如果所述子区域内包含的空白区域的面积大于预设值,确定该子区域为适合显示所述目标物的位置范围,包括:
    若有多个子区域包含的空白区域的面积均大于预设值,根据多个所述子区域分别在所述掩膜图像中的位置和预设的位置优先级获取目标子区域,确定所述目标子区域为适合显示所述目标物的位置和/或区域范围。
  11. 根据权利要求10所述的方法,其特征在于,所述预设的位置优先级中,位于所述掩膜图像中部的位置的优先级高于其他位置的优先级。
  12. 根据权利要求7所述的方法,其特征在于,所述搜索框的显示尺寸大于或等于所述目标物待显示在所述第二图像中的尺寸;
    其中,所述目标物待显示在所述第二图像中的尺寸根据所述非目标区域的尺寸信息确定。
  13. 根据权利要求5所述的方法,其特征在于,所述根据所述结合参数,将所述第一图像中的目标区域替换为至少部分所述素材图像,包括:
    根据所述适合显示所述素材图像中的目标物的位置和/或区域范围,对所述素材图像进行处理;
    将所述第一图像中的目标区域替换为处理后的素材图像。
  14. 根据权利要求13所述的方法,其特征在于,对所述素材图像进行的处理包括裁切处理、缩放处理和/或旋转处理。
  15. 根据权利要求1所述的方法,其特征在于,所述生成结合所述非目标区域和至少部分所述素材图像的第二图像,包括:
    使用所述掩膜图像从所述第一图像中提取包含所述非目标区域的图像;
    将包含所述非目标区域的图像和至少部分所述素材图像进行融合,生成所述第二图像。
  16. 根据权利要求1所述的方法,其特征在于,所述掩膜图像使用预设的图像分割模型对所述第一图像中的各个像素进行分类,并对属于所述目标区域的像素进行处理后得到。
  17. 根据权利要求1所述的方法,其特征在于,所述掩膜图像为经过保边滤波器滤波处理后的图像。
  18. 根据权利要求1所述的方法,其特征在于,所述素材图像根据所述第一目标图像的替换时间进行选取,不同的替换时间所指示的所述素材图像中的目标物、色调和/或亮度不同。
  19. 根据权利要求1所述的方法,其特征在于,还包括:
    根据所述目标区域的色彩信息和所述素材图像的色彩信息,调整所述非目标区域的显示参数,以生成结合调整后的非目标区域和至少部分所述素材图像的第二图像。
  20. 根据权利要求19所述的方法,其特征在于,所述显示参数包括所述非目标区域的亮度和/或色调。
  21. 根据权利要求1所述的方法,其特征在于,还包括:
    对所述非目标区域和至少部分所述素材图像之间的边缘进行色彩过渡处理。
  22. 根据权利要求1至21任意一项所述的方法,其特征在于,所述目标区域包括天空区域;所述素材图像包括天空图像。
  23. 根据权利要求1所述的方法,其特征在于,所述第一图像为视频帧序列中首个包含目标区域的视频帧;
    所述方法还包括:
    从所述视频帧序列中获取至少两个包含目标区域的待处理视频帧;其中,至少两个所述待处理视频帧中的每个为所述视频帧序列中的非首个包含目标区域的视频帧;
    根据所述视频帧序列进行运动估计,获取不同所述待处理视频帧之间的变换关系;
    根据所述变换关系对所述素材图像进行变换处理;
    使用变换后的素材图像替换所述待处理视频帧中的目标区域,以使得所述素材图像与不同的所述待处理视频帧的运动相适应。
  24. 一种视频处理方法,其特征在于,包括:
    从视频帧序列中获取至少两个包含目标区域的待处理视频帧;
    根据所述视频帧序列进行运动估计,获取不同所述待处理视频帧之间的变换关系;
    获取用于替换所述待处理视频的目标区域的素材图像,并根据所述变换关系对所述素材图像进行变换处理;
    使用变换后的素材图像替换所述待处理视频帧中的目标区域,以使得所述素材图像与不同的所述待处理视频帧的运动相适应。
  25. 根据权利要求24所述的方法,其特征在于,所述根据所述视频帧序列进行运动估计,包括:
    根据至少两个所述待处理视频帧中每个的全部或部分区域进行运动估计。
  26. 根据权利要求25所述的方法,其特征在于,所述待处理视频帧包含所述目标区域和除所述目标区域之外的非目标区域;
    所述部分区域包括所述待处理视频帧中的至少部分目标区域;或者,
    所述部分区域包括所述待处理视频帧中的至少部分目标区域和靠近所述目标区域的部分非目标区域;或者,
    所述部分区域包括所述待处理视频帧中在交界线其中一侧的部分,所述目标区域在所述交界线的其中一侧的面积大于在另一侧的面积。
  27. 根据权利要求25所述的方法,其特征在于,所述待处理视频帧包含所述目标区域和除所述目标区域之外的非目标区域;
    所述根据至少两个所述待处理视频帧中每个的部分区域进行运动估计,包括:
    根据至少两个所述待处理视频帧中的每个的至少部分目标区域及其第一权重、和靠近所述目标区域的部分非目标区域及其第二权重进行运动估计;其中,所述第一权重大于所述第二权重。
  28. 根据权利要求26或27所述的方法,其特征在于,所述靠近所述目标区域的部分非目标区域根据所述待处理视频帧中所述目标区域和所述非目标区域之间的边缘确定。
  29. 根据权利要求26所述的方法,其特征在于,所述待处理视频帧在所述交界线同一侧的部分包括部分所述目标区域和部分所述非目标区域。
  30. 根据权利要求26所述的方法,其特征在于,所述交界线根据所述目标区域和所述非目标区域之间的边缘中的其中一个边缘点确定。
  31. 根据权利要求30所述的方法,其特征在于,所述其中一个边缘点包括所述目标区域的最低点。
  32. 根据权利要求24所述的方法,其特征在于,所述待处理视频帧包含所述目标 区域和除所述目标区域之外的非目标区域;
    所述根据所述视频帧序列进行运动估计,包括:
    获取所述待处理视频帧中的所述目标区域和所述非目标区域的交界线,并根据至少两个待处理视频帧中的所述交界线进行运动估计。
  33. 根据权利要求32所述的方法,其特征在于,所述根据至少两个待处理视频帧中的所述交界线进行运动估计,包括:
    对所述交界线进行膨胀处理,获取膨胀处理后的交界线;
    根据至少两个待处理视频帧中的所述膨胀处理后的交界线进行运动估计。
  34. 根据权利要求24所述的方法,其特征在于,所述素材图像的显示范围与不同的所述待处理视频帧的运动相适应。
  35. 根据权利要求24或34所述的方法,其特征在于,所述素材图像由多个图像合成,所述多个图像的采集角度不同。
  36. 根据权利要求35所述的方法,其特征在于,所述素材图像包括天空图像,所述素材图像以天空包围盒形式或者半球形形式存储。
  37. 根据权利要求24所述的方法,其特征在于,所述至少两个所述待处理视频帧中的每个为所述视频帧序列中的非首个包含目标区域的视频帧;
    所述方法还包括:
    对于所述视频帧序列中首个包含目标区域的待处理视频帧;
    对首个所述待处理视频中的目标区域进行掩膜处理,获取掩膜图像;
    根据所述掩膜图像的特征信息,确定所述首个所述待处理视频中除所述目标区域之外的非目标区域和所述素材图像的结合参数;
    根据所述结合参数,将首个所述待处理视频中的目标区域替换为至少部分所述素材图像,以生成结合所述非目标区域和至少部分所述素材图像的视频帧。
  38. 一种图像分割模型的训练方法,其特征在于,包括:
    将训练图像输入图像分割模型,以获取分割预测图;
    对所述训练图像对应的分割标注图进行边缘检测,获取边缘图;
    使用所述边缘图分别对所述分割标注图和所述分割预测图进行掩膜处理,获得第一掩膜图和第二掩膜图;
    至少根据所述第一掩膜图和所述第二掩膜图之间的差异调整所述图像分割模型的参数,获得训练好的图像分割模型。
  39. 根据权利要求38所述的方法,其特征在于,所述对所述训练图像对应的分割标注图进行边缘检测,获取边缘图,包括:
    对所述训练图像对应的分割标注图进行边缘检测后,将检测到的边缘进行膨胀处理,获取所述边缘图。
  40. 根据权利要求38所述的方法,其特征在于,所述至少根据所述第一掩膜图和所述第二掩膜图之间的差异调整所述图像分割模型的参数,获得训练好的图像分割模型,包括:
    根据所述分割预测图和所述分割标注图之间的差异,以及所述第一掩膜图和所述第二掩膜图之间的差异调整所述图像分割模型的参数,获得训练好的图像分割模型。
  41. 根据权利要求38所述的方法,其特征在于,所述训练图像包括有目标区域;
    所述训练图像对应的分割标注图标注了所述目标区域。
  42. 一种电子设备,其特征在于,包括:
    用于存储可执行指令的存储器;
    一个或多个处理器;
    其中,所述一个或多个处理器执行所述可执行指令时,被单独地或共同地配置成执行权利要求1~41任一项所述的方法。
  43. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有可执行指令,所述可执行指令被处理器执行时实现如权利要求1~41任一项所述的方法。
  44. 一种计算机程序产品,其特征在于,包括如权利要求1~41任一所述方法的计算机程序。
PCT/CN2021/119186 2021-09-17 2021-09-17 图像处理方法、视频处理方法、训练方法、设备、程序产品及存储介质 WO2023039865A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/119186 WO2023039865A1 (zh) 2021-09-17 2021-09-17 图像处理方法、视频处理方法、训练方法、设备、程序产品及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/119186 WO2023039865A1 (zh) 2021-09-17 2021-09-17 图像处理方法、视频处理方法、训练方法、设备、程序产品及存储介质

Publications (1)

Publication Number Publication Date
WO2023039865A1 true WO2023039865A1 (zh) 2023-03-23

Family

ID=85602343

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119186 WO2023039865A1 (zh) 2021-09-17 2021-09-17 图像处理方法、视频处理方法、训练方法、设备、程序产品及存储介质

Country Status (1)

Country Link
WO (1) WO2023039865A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103716534A (zh) * 2012-10-09 2014-04-09 三星电子株式会社 摄像装置及合成图像的方法
CN111127307A (zh) * 2019-12-09 2020-05-08 上海传英信息技术有限公司 图像处理方法、装置、电子设备及计算机可读存储介质
CN111292337A (zh) * 2020-01-21 2020-06-16 广州虎牙科技有限公司 图像背景替换方法、装置、设备及存储介质
CN112101320A (zh) * 2020-11-18 2020-12-18 北京世纪好未来教育科技有限公司 模型训练方法、图像生成方法、装置、设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103716534A (zh) * 2012-10-09 2014-04-09 三星电子株式会社 摄像装置及合成图像的方法
CN111127307A (zh) * 2019-12-09 2020-05-08 上海传英信息技术有限公司 图像处理方法、装置、电子设备及计算机可读存储介质
CN111292337A (zh) * 2020-01-21 2020-06-16 广州虎牙科技有限公司 图像背景替换方法、装置、设备及存储介质
CN112101320A (zh) * 2020-11-18 2020-12-18 北京世纪好未来教育科技有限公司 模型训练方法、图像生成方法、装置、设备及存储介质

Similar Documents

Publication Publication Date Title
US11601630B2 (en) Video processing method, electronic device, and non-transitory computer-readable medium
US11595737B2 (en) Method for embedding advertisement in video and computer device
US10074161B2 (en) Sky editing based on image composition
CN107452010B (zh) 一种自动抠图算法和装置
Vaquero et al. A survey of image retargeting techniques
RU2587425C2 (ru) Способ получения карты глубины изображения повышенного качества
WO2021213067A1 (zh) 物品显示方法、装置、设备及存储介质
Ni et al. Learning to photograph: A compositional perspective
US11704357B2 (en) Shape-based graphics search
US11625871B2 (en) System and method for capturing and interpreting images into triple diagrams
CN111553923B (zh) 一种图像处理方法、电子设备及计算机可读存储介质
US11978216B2 (en) Patch-based image matting using deep learning
CN110909724A (zh) 一种多目标图像的缩略图生成方法
WO2012153744A1 (ja) 情報処理装置、情報処理方法および情報処理プログラム
Han et al. Circular array targets detection from remote sensing images based on saliency detection
WO2023039865A1 (zh) 图像处理方法、视频处理方法、训练方法、设备、程序产品及存储介质
Li et al. Aggregating complementary boundary contrast with smoothing for salient region detection
Jia et al. Image-based label placement for augmented reality browsers
Garg et al. A survey on visual saliency detection and computational methods
Shankar et al. A novel semantics and feature preserving perspective for content aware image retargeting
WO2021008322A1 (zh) 图像处理方法、装置及设备
Dickman et al. Smart Scaling: A Hybrid Deep-Learning Approach to Content-Aware Image Retargeting
WO2023193648A1 (zh) 一种图像处理方法、装置、电子设备和存储介质
Zavalishin et al. Visually aesthetic image contrast enhancement
Dickman Smart Resizing: A Hybrid Deep-Learning Approach to Content-Aware and Selective Image Retargeting

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21957145

Country of ref document: EP

Kind code of ref document: A1