CN112203024B

CN112203024B - Matting method, device, equipment and storage medium

Info

Publication number: CN112203024B
Application number: CN202011255943.0A
Authority: CN
Inventors: 朱玉荣; 黄建超
Original assignee: Anhui Wenxiang Technology Co ltd
Current assignee: Anhui Wenxiang Technology Co ltd
Priority date: 2020-03-09
Filing date: 2020-11-11
Publication date: 2023-07-21
Anticipated expiration: 2040-11-11
Also published as: CN111277772A; CN112203024A

Abstract

The embodiment of the application discloses a matting method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring a background image and an image to be segmented aiming at a static scene; wherein the background image does not include a target object, and pixel points in the background image have corresponding color thresholds; calculating a first color distance between a first pixel point in the image to be segmented and a corresponding second pixel point in the background image; the first pixel point is any pixel point in the image to be segmented; when the first color distance is not smaller than a color threshold value corresponding to the second pixel point, determining the first pixel point as a target pixel point, wherein the target pixel point is a pixel point in the target object; and dividing the target object from the image to be divided according to the target pixel point. The method realizes the segmentation of the target object in the static scene, does not need to additionally build a background, reduces the complexity of virtual matting and reduces the matting cost.

Description

Matting method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of data processing, and in particular, to a matting method, device, apparatus, and storage medium.

Background

Virtual matting can be used to key out a target object from an image of one scene and combine the key-out target object with images of other scenes to generate a new image. Virtual matting is widely applied to various scenes, such as recording of teaching videos, shooting of film and television drama and the like.

At present, the main mode of virtual image matting is as follows: setting up a specific background, such as a bluish green background, then recording a video of the target object under the bluish green background, and further matting the target object from the background to synthesize with other needed scene images. However, the existing virtual image matting scheme needs to construct a specific background to perform matting, so that the matting complexity is increased, and the matting cost is also increased.

Disclosure of Invention

In order to solve the technical problems, the application provides a matting method, device, equipment and storage medium, which can realize segmentation of target objects in static scenes without additionally constructing a background, reduce the complexity of virtual matting and reduce matting cost.

The embodiment of the application discloses the following technical scheme:

in one aspect, an embodiment of the present application provides a matting method, where the method includes:

Acquiring a background image and an image to be segmented aiming at a static scene; wherein the background image does not include a target object, and pixel points in the background image have corresponding color thresholds;

calculating a first color distance between a first pixel point in the image to be segmented and a corresponding second pixel point in the background image; the first pixel point is any pixel point in the image to be segmented;

when the first color distance is not smaller than a color threshold value corresponding to the second pixel point, determining the first pixel point as a target pixel point, wherein the target pixel point is a pixel point in the target object;

and dividing the target object from the image to be divided according to the target pixel point.

Optionally, the background image is any one frame image in the first N frame images of the target video frame, or the background image is obtained by processing at least part of the first N frame images of the target video.

Optionally, when the background image is any one frame image in the previous N frame images of the target video, the color threshold corresponding to the second pixel point is determined by the following manner:

determining a second color distance between a third pixel point in each frame of image in the previous N frames of images and the second pixel point, wherein the third pixel point corresponds to the second pixel point;

And determining a color threshold corresponding to the second pixel point according to the second color distance.

Optionally, when the foreground image is obtained by processing at least part of the images in the first N frames of images of the target video, the color value of the second pixel point and the color threshold value corresponding to the second pixel point are determined by the following manner:

calculating a third color distance between a third pixel point in the nth frame image and a second pixel point in the (n-1) -th background image; the n-1 background image is a background image corresponding to the n-1 frame image, and the third pixel point corresponds to the second pixel point;

if the third color distance is larger than the color threshold value corresponding to the second pixel point, updating the color threshold value corresponding to the second pixel point to the third color distance, and updating the color value of the second pixel point according to the color value of the third pixel point and the color value of the second pixel point, otherwise, keeping the color value of the second pixel point and the corresponding color threshold value unchanged;

and if the third color distance corresponding to each pixel point in the nth frame image is smaller than or equal to the corresponding color threshold value, taking the nth-1 background image corresponding to the nth-1 frame image as the background image, wherein the color threshold value corresponding to each pixel point in the background image is the color threshold value corresponding to each pixel point in the nth-1 background image.

Optionally, the image to be segmented is an n+1st frame image in the target video.

Optionally, if the image to be segmented is an mth frame image in the target video, where M > n+1; and determining a color threshold corresponding to the second pixel point according to a fourth color distance between the second pixel point and a sixth pixel point, wherein the sixth pixel point is a pixel point corresponding to the second pixel point in a third image, and the third image is an M-1 frame image in the target video.

Optionally, the color threshold corresponding to the second pixel point is determined by:

determining whether the fourth color distance is larger than a color threshold corresponding to the second pixel point;

if yes, updating the color threshold value corresponding to the second pixel point to be the fourth color distance;

if not, the color threshold value corresponding to the second pixel point is not updated.

Optionally, the method further comprises:

adding a transparency channel to the first pixel point according to the first color distance, wherein when the first color distance is larger, the transparency channel added to the first pixel point is smaller;

processing the image to be segmented through a bidirectional anisotropic filtering algorithm according to the transparency channel corresponding to the first pixel point;

The segmenting the target object from the image to be segmented according to the target pixel point comprises:

and dividing the target object from the image to be divided processed by the bi-directional anisotropic filtering algorithm according to the target pixel point.

In another aspect, an embodiment of the present application provides a matting apparatus, including:

an acquisition unit configured to acquire a background image and an image to be segmented for a stationary scene; wherein the background image does not include a target object, and pixel points in the background image have corresponding color thresholds;

the computing unit is used for computing the first color distance between the first pixel point in the image to be segmented and the corresponding second pixel point in the background image; the first pixel point is any pixel point in the image to be segmented;

a determining unit, configured to determine, when the first color distance is not less than a color threshold corresponding to the second pixel, that the first pixel is a target pixel, where the target pixel is a pixel in the target object;

and the segmentation unit is used for segmenting the target object from the image to be segmented according to the target pixel point.

Optionally, the determining unit is further specifically configured to:

In another aspect, an embodiment of the present application provides a matting device, where the device includes a processor and a memory:

the memory is used for storing a computer program and transmitting the computer program to the processor;

the processor is configured to execute the matting method according to any one of the above instructions in the computer program.

In another aspect, an embodiment of the present application provides a computer readable storage medium, where the computer readable storage medium is configured to store a computer program, where the computer program is configured to perform a matting method according to any one of the foregoing methods.

As can be seen from the technical scheme, the matting method comprises the following steps: acquiring a background image and an image to be segmented aiming at a static scene; wherein the background image does not include a target object, and pixel points in the background image have corresponding color thresholds; calculating a first color distance between a first pixel point in the image to be segmented and a corresponding second pixel point in the background image; the first pixel point is any pixel point in the image to be segmented; when the first color distance is not smaller than a color threshold value corresponding to the second pixel point, determining the first pixel point as a target pixel point, wherein the target pixel point is a pixel point in the target object; and dividing the target object from the image to be divided according to the target pixel point. The method can realize the segmentation of the target object in the static scene without additionally constructing a background, thereby reducing the complexity of virtual matting and the matting cost.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive faculty for a person skilled in the art.

Fig. 1 is a flowchart of a matting method provided in an embodiment of the present application;

fig. 2 is a schematic diagram of a background image and an image to be segmented according to an embodiment of the present application;

FIG. 3a is a flowchart of a method for determining a color threshold according to an embodiment of the present application;

fig. 3b is a flowchart of a background image and color threshold determining method according to an embodiment of the present application;

fig. 3c is a flowchart of a background image and color threshold determining method according to an embodiment of the present application;

fig. 4 is a flowchart of a color threshold determining method according to an embodiment of the present application;

FIG. 5 is a flowchart of a method for partitioning a target object according to an embodiment of the present disclosure;

fig. 6 is a schematic diagram of a matting device provided in an embodiment of the present application;

Fig. 7 is a schematic structural diagram of a server according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the accompanying drawings.

At present, a main mode of virtual image matting is to build a bluish green background, record a video of a target object under the bluish green background, and further perform image matting on the target object from the background. The scheme needs to construct a background, increases the complexity of the matting, and also increases the matting cost.

Therefore, the method for matting is free from additional background establishment, reduces complexity of virtual matting, and reduces matting cost.

First, an execution body of the embodiment of the present application will be described. The matting method provided by the application can be executed by data processing equipment, such as terminal equipment and a server. The terminal device may be, for example, a smart phone, a computer, a personal digital assistant (Personal DigitalAssistant, PDA for short), a tablet computer, or the like. The servers may be independent servers or servers in a cluster.

Referring to fig. 1, which is a flowchart of a matting method provided by an embodiment of the present application, as shown in fig. 1, the method includes:

S101: and acquiring a background image and an image to be segmented for the static scene.

The target object to be scratched in the embodiment of the present application may be any object, for example, a human body (such as a host), or may be an automobile, or may be an animal, or the like.

It should be noted that, the target object in the present application may be in a static scene, and the static scene may be any static scene, for example, may be an office scene, or may be a living room scene, or the like. Specific backgrounds such as blue-green background are not built in the static scene. Of course, the background of the solution of the present application is not limited to a scene of a blue-green background, but the application range of the solution of the present application is wider, and the solution is not limited to a specific background such as a blue-green background, and can be used for a background other than the specific background. That is, if the static scene is a specific background such as a built blue-green background, the scheme of the application is also applicable.

In the embodiment of the application, when the shooting device shoots a video for a static scene, in the beginning stage of shooting, the target object does not enter the static scene, the target object enters the static scene after shooting for a certain period of time, and the target object is not static in the static scene but can move arbitrarily. Thus, in the video shot by the shooting device (for convenience of description, the target video is denoted), only the still scene is present in the first several frame images (for convenience of description, the first N frame images), and no target object is present.

The background image in the embodiment of the application is obtained based on the image which does not contain the target object in the target video, and the image to be segmented is the image which contains the target object in the target video.

In addition, the pixels in the background image have corresponding color thresholds. The color threshold may be used to distinguish whether a pixel in the image to be segmented belongs to the background. The manner in which the color threshold is applied to distinguish whether a pixel in the image to be segmented belongs to the background will be described in detail later.

S102: and calculating the first color distance between the first pixel point in the image to be segmented and the corresponding second pixel point in the background image.

In the embodiment of the present application, a pixel point in an image to be segmented may correspond to one pixel point in a background image. The corresponding pixel points in the two images can be understood as that the background scenes corresponding to the corresponding pixel points in the two images shot by the shooting equipment are consistent.

For example, referring to fig. 2, a schematic diagram of a background image and an image to be segmented provided in an embodiment of the present application is shown, where, as shown in fig. 2, the background image shows a wall surface, the image to be segmented shows a person in front of the wall surface, and a pixel point 201 in the background image corresponds to a pixel point 202 in the image to be segmented, and the real background scenes corresponding to the two pixel points are all the same positions on the wall surface.

In the embodiment of the application, the color distance between the pixel point in the image to be segmented and the corresponding pixel point in the background image can be calculated. Taking a first pixel point in the image to be segmented (any pixel point in the image to be segmented) as an example, a color distance between the first pixel point and a corresponding second pixel point in the background image can be calculated and recorded as a first color distance.

The method for calculating the first color distance between the first pixel point and the corresponding second pixel point in the background image may be:

the color value of the first pixel point is (R1, G1, B1), the color value of the second pixel point is (R2, G2, B2), and the first color distance is between the two

In addition, for the convenience of calculation, the color values of the first pixel point and the second pixel point may be normalized, and then the first color distance d between the two may be calculated. Wherein, the color value is normalized in a manner of fr=r/255; FG = G/255; fb=b/255, in which case the color value of the first pixel is (FR 1, FG1, FB 1), the color value of the second pixel is (FR 2, FG2, FB 2), the first color distance therebetween

S103: and when the first color distance is not smaller than the color threshold value corresponding to the second pixel point, determining the first pixel point as a target pixel point.

The target pixel point may refer to a pixel point in the target object.

The second pixel point has a corresponding color threshold value, and the color threshold value can be used for determining that the first pixel point belongs to a pixel point in the target object, wherein when a first color distance between the first pixel point and the second pixel point is greater than or equal to the color threshold value of the second pixel point, the first pixel point can be determined to belong to the pixel point in the target object, and when the first color distance between the first pixel point and the second pixel point is less than the color threshold value of the second pixel point, the first pixel point can be determined to not belong to the pixel point in the target object.

For this reason, in the embodiment of the present application, when it is determined that the first color distance is not smaller than the first color threshold corresponding to the second pixel, the first pixel is determined to be the target pixel.

S104: and dividing the target object from the image to be divided according to the target pixel point.

After each pixel point in the image to be segmented is determined whether to be the target pixel point by the method, the target matting can be carried out from the image to be segmented according to all the target pixel points in the image to be segmented.

It should be noted that if the region corresponding to the target pixel point is an edge of the target object, the target object may be segmented from the image to be segmented according to the edge determined by the target pixel point.

Next, a method for determining the color threshold of the pixel point in the background image will be described.

In one possible implementation, the background image is determined from the first N frame images in the target video, where none of the first N frame images in the target video includes the target object.

An alternative implementation manner is that the background image may be any one of the first N frame images of the target video. For example, the background image may be the first frame image in the target video. In this way, referring to fig. 3a, which shows a flowchart of a method for determining a color threshold, as shown in fig. 3a, according to an embodiment of the present application, a color threshold corresponding to a second pixel point in a background image may be determined by:

s301: and determining a second color distance between a third pixel point in each frame of image in the previous N frames of images and the second pixel point.

Wherein, the pixel point corresponding to the second pixel point in the other images except the background image in the previous N frame images can be marked as a third pixel point. Then, for the other images except the background image in the previous N frames of images, the color distance between the third pixel point and the second pixel point can be respectively determined and recorded as the second color distance. In this way, N-1 second color distances can be determined.

S302: and determining a color threshold corresponding to the second pixel point according to the second color distance.

In a specific implementation, an average value of N-1 second color distances may be used as a color threshold corresponding to the second pixel, where the color threshold is also a dithering color threshold of the second pixel.

The method is used for determining the color threshold value of the pixel point in the background image.

In another alternative implementation, the background image is obtained by processing at least part of the first N frames of images of the target video. The background image corresponding to the N-1 th frame image may be updated by using the N-th frame image (for convenience of description, the background image corresponding to the N-1 th frame image is referred to as the N-1 th background image), and the updated background image obtained when the update end condition is satisfied may be used as the background image for detecting the target object until the update end condition is satisfied, where N is a positive integer greater than 1 and less than or equal to N. Thus, referring to fig. 3b, fig. 3b shows a flowchart of a method for determining a background image and a color threshold according to an embodiment of the present application, which may include:

s311, determining an nth frame image in the previous N frame images as a first image and determining an (N-1) th frame image in the previous N frame images as a second image;

s312: calculating a third color distance between a fourth pixel point in the first image and a fifth pixel point in the first background image, wherein the third color distance can also be called as a third color distance corresponding to the fourth pixel point in the first image; the first background image is a background image corresponding to the second image; the fourth pixel point is any pixel point in the first image; the fifth pixel point is a pixel point corresponding to the fourth pixel point in the first image in the first background image.

The 1 st background image corresponding to the 1 st frame image (i.e., the first frame image in the previous N frame images is also the first frame image in the target video) is the 1 st frame image itself, and the color threshold value corresponding to each pixel point in the first background image is a preset value, for example, may be 0.

S313: if the third color distance corresponding to each pixel point in the first image is smaller than or equal to the color threshold corresponding to the fifth pixel point, the first background image is used as the background image for detecting the target object, and the color threshold corresponding to the fourth pixel point in the first background image is the color threshold corresponding to the pixel point in the background image for detecting the target object, and the process is ended. Otherwise, the process advances to step S314.

S314: updating the first background image based on the third color distance: if the third color distance is larger than the color threshold corresponding to the fifth pixel point, updating the color threshold corresponding to the fifth pixel point to the third color distance, and updating the color value of the fifth pixel point according to the color value of the fourth pixel point and the color value of the fifth pixel point, otherwise, keeping the color value of the fifth pixel point and the corresponding color threshold unchanged.

The color value of the fifth pixel point may be updated according to the following formula:

Wherein the left side of the equal sign (R' _d,x,y ,G' _d,x,y ,B' _d,x,y ) For the updated color value of the fifth pixel (i.e., the color value of the fifth pixel in the background image corresponding to the first image), the color value on the right of the equal sign (R' _d,x,y ,G' _d,x,y ,B' _d,x,y ) Is the color value before update of the fifth pixel point (i.e. the color value of the fifth pixel point in the first background image), (R' _n,x,y ,G' _n,x,y ,B' _n,x,y ) The color value of the fourth pixel point in the first image. It should be noted that, the color values involved in the formula are normalized color values, and in an alternative embodiment, the color values involved in the formula may be non-normalized color values.

After all pixel points of the first image are calculated, a background image corresponding to the first image can be obtained.

S315: and determining an n+1st frame image in the previous N frame images as a first image, determining an N frame image in the previous N frame images as a second image, and returning to execute the step S312 and subsequent steps.

Optionally, if the first image is the last frame image in the previous N frame images (i.e., the nth frame image is the nth frame image in the target video), and there is a first type of pixel point in the first image, the third color distance corresponding to the first type of pixel point is greater than the color threshold corresponding to the pixel point corresponding to the first type of pixel point in the N-1 background image, the background image corresponding to the first image is used as the background image for detecting the target object, and the color threshold corresponding to the fourth pixel point in the background image corresponding to the first image is the color threshold corresponding to the pixel point in the background image for detecting the target object.

Specifically, the third color distance between the fourth pixel point in the 2 nd frame image and the corresponding fifth pixel point in the 1 st background image may be calculated. The fourth pixel point is any pixel point in the 2 nd frame image; the fifth pixel point is a pixel point corresponding to the fourth pixel point in the 2 nd frame image in the 1 st background image.

If the third color distance between the fourth pixel point in the 2 nd frame image and the fifth pixel point corresponding to the 1 st background image is larger than the color threshold value corresponding to the fifth pixel point, updating the color threshold value corresponding to the fifth pixel point to the third color distance between the fourth pixel point in the 2 nd frame image and the fifth pixel point corresponding to the 1 st background image, and updating the color value of the fifth pixel point in the 1 st background image according to the color value of the fourth pixel point in the 2 nd frame image and the color value of the fifth pixel point in the 1 st background image, otherwise, keeping the color value of the fifth pixel point in the 1 st background image and the corresponding color threshold value unchanged. And after all pixel points of the 2 nd frame image are calculated, obtaining a 2 nd background image corresponding to the 2 nd frame image.

And then calculating the third color distance between the fourth pixel point in the 3 rd frame image and the corresponding fifth pixel point in the 2 nd background image. The fourth pixel point is any pixel point in the 3 rd frame image; the fifth pixel point is a pixel point corresponding to the fourth pixel point in the 3 rd frame image in the 2 nd background image.

If the third color distance between the fourth pixel point in the 3 rd frame image and the corresponding fifth pixel point in the 2 nd background image is larger than the color threshold value corresponding to the fifth pixel point, updating the color threshold value corresponding to the fifth pixel point to the third color distance between the fourth pixel point in the 3 rd frame image and the corresponding fifth pixel point in the 2 nd background image, and updating the color value of the fifth pixel point in the 2 nd background image according to the color value of the fourth pixel point in the 3 rd frame image and the color value of the fifth pixel point in the 2 nd background image, otherwise, keeping the color value of the fifth pixel point in the 2 nd background image and the corresponding color threshold value unchanged. And after all pixel points of the 3 rd frame image are calculated, obtaining a 3 rd background image corresponding to the 3 rd frame image.

Then, third color distances … … between the fourth pixel in the 4 th frame image and the corresponding fifth pixel in the 3 rd background image are calculated, and so on.

And if the third color distance corresponding to each pixel point in the 2 nd frame image is smaller than or equal to the color threshold value corresponding to the pixel point in the 1 st background image, taking the 1 st background image corresponding to the 1 st frame image as the background image for detecting the target object, wherein the color threshold value corresponding to the fourth pixel point in the 1 st background image is the color threshold value corresponding to the pixel point in the background image for detecting the target object.

Similarly, if the third color distance corresponding to each pixel point in the 3 rd frame image is smaller than or equal to the color threshold value corresponding to the pixel point in the 2 nd background image, the 2 nd background image corresponding to the 2 nd frame image is used as the background image for detecting the target object, and the color threshold value corresponding to the fourth pixel point in the 2 nd background image is the color threshold value corresponding to the pixel point in the background image for detecting the target object.

And so on.

After the background image for detecting the target object is determined, the process of determining the background image and the color threshold is ended.

Based on the manner of determining the color value and the color threshold of the background image shown in fig. 3b, as shown in fig. 3c, the color value and the color threshold of the second pixel point in the background image may be determined as follows:

s321: and calculating the third color distance between the third pixel point in the nth frame image and the second pixel point in the (n-1) th background image. The third pixel point is a pixel point corresponding to the second pixel point in the n-1 background image in the n-th frame image.

The background image corresponding to the 1 st frame image (i.e., the first frame image in the previous N frame images) is the 1 st frame image.

S322: if the third color distance is larger than the color threshold corresponding to the second pixel point, updating the color threshold corresponding to the second pixel point to be the third color distance, and updating the color value of the second pixel point according to the color value of the third pixel point and the color value of the second pixel point, otherwise, keeping the color value of the second pixel point and the corresponding color threshold unchanged.

And after all pixel points of the nth frame image are calculated, obtaining an nth background image corresponding to the nth frame image. At this time, the nth background image is not necessarily a background image for detecting the target object. The third color distance between the third pixel point in the n+1th frame image and the second pixel point in the n background image needs to be calculated, and then whether the n background image is the background image for detecting the target object or not is judged according to the third color distance between the third pixel point in the n+1th frame image and the second pixel point in the n background image, and the specific judgment principle is the same as step S323 (i.e. whether the n-1 background image is the background image for detecting the target object or not is judged according to the third color distance between the third pixel point in the n frame image and the second pixel point in the n-1 background image), which will not be described in detail herein.

S323: if the third color distance corresponding to each pixel point in the nth frame image is smaller than or equal to the color threshold corresponding to the pixel point, taking the nth-1 background image corresponding to the nth-1 frame image as the background image for detecting the target object, namely, the color value of the second pixel point in the nth-1 background image is the color value of the second pixel point in the background image for detecting the target object, and the color threshold corresponding to the second pixel point in the nth-1 background image is the color threshold corresponding to the second pixel point in the background image for detecting the target object.

Optionally, if the nth frame image is the last frame image in the previous N frame images and there is a pixel point in the nth frame image, where the third color distance corresponding to the pixel point is greater than the color threshold corresponding to the pixel point, the nth background image corresponding to the nth frame image is used as the background image for detecting the target object, and the color threshold corresponding to the fourth pixel point in the nth background image is the color threshold corresponding to the pixel point in the background image for detecting the target object.

In one possible implementation, the image to be segmented is an n+1st frame image in the target video. That is, only when the image to be segmented is the n+1st frame image in the target video, the target matting may be performed according to the color threshold determined by the manner of fig. 3a or 3b or 3c described above.

Next, another method for determining a color threshold of a pixel point in a background image provided in the embodiment of the present application will be described.

In one possible implementation manner, the background image is any one of the first N frame images of the target video, for example, the background image is the first frame image of the target video; alternatively, the background image is obtained by processing at least part of the first N frame images of the target video (see the foregoing embodiment for specific ways of obtaining). Wherein, the first N frames of images of the target video do not comprise the target object.

For this purpose, if the image to be segmented is an mth frame image in the target video, where M > n+1. And determining a color threshold corresponding to the second pixel point according to the fourth color distance between the second pixel point and the sixth pixel point. The sixth pixel point may be a pixel point corresponding to the second pixel point in a third image, where the third image is an M-1 frame image in the target video.

That is, the color threshold of the pixel point in the background image may be updated in real time. Taking updating of the color threshold value of the second pixel point in the background image as an example, after the M-1 frame image in the target video is subjected to target matting, a sixth pixel point corresponding to the second pixel point in the frame image can be determined, and the color distance between the sixth pixel point and the second pixel point is determined, namely the fourth color distance. Thus, the color threshold of the second pixel point is updated according to the fourth color distance. And applying the updated color threshold of the background image to the Mth frame image so as to perform target matting of the Mth frame image.

In a possible implementation manner, referring to fig. 4, a flowchart of another color threshold determining method provided in an embodiment of the present application is shown, and as shown in fig. 4, a color threshold corresponding to a second pixel point in a background image may be determined by:

S401: and determining whether the fourth color distance is larger than a color threshold corresponding to the second pixel point. If yes, S402 is executed, and if no, S403 is executed.

S402: and updating the color threshold value corresponding to the second pixel point to be the fourth color distance.

S403: and not updating the color threshold value corresponding to the second pixel point.

That is, if the fourth color distance is greater than the color threshold corresponding to the second pixel point, the color threshold thereof may be updated to the fourth color distance, that is, the fourth color distance is the updated color threshold of the second pixel point.

In an actual scene, when a target object needs to be segmented for an image in a target video, a background image and a color threshold corresponding to a pixel point in the background image can be determined through the previous N frames of images. When the n+1st frame image in the target video needs to be subjected to target matting, the target matting can be performed according to the color threshold corresponding to the pixel point in the background image determined by the method shown in the above-mentioned fig. 3a, 3b or 3 c. When any frame of image after the (n+1) th frame of image in the target video needs to be subjected to target matting, the target matting can be performed according to the color threshold updated in real time through the S401-S403 method.

In the embodiment of the present application, after S103, referring to fig. 5, this illustrates a flowchart of a method for segmenting a target object provided in the embodiment of the present application, as shown in fig. 5, the method may further include:

s501: and adding a transparency channel to the first pixel point according to the first color distance.

And when the first color distance is larger, the transparency channel added for the first pixel point is smaller. And when the first color distance is smaller, adding a transparency channel for the first pixel point is larger.

S502: and processing the image to be segmented through a bidirectional anisotropic filtering algorithm according to the transparency channel corresponding to the first pixel point.

The bi-directional anisotropic filtering algorithm may be an algorithm for smoothing an image, and processes the image to be segmented by aiming at a transparency channel of the image to be segmented, so that the transparency of the pixel points with high transparency in the image to be segmented after processing can be higher. For the pixel points with low transparency, the transparency is lower after the processing. That is, the transparency of the background area pixel point in the processed image to be segmented is higher, the transparency of the target pixel point is lower, and the subsequent segmentation of the target object is facilitated.

The method for segmenting the target object from the image to be segmented according to the target pixel point in S104 may include:

s503: and dividing the target object from the image to be divided processed by the bi-directional anisotropic filtering algorithm according to the target pixel point.

According to the method, the bidirectional anisotropic filtering algorithm is utilized to carry out noise filtering on the picture of the target object in the image to be segmented, so that the picture of the target object in the image to be segmented is clearer, and a smoother target object area is facilitated to be obtained.

Based on the above-mentioned image matting method, the embodiment of the present application further provides an image matting apparatus, referring to fig. 6, which shows a schematic diagram of the image matting apparatus provided by the embodiment of the present application, where the apparatus includes:

an acquisition unit 601, configured to acquire a background image and an image to be segmented for a still scene; wherein the background image does not include a target object, and pixel points in the background image have corresponding color thresholds;

a calculating unit 602, configured to calculate a first color distance between a first pixel point in the image to be segmented and a corresponding second pixel point in the background image; the first pixel point is any pixel point in the image to be segmented;

A determining unit 603, configured to determine, when the first color distance is not less than a color threshold corresponding to the second pixel, that the first pixel is a target pixel, where the target pixel is a pixel in the target object;

and a segmentation unit 604, configured to segment the target object from the image to be segmented according to the target pixel point.

In a possible implementation manner, the background image is any one of the first N frame images of the target video frame, or is obtained by processing at least part of the first N frame images of the target video.

In one possible implementation manner, when the background image is any one frame image of the first N frames of images of the target video, the color threshold corresponding to the second pixel point is determined by the following manner:

In one possible implementation manner, when the foreground image is obtained by processing at least part of the first N frames of images of the target video, the color value of the second pixel point and the color threshold value corresponding to the second pixel point are determined by the following manner:

In one possible implementation, the image to be segmented is an n+1st frame image in the target video.

In one possible implementation manner, if the image to be segmented is an mth frame image in the target video, where M > n+1; and determining a color threshold corresponding to the second pixel point according to a fourth color distance between the second pixel point and a sixth pixel point, wherein the sixth pixel point is a pixel point corresponding to the second pixel point in a third image, and the third image is an M-1 frame image in the target video.

In one possible implementation manner, the color threshold value corresponding to the second pixel point is determined by the following manner:

In a possible implementation manner, the determining unit 603 is further specifically configured to:

Embodiments of the present application also provide a device, which may specifically be a server, and fig. 7 is a schematic diagram of a server structure provided in the embodiments of the present application, where the server 600 may generate relatively large differences due to configuration or performance, and may include one or more central processing units (central processing units, CPU) 622 (e.g., one or more processors) and a memory 632, and one or more storage media 630 (e.g., one or more mass storage devices) storing application programs 642 or data 644. Wherein memory 632 and storage medium 630 may be transitory or persistent storage. The program stored on the storage medium 630 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 622 may be configured to communicate with a storage medium 630 and execute a series of instruction operations in the storage medium 630 on the server 600.

The server 600 may also include one or more power supplies 626, one or more wired or wireless network interfaces 650, one or more input/output interfaces 658, and/or one or more operating systems 641, such as Windows ServerTM, mac OS XTM, unixTM, linuxTM, freeBSDTM, and the like.

The steps performed by the server in the above embodiments may be based on the server structure shown in fig. 7.

Wherein, CPU622 is configured to perform the following steps:

Optionally, the CPU622 is further configured to perform steps of any implementation of the matting method provided in the embodiment of the present application.

The embodiment of the present application further provides another device, which may specifically be a terminal, where the terminal may be a desktop, a notebook computer, or the like, as shown in fig. 8, for convenience of explanation, only a portion related to the embodiment of the present application is shown, and specific technical details are not disclosed, which refer to a method portion of the embodiment of the present application. Taking a terminal as a desktop as an example:

Fig. 8 is a block diagram illustrating a desktop portion structure related to a terminal provided in an embodiment of the present application. Referring to fig. 8, the desktop includes: radio Frequency (RF) circuit 710, memory 720, input unit 730, display unit 740, sensor 750, audio circuit 760, wireless fidelity (wireless fidelity, wiFi) module 770, processor 780, and power supply 790. It will be appreciated by those skilled in the art that the desktop structure shown in fig. 7 is not limiting and may include more or fewer components than shown, or may be combined with certain components, or may be arranged in a different arrangement of components.

The following describes the respective constituent elements of the desktop in detail with reference to fig. 8:

the RF circuitry 710 may be used to receive and transmit information or signals during a conversation. Generally, RF circuitry 710 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (English full name: low Noise Amplifier, english abbreviation: LNA), a duplexer, and the like. In addition, the RF circuitry 710 may also communicate with networks and other devices via wireless communications.

Memory 720 may be used to store software programs and modules that processor 780 may perform various functional applications of the desktop and data processing by running the software programs and modules stored in memory 720. The memory 720 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, application programs required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to the use of the desktop (such as audio data, etc.), and the like. In addition, memory 720 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The input unit 730 may be used to receive input numeric or character information and generate key signal inputs related to user settings of the desktop and function control. In particular, the input unit 730 may include a touch panel 731 and other input devices 732. The touch panel 731, also referred to as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on or thereabout the touch panel 731 using any suitable object or accessory such as a finger, a stylus, etc.), and drive the corresponding connection device according to a predetermined program. Alternatively, the touch panel 731 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 780, and can receive commands from the processor 780 and execute them. In addition, the touch panel 731 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. The input unit 730 may include other input devices 732 in addition to the touch panel 731. In particular, the other input devices 732 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.

The display unit 740 may be used to display information input by a user or information provided to the user and various menus of desktops. The display unit 740 may include a display panel 741, and optionally, the display panel 741 may be configured in the form of a liquid crystal display (english full name: liquid Crystal Display, acronym: LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 731 may cover the display panel 741, and when the touch panel 731 detects a touch operation thereon or thereabout, the touch operation is transferred to the processor 780 to determine the type of touch event, and then the processor 780 provides a corresponding visual output on the display panel 741 according to the type of touch event. Although in fig. 7, the touch panel 731 and the display panel 741 are two independent components to implement the input and output functions of the desktop, in some embodiments, the touch panel 731 and the display panel 741 may be integrated to implement the input and output functions of the desktop.

The desktop may also include at least one sensor 750, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 741 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 741 and/or the backlight when the desktop moves to the ear. As one type of motion sensor, the accelerometer sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and direction when stationary, and can be used for applications of identifying the desktop gesture, vibration identification related functions (such as pedometer and knocking), and the like; other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that may be configured by the desktop are not described in detail herein.

Audio circuitry 760, speaker 761, and microphone 762 may provide an audio interface between a user and a desktop. The audio circuit 760 may transmit the received electrical signal converted from audio data to the speaker 761, and the electrical signal is converted into a sound signal by the speaker 761 to be output; on the other hand, microphone 762 converts the collected sound signals into electrical signals, which are received by audio circuit 760 and converted into audio data, which are processed by audio data output processor 780 for transmission to, for example, another desktop via RF circuit 710 or for output to memory 720 for further processing.

WiFi belongs to a short-distance wireless transmission technology, and desktop can help users to send and receive emails, browse webpages, access streaming media and the like through a WiFi module 770, so that wireless broadband Internet access is provided for the users. Although fig. 8 shows the WiFi module 770, it is understood that it does not belong to the essential constitution of the desktop, and can be omitted entirely as required within the scope of not changing the essence of the invention.

Processor 780 is a control center of the desktop, and uses various interfaces and lines to connect various parts of the entire desktop, and performs various functions and processes of the desktop by running or executing software programs and/or modules stored in memory 720, and calling data stored in memory 720, thereby performing overall monitoring of the desktop. Optionally, the processor 780 may include one or more processing units; preferably, the processor 780 may integrate an application processor that primarily processes operating systems, user interfaces, applications, etc., with a modem processor that primarily processes wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 780.

The desktop further includes a power supply 790 (e.g., a battery) for powering the various components, which may preferably be logically coupled to the processor 780 via a power management system, such as to provide for managing charge, discharge, and power consumption by the power management system.

Although not shown, the desktop may further include a camera, a bluetooth module, etc., which will not be described herein.

In the embodiment of the present application, the processor 780 included in the terminal further has the following functions:

Optionally, the processor 780 is further configured to perform steps of any implementation of the matting method of the present application.

The embodiment of the application further provides a computer readable storage medium for storing a computer program, where the computer program is configured to perform any one of the implementations of the matting method described in the foregoing embodiments.

The embodiments also provide a computer program product comprising instructions which, when executed on a computer, cause the computer to perform any one of the matting methods described in the foregoing embodiments.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, where the above program may be stored in a computer readable storage medium, and when the program is executed, the program performs steps including the above method embodiments; and the aforementioned storage medium may be at least one of the following media: read-only memory (ROM), RAM, magnetic disk or optical disk, etc., which can store program codes.

It should be noted that, in the present specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment is mainly described in a different point from other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, with reference to the description of the method embodiments in part. The apparatus and system embodiments described above are merely illustrative, in which elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The foregoing is merely one specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered in the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A matting method, the method comprising:

dividing the target object from the image to be divided processed by the bi-directional anisotropic filtering algorithm according to the target pixel point; and if the region corresponding to the target pixel point is the edge of the target object, dividing the target object from the image to be divided according to the edge determined by the target pixel point.

2. The method according to claim 1, wherein the background image is any one of the first N frame images of the target video frame, or the background image is obtained by processing at least part of the first N frame images of the target video.

3. The method according to claim 2, wherein when the background image is any one of the first N frames of images of the target video, the color threshold corresponding to the second pixel point is determined by:

4. The method according to claim 2, wherein when the background image is obtained by processing at least part of the first N frames of images of the target video, the color value of the second pixel point and the color threshold corresponding to the second pixel point are determined by:

and if the third color distance corresponding to each pixel point in the nth frame image is smaller than or equal to the corresponding color threshold value, taking the nth-1 background image as the background image, wherein the color threshold value corresponding to each pixel point in the background image is the color threshold value corresponding to each pixel point in the nth-1 background image.

5. The method according to any one of claims 2-4, wherein the image to be segmented is an n+1st frame image in the target video.

6. The method of claim 2, wherein if the image to be segmented is an mth frame image in the target video, wherein M > n+1; and determining a color threshold corresponding to the second pixel point according to a fourth color distance between the second pixel point and a sixth pixel point, wherein the sixth pixel point is a pixel point corresponding to the second pixel point in a third image, and the third image is an M-1 frame image in the target video.

7. The method of claim 6, wherein the color threshold corresponding to the second pixel point is determined by:

8. A matting apparatus, the apparatus comprising:

a determining unit, configured to determine, when the first color distance is not less than a color threshold corresponding to the second pixel, that the first pixel is a target pixel, where the target pixel is a pixel in the target object, and add a transparency channel to the first pixel according to the first color distance, where, when the first color distance is greater, the transparency channel added to the first pixel is smaller; processing the image to be segmented through a bidirectional anisotropic filtering algorithm according to the transparency channel corresponding to the first pixel point;

the segmentation unit is used for segmenting the target object from the image to be segmented processed by the bi-directional anisotropic filtering algorithm according to the target pixel point; and if the region corresponding to the target pixel point is the edge of the target object, dividing the target object from the image to be divided according to the edge determined by the target pixel point.

9. The apparatus of claim 8, wherein the background image is any one of the first N frame images of the target video frame or is obtained by processing at least part of the first N frame images of the target video.

10. The apparatus of claim 9, wherein when the background image is any one of the first N frames of images of the target video, the color threshold corresponding to the second pixel point is determined by:

11. The apparatus of claim 9, wherein when the foreground image is obtained by processing at least part of the first N frames of images of the target video, the color value of the second pixel and the color threshold corresponding to the second pixel are determined by:

12. The apparatus according to any one of claims 9-11, wherein the image to be segmented is an n+1st frame image in the target video.

13. The apparatus of claim 9, wherein if the image to be segmented is an mth frame image in the target video, wherein M > n+1; and determining a color threshold corresponding to the second pixel point according to a fourth color distance between the second pixel point and a sixth pixel point, wherein the sixth pixel point is a pixel point corresponding to the second pixel point in a third image, and the third image is an M-1 frame image in the target video.

14. The apparatus of claim 13, wherein the color threshold corresponding to the second pixel point is determined by:

15. A matting apparatus, the apparatus comprising a processor and a memory:

the processor is configured to perform a matting method according to any one of claims 1 to 7 in accordance with instructions in the computer program.

16. A computer readable storage medium storing a computer program for performing a matting method according to any one of claims 1 to 7.