CN112203024A

CN112203024A - Matting method, device, equipment and storage medium

Info

Publication number: CN112203024A
Application number: CN202011255943.0A
Authority: CN
Inventors: 朱玉荣; 黄建超
Original assignee: Beijing Wenxiang Information Technology Co ltd
Current assignee: Beijing Wenxiang Information Technology Co ltd
Priority date: 2020-03-09
Filing date: 2020-11-11
Publication date: 2021-01-08
Anticipated expiration: 2040-11-11
Also published as: CN111277772A; CN112203024B

Abstract

The embodiment of the application discloses a matting method, a matting device, matting equipment and a storage medium, wherein the method comprises the following steps: acquiring a background image and an image to be segmented aiming at a static scene; the background image does not comprise a target object, and pixel points in the background image have corresponding color thresholds; calculating a first color distance between a first pixel point in the image to be segmented and a corresponding second pixel point in the background image; the first pixel point is any pixel point in the image to be segmented; when the first color distance is not smaller than the color threshold corresponding to the second pixel point, determining the first pixel point as a target pixel point, wherein the target pixel point is a pixel point in the target object; and according to the target pixel points, the target object is segmented from the image to be segmented. The method realizes the segmentation of the target object under the static scene without additionally constructing a background, thereby reducing the complexity of virtual matting and reducing the matting cost.

Description

Matting method, device, equipment and storage medium

Technical Field

The present application relates to the field of data processing, and in particular, to a matting method, apparatus, device, and storage medium.

Background

The virtual matting can be used to matte a target object from an image of one scene and merge the matte target object with images of other scenes to generate a new image. The virtual image matting is widely applied to various scenes, such as recording of teaching videos, shooting of movie and television plays and the like.

At present, the main way of virtual image matting is: and (3) building a specific background, such as a blue-green background, then recording a video of the target object under the blue-green background, and further matting the target object from the background to be combined with other required scene images. However, the existing virtual matting scheme needs to construct a specific background to perform matting, which increases the complexity of matting and also increases the matting cost.

Disclosure of Invention

In order to solve the technical problem, the application provides a matting method, a device, equipment and a storage medium, which can realize the segmentation of a target object under a static scene without additionally building a background, thereby reducing the complexity of virtual matting and the matting cost.

The embodiment of the application discloses the following technical scheme:

in one aspect, an embodiment of the present application provides a matting method, where the method includes:

acquiring a background image and an image to be segmented aiming at a static scene; the background image does not comprise a target object, and pixel points in the background image have corresponding color thresholds;

calculating a first color distance between a first pixel point in the image to be segmented and a corresponding second pixel point in the background image; the first pixel point is any pixel point in the image to be segmented;

when the first color distance is not smaller than the color threshold corresponding to the second pixel point, determining the first pixel point as a target pixel point, wherein the target pixel point is a pixel point in the target object;

and according to the target pixel points, the target object is segmented from the image to be segmented.

Optionally, the background image is any one of the first N frames of images of the target video frame, or the background image is obtained by processing at least part of images of the first N frames of images of the target video.

Optionally, when the background image is any one of the first N frames of images of the target video, the color threshold corresponding to the second pixel point is determined as follows:

determining a second color distance between a third pixel point in each frame of image in the previous N frames of images and the second pixel point, wherein the third pixel point corresponds to the second pixel point;

and determining a color threshold corresponding to the second pixel point according to the second color distance.

Optionally, when the scene image is obtained by processing at least part of the image in the first N frames of the image of the target video, the color value of the second pixel point and the color threshold corresponding to the second pixel point are determined as follows:

calculating a third color distance between a third pixel point in the nth frame image and a second pixel point in the (n-1) th background image; the (n-1) th background image is a background image corresponding to the (n-1) th frame image, and the third pixel point corresponds to the second pixel point;

if the third color distance is larger than the color threshold corresponding to the second pixel point, updating the color threshold corresponding to the second pixel point to be the third color distance, and updating the color value of the second pixel point according to the color value of the third pixel point and the color value of the second pixel point, otherwise, keeping the color value of the second pixel point and the corresponding color threshold unchanged;

and if the third color distances corresponding to the pixel points in the nth frame image are smaller than or equal to the corresponding color threshold values, taking the nth-1 background image corresponding to the nth-1 frame image as the background image, wherein the color threshold value corresponding to each pixel point in the background image is the color threshold value corresponding to each pixel point in the nth-1 background image.

Optionally, the image to be segmented is an N +1 frame image in the target video.

Optionally, if the image to be segmented is an mth frame image in the target video, where M > N + 1; and determining the color threshold corresponding to the second pixel point according to a fourth color distance between the second pixel point and a sixth pixel point, wherein the sixth pixel point is a pixel point corresponding to the second pixel point in a third image, and the third image is an M-1 frame image in the target video.

Optionally, the color threshold corresponding to the second pixel point is determined as follows:

determining whether the fourth color distance is greater than a color threshold corresponding to the second pixel point;

if so, updating the color threshold corresponding to the second pixel point to be the fourth color distance;

and if not, not updating the color threshold corresponding to the second pixel point.

Optionally, the method further includes:

adding a transparency channel to the first pixel point according to the first color distance, wherein the transparency channel added to the first pixel point is smaller when the first color distance is larger;

processing the image to be segmented through a bidirectional anisotropic filtering algorithm according to the transparency channel corresponding to the first pixel point;

the dividing the target object from the image to be divided according to the target pixel point comprises:

and according to the target pixel points, segmenting the target object from the image to be segmented which is processed by the bidirectional anisotropic filtering algorithm.

In another aspect, an embodiment of the present application provides a matting device, where the matting device includes:

the device comprises an acquisition unit, a segmentation unit and a processing unit, wherein the acquisition unit is used for acquiring a background image and an image to be segmented aiming at a static scene; the background image does not comprise a target object, and pixel points in the background image have corresponding color thresholds;

the computing unit is used for computing a first color distance between a first pixel point in the image to be segmented and a corresponding second pixel point in the background image; the first pixel point is any pixel point in the image to be segmented;

the determining unit is used for determining the first pixel point as a target pixel point when the first color distance is not smaller than the color threshold corresponding to the second pixel point, and the target pixel point is a pixel point in the target object;

and the segmentation unit is used for segmenting the target object from the image to be segmented according to the target pixel point.

Optionally, the determining unit is further specifically configured to:

In another aspect, an embodiment of the present application provides a matting device, where the matting device includes a processor and a memory:

the memory is used for storing a computer program and transmitting the computer program to the processor;

the processor is configured to execute the matting method according to any one of the above-mentioned items according to instructions in the computer program.

In another aspect, an embodiment of the present application provides a computer-readable storage medium for storing a computer program, where the computer program is configured to execute the matting method described in any one of the above.

According to the technical scheme, the matting method comprises the following steps: acquiring a background image and an image to be segmented aiming at a static scene; the background image does not comprise a target object, and pixel points in the background image have corresponding color thresholds; calculating a first color distance between a first pixel point in the image to be segmented and a corresponding second pixel point in the background image; the first pixel point is any pixel point in the image to be segmented; when the first color distance is not smaller than the color threshold corresponding to the second pixel point, determining the first pixel point as a target pixel point, wherein the target pixel point is a pixel point in the target object; and according to the target pixel points, the target object is segmented from the image to be segmented. The method can realize the segmentation of the target object under the static scene without additionally constructing a background, thereby reducing the complexity of virtual matting and reducing the matting cost.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is a flowchart of a matting method provided in an embodiment of the present application;

fig. 2 is a schematic diagram of a background image and an image to be segmented according to an embodiment of the present disclosure;

fig. 3a is a flowchart of a color threshold determination method according to an embodiment of the present application;

fig. 3b is a flowchart of a background image and color threshold determination method according to an embodiment of the present disclosure;

fig. 3c is a flowchart of a background image and color threshold determination method according to an embodiment of the present disclosure;

fig. 4 is a flowchart of a color threshold determination method according to an embodiment of the present application;

fig. 5 is a flowchart of a method for segmenting a target object according to an embodiment of the present application;

fig. 6 is a schematic view of a matting device provided in an embodiment of the present application;

fig. 7 is a schematic structural diagram of a server according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the accompanying drawings.

At present, the main mode of virtual image matting is to record the video of a target object under a blue-green background for building the blue-green background, and then image matting is carried out on the target object from the background. The scheme needs to construct a background, increases the complexity of the cutout, and also increases the cutout cost.

Therefore, the cutout method is provided, the background does not need to be additionally set up, the complexity of virtual cutout is reduced, and the cutout cost is reduced.

First, an execution body of the embodiment of the present application will be described. The matting method provided by the application can be executed by data processing equipment, such as terminal equipment and a server. The terminal device may be, for example, a smart phone, a computer, a Personal Digital Assistant (PDA), a tablet computer, or the like. The servers may be stand-alone servers or servers in a cluster.

The matting method provided by the embodiment of the present application is described below, and referring to fig. 1, this figure shows a flowchart of a matting method provided by the embodiment of the present application, and as shown in fig. 1, the method includes:

s101: a background image and an image to be segmented for a static scene are acquired.

The target object to be subjected to matting in the embodiment of the present application may be any object, such as a human body (e.g., a host), or may be a car, or may be an animal, etc.

It should be noted that the target object in the present application may be in a static scene, and the static scene may be any static scene, such as a scene of an office, or a scene of a living room, etc. No specific background such as a blue-green background is built in the static scene. Of course, the application scope of the solution of the present application is wider, and the solution of the present application is not limited to a specific background such as a blue-green background, but may be applied to a background other than the specific background. That is, if the static scene is a specific background such as a blue-green background of construction, the scheme of the present application is also applicable.

In the embodiment of the application, when the shooting device shoots a video for a static scene, in the initial stage of shooting, a target object does not enter the static scene, the target object enters the static scene after shooting for a certain time, and the target object is not static in the static scene but can move randomly. Thus, in a video (denoted as a target video for convenience of description) captured by the capturing apparatus, only a still scene is present in the first several frame images (denoted as the first N frame images for convenience of description), and there is no target object.

The background image in the embodiment of the application is obtained based on the image which does not contain the target object in the target video, and the image to be segmented is the image which contains the target object in the target video.

In addition, the pixel points in the background image have corresponding color thresholds. The color threshold can be used to distinguish whether a pixel point in the image to be segmented belongs to the background. The manner of distinguishing whether the pixel points in the image to be segmented belong to the background by applying the color threshold will be described in detail later.

S102: and calculating a first color distance between a first pixel point in the image to be segmented and a corresponding second pixel point in the background image.

In this embodiment of the present application, a pixel point in an image to be segmented may correspond to a pixel point in a background image. The pixels in the two images can be relatively understood as the background scenes corresponding to the pixels corresponding to the two images shot by the shooting equipment are consistent.

For example, referring to fig. 2, which shows a schematic diagram of a background image and an image to be segmented provided by an embodiment of the present application, as shown in fig. 2, the background image shows a wall surface, and the image to be segmented shows that a person is in front of the wall surface, where a pixel point 201 in the background image corresponds to a pixel point 202 in the image to be segmented, and both real background scenes corresponding to the two pixel points are at the same position on the wall surface.

In the embodiment of the application, the color distance between the pixel point in the image to be segmented and the corresponding pixel point in the background image can be calculated. Taking a first pixel point (any pixel point in the image to be segmented) in the image to be segmented as an example, a color distance between the first pixel point and a corresponding second pixel point in the background image can be calculated and recorded as a first color distance.

The method for calculating the first color distance between the first pixel point and the corresponding second pixel point in the background image may be:

let the color value of the first pixel be (R1, G1, B1), the secondThe color value of the pixel point is (R2, G2, B2), and the first color distance between the two

In addition, for convenience of calculation, the color values of the first pixel point and the second pixel point can be normalized, and then the first color distance d between the first pixel point and the second pixel point is calculated. Wherein, the color value normalization mode is FR ═ R/255; FG is G/255; FB ═ B/255, in this case, the color value of the first pixel is (FR1, FG1, FB1), the color value of the second pixel is (FR2, FG2, FB2), and the first color distance between the two is

S103: and when the first color distance is not smaller than the color threshold corresponding to the second pixel point, determining the first pixel point as a target pixel point.

The target pixel point may refer to a pixel point in the target object.

The second pixel point has a corresponding color threshold value, which can be used to determine that the first pixel point belongs to a pixel point in the target object, wherein when the first color distance between the first pixel point and the second pixel point is greater than or equal to the color threshold value of the second pixel point, it can be determined that the first pixel point belongs to a pixel point in the target object, and when the first color distance between the first pixel point and the second pixel point is less than the color threshold value of the second pixel point, it can be determined that the first pixel point does not belong to a pixel point in the target object.

Therefore, in the embodiment of the present application, when it is determined that the first color distance is not less than the first color threshold corresponding to the second pixel point, it is determined that the first pixel point is the target pixel point.

S104: and according to the target pixel points, the target object is segmented from the image to be segmented.

After each pixel point in the image to be segmented is determined to be a target pixel point through the method, the target can be extracted from the image to be segmented according to all target pixel points in the image to be segmented.

It should be noted that, if the region corresponding to the target pixel point is the edge of the target object, the target object may be segmented from the image to be segmented according to the edge determined by the target pixel point.

Next, a method for determining a color threshold of a pixel point in a background image is described.

In one possible implementation, the background image is determined according to the first N frames of images in the target video, where none of the first N frames of images in the target video includes the target object.

In an alternative implementation, the background image may be any one of the first N frames of images of the target video. For example, the background image may be the first frame image in the target video. Thus, referring to fig. 3a, which shows a flowchart of a color threshold determination method provided in an embodiment of the present application, as shown in fig. 3a, a color threshold corresponding to a second pixel point in a background image may be determined as follows:

s301: and determining a second color distance between a third pixel point in each image frame in the previous N image frames and the second pixel point.

And marking the pixel points corresponding to the second pixel points in other images except the background image in the previous N frames of images as third pixel points. Then, for the other images except the background image in the previous N frames of images, the color distance between the third pixel point and the second pixel point can be respectively determined, and recorded as the second color distance. Thus, N-1 second color distances may be determined.

S302: and determining a color threshold corresponding to the second pixel point according to the second color distance.

In a specific implementation, the average value of the N-1 second color distances may be used as a color threshold corresponding to the second pixel point, and the color threshold is also a dithering color threshold of the second pixel point.

By the method, the color threshold of the pixel point in the background image is determined.

In another alternative implementation, the background image is obtained by processing at least a part of the image in the first N frames of the target video. The background image corresponding to the N-th frame image (for convenience of description, the background image corresponding to the N-1 th frame image is referred to as the N-1 th background image) obtained by updating the background image corresponding to the N-th frame image (for convenience of description, the background image corresponding to the N-th frame image is referred to as the N-th background image) with the N-th frame image may be used until the update end condition is satisfied, the updated background image obtained when the update end condition is satisfied is used as the background image for detecting the target object, and N is a positive integer greater than 1 and less than or equal to N. Thus, referring to fig. 3b, fig. 3b shows a flowchart of a method for determining a background image and a color threshold provided by an embodiment of the present application, which may include:

s311, determining the nth frame image in the previous N frame images as a first image, and determining the (N-1) th frame image in the previous N frame images as a second image;

s312: calculating a third color distance between a fourth pixel point in the first image and a fifth pixel point in the first background image, wherein the third color distance can also be called a third color distance corresponding to the fourth pixel point in the first image; the first background image is a background image corresponding to the second image; the fourth pixel point is any pixel point in the first image; and the fifth pixel point is a pixel point in the first background image corresponding to the fourth pixel point in the first image.

The 1 st background image corresponding to the 1 st frame image (i.e., the first frame image in the previous N frame images, which is also the first frame image in the target video) is the 1 st frame image itself, and the color threshold corresponding to each pixel point in the first background image is a preset value, which may be 0, for example.

S313: and if the third color distance corresponding to each pixel point in the first image is smaller than or equal to the color threshold corresponding to the fifth pixel point, taking the first background image as a background image for detecting the target object, wherein the color threshold corresponding to the fourth pixel point in the first background image is the color threshold corresponding to the pixel point in the background image for detecting the target object, and ending the process. Otherwise, the process proceeds to step S314.

S314: updating the first background image based on the third color distance: and if the third color distance is greater than the color threshold corresponding to the fifth pixel point, updating the color threshold corresponding to the fifth pixel point to be the third color distance, and updating the color value of the fifth pixel point according to the color value of the fourth pixel point and the color value of the fifth pixel point, otherwise, keeping the color value of the fifth pixel point and the corresponding color threshold unchanged.

The color value of the fifth pixel point can be updated according to the following formula:

wherein, equal-numbered left side (R'_d,x,y,G'_d,x,y,B'_d,x,y) Is the updated color value of the fifth pixel point (i.e. the color value of the fifth pixel point in the background image corresponding to the first image), and is right of equal sign (R'_d,x,y,G'_d,x,y,B'_d,x,y) (R ') for the color value of the fifth pixel before update (i.e., the color value of the fifth pixel in the first background image)'_n,x,y,G'_n,x,y,B'_n,x,y) The color value of the fourth pixel point in the first image is obtained. It should be noted that the color values involved in the formula are normalized color values, and in an optional embodiment, the color values involved in the formula may also be non-normalized color values.

After all the pixel points of the first image are calculated, the background image corresponding to the first image can be obtained.

S315: and determining the (N + 1) th frame image in the first N frame images as a first image, determining the nth frame image in the first N frame images as a second image, and returning to execute the step S312 and the subsequent steps.

Optionally, if the first image is a last image in the previous N frames of images (that is, the nth image is an nth image in the target video), and the first type of pixel points exist in the first image, and the third color distance corresponding to the first type of pixel points is greater than a color threshold corresponding to a pixel point corresponding to the first type of pixel point in the N-1 th background image, the background image corresponding to the first image is used as a background image for detecting the target object, and a color threshold corresponding to a fourth pixel point in the background image corresponding to the first image is a color threshold corresponding to a pixel point in the background image for detecting the target object.

Specifically, the third color distance between the fourth pixel point in the 2 nd frame image and the corresponding fifth pixel point in the 1 st background image may be calculated first. The fourth pixel point is any pixel point in the 2 nd frame image; and the fifth pixel point is a pixel point in the 1 st background image corresponding to the fourth pixel point in the 2 nd frame image.

If the third color distance between the fourth pixel point in the 2 nd frame image and the corresponding fifth pixel point in the 1 st background image is greater than the color threshold corresponding to the fifth pixel point, the color threshold corresponding to the fifth pixel point is updated to the third color distance between the fourth pixel point in the 2 nd frame image and the corresponding fifth pixel point in the 1 st background image, the color value of the fifth pixel point in the 1 st background image is updated according to the color value of the fourth pixel point in the 2 nd frame image and the color value of the fifth pixel point in the 1 st background image, and otherwise, the color value of the fifth pixel point in the 1 st background image and the corresponding color threshold are kept unchanged. And after all the pixel points of the 2 nd frame image are calculated, obtaining a 2 nd background image corresponding to the 2 nd frame image.

And then calculating a third color distance between a fourth pixel point in the 3 rd frame image and a corresponding fifth pixel point in the 2 nd background image. The fourth pixel point is any pixel point in the 3 rd frame image; and the fifth pixel point is a pixel point in the 2 nd background image corresponding to the fourth pixel point in the 3 rd frame image.

If the third color distance between the fourth pixel point in the 3 rd frame image and the corresponding fifth pixel point in the 2 nd background image is greater than the color threshold value corresponding to the fifth pixel point, the color threshold value corresponding to the fifth pixel point is updated to the third color distance between the fourth pixel point in the 3 rd frame image and the corresponding fifth pixel point in the 2 nd background image, the color value of the fifth pixel point in the 2 nd background image is updated according to the color value of the fourth pixel point in the 3 rd frame image and the color value of the fifth pixel point in the 2 nd background image, and otherwise, the color value of the fifth pixel point in the 2 nd background image and the corresponding color threshold value are kept unchanged. And after all the pixel points of the 3 rd frame image are calculated, a 3 rd background image corresponding to the 3 rd frame image can be obtained.

Then, a third color distance … … between the fourth pixel point in the 4 th frame image and the corresponding fifth pixel point in the 3 rd background image is calculated, and so on.

And if the third color distance corresponding to each pixel point in the 2 nd frame image is less than or equal to the color threshold corresponding to the pixel point in the 1 st background image, taking the 1 st background image corresponding to the 1 st frame image as a background image for detecting the target object, wherein the color threshold corresponding to the fourth pixel point in the 1 st background image is the color threshold corresponding to the pixel point in the background image for detecting the target object.

Similarly, if the third color distance corresponding to each pixel point in the 3 rd frame image is less than or equal to the color threshold corresponding to the pixel point in the 2 nd background image, the 2 nd background image corresponding to the 2 nd frame image is used as the background image for detecting the target object, and the color threshold corresponding to the fourth pixel point in the 2 nd background image is the color threshold corresponding to the pixel point in the background image for detecting the target object.

And so on.

After determining the background image for detecting the target object, the process of determining the background image and the color threshold is also ended.

Based on the manner of determining the color value and the color threshold of the background image shown in fig. 3b, as shown in fig. 3c, the color value and the color threshold of the second pixel point in the background image may be determined in the following manner:

s321: and calculating a third color distance between a third pixel point in the nth frame image and a second pixel point in the (n-1) th background image. And the third pixel point is a pixel point corresponding to the second pixel point in the n-1 th background image in the nth frame image.

The background image corresponding to the 1 st frame image (i.e. the first frame image in the previous N frame images) is the 1 st frame image.

S322: and if the third color distance is greater than the color threshold corresponding to the second pixel point, updating the color threshold corresponding to the second pixel point to be the third color distance, and updating the color value of the second pixel point according to the color value of the third pixel point and the color value of the second pixel point, otherwise, keeping the color value of the second pixel point and the corresponding color threshold unchanged.

And after all the pixel points of the nth frame image are calculated, the nth background image corresponding to the nth frame image can be obtained. At this time, the nth background image is not necessarily a background image for detecting the target object. The third color distance between the third pixel point in the n +1 th frame image and the second pixel point in the nth background image needs to be calculated again, and then whether the nth background image is the background image for detecting the target object is judged according to the third color distance between the third pixel point in the n +1 th frame image and the second pixel point in the nth background image, the specific judgment principle is the same as that in step S323 (namely, whether the n-1 th background image is the background image for detecting the target object is judged according to the third color distance between the third pixel point in the nth frame image and the second pixel point in the n-1 th background image), and details are not described here.

S323: and if the third color distance corresponding to each pixel point in the nth frame image is less than or equal to the color threshold corresponding to the pixel point, taking the nth-1 background image corresponding to the nth-1 frame image as the background image for detecting the target object, namely, the color value of the second pixel point in the nth-1 background image is the color value of the second pixel point in the background image for detecting the target object, and the color threshold corresponding to the second pixel point in the nth-1 background image is the color threshold corresponding to the second pixel point in the background image for detecting the target object.

Optionally, if the nth frame image is the last frame image in the previous N frame images, and a pixel point corresponding to the pixel point and having a third color distance greater than a color threshold corresponding to the pixel point exists in the nth frame image, the nth background image corresponding to the nth frame image is used as a background image for detecting the target object, and the color threshold corresponding to the fourth pixel point in the nth background image is the color threshold corresponding to the pixel point in the background image for detecting the target object.

In one possible implementation manner, the image to be segmented is an N +1 th frame image in the target video. That is, only when the image to be segmented is the (N + 1) th frame image in the target video, the object matting can be performed according to the color threshold determined by the manner of the above-described fig. 3a or fig. 3b or fig. 3 c.

Next, a method for determining a color threshold of a pixel point in a background image according to another embodiment of the present application is described.

In a possible implementation manner, the background image is any one of the first N frames of images of the target video, for example, the background image is a first frame of image in the target video; alternatively, the background image is obtained by processing at least part of the image in the first N frames of the target video (see the foregoing embodiment for specific manner of obtaining). And the first N frames of images of the target video do not comprise the target object.

Therefore, if the image to be segmented is the Mth frame image in the target video, wherein M > N + 1. And determining the color threshold corresponding to the second pixel point according to the fourth color distance between the second pixel point and the sixth pixel point. The sixth pixel point may be a pixel point corresponding to the second pixel point in a third image, where the third image is an M-1 th frame image in the target video.

That is, the color threshold of the pixel point in the background image can be updated in real time. Taking color threshold updating of a second pixel point in a background image as an example, after target matting is performed on an M-1 th frame image in a target video, a sixth pixel point corresponding to the second pixel point in the frame image can be determined, and a color distance between the sixth pixel point and the second pixel point is determined, that is, a fourth color distance. And updating the color threshold of the second pixel point according to the fourth color distance. And applying the updated color threshold of the background image to the M frame image to perform object matting of the M frame image.

In a possible implementation manner, referring to fig. 4, which shows a flowchart of another color threshold determination method provided in an embodiment of the present application, as shown in fig. 4, a color threshold corresponding to a second pixel point in a background image may be determined by:

s401: determining whether the fourth color distance is greater than a color threshold corresponding to the second pixel point. If so, go to step S402, otherwise, go to step S403.

S402: and updating the color threshold corresponding to the second pixel point to be the fourth color distance.

S403: and not updating the color threshold corresponding to the second pixel point.

That is, if the fourth color distance is greater than the color threshold corresponding to the second pixel point, the color threshold may be updated to the fourth color distance, that is, the fourth color distance is the updated color threshold of the second pixel point.

In an actual scene, when a target object needs to be segmented for an image in a target video, color thresholds corresponding to pixel points in a background image and a background image can be determined through the first N frames of images. When the N +1 th frame image in the target video needs to be subjected to target matting, the target matting can be performed according to the color threshold corresponding to the pixel point in the background image determined by the method shown in the above fig. 3a, 3b, or 3 c. When the object matting needs to be performed on any frame image after the N +1 th frame image in the object video, the object matting can be performed according to the color threshold value updated in real time by the above-mentioned S401-S403 method.

In the embodiment of the present application, after S103, referring to fig. 5, this illustrates a flowchart of a method for segmenting a target object according to the embodiment of the present application, and as shown in fig. 5, the method may further include:

s501: and adding a transparency channel to the first pixel point according to the first color distance.

When the first color distance is larger, the transparency channel added for the first pixel point is smaller. And when the first color distance is smaller, the transparency channel added for the first pixel point is larger.

S502: and processing the image to be segmented through a bidirectional anisotropic filtering algorithm according to the transparency channel corresponding to the first pixel point.

The bidirectional anisotropic filtering algorithm can be an algorithm for smoothing an image, the image to be segmented is processed by aiming at a transparency channel of the image to be segmented, and therefore, for pixel points with high transparency in the image to be segmented, the transparency of the processed pixel points can be higher. For the pixel points with low transparency, the transparency of the processed pixel points is lower. That is to say, the transparency of the pixel points in the background area in the processed image to be segmented is higher, and the transparency of the target pixel points is lower, so that the subsequent segmentation of the target object is facilitated.

Then, in the above S104, the method for segmenting the target object from the image to be segmented according to the target pixel point may include:

s503: and according to the target pixel points, segmenting the target object from the image to be segmented which is processed by the bidirectional anisotropic filtering algorithm.

The method utilizes a bidirectional anisotropic filtering algorithm to carry out noise filtering on the picture of the target object in the image to be segmented, so that the picture of the target object in the image to be segmented is clearer, and a smoother target object area is obtained.

Based on the matting method provided above, an embodiment of the present application further provides a matting device, see fig. 6, which shows a schematic diagram of a matting device provided in an embodiment of the present application, where the matting device includes:

an obtaining unit 601, configured to obtain a background image and an image to be segmented for a still scene; the background image does not comprise a target object, and pixel points in the background image have corresponding color thresholds;

a calculating unit 602, configured to calculate a first color distance between a first pixel point in the image to be segmented and a corresponding second pixel point in the background image; the first pixel point is any pixel point in the image to be segmented;

a determining unit 603, configured to determine, when the first color distance is not smaller than a color threshold corresponding to the second pixel point, that the first pixel point is a target pixel point, where the target pixel point is a pixel point in the target object;

a segmenting unit 604, configured to segment the target object from the image to be segmented according to the target pixel point.

In a possible implementation manner, the background image is any one of the first N frames of images of the target video frame, or the background image is obtained by processing at least a part of images of the first N frames of images of the target video.

In a possible implementation manner, when the background image is any one of the first N frames of images of the target video, the color threshold corresponding to the second pixel point is determined as follows:

In a possible implementation manner, when the scene image is obtained by processing at least a partial image in the first N frames of images of the target video, the color value of the second pixel point and the color threshold corresponding to the second pixel point are determined as follows:

In a possible implementation manner, the image to be segmented is an N +1 th frame image in the target video.

In a possible implementation manner, if the image to be segmented is an mth frame image in the target video, where M > N + 1; and determining the color threshold corresponding to the second pixel point according to a fourth color distance between the second pixel point and a sixth pixel point, wherein the sixth pixel point is a pixel point corresponding to the second pixel point in a third image, and the third image is an M-1 frame image in the target video.

In a possible implementation manner, the color threshold corresponding to the second pixel point is determined by:

In a possible implementation manner, the determining unit 603 is further specifically configured to:

The present embodiment further provides a device, which may specifically be a server, fig. 7 is a schematic structural diagram of the server provided in the present embodiment, and the server 600 may generate a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 622 (e.g., one or more processors) and a memory 632, and one or more storage media 630 (e.g., one or more mass storage devices) storing an application program 642 or data 644. Memory 632 and storage medium 630 may be, among other things, transient or persistent storage. The program stored in the storage medium 630 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, the central processor 622 may be configured to communicate with the storage medium 630 and execute a series of instruction operations in the storage medium 630 on the server 600.

The server 600 may also include one or more power supplies 626, one or more wired or wireless network interfaces 650, one or more input-output interfaces 658, and/or one or more operating systems 641, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.

The steps performed by the server in the above embodiments may be based on the server structure shown in fig. 7.

The CPU622 is configured to execute the following steps:

Optionally, the CPU622 is further configured to execute steps of any implementation manner of the matting method provided in the embodiment of the present application.

The embodiment of the present application further provides another device, which may specifically be a terminal, and the terminal may be a desktop, a notebook computer, or the like, as shown in fig. 8, for convenience of description, only a part related to the embodiment of the present application is shown, and details of the specific technology are not disclosed, please refer to a method part in the embodiment of the present application. Taking the terminal as desktop as an example:

fig. 8 is a block diagram illustrating a structure of a desktop part related to a terminal provided in an embodiment of the present application. Referring to fig. 8, the desktop includes: radio Frequency (RF) circuit 710, memory 720, input unit 730, display unit 740, sensor 750, audio circuit 760, wireless fidelity (WiFi) module 770, processor 780, and power supply 790. Those skilled in the art will appreciate that the desktop architecture shown in FIG. 7 does not constitute a limitation on desktop, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.

The following specifically describes each constituent component of the desktop with reference to fig. 8:

the RF circuit 710 may be used for receiving and transmitting signals during a message or call. In general, RF circuit 710 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (Low Noise Amplifier; LNA), a duplexer, and the like. In addition, the RF circuit 710 may also communicate with networks and other devices via wireless communication.

The memory 720 may be used to store software programs and modules, and the processor 780 may execute various functional applications of the desktop and data processing by operating the software programs and modules stored in the memory 720. The memory 720 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data) created according to the use of the desktop, and the like. Further, the memory 720 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 730 may be used to receive input numeric or character information and generate key signal inputs related to user setting and function control of the desktop. Specifically, the input unit 730 may include a touch panel 731 and other input devices 732. The touch panel 731, also referred to as a touch screen, can collect touch operations of a user (e.g. operations of the user on or near the touch panel 731 by using any suitable object or accessory such as a finger, a stylus, etc.) and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 731 may include two portions of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts it to touch point coordinates, and sends the touch point coordinates to the processor 780, and can receive and execute commands from the processor 780. In addition, the touch panel 731 may be implemented by various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 730 may include other input devices 732 in addition to the touch panel 731. In particular, other input devices 732 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 740 may be used to display information input by the user or information provided to the user and various menus of the desktop. The Display unit 740 may include a Display panel 741, and optionally, the Display panel 741 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 731 can cover the display panel 741, and when the touch panel 731 detects a touch operation on or near the touch panel 731, the touch operation is transmitted to the processor 780 to determine the type of the touch event, and then the processor 780 provides a corresponding visual output on the display panel 741 according to the type of the touch event. Although in fig. 7, the touch panel 731 and the display panel 741 are two independent components to implement the input and output functions of the desktop, in some embodiments, the touch panel 731 and the display panel 741 may be integrated to implement the input and output functions of the desktop.

The desktop may also include at least one sensor 750, such as light sensors, motion sensors, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel 741 according to the brightness of ambient light, and a proximity sensor that may turn off the display panel 741 and/or the backlight when the desktop moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing a desktop gesture, related functions of vibration recognition (such as pedometer and tapping), and the like; as for the desktop, other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor may be further configured, which are not described herein.

Audio circuitry 760, speaker 761, and microphone 762 may provide an audio interface between the user and the desktop. The audio circuit 760 can transmit the electrical signal converted from the received audio data to the speaker 761, and the electrical signal is converted into a sound signal by the speaker 761 and output; on the other hand, the microphone 762 converts the collected sound signal into an electric signal, converts the electric signal into audio data after being received by the audio circuit 760, processes the audio data output processor 780, and then transmits the processed audio data to, for example, another desktop via the RF circuit 710, or outputs the audio data to the memory 720 for further processing.

WiFi belongs to short-distance wireless transmission technology, desktop can help users to receive and send e-mails, browse webpages and access streaming media and the like through a WiFi module 770, and wireless broadband Internet access is provided for the users. Although fig. 8 shows the WiFi module 770, it is understood that it does not belong to the essential constitution of the desktop, and can be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 780 is a control center of the desktop, connects each part of the entire desktop by using various interfaces and lines, and performs various functions of the desktop and processes data by running or executing software programs and/or modules stored in the memory 720 and calling data stored in the memory 720, thereby performing overall monitoring of the desktop. Optionally, processor 780 may include one or more processing units; preferably, the processor 780 may integrate an application processor, which primarily handles operating systems, user interfaces, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 780.

The desktop also includes a power supply 790 (e.g., a battery) for supplying power to the various components, which may preferably be logically connected to the processor 780 via a power management system, so as to manage charging, discharging, and power consumption via the power management system.

Although not shown, the desktop may further include a camera, a bluetooth module, and the like, which are not described in detail herein.

In the embodiment of the present application, the processor 780 included in the terminal further has the following functions:

Optionally, the processor 780 is further configured to perform the steps of any one implementation of the matting method of the present application.

The embodiment of the present application further provides a computer-readable storage medium for storing a computer program, where the computer program is used to execute any one implementation manner of the matting method described in the foregoing embodiments.

The present application further provides a computer program product including instructions, which when run on a computer, cause the computer to perform any one of the embodiments of a matting method described in the foregoing embodiments.

Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium may be at least one of the following media: various media that can store program codes, such as read-only memory (ROM), RAM, magnetic disk, or optical disk.

It should be noted that, in the present specification, all the embodiments are described in a progressive manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The above description is only one specific embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A matting method, characterized in that the method comprises:

2. The method according to claim 1, wherein the background image is any one of the first N frames of images of the target video frame, or is obtained by processing at least part of the images of the first N frames of images of the target video.

3. The method according to claim 2, wherein when the background image is any one of the first N frames of images of the target video, the color threshold corresponding to the second pixel point is determined as follows:

4. The method according to claim 2, wherein when the background image is obtained by processing at least a partial image of the first N frames of images of the target video, the color value of the second pixel point and the color threshold corresponding to the second pixel point are determined as follows:

and if the third color distances corresponding to the pixel points in the nth frame image are smaller than or equal to the corresponding color thresholds, taking the nth-1 background image as the background image, wherein the color thresholds corresponding to the pixel points in the background image are the color thresholds corresponding to the pixel points in the nth-1 background image.

5. The method according to any one of claims 2 to 4, wherein the image to be segmented is an N +1 frame image in the target video.

6. The method according to claim 2, wherein if the image to be segmented is the mth frame image in the target video, wherein M > N + 1; and determining the color threshold corresponding to the second pixel point according to a fourth color distance between the second pixel point and a sixth pixel point, wherein the sixth pixel point is a pixel point corresponding to the second pixel point in a third image, and the third image is an M-1 frame image in the target video.

7. The method of claim 6, wherein the color threshold corresponding to the second pixel point is determined by:

8. The method of claim 1, further comprising:

9. A matting device, characterized in that the device comprises:

10. The apparatus according to claim 9, wherein the background image is any one of the first N frames of images of the target video frame, or is obtained by processing at least part of the images of the first N frames of images of the target video.

11. The apparatus according to claim 10, wherein when the background image is any one of the first N frames of images of the target video, the color threshold corresponding to the second pixel point is determined as follows:

12. The apparatus according to claim 10, wherein when the scene image is obtained by processing at least a part of images in the first N frames of images of the target video, the color value of the second pixel point and the color threshold corresponding to the second pixel point are determined as follows:

13. The apparatus according to any one of claims 10-12, wherein the image to be segmented is an N +1 frame image in the target video.

14. The apparatus according to claim 10, wherein if the image to be segmented is an mth frame image in the target video, wherein M > N + 1; and determining the color threshold corresponding to the second pixel point according to a fourth color distance between the second pixel point and a sixth pixel point, wherein the sixth pixel point is a pixel point corresponding to the second pixel point in a third image, and the third image is an M-1 frame image in the target video.

15. The apparatus of claim 14, wherein the color threshold corresponding to the second pixel point is determined by:

16. The apparatus according to claim 9, wherein the determining unit is further specifically configured to:

17. A matting device, characterized in that the device comprises a processor and a memory:

the processor is configured to perform the matting method of any one of claims 1-8 according to instructions in the computer program.

18. A computer-readable storage medium for storing a computer program for performing the matting method according to any one of claims 1-8.