CN117274141A - Chrominance matting method and device and video live broadcast system - Google Patents

Chrominance matting method and device and video live broadcast system Download PDF

Info

Publication number
CN117274141A
CN117274141A CN202210680404.4A CN202210680404A CN117274141A CN 117274141 A CN117274141 A CN 117274141A CN 202210680404 A CN202210680404 A CN 202210680404A CN 117274141 A CN117274141 A CN 117274141A
Authority
CN
China
Prior art keywords
face
image
matting
target
key points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210680404.4A
Other languages
Chinese (zh)
Inventor
麦广灿
陈增海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Cubesili Information Technology Co Ltd
Original Assignee
Guangzhou Cubesili Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Cubesili Information Technology Co Ltd filed Critical Guangzhou Cubesili Information Technology Co Ltd
Priority to CN202210680404.4A priority Critical patent/CN117274141A/en
Publication of CN117274141A publication Critical patent/CN117274141A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Image Processing (AREA)

Abstract

The application relates to a chrominance image matting method, a chrominance image matting device and a video live broadcast system, wherein the method comprises the following steps: according to the input image matting parameters, matting is carried out on a target image, and the target image is converted into an initial semitransparent channel image; detecting a first face key point of the target picture, and acquiring the first face key point of the target image; generating a face region mask on a face region of the target picture according to the first face key points; fusing the initial semitransparent channel image and the face region mask to obtain a target semitransparent channel image; according to the technical scheme, when the method is applied to chromaticity image matting of the picture, the situation that useful parts of a face area are mistakenly scratched can be avoided, and therefore the image matting effect is improved.

Description

Chrominance matting method and device and video live broadcast system
Technical Field
The application relates to the technical field of image processing, in particular to a chrominance image matting method and device and a video live broadcast system.
Background
In the image processing technology, the chroma-based (commonly called green (blue) screen matting) technology has been widely applied to various fields such as image editing, video live broadcasting, film and television creation and the like; the basic principle of the technology is that firstly, the similarity between the color of the pixel in the image to be processed and the key color (the background curtain color selected by the user) is calculated, then the similarity is converted into the transparency of the image, and then the matting is completed according to the transparency.
In general, the higher the similarity of the image to the key color, the higher the transparency thereof, thereby realizing that the image is not displayed on the final composite image, and the partial image area is scratched out. However, in practical applications, there are often some image areas with a color highly similar to that of the background in the image main body to be processed, for example, near imaging of the object carried by the image main body to be processed caused by objects with a reflective background color, such as glasses worn in a face area, metal jewelry, etc., and finally, the imaging of the objects can be scratched due to high similarity with the key colors, for example, the corresponding areas in the face area are scratched due to reflection of the glasses, the metal jewelry, etc., which seriously affects the image scratching effect.
Disclosure of Invention
Based on this, it is necessary to provide a chroma matting method, device and video live broadcast system aiming at least one of the above technical defects so as to improve the chroma matting effect.
A chroma matting method comprising:
according to the input image matting parameters, matting is carried out on a target image, and the target image is converted into an initial semitransparent channel image;
detecting a first face key point of the target picture, and acquiring the first face key point of the target image;
generating a face region mask on a face region of the target picture according to the first face key points;
and fusing the initial semitransparent channel image and the face region mask to obtain a target semitransparent channel image.
In one embodiment, generating a face region mask on a face region of the target picture according to the first face keypoints includes:
calculating a face center point of the face area according to the first face key points;
scaling the positions of the first face key points according to the face center points to obtain second face key points;
and generating a face region mask by using the face center point and the second face key point.
In one embodiment, before the generating the face region mask by using the face center point and the second face key point, the method further includes:
acquiring forehead key points of a forehead area of the target picture;
and adjusting the point positions of the second face key points by utilizing the forehead key points so that the second face key points cover the forehead area.
In one embodiment, the calculating the face center point of the face area according to the first face key point includes:
acquiring two corner points of eyes and two corner points of mouth of the face region;
and calculating point coordinates of the eye corner points and the mouth corner points to obtain a face center point of the face region.
In one embodiment, the scaling the position of the first face key point according to the face center point to obtain a second face key point includes:
setting a scaling coefficient of a key point of a human face; wherein the value range of the scaling coefficient is (0, 1);
and scaling the positions of the first face key points by using the scaling coefficients to obtain second face key points of the face region.
In an embodiment, the acquiring the forehead key point of the forehead area of the target picture includes:
selecting two second face key points on the left side and the right side of a face area of the target picture;
constructing a plurality of key points in the forehead area according to the distance between the nose tip and the nose bridge position;
and forming forehead key points according to the selected second face key points and the constructed key points.
In one embodiment, the generating a face region mask using the face center point and the second face key point includes:
selecting a face center point and any two adjacent second face key points to construct a triangle, and assigning 1 to all pixels in the triangle;
and sequentially executing the triangle construction and pixel assignment operation on all the second face key points to obtain a face region mask of the face region.
In one embodiment, the fusing the initial semitransparent channel image and the face region mask to obtain the target semitransparent channel image includes:
when the pixel of the mask of the face area is calculated to be 1, the maximum pixel value of the corresponding position of the initial semitransparent channel image is calculated;
and calculating a target semitransparent channel image according to the face region mask and the maximum pixel value.
In one embodiment, the target picture is a video image collected by a host end participating in a link in a network live broadcast system, and the semitransparent channel image is an image describing portrait matting information.
In one embodiment, before the matting the target picture according to the input matting parameter, the method further includes:
responding to a live broadcast link request of a live broadcast server to establish link connection, and adjusting the current anchor end to be consistent with the opening resolution of other anchor ends;
and collecting video images shot by the current host-side wheat connecting host in front of the background curtain with the set color.
In one embodiment, the chroma matting method further includes:
uploading the video image and the corresponding target semitransparent channel image to a live broadcast server, so that the live broadcast server extracts the image of the webcast from the video image according to the target semitransparent channel image, mixes the image with the images of other webcast, draws the image into a background image, and generates a webcast video stream to be pushed to a spectator.
A chroma matting apparatus comprising:
the initial matting module is used for matting the target picture according to the input matting parameters and converting the target picture into an initial semitransparent channel image;
the face detection module is used for detecting the first face key points of the target picture and acquiring the first face key points of the target image;
the mask generating module is used for generating a face region mask on a face region of the target picture according to the first face key points;
and the image fusion module is used for fusing the initial semitransparent channel image with the face region mask to obtain a target semitransparent channel image.
A video live broadcast system, comprising: at least one anchor end and a live broadcast server; the anchor terminal is used for acquiring a live video image, and performing image matting on the target image by adopting the chromaticity image matting method to obtain a target semitransparent channel image;
the server is used for receiving the video image and the target semitransparent channel image uploaded by the anchor end and carrying out image matting on the video image according to the target semitransparent channel image.
An electronic device, comprising:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform the chroma matting method described above.
A computer readable storage medium storing at least one instruction, at least one program, code set, or instruction set, the at least one instruction, the at least one program, the code set, or instruction set loaded by the processor and performing the chroma matting method described above.
According to the chrominance image matting method, the chrominance image matting device and the video live broadcast system, the target image is scratched according to the input image matting parameters, the target image is firstly converted into the initial semitransparent channel image, then the first face key points of the target image are detected, the face region mask is generated on the face region of the target image by utilizing the detected first face key points, and finally the initial semitransparent channel image and the face region mask are fused to obtain the target semitransparent channel image, so that the target image is scratched; according to the technical scheme, when the method is applied to chromaticity image matting of the picture, the situation that useful parts of a face area are mistakenly scratched can be avoided, and therefore the image matting effect is improved.
Furthermore, when the center point of the face is calculated, two eye corner points and two mouth corner points of the face region are used, and the face can be ensured to be always in the mask center region of the face region under the state of a large-angle face, so that the structured mask of the face region is prevented from containing a background region.
Furthermore, all the face key points are scaled according to the face center points, so that the situation that the face key points exceed the face area due to insufficient accuracy of the face key point model can be avoided, and the image matting effect is ensured.
Furthermore, the key points are adjusted in the key point areas of the human face through the constructed new key points, so that the key points of the human face can cover all areas of the human face, and the situation that part of the areas cannot be effectively protected can be avoided.
Furthermore, in the process of generating the face region mask, the dependent face region mask generation scheme based on the triangle can also be rapidly realized by the mainstream rendering language through technologies such as vertex shader, and the method has the advantages of calculation amount and realization speed, and can be realized in various devices with low cost.
Further, when the face region mask and the initial semitransparent channel image are fused, firstly calculating the maximum value of the initial semitransparent channel image when the face region mask region is 1, and then calculating a final result as a target semitransparent channel image; the method can ensure that the obtained target semitransparent channel image can obtain the expected matting effect even under the condition of unreasonable matting parameters, and can avoid phenomena such as transition lack of naturalness and the like, thereby improving the matting effect.
Drawings
FIG. 1 is a schematic diagram of an exemplary chroma matting principle;
FIG. 2 is an exemplary network live system topology;
FIG. 3 is a flow diagram of a chroma matting method of one embodiment;
FIG. 4 is a schematic diagram of an exemplary face keypoint detection;
FIG. 5 is a schematic diagram of an exemplary scaling process and expansion forehead key points;
FIG. 6 is a schematic illustration of an exemplary face region mask;
FIG. 7 is a schematic diagram of an exemplary chroma matting process flow;
fig. 8 is a schematic structural diagram of a chroma matting device according to an embodiment;
FIG. 9 is a schematic diagram of an exemplary live video system architecture;
fig. 10 is a block diagram of an example electronic device.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
In the embodiments of the present application, reference to "first," "second," etc. is used to distinguish between identical items or similar items that have substantially the same function and function, "at least one" means one or more, "a plurality" means two or more, e.g., a plurality of objects means two or more. The words "comprise" or "comprising" and the like mean that information preceding the word "comprising" or "comprises" is meant to encompass the information listed thereafter and equivalents thereof as well as additional information not being excluded. Reference to "and/or" in the embodiments of the present application indicates that there may be three relationships, and the character "/" generally indicates that the associated object is an "or" relationship.
In general, as shown in fig. 1, fig. 1 is an exemplary schematic diagram of a chrominance matting principle, and the chrominance matting technique is based on a matting parameter P input by a user t Converting a target picture of an input video image into a semitransparent channel image, such as key colors, similarity and the like; for example, taking the t frame of the video image, the expression is I t ∈R H×W×C By using input matting parameters P t Convert it into a semitransparent channel image, expressed as M t ∈R H×W The method comprises the steps of carrying out a first treatment on the surface of the Wherein R represents a standard matrix, M t Translucent channel image representing the t-th frame, P t Representing the matting parameters for the t-th frame, W and H representing the width and height of the target picture, respectively, C representing the number of channels of the input picture (typically three channels of RGB), M t The higher the value of which is between 0 and 1, the lower the similarity between the pixels corresponding to the positions in the target picture and the key colors, and the lower the transparency.
When a chromaticity image matting technology is adopted to perform image matting on a picture containing a face, reflective objects such as glasses and metal ornaments are usually arranged on the face, so that the area corresponding to the face can be scratched out, and an unexpected image matting effect is generated, and in a network live broadcast system, as an example, a network live broadcast system is taken, and in the network live broadcast system, as shown by referring to fig. 2, fig. 2 is an exemplary network live broadcast system topological diagram; the method comprises the steps that a plurality of anchor terminals (i.e. anchor clients) are shown, an anchor terminal 1 and an anchor terminal 2 are connected with a live broadcast server in the figure, live broadcast link is established through the live broadcast server, then after the anchor terminal 1 and the anchor terminal 2 establish live broadcast link, portrait images obtained by matting in video images of respective anchors are sent to the live broadcast server for mixed drawing, and generated link video streams are pushed to an audience terminal. In the network live broadcast system, when the chrominance image matting technology is adopted to perform image matting on the video image of the wheat linking anchor, the region caused by partial reflection of the anchor is easily scratched, and in order to avoid the occurrence of the phenomenon, the application provides a chrominance image matting method.
As shown in fig. 3, fig. 3 is a flowchart of a chroma matting method according to an embodiment, including the following steps:
s10, matting is carried out on the target picture according to the input matting parameters, and the target picture is converted into an initial semitransparent channel image.
Specifically, according to the matting parameter P input by the user t Input target picture I by using green (blue) curtain matting method t Preliminary conversion to an initial semitransparent channel imageRepresenting an initial semi-transparent channel image.
S20, detecting the first face key points of the target picture, and acquiring the first face key points of the target image.
In this step, the face key point detection technique can be utilized to detect the input target picture I t All face key points K are detected t ∈R n×k×2 The face key points comprise contour points of a plurality of faces, wherein n represents a target picture I t The number of faces, K t Representing the key points of the human face of the t frame, wherein k represents the number of key points in the single Zhang Ren face; as shown in fig. 4, fig. 4 is a schematic diagram of exemplary face keypoints detection, in which detected face keypoints in a single Zhang Ren face are visualized.
S30, generating a face region mask on the face region of the target picture according to the first face key points.
In one embodiment, generating a face region mask on the face region of the target picture according to the first face key point in step S30 includes:
s301, calculating a face center point of the face area according to the first face key point.
In this step, the face key points are mainly used to calculate the face center points of the face region, and the face center points are key elements for face key point scaling and face region mask generation, so that the face center points are required to be always located in the face region mask center region, and the face region mask finally constructed is prevented from containing a background region.
In general, the conventional technology selects a key point of the nose tip as a face center point, but in a state of a large-angle face, the nose tip exceeds a face area, so that the face center does not meet the expected effect.
As an embodiment, to avoid the occurrence of the foregoing situation, the present application provides a method for calculating a face center point, according to which the step of calculating the face center point of the face area according to the first face key point in S301 may include the following steps:
(1) Acquiring two corner points of eyes and two corner points of mouth of the face region; specifically, four key points, namely two corner points and the middle point of the two corner points are calculated to be taken as the center point C of the face t
(2) Calculating point coordinates of the eye corner points and the mouth corner points to obtain a face center point of the face region; specifically, by calculating the eye corner points and the point position coordinates of the mouth corner points, the center point of the face can be obtained to be C t ∈R n×1×2 ,C t Representing the center point of the face of the t frame.
According to the technical scheme, even in the state of a large-angle face, the nose tip cannot exceed the face region, the face is guaranteed to be always located in the center region of the face region mask, and the structured face region mask is prevented from containing a background region.
S302, scaling the positions of the first face key points according to the face center points to obtain second face key points.
In this step, by scaling the positions of the first face key points, it is possible to avoid that the face key points exceed the face area due to insufficient accuracy of the face key point model.
As an embodiment, the method for scaling the location of the first face key point in S302 may include the following steps:
(1) Setting a scaling coefficient of a key point of a human face; specifically, the point positions of the key points of the face on the face outline are selected and are based on the central point C of the face t And scaling the key points of the human face with the coefficient alpha, wherein the scaling coefficient range is alpha epsilon (0, 1).
(2) And scaling the positions of the first face key points by using the scaling coefficients to obtain second face key points of the face region.
Specifically, referring to FIG. 5, FIG. 5 is a schematic diagram illustrating an exemplary scaling process and expanding the forehead key point by scaling process K' t =α(K t -C t )+C t Then, a new face key point K 'of the face area can be obtained' t Namely a second face key point; in the operation process, K' t And representing the second face key points obtained after scaling.
According to the scheme of the embodiment, the situation that the face key points exceed the face area due to insufficient accuracy of the face key point model can be avoided, namely, the background curtain image in a final picture caused by transition protection is avoided, and the image matting effect is ensured.
In general, the key point model of the face can only identify the eyebrow and the area below the eyebrow during identification, however, when reflective objects such as glasses worn on the face exceed the area, the reflective objects are easily mistakenly scratched to influence the scratching effect, so that partial areas on the face cannot be protected.
In order to avoid the situation that part of the area cannot be effectively protected, the application further provides a further improved technical scheme, according to which, before the face area mask is generated by utilizing the face center point and the second face key point, the method can further comprise the technical scheme that the area key points are expanded to protect all areas on the face, and according to the technical scheme, step S303 is added in the embodiment; as an example, step S303 may include the following:
(1) Acquiring forehead key points of a forehead area of the target picture; specifically, the point positions of the forehead key points are estimated according to the point positions of the existing face key points.
As an embodiment, with continued reference to fig. 5, preferably, the method for obtaining the forehead key point of the forehead area of the target picture may include the following steps:
(a) Selecting two second face key points on the left side and the right side of a face area of the target picture; namely, the left and right face contour key points of the selected and zoomed face key points are respectively obtained.
(b) Constructing a plurality of key points in the forehead area according to the distance between the nose tip and the nose bridge position; for example, two key points are constructed in the forehead area by taking the distance between the nose tip and the bridge of the nose as a reference.
(c) Forming forehead key points according to the selected second face key points and the constructed plurality of key points; for example, 5 forehead key points are constructed by constructing two key points in the forehead region and selecting two face key points on the left and right sides of the face.
(2) And adjusting the point positions of the second face key points by utilizing the forehead key points so that the second face key points cover the forehead area.
As shown in fig. 5, the point positions of the key points of the second face are adjusted through the constructed key points, and as shown in the figure, the key points are mainly located in the forehead area above the eyebrows of the face, so that the key point areas of the face cover all the effective face areas, the situation that part of the face areas cannot be effectively protected can be avoided, and the follow-up image matting accuracy is improved.
S304, generating a face region mask by using the face center point and the second face key point.
In the step, the point positions of the second face key points and the point positions of the face center points obtained through the processing can be used for drawing a face region mask for obtaining the target picture.
In one embodiment, the method for generating the face region mask in step S304 may include the following steps:
(1) Selecting a face center point and two arbitrary adjacent second face key points to construct a triangle; and assigns a 1 to each and every pixel within the triangle.
Specifically, selecting a face center point and any two adjacent non-face center points to form a triangle, and assigning 1 to all pixels in the triangle; specifically, the mask of the face region is continuously drawn by constructing a triangle by connecting the key points of the face and the central points of the face.
(2) And sequentially executing the triangle construction and pixel assignment operation on all the second face key points to obtain a face region mask of the face region.
Specifically, referring to fig. 6, fig. 6 is a schematic view of an exemplary mask of a face region, where the mask of the face region may be obtained by taking points one by one for all face key points of all face regions and drawing triangles, and continuously repeating the above drawing operationsWherein (1)>A mask image of a face region representing the t-th frame.
According to the technical scheme, triangles are continuously constructed through the key points of the face, triangle patterns are continuously drawn, and finally the face area mask of the whole face is drawn.
And S40, fusing the initial semitransparent channel image and the face region mask to obtain a target semitransparent channel image.
In this step, an initial semi-transparent channel image may be displayedMask->Fusion is carried out to form a target semitransparent channel image M t Because the face region is protected under the action of the face region mask, the face region can be prevented from being mistakenly scratched due to higher similarity with the key colors, and the image scratching effect is ensured.
In the course of the fusion process, the molten metal is melted,conventional techniques are generally fetching for arbitrary pixelsAnd->Is the final fusion image result, i.e. +.>Wherein M is t The value is +.>Or->The larger of (3); however, the result obtained by the technical proposal is that at certain matting parameters P t The following may obtain undesirable effects, such as lack of natural transition.
To avoid the above undesirable effects, the present application provides a fusion scheme, according to which, in one embodiment, the fusion process performed in the step S40 may include the following steps:
(1) When the pixel of the mask of the face area is calculated to be 1, the maximum pixel value of the corresponding position of the initial semitransparent channel image is calculated; specifically, when the mask area of the face area is first calculated to be 1,is not shown.
(2) Calculating a target semitransparent channel image according to the face region mask and the maximum pixel value; specifically, the calculation formula may be as follows:
in the above operation, getAnd->As the final semitransparent channel image M t Through translucent channel image M t The target picture can be scratched.
In the solution of the above embodiment, referring to fig. 7, fig. 7 is a schematic diagram of an exemplary chroma matting process, first using matting parameter P t Green (blue) curtain imaging is carried out on the input target image to obtain an initial semitransparent channel image, and meanwhile, the target image I is processed t Detecting the key points of the human face to obtain the key points of the human face, generating a mask of the human face area by using the key points of the human face, and finally fusing the initial semitransparent channel image and the mask of the human face area to obtain a target semitransparent image, wherein the calculation formula is as follows:
I t ∈R H×W×C
K t ∈R n×k×2
K t ∈R n×k×2
in the method, in the process of the invention,for an initial semitransparent channel image, I t ∈R H×W×C For the target image, K t ∈R n ×k×2 Representing key points of human face->Mask for representing face area, K t ∈R n×k×2 Represents key points of human face, M t ∈R H ×W As a target semi-transparent image, a target semi-transparent image M t ∈R H×W ,M t ∈R H×W A translucent image of the object.
The technical solution of the above embodiment can ensure the obtained semitransparent channel image M t Even in unreasonable matting parameters P t Under the condition, the expected image matting effect can be obtained, phenomena such as transition lack of naturalness and the like can be avoided, and therefore the image matting effect can be improved.
In summary, the technical scheme of the embodiment is that the chroma-matting method is suitable for application scenes such as green (blue) curtain matting live broadcast in live broadcast service, content editing and creation in short video service, and the like, and can intelligently and automatically assist a user in performing green (blue) curtain live broadcast and video editing, so that the problem that useful parts of a face area are mistakenly scratched to generate unexpected results is avoided. Meanwhile, the triangle-based face region mask generation scheme relied in the face region mask generation process can be rapidly realized by the mainstream rendering language through technologies such as a vertex shader, and the like, so that the generation speed of the sub-millisecond level can be basically realized on the mainstream equipment. Based on the advantages of the calculated amount and the realization speed, the method can be realized in various devices with low cost and is used for various applications, such as real video live broadcasting, editing and the like.
An embodiment of the chroma matting device is set forth below.
Referring to fig. 8, fig. 8 is a schematic structural diagram of a chroma matting device according to an embodiment, including:
the initial matting module 10 is configured to perform matting on a target picture according to an input matting parameter, and convert the target picture into an initial semitransparent channel image;
the face detection module 20 is configured to perform first face key point detection on the target image, and obtain a first face key point of the target image;
a mask generating module 30, configured to generate a face region mask on a face region of the target picture according to the first face key point;
and the image fusion module 40 is configured to fuse the initial semitransparent channel image with the face region mask to obtain a target semitransparent channel image.
The chroma matting device of the present embodiment may execute a chroma matting method provided by the embodiment of the present application, and its implementation principle is similar, and actions executed by each module in the chroma matting device of each embodiment of the present application correspond to steps in the chroma matting method of each embodiment of the present application, and detailed functional descriptions of each module of the chroma matting device may be specifically referred to descriptions in the corresponding chroma matting method shown in the foregoing, which are not repeated herein.
An embodiment of a live video system is set forth below.
The video live broadcast system of this application includes: at least one anchor end and a live broadcast server; the anchor terminal is used for acquiring a live video image, and performing image matting on the target image by adopting the chromaticity image matting method of the embodiment to obtain a target semitransparent channel image; the server is used for receiving the video image and the target semitransparent channel image uploaded by the anchor end and carrying out image matting on the video image according to the target semitransparent channel image.
In order to facilitate more detailed implementation of the chroma matting technical solution of the present application, the following description is made in connection with an example in a video live broadcast system. As described in the foregoing embodiments, referring to fig. 9, fig. 9 is a schematic structural diagram of an exemplary video live broadcast system, which includes a main broadcasting end, a live broadcast server, and a viewer end, where the main broadcasting end may be composed of a mobile phone and a PC, or may be composed of a camera and a portable computer, etc., and of course, in practical application, the main broadcasting ends may be connected to the live broadcast server through a network, the live broadcast server performs mixed drawing and live broadcast with a live broadcast respectively, and generates a live broadcast video stream to be pushed to the viewer end, and the viewer end may be a PDA, a tablet computer, a PC, or a portable computer.
The operation process of the network live broadcast system is described below, as shown in the figure, the anchor end can comprise an opening tool and a client end, wherein the opening tool integrates a virtual camera, has various functions of beautifying, drawing and the like, and the client end is a software client end based on voice and video live broadcast. Various types of live templates (entertainment/friend making/war/game/education and the like) can be provided in live broadcast, and virtual same-channel live broadcast with wheat can be realized through the entertainment templates oriented to live broadcast in a show field in the example.
The anchor end can comprise an anchor tool and a client end, wherein the anchor tool integrates a virtual camera, has various functions of beautifying, matting and the like, and the client end is a software client end based on voice and video live broadcast. Various types of live templates (entertainment/friend making/war/game/education, etc.) can be provided in live broadcast, and the live broadcast of the link wheat can be realized through the entertainment templates oriented to the live broadcast of the show in the example.
For the anchor:
(1) The sowing tool is responsible for collecting cameras, carrying out treatments such as beautifying, skin grinding and face thinning on video images of a main sowing, then carrying out pre-matting treatment on foreground images based on a chromaticity matting technology, and extracting behavior data (such as data of arm actions, gestures, whole outline of a body and the like) to obtain semitransparent channel images; the client uploads the target image, the semitransparent channel image, and image related information (AI key point information transferred by using SEI information, key point information such as faces, gestures, heads, etc., play special effect information, play gift information, and other information, etc.) to the live broadcast server.
(2) The anchor terminal initiates a linking connection, for example, the anchor terminal 1 and the anchor terminal 2 link the wheat through the live broadcast server, and select background images (static pictures, dynamic videos, etc.), and the background images are put into SEI information in the form of URL and sent to the live broadcast server.
(3) Performing a beautifying and virtual special effect processing function on video images of a host player; such as skin abrasion, face thinning, face changing, sunglasses wearing, etc.
For a live server:
(1) And forwarding the video stream information of the anchor, for example, forwarding the video stream information of the anchor 1 to the anchor 2, so that the anchor 2 can perform the mixing and composing locally on the anchor 1.
(2) And pushing the video stream generated by the live wheat linking to the audience, for example, mixing the portrait image and the background image of the scratched images in the target images uploaded by the anchor terminal 1 and the anchor terminal 2, coding and pushing the mixed images to a CDN distribution network, and sending the coded and pushed images to each audience terminal.
(3) Directly re-rendering virtual special effect content on the anchor end; for example, the anchor terminal 1 and the anchor terminal 2 respectively need to display virtual gift special effects in the wheat connecting process, convert according to AI key point information to obtain the rendering position of virtual special effect content, and then re-render on the live broadcast server; in addition, the joint virtual gift special effect generated by the continuous wheat interaction is synchronously rendered, and a continuous wheat video stream is generated and pushed to the audience side.
For the audience end:
(1) And receiving the wheat-linked video stream pushed by the live broadcast server, and playing the video stream on an audience terminal interface.
(2) And (3) carrying out the wheat clamping in the live broadcasting process of the continuous wheat, firstly establishing connection with a live broadcasting server, downloading video image data of the anchor terminal 1 and the anchor terminal 2, previewing locally, uploading an audio stream of a wheat clamping user, adding the audio stream into the continuous wheat video stream by the live broadcasting server, and pushing the audio stream to a spectator terminal.
Embodiments of an electronic device and a computer-readable storage medium are set forth below.
The application provides a technical scheme of electronic equipment, which is used for realizing the related functions of a chroma matting method.
In one embodiment, the present application provides an electronic device comprising:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in memory and configured to be executed by the one or more processors, the one or more applications configured for use with the chroma matting method of any embodiment.
As shown in fig. 10, fig. 10 is a block diagram of an example electronic device. The electronic device may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, or the like. Referring to fig. 10, the apparatus 1000 may include one or more of the following components: a processing component 1002, a memory 1004, a power component 1006, a multimedia component 1008, an audio component 1009, an input/output (I/O) interface 1012, a sensor component 1014, and a communication component 1016.
The processing component 1002 generally controls overall operation of the apparatus 1000, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
The memory 1004 is configured to store various types of data to support operations at the device 1000. Such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power supply component 1006 provides power to the various components of the device 1000.
The multimedia component 10010 includes a screen between the device 1000 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). In some embodiments, the multimedia assembly 1008 includes a front-facing camera and/or a rear-facing camera.
The audio component 1010 is configured to output and/or input audio signals.
The I/O interface 1012 provides an interface between the processing assembly 1002 and peripheral interface modules, which may be a keyboard, click wheel, buttons, and the like. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.
The sensor assembly 1014 includes one or more sensors for providing status assessment of various aspects of the device 1000. The sensor assembly 1014 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact.
The communication component 1016 is configured to facilitate communication between the apparatus 1000 and other devices, either wired or wireless. The device 1000 may access a wireless network based on a communication standard, such as WiFi, an operator network (e.g., 2G, 3G, 4G, or 5G), or a combination thereof.
The application provides a technical scheme of a computer readable storage medium, which is used for realizing functions related to a chroma matting method. The computer readable storage medium stores at least one instruction, at least one program, code set, or instruction set, the at least one instruction, at least one program, code set, or instruction set being loaded by a processor and performing the chroma matting method of any embodiment.
In an exemplary embodiment, the computer-readable storage medium may be a non-transitory computer-readable storage medium including instructions, such as a memory including instructions, for example, the non-transitory computer-readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
The foregoing examples represent only a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (15)

1. A chroma matting method, comprising:
according to the input image matting parameters, matting is carried out on a target image, and the target image is converted into an initial semitransparent channel image;
detecting a first face key point of the target picture, and acquiring the first face key point of the target image;
generating a face region mask on a face region of the target picture according to the first face key points;
and fusing the initial semitransparent channel image and the face region mask to obtain a target semitransparent channel image.
2. A chroma matting method as defined in claim 1, wherein generating a face region mask over a face region of the target picture from the first face keypoints comprises:
calculating a face center point of the face area according to the first face key points;
scaling the positions of the first face key points according to the face center points to obtain second face key points;
and generating a face region mask by using the face center point and the second face key point.
3. A chroma matting method in accordance with claim 1 further comprising, prior to said generating a face region mask using the face center point and a second face key point:
acquiring forehead key points of a forehead area of the target picture;
and adjusting the point positions of the second face key points by utilizing the forehead key points so that the second face key points cover the forehead area.
4. A chroma matting method as defined in claim 2, wherein the computing a face center point of the face region from the first face key point comprises:
acquiring two corner points of eyes and two corner points of mouth of the face region;
and calculating point coordinates of the eye corner points and the mouth corner points to obtain a face center point of the face region.
5. The chroma matting method of claim 2 wherein scaling the location of the first face keypoint based on the face center point to obtain a second face keypoint comprises:
setting a scaling coefficient of a key point of a human face; wherein the value range of the scaling coefficient is (0, 1);
and scaling the positions of the first face key points by using the scaling coefficients to obtain second face key points of the face region.
6. The chroma matting method of claim 1 wherein the acquiring the forehead key point of the forehead region of the target picture comprises:
selecting two second face key points on the left side and the right side of a face area of the target picture;
constructing a plurality of key points in the forehead area according to the distance between the nose tip and the nose bridge position;
and forming forehead key points according to the selected second face key points and the constructed key points.
7. A chroma matting method as defined in claim 1, wherein the generating a face region mask using the face center point and a second face key point comprises:
selecting a face center point and any two adjacent second face key points to construct a triangle, and assigning 1 to all pixels in the triangle;
and sequentially executing the triangle construction and pixel assignment operation on all the second face key points to obtain a face region mask of the face region.
8. The chroma matting method of claim 1 wherein the fusing the initial semi-transparent channel image with the face region mask to obtain a target semi-transparent channel image comprises:
when the pixel of the mask of the face area is calculated to be 1, the maximum pixel value of the corresponding position of the initial semitransparent channel image is calculated;
and calculating a target semitransparent channel image according to the face region mask and the maximum pixel value.
9. A chroma matting method according to any one of claims 1-8, characterized in that the target picture is a video image collected by a host end participating in a link in a network living broadcast system, and the semitransparent channel image is an image describing portrait matting information.
10. A chroma matting method as defined in claim 9, wherein before matting the target picture according to the input matting parameter, further comprising:
responding to a live broadcast link request of a live broadcast server to establish link connection, and adjusting the current anchor end to be consistent with the opening resolution of other anchor ends;
and collecting video images shot by the current host-side wheat connecting host in front of the background curtain with the set color.
11. A chroma matting method in accordance with claim 10, further comprising:
uploading the video image and the corresponding target semitransparent channel image to a live broadcast server, so that the live broadcast server extracts the image of the webcast from the video image according to the target semitransparent channel image, mixes the image with the images of other webcast, draws the image into a background image, and generates a webcast video stream to be pushed to a spectator.
12. A chroma matting device, comprising:
the initial matting module is used for matting the target picture according to the input matting parameters and converting the target picture into an initial semitransparent channel image;
the face detection module is used for detecting the first face key points of the target picture and acquiring the first face key points of the target image;
the mask generating module is used for generating a face region mask on a face region of the target picture according to the first face key points;
and the image fusion module is used for fusing the initial semitransparent channel image with the face region mask to obtain a target semitransparent channel image.
13. A video live broadcast system, comprising: at least one anchor end and a live broadcast server; the anchor side is used for acquiring a live video image, and performing matting on the target image by adopting the chromaticity matting method according to any one of claims 1-11 to obtain a target semitransparent channel image;
the server is used for receiving the video image and the target semitransparent channel image uploaded by the anchor end and carrying out image matting on the video image according to the target semitransparent channel image.
14. An electronic device, comprising:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform the chroma matting method of any one of claims 1-11.
15. A computer readable storage medium having stored thereon at least one instruction, at least one program, code set, or instruction set, the at least one instruction, the at least one program, code set, or instruction set being loaded by the processor and performing a chroma matting method according to any one of claims 1 to 11.
CN202210680404.4A 2022-06-15 2022-06-15 Chrominance matting method and device and video live broadcast system Pending CN117274141A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210680404.4A CN117274141A (en) 2022-06-15 2022-06-15 Chrominance matting method and device and video live broadcast system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210680404.4A CN117274141A (en) 2022-06-15 2022-06-15 Chrominance matting method and device and video live broadcast system

Publications (1)

Publication Number Publication Date
CN117274141A true CN117274141A (en) 2023-12-22

Family

ID=89220151

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210680404.4A Pending CN117274141A (en) 2022-06-15 2022-06-15 Chrominance matting method and device and video live broadcast system

Country Status (1)

Country Link
CN (1) CN117274141A (en)

Similar Documents

Publication Publication Date Title
US11423556B2 (en) Methods and systems to modify two dimensional facial images in a video to generate, in real-time, facial images that appear three dimensional
CN107993216B (en) Image fusion method and equipment, storage medium and terminal thereof
US11100664B2 (en) Depth-aware photo editing
CN107851299B (en) Information processing apparatus, information processing method, and program
CN106730815B (en) Somatosensory interaction method and system easy to realize
CN111971713A (en) 3D face capture and modification using image and time tracking neural networks
US20130101164A1 (en) Method of real-time cropping of a real entity recorded in a video sequence
US6945869B2 (en) Apparatus and method for video based shooting game
CN108876886B (en) Image processing method and device and computer equipment
CN111652123B (en) Image processing and image synthesizing method, device and storage medium
CN109584358A (en) A kind of three-dimensional facial reconstruction method and device, equipment and storage medium
CN111182350B (en) Image processing method, device, terminal equipment and storage medium
KR102353556B1 (en) Apparatus for Generating Facial expressions and Poses Reappearance Avatar based in User Face
CN112199016A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN110267079B (en) Method and device for replacing human face in video to be played
CN112348841B (en) Virtual object processing method and device, electronic equipment and storage medium
CN113645476A (en) Picture processing method and device, electronic equipment and storage medium
CN110719415B (en) Video image processing method and device, electronic equipment and computer readable medium
CN109885172B (en) Object interaction display method and system based on Augmented Reality (AR)
CN116958344A (en) Animation generation method and device for virtual image, computer equipment and storage medium
CN116962748A (en) Live video image rendering method and device and live video system
Leung et al. Realistic video avatar
KR100422470B1 (en) Method and apparatus for replacing a model face of moving image
CN117274141A (en) Chrominance matting method and device and video live broadcast system
US11287658B2 (en) Picture processing device, picture distribution system, and picture processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination