CN112561790A - Live-broadcast class portrait slimming processing method and device and electronic equipment - Google Patents

Live-broadcast class portrait slimming processing method and device and electronic equipment Download PDF

Info

Publication number
CN112561790A
CN112561790A CN202011539638.4A CN202011539638A CN112561790A CN 112561790 A CN112561790 A CN 112561790A CN 202011539638 A CN202011539638 A CN 202011539638A CN 112561790 A CN112561790 A CN 112561790A
Authority
CN
China
Prior art keywords
portrait
picture
slimming
position information
scene video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011539638.4A
Other languages
Chinese (zh)
Inventor
杨森
蔡红
王岩
安�晟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zuoyebang Education Technology Beijing Co Ltd
Original Assignee
Zuoyebang Education Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zuoyebang Education Technology Beijing Co Ltd filed Critical Zuoyebang Education Technology Beijing Co Ltd
Priority to CN202011539638.4A priority Critical patent/CN112561790A/en
Publication of CN112561790A publication Critical patent/CN112561790A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/50Lighting effects
    • G06T15/506Illumination models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/50Lighting effects
    • G06T15/80Shading
    • G06T15/83Phong shading
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Computer Graphics (AREA)
  • Health & Medical Sciences (AREA)
  • Educational Technology (AREA)
  • Educational Administration (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • General Business, Economics & Management (AREA)
  • Image Processing (AREA)

Abstract

The invention belongs to the technical field of live broadcast courses, and provides a live broadcast course portrait slimming method, a live broadcast course portrait slimming device and electronic equipment, wherein the method comprises the following steps: respectively acquiring a first scene video picture containing a portrait and a second scene video picture containing a blackboard writing in a live broadcast class; extracting a portrait picture in the first scene video picture; positioning the slimming position information of the portrait picture; carrying out slimming treatment on the slimming position information to obtain a portrait picture after slimming; fusing the slim portrait picture with the second scene video picture; and outputting the fused video picture. The invention can realize the effect of real-time slimming of the portrait of the live broadcast class and ensuring that the blackboard writing is not deformed, and effectively improves the image of a teacher in the live broadcast class, thereby improving the class enthusiasm and concentration of students.

Description

Live-broadcast class portrait slimming processing method and device and electronic equipment
Technical Field
The invention belongs to the technical field of online education of networks, is particularly suitable for online live-broadcast class technology, and particularly relates to a live-broadcast class portrait slimming processing method and device, electronic equipment and a computer readable medium.
Background
With the development of internet technology and the attention of people to education, the advantage that the price is cheaper than the traditional course is loved by students and parents due to the guidance of famous students in the internet live broadcast course.
The internet live lessons generally need students to watch through visual equipment such as a tablet personal computer or a smart phone and the like which can be connected with the internet, and the learning process of listening to lessons and doing exercises is completed. The teachers in the live class need to interact with the students through network videos. The good teacher image in the live broadcast video can attract students to attend classes, and the class enthusiasm, the concentration degree and the like of the students are improved. Therefore, how to reduce the weight of teachers in the live-broadcast class and improve the image of the teachers to become the problem to be solved urgently for live-broadcast class videos.
Disclosure of Invention
Technical problem to be solved
The invention aims to solve the technical problems of how to reduce the weight of teachers in live classes and improve the image of the teachers, thereby improving the enthusiasm and concentration of students in the class.
(II) technical scheme
In order to solve the technical problem, the invention provides a live-broadcast class portrait slimming method on one hand, which comprises the following steps:
respectively acquiring a first scene video picture containing a portrait and a second scene video picture containing a blackboard writing in a live broadcast class;
extracting a portrait picture in the first scene video picture;
positioning the slimming position information of the portrait picture;
carrying out slimming treatment on the slimming position information to obtain a portrait picture after slimming;
fusing the slim portrait picture with the second scene video picture;
and outputting the fused video picture.
According to a preferred embodiment of the present invention, the positioning the slimming position information of the portrait picture includes:
extracting key part position information and key part height information of the portrait picture;
determining the standard position information of the key part according to the height information of the key part;
and comparing the key part position information with the key part standard position information to obtain the slimming position information needing to be slimmed.
According to a preferred embodiment of the present invention, the portrait image is input into a deep learning model, so as to obtain the position information of the key part of the portrait image.
According to a preferred embodiment of the present invention, before the portrait image is input into the deep learning model and the key location information of the portrait image is obtained, the method further includes:
acquiring portrait pictures in a historical live broadcast course as a sample set;
marking the position information of the key part in the sample set picture;
and training a deep learning model according to the sample set and the position information of the key part.
According to a preferred embodiment of the present invention, the pixels of the slimming position information are mapped to a preset range by image warping, so as to obtain a portrait picture after slimming.
According to a preferred embodiment of the present invention, the fusing the slimmed portrait picture and the second scene video picture includes:
pasting the portrait picture after slimming to a position except for the blackboard writing in the second scene video picture;
and processing the pasting boundary of the portrait picture and the second scene video picture by adopting a Gaussian aliasing algorithm.
According to a preferred embodiment of the present invention, after the fusing the slimmed portrait picture with the second scene video picture, the method further includes:
and processing the fused picture by adopting a lighting model of graphics.
According to a preferred embodiment of the invention, the method further comprises:
and when a scene switching instruction is received, switching the second scene video picture to a corresponding scene picture.
According to a preferred embodiment of the invention, the method further comprises:
when receiving the key knowledge prompting instruction, prompting key knowledge information.
The invention provides a device for processing the human figure slimming of a live broadcast class in a second aspect, which comprises:
the acquisition module is used for respectively acquiring a first scene video picture containing a portrait and a second scene video picture containing a blackboard writing picture in a live broadcast class;
the extraction module is used for extracting a portrait picture in the first scene video picture;
the positioning module is used for positioning the slimming position information of the portrait picture;
the slimming processing module is used for carrying out slimming processing on the slimming position information to obtain a portrait picture after slimming;
the fusion module is used for fusing the slim portrait picture with the second scene video picture;
and the output module is used for outputting the fused video pictures.
A third aspect of the invention proposes an electronic device comprising a processor and a memory for storing a computer-executable program, which, when executed by the processor, performs the method.
The fourth aspect of the present invention also provides a computer-readable medium storing a computer-executable program, which when executed, implements the method.
(III) advantageous effects
The method comprises the steps of collecting at least two scene videos of a live course in real time, wherein one scene video is composed of a first scene video picture containing a portrait, and the other scene video is composed of a second scene video picture containing a blackboard-writing; extracting a portrait picture in the first scene video picture; positioning the slimming position information of the portrait picture; then, the weight-reducing position information is subjected to weight-reducing processing, and the independent weight reduction of the portrait in the live broadcast class is completed; then fusing the slim portrait picture with the second scene video picture; and outputting the fused video picture. Thereby realize the effect of real-time slimming and guaranteeing the indeformable of blackboard writing to live class portrait, effectively promoted live class in teacher's image to improve student's enthusiasm in class, be absorbed in the degree.
According to the invention, the pasting boundary of the portrait picture and the second scene video picture is processed by adopting the Gaussian aliasing algorithm, so that the mosaic phenomenon of the pasting boundary of the portrait picture and the second scene video picture caused by the inconsistent illumination can be effectively relieved, and the quality of the fused video picture is improved.
According to the invention, when a scene switching instruction is received, the second scene video picture is switched to the corresponding scene picture, so that the learning attention of students can be further improved, and the classroom atmosphere is improved.
When the key knowledge prompting instruction is received, the invention prompts key knowledge information, deepens the attention and memory of students to the key knowledge and improves the learning effect.
Drawings
FIG. 1 is a schematic flow chart of a method for processing a portrait of a live broadcast class for slimming according to the present invention;
FIG. 2 is a schematic diagram of the present invention for locating the slimming position information of the portrait image;
FIG. 3a is a schematic diagram of a live lesson showing a teacher with a slimming back and a blackboard writing in real time according to the present invention;
fig. 3b is a schematic diagram of automatically switching a second scene video picture after receiving a scene switching instruction in the live broadcasting process of fig. 3 a;
FIG. 4 is a schematic illustration of an emphasis display of the present invention;
FIG. 5 is a schematic structural diagram of a device for processing a portrait of a live-broadcast class for slimming according to the present invention;
FIG. 6 is a schematic diagram of the structure of an electronic device of one embodiment;
fig. 7 is a schematic diagram of a computer-readable recording medium of an embodiment of the present invention.
Detailed Description
In describing particular embodiments, specific details of structures, properties, effects, or other features are set forth in order to provide a thorough understanding of the embodiments by one skilled in the art. However, it is not excluded that a person skilled in the art may implement the invention in a specific case without the above-described structures, performances, effects or other features.
The flow chart in the drawings is only an exemplary flow demonstration, and does not represent that all the contents, operations and steps in the flow chart are necessarily included in the scheme of the invention, nor does it represent that the execution is necessarily performed in the order shown in the drawings. For example, some operations/steps in the flowcharts may be divided, some operations/steps may be combined or partially combined, and the like, and the execution order shown in the flowcharts may be changed according to actual situations without departing from the gist of the present invention.
The block diagrams in the figures generally represent functional entities and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different network and/or processing unit devices and/or microcontroller devices.
The same reference numerals denote the same or similar elements, components, or parts throughout the drawings, and thus, a repetitive description thereof may be omitted hereinafter. It will be further understood that, although the terms first, second, third, etc. may be used herein to describe various elements, components, or sections, these elements, components, or sections should not be limited by these terms. That is, these phrases are used only to distinguish one from another. For example, a first device may also be referred to as a second device without departing from the spirit of the present invention. Furthermore, the term "and/or", "and/or" is intended to include all combinations of any one or more of the listed items.
The method is mainly used for slimming teachers in online live broadcast classes, and the online live broadcast is a real-time scene and is not a static single image, so that the requirement on speed is high, and the real-time performance of processing needs to be ensured. On the other hand, the teacher writes the blackboard writing on the display screen in the online live broadcast class, and if only the teacher is subjected to slimming treatment, the blackboard writing in the slimming area is deformed and distorted. Therefore, the live broadcast class portrait slimming method needs to simultaneously ensure the real-time performance and the effect of preventing the blackboard writing from being distorted after slimming. Based on the fact, at least two scene videos of a live broadcast class are collected in real time, wherein one scene video is composed of a first scene video picture containing a portrait, and the other scene video is composed of a second scene video picture containing a blackboard-writing; extracting a portrait picture in the first scene video picture; positioning the slimming position information of the portrait picture; then, the weight-reducing position information is subjected to weight-reducing processing, and the independent weight reduction of the portrait in the live broadcast class is completed; then fusing the slim portrait picture with the second scene video picture; and outputting the fused video picture. Therefore, the effect of reducing the weight of the portrait of the live broadcast class in real time and ensuring that the blackboard writing is not deformed is achieved.
In a specific embodiment, the invention extracts the key part position information and the key part height information of the portrait picture; determining the standard position information of the key part according to the height information of the key part based on the standard height fat-thin distribution; and comparing the key part position information with the key part standard position information to obtain the slimming position information needing to be slimmed. Wherein, the key position of portrait picture refers to the position that needs to be slimmed, including but not limited to: waist, face, etc. Correspondingly, the key part position information of the portrait picture can be waist width, face width and the like. The key part position information of the portrait picture can be obtained through a pre-trained deep learning model.
The invention adopts an image distortion mode to map the pixels of the slimming position information into a preset range to obtain the portrait picture after slimming. Wherein, the preset range is a slimming range set according to the portrait height.
Pasting the slim portrait picture to a position except for the blackboard writing in the second scene video picture; and processing the pasting boundary of the portrait picture and the second scene video picture by adopting a Gaussian aliasing algorithm. The mosaic phenomenon of the sticking boundary of the portrait picture and the second scene video picture caused by the inconsistent illumination can be effectively relieved, and the quality of the fused video picture is improved. And the integrated picture is further processed by adopting a lighting model of graphics, so that the problem of inconsistent lighting between the portrait picture and the second scene video picture is further solved, and the authenticity of the integrated picture is improved.
According to the invention, when a scene switching instruction is received, the second scene video picture is switched to the corresponding scene picture, so that the learning attention of students can be further improved, and the classroom atmosphere is improved.
When the key knowledge prompting instruction is received, the invention prompts key knowledge information, deepens the attention and memory of students to the key knowledge and improves the learning effect.
In order that the objects, technical solutions and advantages of the present invention will become more apparent, the present invention will be further described in detail with reference to the accompanying drawings in conjunction with the following specific embodiments.
In describing particular embodiments, specific details of structures, properties, effects, or other features are set forth in order to provide a thorough understanding of the embodiments by one skilled in the art. However, it is not excluded that a person skilled in the art may implement the invention in a specific case without the above-described structures, performances, effects or other features.
The flow chart in the drawings is only an exemplary flow demonstration, and does not represent that all the contents, operations and steps in the flow chart are necessarily included in the scheme of the invention, nor does it represent that the execution is necessarily performed in the order shown in the drawings. For example, some operations/steps in the flowcharts may be divided, some operations/steps may be combined or partially combined, and the like, and the execution order shown in the flowcharts may be changed according to actual situations without departing from the gist of the present invention.
The block diagrams in the figures generally represent functional entities and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different network and/or processing unit devices and/or microcontroller devices.
The same reference numerals denote the same or similar elements, components, or parts throughout the drawings, and thus, a repetitive description thereof may be omitted hereinafter. It will be further understood that, although the terms first, second, third, etc. may be used herein to describe various elements, components, or sections, these elements, components, or sections should not be limited by these terms. That is, these phrases are used only to distinguish one from another. For example, a first device may also be referred to as a second device without departing from the spirit of the present invention. Furthermore, the term "and/or", "and/or" is intended to include all combinations of any one or more of the listed items.
Fig. 1 is a schematic flow chart of a live-broadcast class portrait slimming processing method of the present invention, as shown in fig. 1, the method includes the following steps:
s1, respectively collecting a first scene video picture containing a portrait and a second scene video picture containing a blackboard-writing in a live broadcast class;
in order to reduce the portrait in the live broadcast class and ensure the real-time performance of the live broadcast class and the effect of untwisting the blackboard writing after the live broadcast class is reduced, the scenes containing the portrait and the scenes containing the blackboard writing in the live broadcast class need to be collected in real time, specifically, two paths of cameras can be adopted to shoot simultaneously, one path is used for collecting the video pictures of a first scene containing the portrait in the live broadcast class in real time, and the other path is used for collecting the video pictures of a second scene containing the blackboard writing in the live broadcast class in real time. For example, the real scene of a teacher on a live broadcast class is collected through the first camera, and the pure blackboard-writing scene in the live broadcast class is collected through the second camera.
In order to ensure that the portrait in the live broadcast class is slimming in real time, the video pictures are pictures contained in videos which are generated by the live broadcast class and played simultaneously with the live broadcast class in real time. In order to give consideration to the speed and effect of slimming processing, each frame of picture in the video played simultaneously with the live broadcast class can be selected as a video picture, and one picture can also be selected as a video picture every few frames in the video played simultaneously with the live broadcast class.
S2, extracting a portrait picture in the first scene video picture;
specifically, the teacher's image may be deducted from the first scene video picture by using an image segmentation technique. The image segmentation refers to a process of dividing an image into a plurality of regions with similar properties, and can be applied to the technologies of scene object segmentation, human body background segmentation, human face and human body Parsing, three-dimensional reconstruction and the like. Image segmentation can be broadly divided into three major categories, graph theory-based methods, pixel clustering-based methods, and depth semantic-based methods. The invention preferably employs a depth speech based method to extract the teacher's image from the first scene video picture.
S3, positioning the slimming position information of the portrait picture;
illustratively, the invention extracts the key position information and the key height information of the portrait picture; determining the standard position information of the key part according to the height information of the key part based on the standard height fat-thin distribution; and comparing the key part position information with the key part standard position information to obtain the slimming position information needing to be slimmed.
According to the invention, the waist slimming can be realized by reducing the width of the waist of a teacher, and the face slimming effect can also be realized by reducing the width of the face of the teacher. Therefore, the key part positions of the portrait image include, but are not limited to: waist, face, etc. The height of the key part of the portrait picture corresponds to the height of the position needing to be slimmed, for example, the height of the waist needs to be slimmed, and the height of the key part refers to the height of the portrait, and if the face needs to be thinned, the height of the key part refers to the height of the face. Correspondingly, the key part position information of the portrait picture can be waist width, and the key part height information in the portrait picture is portrait height information; the key position information of the portrait picture can also be the face width, and the key height information in the portrait picture is the face height information.
In the invention, the key part positions of the portrait pictures can be obtained through a pre-trained deep learning model. When the deep learning model is trained, portrait pictures in a historical live broadcast class can be obtained from historical data or public data sets as a sample set; labeling the position information (such as waist width, face width and the like) of key parts in the sample set picture; and training a deep learning model according to the sample set and the position information of the key part. Further, the portrait picture to be verified can be input into a trained deep learning model to obtain predicted key part position information, the actual key part position information of the video frame picture to be verified is compared with the predicted key part position information, and a loss function is calculated; and if the loss function is smaller than a preset value, determining the deep learning model as a finally trained deep learning model. The deep learning model can be realized by adopting network structures such as CNN, Hourglass, Attention, Transform, LSTM and the like.
Similarly, the height information of the key parts in the portrait picture can also be obtained by a trained deep learning model. The invention can configure a corresponding relation table in advance according to standard height fat-thin distribution based on a first scene video picture coordinate system, the corresponding relation table comprises key part standard position information corresponding to the key part height information, the current portrait key part standard position information can be determined according to the extracted key part height information and the corresponding relation, and the key part position information extracted by the current portrait is compared with the key part standard position information, so that the slimming position information needing slimming can be obtained. Wherein the slimming position information includes: coordinates of the slimming part (such as the waist, the face, etc.), and a slimming length (i.e., a difference value between the position information of the key part and the standard position information of the key part in the first scene video picture coordinate system).
Taking the waist slimming of the teacher in the live-feed class as an example, as shown in fig. 2, in this step, the waist width D and the teacher height H in the teacher image 21 in the first scene video picture are obtained through the trained deep learning model, and the waist standard width D corresponding to the teacher height H is searched according to the corresponding relationshipKThen comparing the teacher's waist width D with the standard waist width DKAnd determining the coordinate of the slimming part and the slimming length.
S4, carrying out slimming treatment on the slimming position information to obtain a portrait picture after slimming;
the invention adopts image distortion to map the pixels of the slimming position information into a preset range to obtain the portrait picture after slimming. Specifically, the coordinate of the slimming part can be converted into the pixel of the slimming part according to the coordinate pixel conversion relation of the first scene video picture, the slimming length can be converted into the corresponding slimming pixel, and then the pixel of the slimming part is mapped into the range of the slimming pixel through image distortion, so that the portrait picture after slimming is obtained. The image distortion is to map each pixel point on the picture to a new position according to a certain rule, and is actually a process of solving new pixel coordinates x and y.
S5, fusing the slim portrait picture with the second scene video picture;
specifically, the portrait picture after slimming is pasted to a position, except for the blackboard writing, in the second scene video picture; and processing the pasting boundary of the portrait picture and the second scene video picture by adopting a Gaussian aliasing algorithm. The mosaic phenomenon of the sticking boundary of the portrait picture and the second scene video picture caused by the inconsistent illumination can be effectively relieved, and the quality of the fused video picture is improved.
The Gaussian aliasing algorithm comprises the following steps:
s11, constructing corresponding Gaussian residual error pyramids (the number of layers is a preset value level) according to the pictures L and R to be fused, and reserving the images (the image with the minimum size, the level +1 layer) at the top of the Gaussian pyramid downsampling:
the gaussian pyramid construction method is as follows, taking the picture L as an example:
(1) and performing Gaussian downsampling on the picture L to obtain a down L, wherein a pyrDOWN () function in OpenCV is specifically adopted to realize the DOWN. Then, gaussian upsampling is performed on the down l to obtain upL, and a pyrUp () function in OpenCV can be implemented.
(2) The residual between pictures L and upL is calculated to obtain a residual map lapL 0. As the image of the lowest end of the gaussian pyramid.
(3) And (3) continuing to perform the operation of the step (1) and (2) on the downL, and continuously calculating residual maps lapL1, lap2 and lap3. Thus, a series of residual error graphs are obtained, namely, the Gaussian residual error pyramid.
(4) A total of level images exist in the Gaussian residual pyramid. The gaussian downsampled map topL of level +1 level is retained for later use.
S12, constructing a Gaussian pyramid by downsampling a binary mask, and realizing by using pyrDOwn () in the same way, wherein the Gaussian pyramid has a level +1 layer.
And S13, merging the images of the layers corresponding to the Gaussian residual pyramids of the images L and R into an image by using the mask image of each layer of the mask pyramid. This results in a merged gaussian residual pyramid. While topL and topR retained in step S11 are merged into topLR with the topmost mask.
And S14, taking the topLR as the topmost image of the pyramid, performing Gaussian upsampling on the topLR by using a pyrUp () function to obtain upTopLR, adding the upTopLR and the image of the layer corresponding to the residual pyramid combined in the step S13, and reconstructing the image of the layer.
And S15, repeating the step S14 until the 0 th layer, namely the image at the lowest end of the pyramid, namely the blendmag, is reconstructed. And (6) outputting.
In addition, in the invention, the situation that the light distribution is different between the thin portrait picture and the second scene video picture can occur. In order to improve the authenticity of the fused picture, the problem that the light distribution of the thin portrait picture and the second scene video picture is different needs to be eliminated, so that the step can further adopt a graphical light model to process the fused picture.
The illumination model in the graphics is used for describing the relationship between the light source and the illumination of the surface of the object in the three-dimensional scene. According to the method, the illumination distribution of the thin-body portrait picture and the second scene video picture is obtained by solving the illumination model parameters, and then the illumination distribution of the thin-body portrait picture is corrected according to the illumination distribution of the second scene video picture, so that the illumination distribution of the thin-body portrait picture and the illumination distribution of the second scene video picture are consistent. Obviously, the illumination distribution of the second scene video picture can also be corrected according to the illumination distribution of the thin-body portrait picture, so that the illumination distributions of the second scene video picture and the thin-body portrait picture are consistent. Or, the light distribution of the second scene video picture of the thin-body portrait picture set may be adjusted to the preset optimal light intensity at the same time, so that the light distribution of the second scene video picture set is consistent with the light distribution of the thin-body portrait picture set.
Specifically, there are various Lighting models in graphics, and taking von willebrand Lighting Model (Phong Lighting Model) as an example, the main structure of the von willebrand Lighting Model is composed of 3 components: ambient (Ambient), Diffuse (Diffuse) and Specular (Specular) lighting.
Wherein Ambient Lighting (Ambient Lighting) uses an Ambient Lighting constant to simulate that in the dark, the object still has some light (moon, distant light). The ambient illumination C can be calculated by the following formulaamb
Figure BDA0002854537230000111
Wherein: m isambIs the ambient light component of the material, itAlways equal to the diffuse reflection component. gambIs the ambient light value of the entire scene.
Diffuse reflection Lighting (Diffuse Lighting) simulates the Directional effect of a light source on an object (Directional Impact). Diffuse reflected illumination obeys Lambert's law: the reflected light intensity is proportional to the cosine of the included angle between the normal vector and the light ray, the cosine is calculated by dot multiplication, and the diffuse reflection illumination C can be calculated by the following formuladiff
Figure BDA0002854537230000112
Wherein: n is the normal vector of the surface, l is the unit vector pointing to the light source, mdiffIs the scattering color of the material, i.e. the color of an object, S, which is recognized by most peoplediffScattering color for light source, mglsIs the gloss of the material, also known as Phong index.
Specular Lighting (Specular Lighting) simulates bright spots appearing on glossy objects. The specular illumination C can be calculated by the following formulaspec
Figure BDA0002854537230000113
Wherein: theta is an included angle between r and v and is given by r.v to describe the orientation of the mirror image; v points to the viewer; r is a "mirror" vector. m isglsIs the gloss of the material, also known as Phong index. m isspecThe intensity of the light spot is controlled for the reflected color of the material. SspecIs the specular color of the light source.
The ambient illumination, the diffuse reflection illumination and the mirror illumination of the portrait picture after the slimming and the ambient illumination, the diffuse reflection illumination and the mirror illumination of the video image of the second scene can be obtained through the calculation of the formula. Combining the illumination Attenuation (Attenuation), the following illumination Attenuation formula is obtained:
Figure BDA0002854537230000121
wherein: a is ambient illumination, D is diffuse reflection illumination, S is specular illumination, k is coefficient of corresponding parameter, a0、a1And a2Is an attenuation parameter.
By adjusting a of the portrait picture and/or the video picture of the second scene after slimming0、a1And a2The three parameters can realize different light intensity attenuation effects, so that the light distribution of the thin portrait picture and the second scene video picture is consistent.
And S6, outputting the fused video picture.
The method comprises the steps of collecting at least two scene videos of a live course in real time, wherein one scene video is composed of a first scene video picture containing a portrait, and the other scene video is composed of a second scene video picture containing a blackboard-writing; extracting a portrait picture in the first scene video picture; positioning the slimming position information of the portrait picture; then, the weight-reducing position information is subjected to weight-reducing processing, and the independent weight reduction of the portrait in the live broadcast class is completed; then fusing the slim portrait picture with the second scene video picture; and outputting the fused video picture. Therefore, the effect of reducing the weight of the portrait of the live broadcast class in real time and ensuring that the blackboard writing is not deformed is achieved.
The invention aims to further improve the learning attention of students and improve the classroom atmosphere. And when a scene switching instruction is received, switching the second scene video picture to a corresponding scene picture. For example, in fig. 3a, a live lesson displays a video including a slimmed teacher 31 and a blackboard-writing picture 32 in real time. When explaining ancient poems in a live broadcast class: when the desert is in a direct and long-river sunset round, a teacher can send a scene switching instruction by clicking a live broadcast equipment screen and the like, and then the blackboard writing 32 in the live broadcast class is automatically switched to the scene of the desert and the sunset in the Gobi as shown in fig. 3 b.
Furthermore, in order to deepen the attention and memory of students to key knowledge, the learning effect is improved. When receiving a key knowledge prompting instruction, the invention prompts key knowledge information. The key knowledge prompting instruction can be obtained by detecting the touch of a teacher on a detection device (such as a touch screen and a physical key arranged on the live broadcast equipment). The mode of prompting the key knowledge can be to play the key knowledge at a volume larger than the normal volume, display the key knowledge on a live screen in a key display mode, or display the key knowledge and the key knowledge in a result mode. As shown in fig. 4, the emphasis display may be performed by displaying a cartoon character 11 (e.g., a swinging bear) in the upper right corner of the live screen 10 and displaying the emphasis knowledge XXX in the lower part of the cartoon character 11. The emphasis display mode may also be to display the emphasis knowledge in an enlarged color font on the live screen, and the invention is not limited in particular.
Fig. 5 is a schematic structural diagram of a device for processing a person portrait in a live broadcast class to reduce weight, as shown in fig. 5, the device includes:
the acquisition module 51 is used for respectively acquiring a first scene video picture containing a portrait and a second scene video picture containing a blackboard writing picture in a live broadcast class;
an extracting module 52, configured to extract a portrait picture in the first scene video picture;
the positioning module 53 is configured to position the slimming position information of the portrait picture;
the slimming processing module 54 is configured to perform slimming processing on the slimming position information to obtain a portrait picture after slimming;
a fusion module 55, configured to fuse the slimmed portrait picture with the second scene video picture;
and the output module 56 is used for outputting the fused video pictures.
In one embodiment, the positioning module 53 includes:
the extraction module is used for extracting the key part position information and the key part height information of the portrait picture;
the determining module is used for determining the standard position information of the key part according to the height information of the key part;
and the comparison module is used for comparing the key part position information with the key part standard position information to obtain the slimming position information needing to be slimmed.
Specifically, the portrait picture is input into a deep learning model, and the position information of the key part of the portrait picture is obtained. The apparatus further comprises:
the acquisition module is used for acquiring portrait pictures in a historical live broadcast class as a sample set;
the marking module is used for marking the position information of the key part in the sample set picture;
and the training module is used for training a deep learning model according to the sample set and the position information of the key part.
In one embodiment, the slimming processing module 54 uses image distortion to map the pixels of the slimming position information into a preset range, so as to obtain a portrait picture after slimming.
The fusion module 55 includes:
the pasting module is used for pasting the slim portrait picture to a position, except for the blackboard writing, in the second scene video picture;
and the sub-processing module is used for processing the pasting boundary of the portrait picture and the second scene video picture by adopting a Gaussian aliasing algorithm.
Further, the apparatus further comprises:
and the graphics processing module is used for processing the fused pictures by adopting a lighting model of graphics.
In one embodiment, the apparatus further comprises:
and a switching module 57, configured to switch the second scene video picture to a corresponding scene picture when a scene switching instruction is received.
And the prompting module 58 is used for prompting the key knowledge information when receiving the key knowledge prompting instruction.
The method comprises the steps of collecting at least two scene videos of a live course in real time, wherein one scene video is composed of a first scene video picture containing a portrait, and the other scene video is composed of a second scene video picture containing a blackboard-writing; extracting a portrait picture in the first scene video picture; positioning the slimming position information of the portrait picture; then, the weight-reducing position information is subjected to weight-reducing processing, and the independent weight reduction of the portrait in the live broadcast class is completed; then fusing the slim portrait picture with the second scene video picture; and outputting the fused video picture. Therefore, the effect of reducing the weight of the portrait of the live broadcast class in real time and ensuring that the blackboard writing is not deformed is achieved.
Those skilled in the art will appreciate that the modules in the above-described embodiments of the apparatus may be distributed as described in the apparatus, and may be correspondingly modified and distributed in one or more apparatuses other than the above-described embodiments. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, where the electronic device includes a processor and a memory, where the memory is used to store a computer-executable program, and when the computer program is executed by the processor, the processor executes a live-course portrait slimming processing method.
As shown in fig. 6, the electronic device is in the form of a general purpose computing device. The processor can be one or more and can work together. The invention also does not exclude that distributed processing is performed, i.e. the processors may be distributed over different physical devices. The electronic device of the present invention is not limited to a single entity, and may be a sum of a plurality of entity devices.
The memory stores a computer executable program, typically machine readable code. The computer readable program may be executed by the processor to enable an electronic device to perform the method of the invention, or at least some of the steps of the method.
The memory may include volatile memory, such as Random Access Memory (RAM) and/or cache memory, and may also be non-volatile memory, such as read-only memory (ROM).
Optionally, in this embodiment, the electronic device further includes an I/O interface, which is used for data exchange between the electronic device and an external device. The I/O interface may be a local bus representing one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, and/or a memory storage device using any of a variety of bus architectures.
It should be understood that the electronic device shown in fig. 6 is only one example of the present invention, and elements or components not shown in the above example may be further included in the electronic device of the present invention. For example, some electronic devices further include a display unit such as a display screen, and some electronic devices further include a human-computer interaction element such as a button, a keyboard, and the like. Electronic devices are considered to be covered by the present invention as long as the electronic devices are capable of executing a computer-readable program in a memory to implement the method of the present invention or at least a part of the steps of the method.
Fig. 7 is a schematic diagram of a computer-readable recording medium of an embodiment of the present invention. As shown in fig. 7, a computer-readable recording medium stores a computer-executable program, and when the computer-executable program is executed, the live-broadcast class portrait slimming method according to the present invention is implemented. The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
From the above description of the embodiments, those skilled in the art will readily appreciate that the present invention can be implemented by hardware capable of executing a specific computer program, such as the system of the present invention, and electronic processing units, servers, clients, mobile phones, control units, processors, etc. included in the system, and the present invention can also be implemented by a vehicle including at least a part of the above system or components. The invention can also be implemented by computer software executing the method of the invention, for example, by control software executed by a microprocessor, an electronic control unit, a client, a server, etc. of a live device. It should be noted that the computer software for executing the method of the present invention is not limited to be executed by one or a specific hardware entity, but may also be implemented in a distributed manner by hardware entities without specific details, and for the computer software, the software product may be stored in a computer readable storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or may be stored in a distributed manner on a network, as long as it can enable an electronic device to execute the method according to the present invention.
While the foregoing embodiments have described the objects, aspects and advantages of the present invention in further detail, it should be understood that the present invention is not inherently related to any particular computer, virtual machine or electronic device, and various general-purpose machines may be used to implement the present invention. The invention is not to be considered as limited to the specific embodiments thereof, but is to be understood as being modified in all respects, all changes and equivalents that come within the spirit and scope of the invention.

Claims (10)

1. A live broadcast class portrait slimming processing method is characterized by comprising the following steps:
respectively acquiring a first scene video picture containing a portrait and a second scene video picture containing a blackboard writing in a live broadcast class;
extracting a portrait picture in the first scene video picture;
positioning the slimming position information of the portrait picture;
carrying out slimming treatment on the slimming position information to obtain a portrait picture after slimming;
fusing the slim portrait picture with the second scene video picture;
and outputting the fused video picture.
2. The live-broadcast-class portrait slimming processing method according to claim 1, wherein the positioning of the slimming position information of the portrait picture comprises:
extracting key part position information and key part height information of the portrait picture;
determining the standard position information of the key part according to the height information of the key part;
and comparing the key part position information with the key part standard position information to obtain the slimming position information needing to be slimmed.
3. The live-broadcast class portrait slimming processing method according to claim 1 or 2, wherein the portrait picture is input into a deep learning model to obtain key part position information of the portrait picture.
4. The live-broadcast class portrait slimming processing method according to any one of claims 1 to 3, wherein before the portrait images are input into a deep learning model and key part position information of the portrait images is obtained, the method further comprises:
acquiring portrait pictures in a historical live broadcast course as a sample set;
marking the position information of the key part in the sample set picture;
and training a deep learning model according to the sample set and the position information of the key part.
5. The live-broadcast class portrait slimming processing method of any one of claims 1 to 4, wherein the pixels of the slimming position information are mapped to a preset range by image warping to obtain a portrait picture after slimming.
6. The live-course portrait slimming processing method of any one of claims 1 to 5, wherein said fusing the thinned portrait picture with the second scene video picture comprises:
pasting the portrait picture after slimming to a position except for the blackboard writing in the second scene video picture;
processing a pasting boundary of the portrait picture and the second scene video picture by adopting a Gaussian aliasing algorithm;
optionally, after the blending the slimmed portrait picture with the second scene video picture, the method further includes:
and processing the fused picture by adopting a lighting model of graphics.
7. The live lesson portrait slimming processing method of any one of claims 1 to 6, wherein the method further comprises:
when a scene switching instruction is received, switching the second scene video picture to a corresponding scene picture;
optionally, the method further comprises:
when receiving the key knowledge prompting instruction, prompting key knowledge information.
8. A live course portrait slimming device, the device comprising:
the acquisition module is used for respectively acquiring a first scene video picture containing a portrait and a second scene video picture containing a blackboard writing picture in a live broadcast class;
the extraction module is used for extracting a portrait picture in the first scene video picture;
the positioning module is used for positioning the slimming position information of the portrait picture;
the slimming processing module is used for carrying out slimming processing on the slimming position information to obtain a portrait picture after slimming;
the fusion module is used for fusing the slim portrait picture with the second scene video picture;
and the output module is used for outputting the fused video pictures.
9. An electronic device comprising a processor and a memory, the memory for storing a computer-executable program, characterized in that:
the computer program, when executed by the processor, performs the method of any of claims 1-7.
10. A computer-readable medium storing a computer-executable program, wherein the computer-executable program, when executed, implements the method of any of claims 1-7.
CN202011539638.4A 2020-12-23 2020-12-23 Live-broadcast class portrait slimming processing method and device and electronic equipment Pending CN112561790A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011539638.4A CN112561790A (en) 2020-12-23 2020-12-23 Live-broadcast class portrait slimming processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011539638.4A CN112561790A (en) 2020-12-23 2020-12-23 Live-broadcast class portrait slimming processing method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN112561790A true CN112561790A (en) 2021-03-26

Family

ID=75030983

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011539638.4A Pending CN112561790A (en) 2020-12-23 2020-12-23 Live-broadcast class portrait slimming processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN112561790A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991654A (en) * 2017-03-09 2017-07-28 广东欧珀移动通信有限公司 Human body beautification method and apparatus and electronic installation based on depth
CN107566853A (en) * 2017-09-21 2018-01-09 北京奇虎科技有限公司 Realize the video data real-time processing method and device, computing device of scene rendering
CN107945188A (en) * 2017-11-20 2018-04-20 北京奇虎科技有限公司 Personage based on scene cut dresss up method and device, computing device
CN107977927A (en) * 2017-12-14 2018-05-01 北京奇虎科技有限公司 Stature method of adjustment and device, computing device based on view data
US20200311962A1 (en) * 2019-03-26 2020-10-01 Nec Laboratories America, Inc. Deep learning based tattoo detection system with optimized data labeling for offline and real-time processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991654A (en) * 2017-03-09 2017-07-28 广东欧珀移动通信有限公司 Human body beautification method and apparatus and electronic installation based on depth
CN107566853A (en) * 2017-09-21 2018-01-09 北京奇虎科技有限公司 Realize the video data real-time processing method and device, computing device of scene rendering
CN107945188A (en) * 2017-11-20 2018-04-20 北京奇虎科技有限公司 Personage based on scene cut dresss up method and device, computing device
CN107977927A (en) * 2017-12-14 2018-05-01 北京奇虎科技有限公司 Stature method of adjustment and device, computing device based on view data
US20200311962A1 (en) * 2019-03-26 2020-10-01 Nec Laboratories America, Inc. Deep learning based tattoo detection system with optimized data labeling for offline and real-time processing

Similar Documents

Publication Publication Date Title
US11335379B2 (en) Video processing method, device and electronic equipment
CN107909022B (en) Video processing method and device, terminal equipment and storage medium
Dash et al. Designing of marker-based augmented reality learning environment for kids using convolutional neural network architecture
US20180197345A1 (en) Augmented reality technology-based handheld viewing device and method thereof
CN114373050A (en) Chemistry experiment teaching system and method based on HoloLens
CN111840999A (en) Game education method based on three-dimensional object graph recognition
Kasinathan et al. First Discovery: Augmented Reality for learning solar systems
US20230353702A1 (en) Processing device, system and method for board writing display
CN112561790A (en) Live-broadcast class portrait slimming processing method and device and electronic equipment
Chen et al. Research on augmented reality system for childhood education reading
Gundala et al. Implementing augmented reality using opencv
CN109885172A (en) A kind of object interaction display method and system based on augmented reality AR
CN116434253A (en) Image processing method, device, equipment, storage medium and product
CN114782460A (en) Image segmentation model generation method, image segmentation method and computer equipment
CN115379278A (en) XR technology-based immersive micro-class recording method and system
Alshi et al. Interactive augmented reality-based system for traditional educational media using marker-derived contextual overlays
CN114387315A (en) Image processing model training method, image processing device, image processing equipment and image processing medium
Chen et al. New Enhancement Techniques for Optimizing Multimedia Visual Representations in Music Pedagogy.
Alkurdi Educational Augmented Reality Solar System
Carmo et al. Improving symbol salience in augmented reality
Tao A VR/AR-based display system for arts and crafts museum
KR20110107707A (en) Online learning apparatus for augmented reality and method thereof
Chaniago et al. Augmented Reality Media As A Butterfly Metamorphosis Learning Method Based Marker Tracking: Augmented Reality Media As A Butterfly Metamorphosis Learning Method Based Marker Tracking
Samini et al. A user study on touch interaction for user-perspective rendering in hand-held video see-through augmented reality
Andjelkovic Architecture in the Age of Immersive Augmented Reality Environments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination