WO2022083133A1 - 一种远程视频会议演示的方法和装置 - Google Patents

一种远程视频会议演示的方法和装置 Download PDF

Info

Publication number
WO2022083133A1
WO2022083133A1 PCT/CN2021/098991 CN2021098991W WO2022083133A1 WO 2022083133 A1 WO2022083133 A1 WO 2022083133A1 CN 2021098991 W CN2021098991 W CN 2021098991W WO 2022083133 A1 WO2022083133 A1 WO 2022083133A1
Authority
WO
WIPO (PCT)
Prior art keywords
area
presentation
portrait
presenter
image
Prior art date
Application number
PCT/CN2021/098991
Other languages
English (en)
French (fr)
Inventor
邵猛
魏博
Original Assignee
深圳市前海手绘科技文化有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市前海手绘科技文化有限公司 filed Critical 深圳市前海手绘科技文化有限公司
Publication of WO2022083133A1 publication Critical patent/WO2022083133A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/272Means for inserting a foreground image in a background image, i.e. inlay, outlay

Definitions

  • the invention belongs to the technical field of hand-drawn animation, and in particular relates to an application method, device, electronic device and storage medium for remote video conference presentation.
  • remote office In modern office scenarios, there are many scenarios that require remote presentations for remote conference discussions and sharing.
  • remote office is usually realized through video conferencing software.
  • the disadvantage is that it can only perform file presentation or camera video playback and display alone, and cannot play camera video and presentation files at the same time, let alone interact. This reduces the richness of the remote presentation and reduces the effectiveness of presentations and discussion sharing.
  • a method for remote video conference presentation comprising the steps of:
  • the demonstration video and the portrait of the participation area are displayed synchronously.
  • the present invention provides a device for remote video conference presentation, which is characterized by comprising:
  • the acquisition module can acquire the image of the demonstration area, the audio of the presenter and the image of the participating area in real time;
  • an identification module for performing portrait identification on the image of the demonstration area and the image of the participating area, and obtaining the portrait in the demonstration area and the portrait in the participating area;
  • a synthesis module for synthesizing the presentation file, the portrait in the presentation area and the audio of the presenter, to obtain a presentation video
  • a synchronization module which synchronously displays the demonstration video and the portrait of the participation area.
  • the technical effect shows that the participants need to synchronize the video shot by their own camera video in real time, so that the presenter can view the avatar list of the participants in real time, so as to monitor the students' class situation. At the same time, participants can also obtain the video captured by the presenter's camera in real time, which can be used to watch the presenter's demonstration.
  • participant can watch the presentation files played by the presenter, the presenter's portrait and listen to the audio of the presenter in real time. This effect of simultaneous appearance and interaction through the presentation file and the camera video stream can be Greatly improve the richness and presentation effect of remote video conference presentations.
  • the described steps of obtaining the image of the demonstration area, the audio of the presenter and the image of the participating area in real time include:
  • the above-mentioned obtaining module further includes:
  • the first acquisition unit acquires the image of the demonstration area captured by the camera in real time
  • the second acquisition unit acquires the image of the participating area captured by the camera in real time
  • the third acquiring unit acquires the audio of the presenter recorded by the recording device in real time.
  • the computer is transmitted to the participant's computer in real time; the participant's camera is aimed at the presenter, the image of the participating area is captured in real time, and the image of the participating area is transmitted from the presenter's computer to the participant's computer in real time through the remote network;
  • the recording device records the audio of the presenter in real time, and transmits the audio of the presenter from the presenter's computer to the participant's computer through the remote network.
  • the present invention creatively transmits the image and audio of the presenter to the participant in real time, and simultaneously transmits the image of the participant to the presenter, so as to realize the function of the presenter and the participant synchronizing each other's images, and the participant can simultaneously.
  • the audio of the presenter is received, which increases the vividness of the presentation video and makes the remote video conference more orderly.
  • performing portrait recognition on the image of the demonstration area and the image of the participating area and obtaining the portrait of the demonstration area and the portrait of the participating area are as follows:
  • the area of the participation area that does not contain the portrait is processed as a pixel blank.
  • the above-mentioned identification module further includes:
  • a first identifying unit identifying whether there is a portrait in the image of the demonstration area
  • a fourth acquisition unit if there is the portrait, obtain the area included in the portrait
  • a first processing unit processing the area that does not contain the portrait in the demonstration area as a pixel blank
  • a second identification unit for identifying whether there is a portrait in the image of the participating area
  • the fifth acquisition unit if there is the portrait, obtain the area included in the portrait
  • the second processing unit processes the area in the participation area that does not contain the portrait as a pixel blank.
  • the images of the presenter and the participant captured by the presenter's camera usually contain both the portrait and the background, and for both the participant and the presenter, it is enough to only see the portrait, and There is no need to show the background behind the portrait. Therefore, in this application, it is first possible to identify whether there is a portrait in the image captured by the camera. If there is a portrait, the area containing the portrait will be cut out through the portrait edge detection technology, and the background area other than the portrait in the image will be pixel-transparent. In this way, both the portrait viewed by the presenter and the portrait viewed by the participant are only the human body part, and the background part other than the human body part cannot be seen. This method of showing only the portrait by cutting out the portrait highlights the human body itself, which is beneficial for the presenter and the presenter to observe each other's expressions more clearly.
  • the steps of obtaining the demo video from the synthetic presentation file, the portrait in the demo area and the audio of the presenter are:
  • the presenter audio and the presentation image are superimposed to obtain a presentation video.
  • the above synthesis module also includes:
  • Setting unit setting the presentation file on the top of the bottom background, and setting the presentation area portrait on the top of the presentation file to form a presentation image
  • a superimposing unit for superimposing the audio of the presenter and the presentation image to obtain a presentation video.
  • the underlying background of the set demonstration area supports image and video forms as the underlying background of the demonstration area; then read the demonstration file specified by the presenter through the algorithm, read the display information and size information of the file, and Display the display file at the specified position on the underlying background; finally place the image containing only the participant's body part on top of the presentation file to form a presentation image.
  • the presenter's computer can superimpose the presentation audio and transmit the presentation image to the participant's computer at a specified frame rate, resulting in a presentation video.
  • the obtained presentation video includes:
  • the initial data includes initial position information and/or initial size information
  • the presentation file and the position and size of the portrait in the presentation area are initialized to obtain a presentation video.
  • the method further includes:
  • Receive input adjustment data, and the adjustment data is used to adjust the initial position information and/or initial size information to obtain modified position information and/or modified size information;
  • the presentation file, the position and size of the portrait in the presentation area are processed again to obtain a presentation video.
  • the presentation file and the position and size of the portrait in the presentation area are processed again to obtain a presentation video.
  • the obtaining operation information of the presenter includes:
  • the demo file and the position and size of the portrait in the demo area are processed again to obtain a demo video including:
  • the portrait in the demonstration area is constrained until the portrait in the demonstration area and the presentation file reach the second preset condition.
  • the synthesis module includes:
  • an initial receiving unit for receiving initial data the initial data includes initial position information and/or initial size information
  • An initialization unit configured to initialize the presentation file and the position and size of the portrait in the presentation area based on the initial position information and/or initial size information to obtain a presentation video.
  • the synthesis module further includes:
  • Adjustment receiving unit for receiving input adjustment data, and described adjustment data is used to adjust described initial position information and/or initial size information to obtain modified position information and/or modified size information;
  • the modification unit based on the modified position information and/or the modified size information, reprocesses the presentation file and the position and size of the portrait in the presentation area to obtain a presentation video.
  • the acquisition unit is used to acquire the operation information of the presenter
  • the processing unit is configured to process the presentation file and the position and size of the portrait in the presentation area again based on the operation information to obtain a presentation video.
  • the obtaining unit is further configured to perform the following steps, including:
  • the processing unit is further configured to perform the following steps, including:
  • the portrait in the demonstration area is constrained until the portrait in the demonstration area and the presentation file reach the second preset condition.
  • the present invention also provides an electronic device, comprising a memory and a processor, wherein the memory stores a computer program, and the computer program is executed in the processor to implement any one of the above methods.
  • the electronic device may be a mobile terminal or a web terminal.
  • the present invention also provides a storage medium storing a computer program, and the computer program can implement any of the above methods when executed in a processor.
  • the method and device for remote video conference presentation provided by the present invention obtain the presentation area image, the audio of the presenter and the participation area image in real time; perform portrait recognition on the presentation area image and the participation area image to obtain the presentation area portrait and the participation area portrait; Synthesize the presentation file, the portrait in the demonstration area and the audio of the presenter to obtain a demonstration video; and display the demonstration video and the portrait in the participation area synchronously. Realize the superimposed playback of participant portraits and presentation files, as well as the interaction between participants and presenters, which greatly improves the richness of remote presentations and the effect of discussion and sharing.
  • FIG. 1 is a flowchart of a remote video conference presentation method provided by an embodiment
  • FIG. 2 is an apparatus architecture diagram corresponding to the method in FIG. 1 provided by an embodiment
  • FIG. 3 is a flowchart of a method for acquiring images and audio provided by an embodiment
  • FIG. 4 is an apparatus architecture diagram corresponding to the method in FIG. 3 provided by an embodiment
  • FIG. 5 is a flowchart of performing matting processing on a portrait in an image provided by an embodiment
  • FIG. 6 is a device architecture diagram corresponding to the method in FIG. 5 provided by an embodiment
  • FIG. 7 is a flowchart of a method for generating a demonstration video provided by an embodiment
  • FIG. 8 is an apparatus architecture diagram corresponding to the method in FIG. 7 according to an embodiment.
  • the term “storage medium” may be various media that can store computer programs, such as ROM, RAM, magnetic disk or optical disk.
  • the term "processor” can be CPLD (Complex Programmable Logic Device: Complex Programmable Logic Device), FPGA (Field-Programmable Gate Array: Field Programmable Gate Array), MCU (Microcontroller Unit: Micro Control Unit), PLC (Programmable Logic) Controller: programmable logic controller) and CPU (Central Processing Unit: central processing unit) and other chips or circuits with data processing functions.
  • electronic device may be any device with data processing and storage functions, and may generally include both stationary and mobile terminals. Fixed terminals such as desktops, etc. Mobile terminals such as mobile phones, PADs and mobile robots. In addition, the technical features involved in the different embodiments of the present invention described later can be combined with each other as long as there is no conflict with each other.
  • this embodiment provides a method for remote video conference presentation, including the following steps:
  • S2 perform portrait recognition on the image of the demonstration area and the image of the participation area, and obtain the portrait of the demonstration area and the portrait of the participation area;
  • this embodiment provides an application device for remote video conference presentation, including:
  • Obtaining module 1 obtains the image of the demonstration area, the audio of the presenter and the image of the participating area in real time;
  • Recognition module 2 carries out portrait recognition to the image of the demonstration area and the image of the participation area, and obtains the portrait of the demonstration area and the portrait of the participation area;
  • Synthesizing module 3 synthesizing the presentation file, the portrait of the presentation area and the audio of the presenter, to obtain a presentation video
  • Synchronization module 4 synchronously displaying the demonstration video and the portrait of the participation area.
  • participant need to synchronize the video captured by their own camera video in real time, so that the presenter can view the avatar list of the participants in real time, so as to monitor the students' class situation.
  • participants can also obtain the video captured by the presenter's camera in real time, which can be used to watch the presenter's demonstration.
  • participant can watch the presentation files played by the presenter, the presenter's portrait and listen to the audio of the presenter in real time. This effect of simultaneous appearance and interaction through the presentation file and the camera video stream can be Greatly improve the richness and presentation effect of remote video conference presentations.
  • step S1 and step S2 are creatively combined into an integral technical means inseparable, so as to obtain the portrait in the demonstration area and the portrait in the participation area.
  • step S1 acquires the image of the demonstration area, the audio of the presenter, and the image of the participation area in real time, which provides prior preparations for synchronizing the portrait of the presenter and the portrait of the participant
  • step S2 performs portrait recognition on the image of the demonstration area and the image of the participation area, Get Demo Area Portraits and Participating Area Portraits.
  • Steps S1 and S2 jointly complete the process of completing the portrait cutout processing for the presenter's portrait and the participant's portrait.
  • Participants need to synchronize the video captured by their own camera video in real time, so that the presenter can view the avatar list of the participants in real time, so as to monitor the students' class. At the same time, participants can also obtain the video captured by the presenter's camera in real time, which can be used to watch the presenter's demonstration.
  • step S3 and step S4 are creatively combined into an integral technical means, and participants can watch the presentation file played by the presenter, the portrait of the presenter and listen to the audio of the presenter in real time. Through the simultaneous appearance and interaction of presentation files and camera video streams, the richness and presentation effect of remote video conference presentations can be greatly improved.
  • step S1, step S2, step S3 and step S4 are creatively combined into an integral technical means that cannot be separated, so that participants can watch the demonstration video and demonstration portrait of the presenter in real time, and the presenter can also watch in real time. Watch the portrait list composed of portraits of all participants, and realize monitoring and coordination of all meeting participants.
  • step S1 includes:
  • the acquisition module 1 includes:
  • the first acquisition unit 10 acquires the image of the demonstration area captured by the camera in real time
  • the second acquisition unit 11 acquires the image of the participating area captured by the camera in real time
  • the third acquiring unit 12 acquires the audio of the presenter recorded by the recording device in real time.
  • step S10 the camera of the presenter is aimed at the presenter, and the image of the demonstration area is captured in real time, and the image of the demonstration area is transmitted from the computer of the presenter to the computer of the participant in real time through the remote network;
  • the quasi-presenter shoots the image of the participating area in real time, and transmits the image of the participating area from the computer of the presenter to the computer of the participant in real time through the remote network;
  • step S12 when the presenter is giving a speech, the recording device will record the presenter in real time.
  • the audio of the presenter is transmitted over a remote network from the presenter's computer to the participant's computer.
  • the present invention creatively transmits the image and audio of the presenter to the participant in real time, and simultaneously transmits the image of the participant to the presenter, so as to realize the function of the presenter and the participant synchronizing each other's images, and the participant can simultaneously.
  • the audio of the presenter is received, increasing the liveliness of the presentation video and making the remote video conference more orderly.
  • step S2 includes the steps:
  • the identification module 2 further includes:
  • the first identification unit 20 to identify whether there is a portrait in the image of the demonstration area
  • the first processing module 22 processing the area that does not contain the portrait in the demonstration area by pixel blank;
  • the second identification unit 23 identifies whether there is a portrait in the image of the participating area
  • the fifth obtaining unit 24 if there is the portrait, obtains the area included in the portrait;
  • the second processing module 25 is to divide the area that does not contain the portrait in the participating area by pixel
  • the images of the presenter and the participant captured by the presenter's camera usually contain both the portrait and the background, and for both the participant and the presenter, it is enough to only see the portrait, and There is no need to show the background behind the portrait. Therefore, in steps S20 and S23 of the present application, it is first possible to identify whether there is a portrait in the image captured by the camera, and if there is a portrait, then step S21 and step S24 will extract the area containing the portrait by the portrait edge detection technology, step S22 In step S25, the background area except the portrait is processed as pixel transparency, so that both the portrait viewed by the presenter and the portrait viewed by the participant are only human body parts, and no other human body parts can be seen. outside the background part. This method of showing only the portrait by cutting out the portrait highlights the human body itself, which is beneficial for the presenter and the presenter to observe each other's expressions more clearly.
  • step S3 includes steps:
  • the synthesis module 3 includes:
  • the setting unit 30 is used to set the presentation file on the top of the bottom background, and set the presentation area portrait on the top of the presentation file to form a presentation image;
  • the superimposing unit 31 is configured to superimpose the audio of the presenter and the presentation image to obtain a presentation video.
  • step S30 the function of step S30 is to set the underlying background of the demonstration area, which supports image and video forms as the underlying background of the demonstration area; then read the demonstration file specified by the presenter through an algorithm, and read the display information and size information of the file. , and display the display file at the designated position on the bottom background; step S31, place the image containing only the participant's body part above the demonstration file to form a demonstration image.
  • steps S30 and S31 are creatively combined into integral technical means that cannot be separated, and are used to jointly form a demonstration image.
  • presentation file can freely adjust the size and move the position in the area of the underlying background
  • presentation portrait can also freely adjust the size and move the position.
  • the formed presentation image is transmitted by the presenter to the participant through the computer, and the image containing only the participant's body part is transmitted to the presenter's computer via the remote network, so as to realize the simultaneous realization between the presenter and the participant. Appearance and interaction effects.
  • the present invention synthesizes the demonstration file, the portrait in the demonstration area and the audio of the presenter, and the obtained demonstration video includes:
  • Initial data is received, the initial data includes initial position information and/or initial size information.
  • the initial position information and ⁇ or initial size information may be preset.
  • the presentation file and the portrait in the presentation area have a default presentation method, that is, the presentation is performed according to the initial position information and/or initial size information.
  • the presentation file and the position and size of the portrait in the presentation area are initialized to obtain a presentation video.
  • the corresponding demonstration video is obtained, and the demonstration file and the portrait in the demonstration area are the initial position information and/or initial size information in the demonstration video.
  • the initial position information may be relative, that is, the position of the portrait in the demonstration area relative to the presentation file, or may be the position of the portrait in the demonstration area relative to the display being played.
  • the initial size information can be a specific value, say 4 inches, 9 inches, and so on.
  • the method further includes:
  • the adjustment data is used to adjust the initial position information and/or initial size information to obtain modified position information and/or modified size information.
  • the initial initial position information and/or initial size information are adjusted accordingly under different scenarios by adjusting the data.
  • the presentation file, the position and size of the portrait in the presentation area are processed again to obtain a presentation video.
  • the presenter may adjust the presentation file, the position and/or size of the person in the presentation area according to the situation, and the data can be adjusted according to the user input at this time.
  • the adjustment data may be obtained through devices such as a mouse, a keyboard, and a touch screen.
  • Get the presenter's action information During the presentation of the presenter, the presentation file and the portrait in the demonstration area may be operated, and there will be a process of adjusting the presentation file and the portrait in the demonstration area during the operation.
  • the presentation file and the position and size of the portrait in the presentation area are processed again to obtain a presentation video.
  • the obtaining operation information of the presenter includes:
  • Get the presenter's action on the presentation file For example, the operation of the presenter on the presentation file may be to enlarge, reduce and so on the presentation file.
  • the demonstration video is obtained by reprocessing the demonstration file and the position and size of the portrait in the demonstration area based on the operation information:
  • the first preset condition may be a preset ratio, or may be a relationship between the portrait in the demo area and the presentation file. For example, if the first condition is that the portrait in the demo area occludes the presentation file, then when the portrait in the demo area is occluded by the presentation file After the presentation file is blocked, that is, the presenter's operation on the presentation file reaches the first preset condition. At this time, the portraits in the demonstration area are constrained, and the constraint method may be to reduce the portraits in the demonstration area.
  • the second preset condition may be that the portraits in the demonstration area no longer block the presentation files, that is, when the portraits in the demonstration area are reduced to no longer block After the presentation of the files, the second preset condition is reached. This step is to actively adjust the portraits and presentation files in the presentation area, so that the adjusted remote video conference presentation process is clearer.
  • the present invention includes in the synthesis module:
  • an initial receiving unit configured to receive initial data, the initial data includes initial position information and ⁇ or initial size information.
  • the initial position information and ⁇ or initial size information may be preset.
  • the presentation file and the portrait in the presentation area have a default presentation method, that is, the presentation is performed according to the initial position information and/or initial size information.
  • An initialization unit configured to initialize the presentation file and the position and size of the portrait in the presentation area based on the initial position information and/or initial size information to obtain a presentation video. After the demonstration, the corresponding demonstration video is obtained, and the demonstration file and the portrait in the demonstration area are the initial position information and/or initial size information in the demonstration video.
  • the initial position information may be relative, that is, the position of the portrait in the demonstration area relative to the presentation file, or may be the position of the portrait in the demonstration area relative to the display being played.
  • the initial size information can be a specific value, say 4 inches, 9 inches, and so on.
  • the present invention also includes in the synthesis module:
  • the adjustment receiving unit is used for receiving input adjustment data, and the adjustment data is used for adjusting the initial position information and/or initial size information to obtain modified position information and/or modified size information.
  • the initial initial position information and/or initial size information are adjusted accordingly under different scenarios by adjusting the data.
  • the modification unit based on the modified position information and/or the modified size information, reprocesses the presentation file and the position and size of the portrait in the presentation area to obtain a presentation video.
  • the presenter may adjust the presentation file, the position and/or size of the person in the presentation area according to the situation, and the data can be adjusted according to the user input at this time.
  • the present invention also includes:
  • Get unit used to get the operation information of the presenter.
  • the presentation file and the portrait in the demonstration area may be operated, and there will be a process of adjusting the presentation file and the portrait in the demonstration area during the operation.
  • the processing unit is configured to process the presentation file and the position and size of the portrait in the presentation area again based on the operation information to obtain a presentation video.
  • the obtaining unit is further configured to perform the following steps, including:
  • Get the presenter's action on the presentation file For example, the operation of the presenter on the presentation file may be to enlarge, reduce and so on the presentation file.
  • the processing unit of the present invention is also used to perform the following steps, including:
  • the first preset condition may be a preset ratio, or may be a relationship between the portrait in the demo area and the presentation file. For example, if the first condition is that the portrait in the demo area occludes the presentation file, then when the portrait in the demo area is occluded by the presentation file After the presentation file is blocked, that is, the presenter's operation on the presentation file reaches the first preset condition. At this time, the portraits in the demonstration area are constrained, and the constraint method may be to reduce the portraits in the demonstration area.
  • the second preset condition may be that the portraits in the demonstration area no longer block the presentation files, that is, when the portraits in the demonstration area are reduced to no longer block After the presentation of the files, the second preset condition is reached. This step is to actively adjust the portraits and presentation files in the presentation area, so that the adjusted remote video conference presentation process is clearer.
  • the present invention also provides an electronic device, comprising a memory and a processor, wherein the memory stores a computer program, and the computer program is executed in the processor to implement any one of the above methods.
  • the electronic device may be a mobile terminal or a web terminal.
  • the present invention also provides a storage medium storing a computer program, and the computer program can implement any of the above methods when executed in a processor.

Abstract

本发明属于手绘动画技术领域,提供一种远程视频会议演示的方法和装置,通过实时获取演示区域图像、演示者音频以及参与区域图像;对演示区域图像和参与区域图像进行人像识别,获取演示区人像和参与区人像;再合成演示文件、所述演示区人像以及所述演示者音频,得到演示视频;最后同步显示演示视频和所述参与区人像,实现同步演示文件和人像视频,提升远程视频会议的演示效果。

Description

一种远程视频会议演示的方法和装置 技术领域
本发明属于手绘动画技术领域,尤其涉及一种远程视频会议演示的应用方法、装置、电子设备和存储介质。
背景技术
在现代办公场景中,有很多需要远程演示,以进行远程会议讨论和分享的场景。目前实现远程办公通常是通过视频会议软件来实现,其缺陷在于仅能单独进行文件演示或摄像头视频的播放显示,无法同时进行摄像头视频和演示文件的同时播放,更无法进行交互。这会降低远程演示的丰富度,降低演示和讨论分享的效果。
综上所述,在现有技术中,存在摄像头视频和演示文件无法实现实时交互播放的问题。
发明内容
一种远程视频会议演示的方法,其特征在于,包括步骤:
实时获取演示区域图像、演示者音频以及参与区域图像;
对所述演示区域图像和所述参与区域图像进行人像识别,获取演示区人像和参与区人像;
合成演示文件、所述演示区人像以及所述演示者音频,得到演示视频;
同步显示所述演示视频和所述参与区人像。
对应的,本发明提供一种远程视频会议演示的装置,其特征在于,包括:
获取模块,实时获取演示区域图像、演示者音频以及参与区域图像;
识别模块,对所述演示区域图像和所述参与区域图像进行人像识别,获取演示区人像和参与区人像;
合成模块,合成演示文件、所述演示区人像以及所述演示者音频,得到演示视频;
同步模块,同步显示所述演示视频和所述参与区人像。
技术效果说明,参与者需要实时同步自己的摄像头视频所拍摄的视频,供演示者能够实时观看到参与者的头像列表,以便于监控学生的上课情况。同时参与 者还能实时获取演示者摄像头所拍摄的视频,用于观看演示者的演示。
还需要说明的是,参与者能实时观看到演示者所播放的演示文件、演示者的人像以及收听到演示者的音频,这种通过演示文件和摄像头视频流同时出现和交互的效果,可以在很大程度上提升远程视频会议演示的丰富度和演示效果。
另外,所述的实时获取演示区域图像、演示者音频以及参与区域图像的步骤包括:
实时获取摄像头所拍摄的演示区域的图像;
实时获取摄像头所拍摄的参与区域的图像;
实时获取录音设备所记录的演示者的音频。
对应的,上述所述获取模块还包括:
第一获取单元,实时获取摄像头所拍摄的演示区域的图像;
第二获取单元,实时获取摄像头所拍摄的参与区域的图像;
第三获取单元,实时获取录音设备所记录的演示者的音频。
技术效果说说明,要实现远程视频会议的顺利进行,需要借助录像设备和录音设备,演示者的摄像头对准演示者,实时拍摄演示区域的图像,通过远程网络将演示区域的图像从演示者的计算机实时传输至参与者的计算机;参与者的摄像头对准演示者,实时拍摄参与区域的图像,通过远程网络将参与区域的图像从演示者的计算机实时传输至参与者的计算机;演示者在进行演讲时,录音设备会实时记录演示者的音频,将所述演示者的音频通过远程网络从演示者的计算机传输至参与者的计算机。
还需要说明的是,本发明创造性地实时传输演示者的图像和音频给参与者,同时传输参与者的图像给演示者,实现演示者和参与者互相同步对方图像的功能,并且参与者能同时接收到演示者的音频,增加了演示视频的生动程度,使得远程视频会议更具有秩序性。
另外,对所述演示区域图像和所述参与区域图像进行人像识别,获取演示区人像和参与区人像的步骤为:
识别所述演示区域的图像中是否存在人像;
若存在所述人像,获取所述人像所包含的区域;
将所述演示区域中不包含所述人像的区域按像素空白处理;
识别所述参与区域的图像中是否存在人像;
若存在所述人像,获取所述人像所包含的区域;
将所述参与区域中不包含所述人像的区域按像素空白处理。
对应的,上述识别模块还包括:
第一识别单元,识别所述演示区域的图像中是否存在人像;
第四获取单元,若存在所述人像,获取所述人像所包含的区域;
第一处理单元,将所述演示区域中不包含所述人像的区域按像素空白处理;
第二识别单元,识别所述参与区域的图像中是否存在人像;
第五获取单元,若存在所述人像,获取所述人像所包含的区域;
第二处理单元,将所述参与区域中不包含所述人像的区域按像素空白处理。
技术效果说明,演示者的摄像头所拍摄的演示者和参与者的图像中,通常既包含了人像又包含了背景,而无论是对于参与者还是演示者,只需要看到人像就够了,并不需要展示人像后的背景。所以在本申请中,首先能识别摄像头所拍摄的图像中是否存在人像,如果存在人像,则将包含人像的区域通过人像边缘检测技术抠出,图像中除了人像之外的背景区域则按像素透明处理,这样子无论是演示者所观看到的人像还是参与者所观看到的人像都只是人体部分,而看不到人体部分之外的背景部分。这种通过抠出人像从而只展现人像的方式更加突出了人体本身,有利于演示者和演示者能更清晰地观察对方的表情。
另外,所述的合成演示文件、所述演示区人像以及所述演示者音频,得到演示视频的步骤为:
设置所述演示文件于底层背景上部,并设置演示区人像于所述演示文件的上部,形成演示图像;
叠加所述演示者音频和所述演示图像,得到演示视频。
对应的,上述合成模块还包括:
设置单元,设置所述演示文件于底层背景上部,并设置演示区人像于所述演示文件的上部,形成演示图像;
叠加单元,叠加所述演示者音频和所述演示图像,得到演示视频。
技术效果说明,所述设置的演示区域的底层背景,支持图像和视频形式,作为演示区域的底层背景;再通过算法读取演示者指定的演示文件,读取文件的显示信息和尺寸信息,并将显示文件显示在底层背景上的指定位置;最后将只包含参与者人体部分的图像置于演示文件的上方,形成演示图像。
更重要的是,演示者的计算机能够叠加演示音频,并按照指定帧率将所述演示图像传输至参与者的计算机,得到演示视频。
另外,所述合成演示文件、所述演示区人像以及所述演示者音频,得到演示视频包括:
接收初始数据,所述初始数据包括初始位置信息和\或初始尺寸信息;
基于所述初始位置信息和\或初始尺寸信息对所述演示文件、所述演示区人像的位置和尺寸进行初始化处理得到演示视频。
另外,在对所述演示文件、所述演示区人像的位置和尺寸进行初始化处理后,还包括:
接收输入的调整数据,所述调整数据用于对所述初始位置信息和\或初始尺寸信息进行调整得到修改位置信息和\或修改尺寸信息;
基于所述修改位置信息和\或修改尺寸信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频。
另外,还包括:
获取演示者的操作信息;
基于所述操作信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频。
另外,所述获取演示者的操作信息包括:
获取演示者对演示文件的操作;
基于所述操作信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频包括:
若演示者对演示文件的操作达到第一预设条件,对所述演示区人像进行约束直至演示区人像与演示文件达到第二预设条件。
对应的,所述合成模块包括:
初始接收单元,用于接收初始数据,所述初始数据包括初始位置信息和\或初始尺寸信息;
初始化单元,用于基于所述初始位置信息和\或初始尺寸信息对所述演示文件、所述演示区人像的位置和尺寸进行初始化处理得到演示视频。
对应的,所述合成模块还包括:
调整接收单元,用于接收输入的调整数据,所述调整数据用于对所述初始位 置信息和\或初始尺寸信息进行调整得到修改位置信息和\或修改尺寸信息;
修改单元,基于所述修改位置信息和\或修改尺寸信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频。
对应的,还包括:
获取单元,用于获取演示者的操作信息;
处理单元,用于基于所述操作信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频。
对应的,所述获取单元还用于执行以下步骤,包括:
获取演示者对演示文件的操作;
所述处理单元还用于执行以下步骤,包括:
若演示者对演示文件的操作达到第一预设条件,对所述演示区人像进行约束直至演示区人像与演示文件达到第二预设条件。
本发明还提供一种电子设备,包括存储器和处理器,所述存储器存储计算机程序,所述计算机程序在所述处理器中执行可实现上述任一种方法。其中,电子设备可以为移动终端或web端。
本发明还提供一种存储介质,存储计算机程序,所述计算机程序在处理器中执行可实现上述任一种方法。
本发明提供的远程视频会议演示的方法和装置,通过实时获取演示区域图像、演示者音频以及参与区域图像;对演示区域图像和参与区域图像进行人像识别,获取演示区人像和参与区人像;再合成演示文件、演示区人像以及所述演示者音频,得到演示视频;并同步显示所述演示视频和所述参与区人像。实现叠加播放参与者人像和演示文件,以及参与者和演示者之间的交互进行,极大地提高了远程演示的丰富度和讨论分享的效果。
附图说明
图1为一实施例提供的远程视频会议演示方法的流程图;
图2为一实施例提供的对应图1中方法的装置架构图;
图3为一实施例提供的获取图像和音频方法的流程图;
图4为一实施例提供的对应图3中方法的装置架构图;
图5为一实施例提供的对图像中的人像进行抠图处理的流程图;
图6为一实施例提供的对应图5中方法的装置架构图;
图7为一实施例提供的演示视频生成方法的流程图;
图8为一实施例提供的对应图7中方法的装置架构图。
具体实施方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,在本发明的描述中,除非另有明确的规定和限定,术语“存储介质”可以是ROM、RAM、磁碟或者光盘等各种可以存储计算机程序的介质。术语“处理器”可以是CPLD(Complex Programmable Logic Device:复杂可编程逻辑器件)、FPGA(Field-Programmable Gate Array:现场可编程门阵列)、MCU(Microcontroller Unit:微控制单元)、PLC(Programmable Logic Controller:可编程逻辑控制器)以及CPU(Central Processing Unit:中央处理器)等具备数据处理功能的芯片或电路。术语“电子设备”可以是具有数据处理功能和存储功能的任何设备,通常可以包括固定终端和移动终端。固定终端如台式机等。移动终端如手机、PAD以及移动机器人等。此外,后续所描述的本发明不同实施方式中所涉及的技术特征只要彼此之间未构成冲突就可以相互结合。
下面,本发明提出部分优选实施例以教导本领域技术人员实现。
实施例一
参见图1,本实施例提供一种远程视频会议演示的方法,包括如下步骤:
S1、实时获取演示区域图像、演示者音频以及参与区域图像;
S2、对所述演示区域图像和所述参与区域图像进行人像识别,获取演示区人像和参与区人像;
S3、合成演示文件、所述演示区人像以及所述演示者音频,得到演示视频;
S4、同步显示所述演示视频和所述参与区人像。
实施例二
参见图2,对应的,本实施例提供一种远程视频会议演示的应用装置,包括:
获取模块1,实时获取演示区域图像、演示者音频以及参与区域图像;
识别模块2,对所述演示区域图像和所述参与区域图像进行人像识别,获取 演示区人像和参与区人像;
合成模块3,合成演示文件、所述演示区人像以及所述演示者音频,得到演示视频;
同步模块4,同步显示所述演示视频和所述参与区人像。
需要说明的是,参与者需要实时同步自己的摄像头视频所拍摄的视频,供演示者能够实时观看到参与者的头像列表,以便于监控学生的上课情况。同时参与者还能实时获取演示者摄像头所拍摄的视频,用于观看演示者的演示。
还需要说明的是,参与者能实时观看到演示者所播放的演示文件、演示者的人像以及收听到演示者的音频,这种通过演示文件和摄像头视频流同时出现和交互的效果,可以在很大程度上提升远程视频会议演示的丰富度和演示效果。
还需要说明的是,步骤S1和步骤S2创造性地组合为不可分割的整体技术手段,获得演示区人像和参与区人像。其中,步骤S1实时获取演示区域图像、演示者音频以及参与区域图像,为实现同步演示者的人像和参与者的人像提供了在先准备,步骤S2对演示区域图像和参与区域图像进行人像识别,获取演示区人像和参与区人像。步骤S1和步骤S2共同完成了对演示者的人像和参与者的人像完成了人像抠图处理的过程。参与者需要实时同步自己的摄像头视频所拍摄的视频,供演示者能够实时观看到参与者的头像列表,以便于监控学生的上课情况。同时参与者还能实时获取演示者摄像头所拍摄的视频,用于观看演示者的演示。
还需要说明的是,步骤S3和步骤S4创造性地组合为不可分割的整体技术手段,参与者能实时观看到演示者所播放的演示文件、演示者的人像以及收听到演示者的音频,这种通过演示文件和摄像头视频流同时出现和交互的效果,可以在很大程度上提升远程视频会议演示的丰富度和演示效果。
还需要说明的是,步骤S1、步骤S2、步骤S3和步骤S4创造性地组合为不可分割的整体技术手段,实现了参与者在实时收看到演示者的演示视频和演示人像,演示者也能实时收看到由所有参与者的人像组成的人像列表,实现对所有会议参与者的监控和协调。
实施例三
参见图3,具体的,步骤S1包括:
S10、实时获取摄像头所拍摄的演示区域的图像;
S11、实时获取摄像头所拍摄的参与区域的图像;
S12、实时获取录音设备所记录的演示者的音频。
实施例四
参见图4,对应的,获取模块1包括:
第一获取单元10,实时获取摄像头所拍摄的演示区域的图像;
第二获取单元11,实时获取摄像头所拍摄的参与区域的图像;
第三获取单元12,实时获取录音设备所记录的演示者的音频。
技术效果说说明,要实现远程视频会议的顺利进行,需要借助录像设备和录音设备。步骤S10中,演示者的摄像头对准演示者,实时拍摄演示区域的图像,通过远程网络将演示区域的图像从演示者的计算机实时传输至参与者的计算机;步骤S11中,参与者的摄像头对准演示者,实时拍摄参与区域的图像,通过远程网络将参与区域的图像从演示者的计算机实时传输至参与者的计算机;步骤S12中,演示者在进行演讲时,录音设备会实时记录演示者的音频,将所述演示者的音频通过远程网络从演示者的计算机传输至参与者的计算机。
还需要说明的是,本发明创造性地实时传输演示者的图像和音频给参与者,同时传输参与者的图像给演示者,实现演示者和参与者互相同步对方图像的功能,并且参与者能同时接收到演示者的音频,增加了演示视频的生动程度,并使得远程视频会议更具有秩序性。
实施例五
参见图5,改进的,步骤S2包括步骤:
S20、识别所述演示区域的图像中是否存在人像;
S21、若存在人像,获取人像所包含的区域;
S22、将演示区域中不包含人像的区域按像素空白处理;
S23、识别参与区域的图像中是否存在人像;
S24、若存在人像,获取人像所包含的区域;
S25、将参与区域中不包含人像的区域按像素空白处理。
实施例六
参见图6,对应的,识别模块2还包括:
第一识别单元20,识别所述演示区域的图像中是否存在人像;
第四获取单元21,若存在所述人像,获取所述人像所包含的区域;
第一处理模块22,将所述演示区域中不包含所述人像的区域按像素空白处理;
第二识别单元23,识别所述参与区域的图像中是否存在人像;
第五获取单元24,若存在所述人像,获取所述人像所包含的区域;
第二处理模块25,将所述参与区域中不包含所述人像的区域按像素
技术效果说明,演示者的摄像头所拍摄的演示者和参与者的图像中,通常既包含了人像又包含了背景,而无论是对于参与者还是演示者,只需要看到人像就够了,并不需要展示人像后的背景。所以在本申请的步骤S20和步骤S23中,首先能识别摄像头所拍摄的图像中是否存在人像,如果存在人像,则步骤S21和步骤S24将包含人像的区域通过人像边缘检测技术抠出,步骤S22和步骤S25对图像中除了人像之外的背景区域则按像素透明处理,这样子无论是演示者所观看到的人像还是参与者所观看到的人像都只是人体部分,而看不到人体部分之外的背景部分。这种通过抠出人像从而只展现人像的方式更加突出了人体本身,有利于演示者和演示者能更清晰地观察对方的表情。
实施例七
参见图7,具体的,步骤S3包括步骤:
S30、设置所述演示文件于底层背景上部,并设置演示区人像于所述演示文件的上部,形成演示图像;
S31、叠加所述演示者音频和所述演示图像,得到演示视频。
实施例八
参见图8,对应的,合成模块3包括:
设置单元30,用于设置所述演示文件于底层背景上部,并设置演示区人像于所述演示文件的上部,形成演示图像;
叠加单元31,用于叠加所述演示者音频和所述演示图像,得到演示视频。
技术效果说明,步骤S30的功能在于设置的演示区域的底层背景,作为演示区域的底层背景支持图像和视频形式;再通过算法读取演示者指定的演示文件,读取文件的显示信息和尺寸信息,并将显示文件显示在底层背景上的指定位置;步骤S31将只包含参与者人体部分的图像置于演示文件的上方,形成演示图像。
还需要说明的是,步骤S30和S31创造性地组合为不可分割的整体技术手段,用于共同形成演示图像。
还需要说明的是,演示文件在底层背景的区域内可以自由进行尺寸的调控和位置的移动,演示人像也可以自由进行尺寸的调控和位置的移动。
还需要说明的是,所形成的演示图像由演示者通过计算机传输至参与者,并且只包含参与者人体部分的图像通过远程网络传输至演示者的计算机,实现演示者和参与者之间实现同时出现和交互的效果。
实施例九
本发明在合成演示文件、所述演示区人像以及所述演示者音频,得到演示视频中包括:
接收初始数据,所述初始数据包括初始位置信息和\或初始尺寸信息。其中,初始位置信息和\或初始尺寸信息可以是预先设置的。在每次进行远程视频会议演示时,对演示文件以及演示区人像具有默认的展示方式,即按照初始位置信息和\或初始尺寸信息进行展示。
基于所述初始位置信息和\或初始尺寸信息对所述演示文件、所述演示区人像的位置和尺寸进行初始化处理得到演示视频。在进行演示后,即获取相应的演示视频,在演示视频中演示文件以及演示区人像即是初始位置信息和\或初始尺寸信息。
其中,初始位置信息可以是相对的,即演示区人像相对于演示文件的所处位置,也可以是演示区人像相对于播放的显示器的位置。初始尺寸信息可以是一个具体的值,例如说4英寸、9英寸等等。
另外,在对所述演示文件、所述演示区人像的位置和尺寸进行初始化处理后,还包括:
接收输入的调整数据,所述调整数据用于对所述初始位置信息和\或初始尺寸信息进行调整得到修改位置信息和\或修改尺寸信息。通过调整数据在不同的场景之下对初始的初始位置信息和\或初始尺寸信息进行相应的调整。
基于所述修改位置信息和\或修改尺寸信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频。在远程视频会议演示的实际过程中,演示者可能会根据情况调整演示文件、演示区人像位置和\或尺寸,此时可以根据使用者输入的调整数据。
其中在获取调整数据的过程中,可以是通过鼠标、键盘以及触摸屏幕等设备进行调整数据的获取。
另外,还包括:
获取演示者的操作信息。在演示者的演示过程中,可能会对演示文件、演示区人像进行操作,在操作过程中会有对演示文件、演示区人像进行调整的过程。
基于所述操作信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频。
另外,所述获取演示者的操作信息包括:
获取演示者对演示文件的操作。例如说演示者对演示文件的操作可以是对演示文件进行放大、缩小等等。
本发明在基于所述操作信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频中包括:
若演示者对演示文件的操作达到第一预设条件,对所述演示区人像进行约束直至演示区人像与演示文件达到第二预设条件。其中,第一预设条件可以是预设的比例,也可以是演示区人像和演示文件之间的关系,例如说,第一条件是演示区人像对演示文件进行了遮挡,则当演示区人像对演示文件进行了遮挡后,即演示者对演示文件的操作达到第一预设条件。此时,对演示区人像进行约束,约束方式可以是对演示区人像进行缩小处理,第二预设条件可以是演示区人像不再对演示文件进行遮挡,即当演示区人像缩小至不再遮挡演示文件后,达到第二预设条件,该步骤为对演示区人像和演示文件进行的主动调整,使得调整后的远程视屏会议演示过程更加的清晰。
实施例十
本发明在合成模块中包括:
初始接收单元,用于接收初始数据,所述初始数据包括初始位置信息和\或初始尺寸信息。其中,初始位置信息和\或初始尺寸信息可以是预先设置的。在每次进行远程视频会议演示时,对演示文件以及演示区人像具有默认的展示方式,即按照初始位置信息和\或初始尺寸信息进行展示。
初始化单元,用于基于所述初始位置信息和\或初始尺寸信息对所述演示文件、所述演示区人像的位置和尺寸进行初始化处理得到演示视频。在进行演示后,即获取相应的演示视频,在演示视频中演示文件以及演示区人像即是初始位置信息和\或初始尺寸信息。
其中,初始位置信息可以是相对的,即演示区人像相对于演示文件的所处位 置,也可以是演示区人像相对于播放的显示器的位置。初始尺寸信息可以是一个具体的值,例如说4英寸、9英寸等等。
本发明在合成模块中还包括:
调整接收单元,用于接收输入的调整数据,所述调整数据用于对所述初始位置信息和\或初始尺寸信息进行调整得到修改位置信息和\或修改尺寸信息。通过调整数据在不同的场景之下对初始的初始位置信息和\或初始尺寸信息进行相应的调整。
修改单元,基于所述修改位置信息和\或修改尺寸信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频。在远程视频会议演示的实际过程中,演示者可能会根据情况调整演示文件、演示区人像位置和\或尺寸,此时可以根据使用者输入的调整数据。
本发明还包括:
获取单元,用于获取演示者的操作信息。在演示者的演示过程中,可能会对演示文件、演示区人像进行操作,在操作过程中会有对演示文件、演示区人像进行调整的过程。
处理单元,用于基于所述操作信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频。
对应的,所述获取单元还用于执行以下步骤,包括:
获取演示者对演示文件的操作。例如说演示者对演示文件的操作可以是对演示文件进行放大、缩小等等。
本发明处理单元还用于执行以下步骤,包括:
若演示者对演示文件的操作达到第一预设条件,对所述演示区人像进行约束直至演示区人像与演示文件达到第二预设条件。其中,第一预设条件可以是预设的比例,也可以是演示区人像和演示文件之间的关系,例如说,第一条件是演示区人像对演示文件进行了遮挡,则当演示区人像对演示文件进行了遮挡后,即演示者对演示文件的操作达到第一预设条件。此时,对演示区人像进行约束,约束方式可以是对演示区人像进行缩小处理,第二预设条件可以是演示区人像不再对演示文件进行遮挡,即当演示区人像缩小至不再遮挡演示文件后,达到第二预设条件,该步骤为对演示区人像和演示文件进行的主动调整,使得调整后的远程视屏会议演示过程更加的清晰。
另外,本发明还提供一种电子设备,包括存储器和处理器,所述存储器存储计算机程序,所述计算机程序在所述处理器中执行可实现上述任一种方法。其中,电子设备可以为移动终端或web端。
本发明还提供一种存储介质,存储计算机程序,所述计算机程序在处理器中执行可实现上述任一种方法。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。

Claims (18)

  1. 一种远程视频会议演示的方法,其特征在于,包括步骤:
    实时获取演示区域图像、演示者音频以及参与区域图像;
    对所述演示区域图像和所述参与区域图像进行人像识别,获取演示区人像和参与区人像;
    合成演示文件、所述演示区人像以及所述演示者音频,得到演示视频;
    同步显示所述演示视频和所述参与区人像。
  2. 如权利要求1所述的方法,其特征在于,所述的实时获取演示区域图像、演示者音频以及参与区域图像的步骤包括:
    实时获取摄像头所拍摄的演示区域的图像;
    实时获取摄像头所拍摄的参与区域的图像;
    实时获取录音设备所记录的演示者的音频。
  3. 如权利要求1所述的方法,其特征在于,对所述演示区域图像和所述参与区域图像进行人像识别,获取演示区人像和参与区人像的步骤为:
    识别所述演示区域的图像中是否存在人像;
    若存在所述人像,获取所述人像所包含的区域;
    将所述演示区域中不包含所述人像的区域按像素空白处理;
    识别所述参与区域的图像中是否存在人像;
    若存在所述人像,获取所述人像所包含的区域;
    将所述参与区域中不包含所述人像的区域按像素空白处理。
  4. 如权利要求1所述的方法,其特征在于,所述的合成演示文件、所述演示区人像以及所述演示者音频,得到演示视频的步骤为:
    设置所述演示文件于底层背景上部,并设置演示区人像于所述演示文件的上部,形成演示图像;
    叠加所述演示者音频和所述演示图像,得到演示视频。
  5. 一种远程视频会议演示的装置,其特征在于,包括:
    获取模块,实时获取演示区域图像、演示者音频以及参与区域图像;
    识别模块,对所述演示区域图像和所述参与区域图像进行人像识别,获取演示区人像和参与区人像;
    合成模块,合成演示文件、所述演示区人像以及所述演示者音频,得到演示视频;
    同步模块,同步显示所述演示视频和所述参与区人像。
  6. 如权利要求5所述的装置,其特征在于,所述获取模块包括:
    第一获取单元,实时获取摄像头所拍摄的演示区域的图像;
    第二获取单元,实时获取摄像头所拍摄的参与区域的图像;
    第三获取单元,实时获取录音设备所记录的演示者的音频。
  7. 如权利要求5所述的装置,其特征在于,所述识别模块包括:
    第一识别单元,识别所述演示区域的图像中是否存在人像;
    第四获取单元,若存在所述人像,获取所述人像所包含的区域;
    第一处理模块,将所述演示区域中不包含所述人像的区域按像素空白处理;
    第二识别单元,识别所述参与区域的图像中是否存在人像;
    第五获取单元,若存在所述人像,获取所述人像所包含的区域;
    第二处理模块,将所述参与区域中不包含所述人像的区域按像素空白处理。
  8. 如权利要求5所述的装置,其特征在于,所述合成模块包括:
    设置单元,设置所述演示文件于底层背景上部,并设置演示区人像于所述演示文件的上部,形成演示图像;
    叠加单元,叠加所述演示者音频和所述演示图像,得到演示视频。
  9. 如权利要求1所述的方法,其特征在于,所述合成演示文件、所述演示区人像以及所述演示者音频,得到演示视频包括:
    接收初始数据,所述初始数据包括初始位置信息和\或初始尺寸信息;
    基于所述初始位置信息和\或初始尺寸信息对所述演示文件、所述演示区人像的位置和尺寸进行初始化处理得到演示视频。
  10. 如权利要求9所述的方法,其特征在于,在对所述演示文件、所述演示区人像的位置和尺寸进行初始化处理后,还包括:
    接收输入的调整数据,所述调整数据用于对所述初始位置信息和\或初始尺寸信息进行调整得到修改位置信息和\或修改尺寸信息;
    基于所述修改位置信息和\或修改尺寸信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频。
  11. 如权利要求9所述的方法,其特征在于,还包括:
    获取演示者的操作信息;
    基于所述操作信息对所述演示文件、所述演示区人像的位置和尺寸进行再次 处理得到演示视频。
  12. 如权利要求11所述的方法,其特征在于,
    所述获取演示者的操作信息包括:
    获取演示者对演示文件的操作;
    基于所述操作信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频包括:
    若演示者对演示文件的操作达到第一预设条件,对所述演示区人像进行约束直至演示区人像与演示文件达到第二预设条件。
  13. 如权利要求5所述的装置,其特征在于,所述合成模块包括:
    初始接收单元,用于接收初始数据,所述初始数据包括初始位置信息和\或初始尺寸信息;
    初始化单元,用于基于所述初始位置信息和\或初始尺寸信息对所述演示文件、所述演示区人像的位置和尺寸进行初始化处理得到演示视频。
  14. 如权利要求13所述的装置,其特征在于,所述合成模块还包括:
    调整接收单元,用于接收输入的调整数据,所述调整数据用于对所述初始位置信息和\或初始尺寸信息进行调整得到修改位置信息和\或修改尺寸信息;
    修改单元,基于所述修改位置信息和\或修改尺寸信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频。
  15. 如权利要求13所述的装置,其特征在于,还包括:
    获取单元,用于获取演示者的操作信息;
    处理单元,用于基于所述操作信息对所述演示文件、所述演示区人像的位置和尺寸进行再次处理得到演示视频。
  16. 如权利要求15所述的装置,其特征在于,
    所述获取单元还用于执行以下步骤,包括:
    获取演示者对演示文件的操作;
    所述处理单元还用于执行以下步骤,包括:
    若演示者对演示文件的操作达到第一预设条件,对所述演示区人像进行约束直至演示区人像与演示文件达到第二预设条件。
  17. 一种电子设备,包括存储器和处理器,所述存储器存储计算机程序,其特征在于,所述计算机程序在所述处理器中执行可实现权利要求1-4、9-12中任 一种方法。
  18. 一种存储介质,存储计算机程序,其特征在于,所述计算机程序在处理器中执行可实现权利要求1-4、9-12中任一种方法。
PCT/CN2021/098991 2020-10-20 2021-06-08 一种远程视频会议演示的方法和装置 WO2022083133A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011128627.7A CN112333415A (zh) 2020-10-20 2020-10-20 一种远程视频会议演示的方法和装置
CN202011128627.7 2020-10-20

Publications (1)

Publication Number Publication Date
WO2022083133A1 true WO2022083133A1 (zh) 2022-04-28

Family

ID=74311154

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/098991 WO2022083133A1 (zh) 2020-10-20 2021-06-08 一种远程视频会议演示的方法和装置

Country Status (2)

Country Link
CN (1) CN112333415A (zh)
WO (1) WO2022083133A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112333415A (zh) * 2020-10-20 2021-02-05 深圳市前海手绘科技文化有限公司 一种远程视频会议演示的方法和装置
CN113344962A (zh) * 2021-06-25 2021-09-03 北京市商汤科技开发有限公司 人像显示方法及装置、电子设备和存储介质
CN113794824B (zh) * 2021-09-15 2023-10-20 深圳市智像科技有限公司 室内可视化文档智能交互式采集方法、装置、系统及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108091192A (zh) * 2017-12-14 2018-05-29 尹子悦 交互式在线教学系统及方法、教师系统以及学生系统
CN109584655A (zh) * 2018-12-03 2019-04-05 贵阳朗玛信息技术股份有限公司 一种远程演示方法、装置、教学系统及可读存储介质
CN110009951A (zh) * 2019-03-26 2019-07-12 乐佰科(深圳)教育科技有限公司 一种在线直播编程教学的教学方法及教学系统
CN111028580A (zh) * 2019-12-23 2020-04-17 杭州当虹科技股份有限公司 一种基于视频会议系统的大班课授课方法
CN112333415A (zh) * 2020-10-20 2021-02-05 深圳市前海手绘科技文化有限公司 一种远程视频会议演示的方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040161728A1 (en) * 2003-02-14 2004-08-19 Benevento Francis A. Distance learning system
US9124765B2 (en) * 2012-12-27 2015-09-01 Futurewei Technologies, Inc. Method and apparatus for performing a video conference
US9497412B1 (en) * 2015-07-27 2016-11-15 Cisco Technology, Inc. Video conference audio/video verification
CN111654715B (zh) * 2020-06-08 2024-01-09 腾讯科技(深圳)有限公司 直播的视频处理方法、装置、电子设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108091192A (zh) * 2017-12-14 2018-05-29 尹子悦 交互式在线教学系统及方法、教师系统以及学生系统
CN109584655A (zh) * 2018-12-03 2019-04-05 贵阳朗玛信息技术股份有限公司 一种远程演示方法、装置、教学系统及可读存储介质
CN110009951A (zh) * 2019-03-26 2019-07-12 乐佰科(深圳)教育科技有限公司 一种在线直播编程教学的教学方法及教学系统
CN111028580A (zh) * 2019-12-23 2020-04-17 杭州当虹科技股份有限公司 一种基于视频会议系统的大班课授课方法
CN112333415A (zh) * 2020-10-20 2021-02-05 深圳市前海手绘科技文化有限公司 一种远程视频会议演示的方法和装置

Also Published As

Publication number Publication date
CN112333415A (zh) 2021-02-05

Similar Documents

Publication Publication Date Title
WO2022083133A1 (zh) 一种远程视频会议演示的方法和装置
US10609332B1 (en) Video conferencing supporting a composite video stream
US8044989B2 (en) Mute function for video applications
US8291326B2 (en) Information-processing apparatus, information-processing methods, recording mediums, and programs
US8958686B2 (en) Information processing device, synchronization method, and program
US20080303949A1 (en) Manipulating video streams
GB2590545A (en) Video photographing method and apparatus, electronic device and computer readable storage medium
WO2018006377A1 (zh) 实时互动动画的全息投影系统、方法及人工智能机器人
US20090202223A1 (en) Information processing device and method, recording medium, and program
US10560752B2 (en) Apparatus and associated methods
JPH11219446A (ja) 映像音響再生システム
US10984537B2 (en) Expression transfer across telecommunications networks
JP2001313915A (ja) テレビ会議装置
TW201707444A (zh) 視線校正(一)
US20230105064A1 (en) System and method for rendering virtual reality interactions
CN112839190A (zh) 虚拟图像与现实场景同步视频录制或直播的方法
CN114040318A (zh) 一种空间音频的播放方法及设备
CN111163280B (zh) 非对称性视频会议系统及其方法
KR20200001750A (ko) 특정 영역의 화질을 개선하기 위해 복수의 가상현실영상을 재생하는 가상현실영상재생장치 및 가상현실영상생성방법
CN112422882A (zh) 一种为视频会议系统提供视频源的方法与装置
US20230319234A1 (en) System and Methods for Enhanced Videoconferencing
JP4321751B2 (ja) 描画処理装置、描画処理方法および描画処理プログラム、並びにそれらを備えた電子会議システム
TW201639347A (zh) 視線校正(二)
US11399166B2 (en) Relationship preserving projection of digital objects
US11381793B2 (en) Room capture and projection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21881556

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 01/08/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21881556

Country of ref document: EP

Kind code of ref document: A1