WO2020238973A1 - 多通道音视频自动检测实时推送方法 - Google Patents

多通道音视频自动检测实时推送方法 Download PDF

Info

Publication number
WO2020238973A1
WO2020238973A1 PCT/CN2020/092669 CN2020092669W WO2020238973A1 WO 2020238973 A1 WO2020238973 A1 WO 2020238973A1 CN 2020092669 W CN2020092669 W CN 2020092669W WO 2020238973 A1 WO2020238973 A1 WO 2020238973A1
Authority
WO
WIPO (PCT)
Prior art keywords
time point
default channel
video
time
computer interaction
Prior art date
Application number
PCT/CN2020/092669
Other languages
English (en)
French (fr)
Inventor
罗辉
刘鑫
陈挚
Original Assignee
成都依能科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 成都依能科技股份有限公司 filed Critical 成都依能科技股份有限公司
Publication of WO2020238973A1 publication Critical patent/WO2020238973A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/438Interfacing the downstream path of the transmission network originating from a server, e.g. retrieving encoded video stream packets from an IP network
    • H04N21/4383Accessing a communication channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4668Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies

Definitions

  • the invention relates to the field of remotely pushing audio and video streams, in particular to a real-time pushing method for multi-channel audio and video automatic detection.
  • multi-channel audio and video generally include two types of camera video and screen video of the teaching computer. There are several ways to push multi-channel audio and video:
  • Multi-channel push selective push of multi-channel audio and video such as current TV stations and live conferences. Based on the manual implementation of the switcher, one of the multiple audio and video channels is finally selected within a period of time and pushed to the user to watch. When the display content needs to be switched, it needs to be manually switched at the director station. This scheme increases hardware costs and manpower input, and there is no standard to follow, which is greatly affected by human subjective factors.
  • Multi-channel artificial intelligence camera and analysis and processing server the action of the character in the scene is collected through the smart camera, and the server analyzes and processes it and judges a certain camera to give a close-up shot of the character, and pushes the final result video to the user for viewing, currently live on the web It is more useful than classroom teaching live broadcast. However, in the teaching process, the differences in the actions of the characters and the interference from other dynamic people or objects in the scene are great. The processing center often misjudges and issues wrong instructions. The final result does not meet the requirements or is not the picture that the speaker most wants to show to the audience.
  • the current video streaming solutions such as online remote teaching, webcasting, and online classrooms either use one camera channel to push streaming, or multiple audio and video channels are automatically switched to display content through manual manual or artificial intelligence analysis and pushed to the audience.
  • Fixing a certain channel of streaming solution shows limited lenses and single content, and it is impossible to shoot and display the observed objects from different directions and different perspectives at the same time.
  • the results display similar picture information for a long time, which is not conducive to users' viewing, and it is easy to cause visual fatigue to the audience.
  • Multi-channel audio and video selective push solutions often require manual intervention, or require a powerful computing processing center and artificial intelligence camera equipment, which often requires a large investment and requires multi-person cooperation to complete, high equipment investment costs and complex equipment , Maintenance and management and its inconvenience.
  • the technical problem to be solved by the present invention is: in view of the above-mentioned problem that the existing push method is not suitable for the teaching field, the present invention proposes a real-time push method for multi-channel audio and video automatic detection.
  • the real-time push method for multi-channel audio and video automatic detection includes the following steps:
  • the multi-channel audio and video includes one default channel video and at least one non-default channel video, and push the default channel video by default;
  • the video stream of the corresponding channel is selected and pushed.
  • the step B includes:
  • step S101 Determine whether there is a new human-computer interaction time point within a first preset time period from the current human-computer interaction time point, and the initial value of the current human-computer interaction time point is the start time of the multi-channel audio and video push. If not, go to step S103, if yes, go to step S102;
  • the step B includes:
  • step S201 Determine whether there is a new human-computer interaction time point within a first preset time period from the current human-computer interaction time point, and the initial value of the current human-computer interaction time point is the start time of the multi-channel audio and video push. If it does not exist, go to step S203, if it exists, go to step S202;
  • step S202 If the type of the last marked time point is the default channel cut-out time point, add a marked time point whose type is the default channel cut-in time point, and its value is set to the newly added human-computer interaction time point; Set the current human-computer interaction time point as the newly added human-computer interaction time point, and proceed to step S201;
  • step S204 Determine whether there is a new human-computer interaction time point within the second preset time period after the start of the image time, if it exists, go to step S202; otherwise, go to step S205;
  • step S205 Determine whether the image of the default channel video picture at the start of the image time is consistent with the image of the default channel video picture at the start of the image time plus the second preset duration, if not, go to step S206; otherwise, go to step S207;
  • step S206 If the type of the last marked time point is the default channel cut-out time point, a new marked time point whose type is the default channel cut-in time point is added, and its value is set to the image time starting point plus the second preset time length to correspond to The new image time start point is set to this time image time start point plus the second preset duration; step S204 is entered;
  • step S207 If the type of the last marked time point is the default channel cut-in time point, a new marked time point whose type is the default channel cut-out time point is added, and its value is set to the image time starting point plus the second preset time length to correspond to Set the new image time starting point for this time plus the second preset time period; go to step S204.
  • the step B includes:
  • step S301 Determine whether there is a new human-computer interaction time point within a first preset time period from the current human-computer interaction time point, and the initial value of the current human-computer interaction time point is the start time of the multi-channel audio and video push. If it does not exist, go to step S303, if it exists, go to step S302;
  • step S304 Determine whether there is a new human-computer interaction time point within the second preset time period after the start of the image time, if it exists, go to step S302; otherwise, go to step S305;
  • step S305 Determine whether the image of the default channel video picture at the start of the image time is consistent with the image of the default channel video picture at the start of the image time plus the second preset time length. If not, go to step S306; otherwise, go to step S307;
  • the human-computer interaction events include screen touch instructions, keyboard operations, mouse operations, electronic lecture control instructions, and camera channel display operation instructions.
  • the step C includes: the pushed video stream is set to the default channel video by default; when there is a default channel cut-out time point, the pushed video stream is switched from the default channel video to the non-default channel video; when there is a default channel cut-in At the time, the pushed video stream is switched from the non-default channel video to the default channel video.
  • the multi-channel audio and video includes N non-default channel videos, N ⁇ 2, the corresponding N non-default channel videos are respectively recorded as the first non-default channel video to the Nth non-default channel video; so
  • the above step C includes: the pushed video stream is set to the default channel video by default.
  • the pushed video stream When there is a default channel cut-out time point, the pushed video stream is switched from the default channel video to the first non-default channel video, and the first non-default channel After the preset rotation duration of video playback, the pushed video stream is switched to the second non-default channel video, and so on, switching to the Nth non-default channel video in a loop, and then to the first non-default channel video, until it reaches the default Channel cut-in time point; when there is a default channel cut-in time point, the pushed video stream is switched from the non-default channel video to the default channel video.
  • the cut-out transition effect is pushed at the default channel cut-out time point; when the default channel cut-in time point exists, the cut-in time point is pushed at the default channel cut-in time point Transition effects.
  • the same computer used by the teacher is used to automatically and intelligently analyze and judge the teacher's human-computer interaction information, and automatically select the switching channel in the multi-channel audio and video.
  • the judgment is accurate according to the human-computer interaction information of the teacher, and it is automatically and intelligently completed by the software without the assistance of others. It is completed by the same computer when the teacher teaches without the need to configure a third-party independent hardware system.
  • This method has the advantages of low implementation cost, extremely simple operation and accurate switching of multi-channel audio and video. It truly reflects and expresses the intent of the lecturer to display the content, and improves the visual fatigue of the learner by switching and pushing the video picture change, thereby Improve knowledge transfer and teaching effectiveness.
  • FIG. 1 is a flowchart of a real-time push method for automatic detection of multi-channel audio and video according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a method for automatic detection of real-time push of multi-channel audio and video according to an embodiment of the present invention
  • FIG. 3 is another schematic diagram of a multi-channel audio and video automatic detection real-time push method according to an embodiment of the present invention
  • FIG. 4 is another schematic diagram of the multi-channel audio and video automatic detection real-time push method according to the embodiment of the present invention.
  • the present invention monitors the human-computer interaction information on the teaching computer, and determines the pushed channel video content according to the monitored human-computer interaction information such as writing and operation on the computer by the teacher.
  • the teacher explains the content on the screen, writes, annotates on the screen, and various software operations on the computer screen, it is judged that such human-computer interaction events are pushing the default screen channel video;
  • the teacher has not been on the screen for a long time
  • it will automatically switch from the default channel video to other video channel display according to the preset time, and display the content from different perspectives;
  • the teacher calls a camera screen to display on the computer screen, it is automatically determined that the current teacher is on the computer
  • the operation and explanation content on the screen is automatically switched to the computer screen video screen to ensure that the operation process and the displayed camera screen are shown to the learners.
  • the above process is automatically and intelligently analyzed and judged during the teacher's teaching process, without the intervention of professionals for switching operations, by judging the human-computer interaction operation of the teacher (user) on the computer, accurately capturing what the teacher (user) wants to express
  • the content truly conveys the current intentions of the teacher (user), and avoids the inaccuracy and personal factors caused by third-party manipulation.
  • the multi-channel audio and video picture push scheme of the present invention can switch between the video channel pictures, which can improve the visual fatigue of learners and change the picture at the same time It can activate the brain to continuously process learning content with better efficiency, thereby improving the effect and quality of learning.
  • the present invention Compared with the simultaneous push of multiple channels of audio and video, the present invention only pushes one channel of audio and video to the learning terminal at the same time, which greatly reduces the bandwidth requirements, and the software is automatically based on the man-machine
  • the interactive information intelligently selects and switches the audio and video channels that need to be pushed, and the pushing process is automated, without the intervention of users and learners, and without the need to arrange complex equipment, achieving low cost and low maintenance cost.
  • the multi-channel audio and video automatic detection real-time push method includes the following steps:
  • the multi-channel audio and video includes one default channel video and at least one non-default channel video, and push the default channel video by default;
  • the video stream of the corresponding channel is selected and pushed.
  • the default channel video is generally set to the teaching computer screen video, and the non-default channel can come from the camera channel of shooting students and/or the camera channel of shooting teachers and/or the camera channel of shooting experiment platform.
  • the default channel cut-out time point and the default channel cut-in time point are two types of recording points, and the number of the default channel cut-out time point and the default channel cut-in time point is calculated according to the human-computer interaction time point, which can be multiple .
  • the present invention establishes a human-computer interaction judgment and analysis mechanism at the beginning of the video push, analyzes the teacher's operation behavior on the source terminal where the default channel video is located, and captures and monitors the operation of the mouse, keyboard, page turning pen and other human input devices. If the above occurs Operation behavior means that the teacher needs the student to watch the operation content given on the screen, and the video pushed at this time is the default channel video. When the above operation does not occur after a period of time, for example, when the human input device operation does not occur after 15s, the main communication at this time is the teacher’s explanation and the teacher’s body movements to convey information to the students. You need to switch the focus to the teacher.
  • the software will automatically and intelligently determine and display the lecture screen (the demonstration shot is displayed on the screen in real time), and the teacher is on the screen Operations such as text writing and key marking can be performed in this.
  • This strategy can well avoid the third-party equipment's automatic judgment and selection of experimental shooting pictures and omission of information such as screen writing and key content annotations.
  • human-computer interaction events may include screen touch commands, keyboard operations, mouse operations, electronic lecture control commands, and display operations of various camera channels.
  • Screen touch instructions any touch and writing by the user on the screen, mainly related to content marking, circle dots, problem-solving derivation during the teaching process, including infrared electronic whiteboards, all-in-ones, electromagnetic induction writing screens, digital tablets, capacitive writing Touch screen and other touch information;
  • Keyboard operation the user performs computer keyboard input, such as text input, software operation instructions, function keys, etc.;
  • Mouse operation mouse left and right click events, drag events, scroll events, zoom in, zoom out, copy and other operations with function keys;
  • the wireless page turning device controls the lecture manuscript page turning, black screen and flying mouse control on the computer;
  • Camera channel display operation instructions during the user's teaching process, the user can call the video of each channel and display it on the screen, such as demonstrating the image display of the experimental camera equipment channel, and the physical display of the image display of the shooting equipment channel.
  • the step C includes: the pushed video stream is set to the default channel video by default, and when there is a default channel cut-out time point, the pushed video stream is switched from the default channel video to the non-default channel video; when there is a default channel cut-in At the time, the pushed video stream is switched from the non-default channel video to the default channel video.
  • the multi-channel audio and video include N non-default channel videos, N ⁇ 2, the corresponding N non-default channel videos are respectively recorded as the first non-default channel video to the Nth non-default channel video; the step C Including: The pushed video stream is set to the default channel video by default. When the default channel cut-out time exists, the pushed video stream is switched from the default channel video to the first non-default channel video, and the first non-default channel video is played.
  • the pushed video stream is switched to the second non-default channel video, and so on, switching to the Nth non-default channel video in a loop, and then to the first non-default channel video, until the default channel cut-in time is reached Point; when there is a default channel cut-in time point, the pushed video stream is switched from the non-default channel video to the default channel video.
  • the cut-out transition effect can be set at the above-mentioned default channel cut-out time point; when the default channel cut-in time point exists, the cut-in can be set at the above-mentioned default channel cut-in time point In this way, the transition effect can be used as a reminder for switching videos. Similarly, when the pushed video stream is switched between the non-default channel videos, the rotation transition effect can be pushed at the corresponding switching time point.
  • Step B may include:
  • step S101 Determine whether there is a new human-computer interaction time point within a first preset time period from the current human-computer interaction time point, and the initial value of the current human-computer interaction time point is the start time of the multi-channel audio and video push. If not, go to step S103, if yes, go to step S102;
  • the multi-channel audio and video in the figure includes one default channel video, one first non-default channel video, and one second non-default channel video.
  • the duration of the three-channel video is the same.
  • the default channel video pushes the first human-computer interaction time point M1 after t0, t0 ⁇ t1, there is no new human-computer interaction time point within the first preset time t1 after M1 , M2 is the second human-computer interaction time point immediately after, t3 is the preset rotation duration, t5 is the time length from the second human-computer interaction time point to the video push end time point, t5 ⁇ t1, see M2-M1 >t1, the default channel video is pushed at the beginning of the pushed video stream, and then the channel needs to be switched to the first non-default channel video at t0+t1, and the preset rotation duration of the first non-default channel video is within t3 There is no new human-computer interaction time point, the pushed video stream is switched to the second non-default channel video output, and there is a new human-computer interaction time point within the preset rotation time of the second non-default channel video, and t4 is continued to be pushed After the duration, there is a second human-computer interaction
  • the pushed video stream needs to be switched back to the default channel video, that is, the pushed video stream includes the one shown in Figure 2.
  • the videos in the shadow of the four slashes are pushed sequentially, which are the default channel video in t0+t1 time period, the first non-default channel video in t3 time period, and the second non-default channel video in t4 time period.
  • the cut-out transition effect can be added at t0+t1
  • the rotation transition effect can be added at t0+t1+t3
  • the cut-in transition can be added at t0+t1+t3+t4.
  • step B includes:
  • step S201 Determine whether there is a new human-computer interaction time point within a first preset time period from the current human-computer interaction time point, and the initial value of the current human-computer interaction time point is the start time of the multi-channel audio and video push. If it does not exist, go to step S203, if it exists, go to step S202;
  • step S202 If the type of the last marked time point is the default channel cut-out time point, add a marked time point whose type is the default channel cut-in time point, and set its value to the newly added human-computer interaction time point; Set the current human-computer interaction time point as the newly-added human-computer interaction time point, and proceed to step S201;
  • step S204 Determine whether there is a new human-computer interaction time point within the second preset time period after the start of the image time, if it exists, go to step S202; otherwise, go to step S205;
  • step S205 Determine whether the image of the default channel video picture at the start of the image time is consistent with the image of the default channel video picture at the start of the image time plus the second preset duration, if not, go to step S206; otherwise, go to step S207;
  • step S206 If the type of the last marked time point is the default channel cut-out time point, a new marked time point whose type is the default channel cut-in time point is added, and its value is set to the image time starting point plus the second preset time length to correspond to The new image time start point is set to this time image time start point plus the second preset duration; step S204 is entered;
  • step S207 If the type of the last marked time point is the default channel cut-in time point, a new marked time point whose type is the default channel cut-out time point is added, and its value is set to the image time starting point plus the second preset time length to correspond to Set the new image time starting point for this time plus the second preset time period; go to step S204.
  • the multi-channel audio and video in the figure includes one default channel video and one first non-default channel video.
  • the duration of the two videos is the same, and the default channel
  • the first human-computer interaction time point M1 is ushered in.
  • M2 is the second human-computer interaction time point immediately after it.
  • Interaction time point, t6 is the second preset duration, it can be seen that M2-M1>t1, the pushed video stream will push the default channel video by default at the beginning of the push, and switch the channel at the corresponding time point at t0+t1 and start pushing the first
  • the default channel video is not
  • t5 is the length of time between the second human-computer interaction time point and the video push end time point, t5 ⁇ t1, the video stream pushed like this
  • step B includes:
  • step S301 Determine whether there is a new human-computer interaction time point within a first preset time period from the current human-computer interaction time point, and the initial value of the current human-computer interaction time point is the start time of the multi-channel audio and video push. If it does not exist, go to step S303, if it exists, go to step S302;
  • step S304 Determine whether there is a new human-computer interaction time point within the second preset time period after the start of the image time, if it exists, go to step S302; otherwise, go to step S305;
  • step S305 Determine whether the image of the default channel video picture at the start of the image time is consistent with the image of the default channel video picture at the start of the image time plus the second preset duration. If they are not consistent, go to step S306, otherwise, go to step S307;
  • the multi-channel audio and video in the figure includes one default channel video and one first non-default channel video.
  • the duration of the two videos is the same, and the default channel
  • the first human-computer interaction time point M1 is ushered in.
  • M2 is the second human-computer interaction time point immediately after it.
  • Interaction time point, t6 is the second preset duration, it can be seen that M2-M1>t1, assuming that the image of the default channel video at the corresponding time point at t0+t1+t6 and the default channel video at the corresponding time point at t0+t1 If the images are inconsistent, and the default channel video reaches the second human-computer interaction time point M2 after 2*t6+t7 at t0+t1+t6, and the default channel video has been changing during this period, the pushed video stream will always be the default channel video.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Television Signal Processing For Recording (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本发明涉及远程推送视频领域,针对现有推送方法不适用于教学领域的问题,本发明提出了一种多通道音视频自动检测实时推送方法,包括如下步骤:A、对采集到的多通道音视频进行选择性推送,所述多通道音视频包括一路默认通道视频和至少一路非默认通道视频,默认推送所述默认通道视频;B、实时检测录制时默认通道视频的来源终端中的人机交互事件并记录对应的人机交互时间点,根据所述人机交互时间点计算出默认通道切出时间点和默认通道切入时间点;C、根据所述默认通道切出时间点和默认通道切入时间点选择合适通道的视频流进行推送。如此通过人机交互判断视频所处状态并进行相应推送的切换,本发明适用于实时远程教学。

Description

多通道音视频自动检测实时推送方法 技术领域
本发明涉及远程推送音视频流领域,特别涉及一种多通道音视频自动检测实时推送方法。
背景技术
教学领域中,多通道音视频一般包括摄像视频和授课电脑的屏幕视频这两类,多通道音视频的推送存在有如下几种方式:
1、单路推送:推送过程始终保持固定一路音视频,用户端观看到的画面单调且容易产生视觉疲劳,无法同一时间从不同方位不同视角观察拍摄对象。
2、多路推送(多摄像头+电脑屏幕):当前电视台、直播会议等多路音视频选择性推送。基于切换台人工手动实现,最终在一段时间内选取多路音视频中的其中一路音视频推送给用户观看,当需要切换显示内容时,需要在导播台人工手动切换。此方案增加了硬件成本和人力投入,且没有一个标准可循,受到人为主观因素影响较大。
3、现有的教学直播多为推送摄像头视频,在授课过程中若需要推送老师讲课屏幕画面,往往需要通过一路摄像机拍摄授课屏幕画面,而摄像机拍摄的画面受到场景和光线影响品质不高,且摄像头的展示需要人工接入操作。另外,通过教学直播软件将电脑屏幕和摄像头视频同步推送,一并展示给学生观看;若需要展示其中某一路画面,还需要直播主持人手动选择或者学习者手动选择具体某一路观看,不能充分表达直播过程中讲述者所需要展示的真正意图。
4、多路音视频同时推送,占用带宽高,在客户端由用户自行选择观看最终某一路音视频画面,在观看时还需要自己去切换操作不太方便。
5、多路人工智能摄像头与分析处理服务器:通过智能摄像头采集到场景中人物动作,服务器分析处理后判断某一路摄像头给该人物特写镜头,并把最终成果视频推送给用户观看,当前在网络直播和课堂教学直播比较多用。但是教学过程中人物动作差异和受到场景中其他动态人或物体的干扰较大,处理中心经常误判和发出错误指令,最终成果达不到要求或不是演讲者最希望展示给观众的画面。
也就是说,当前网络远程授课、网络直播、在线课堂等视频推流方案要么是一路摄像机通道推流,要么是多路音视频通道通过人工手动或人工智能分析自动切换显示内容 推送给观众。固定某一路推流方案显示的镜头有限、内容单一,无法同一时间从不同方位、不同视角拍摄、展示所观察对象。同时成果长时间显示相近的画面信息,不利于用户观看,容易让观众产生视觉疲劳。多路音视频选择性推送方案往往需要人工手动介入,或者是需要一套功能强大的计算处理中心和人工智能摄像设备,往往投入较大,需要多人协作来完成,设备投入费用高,设备复杂,维护管理及其不便等。
发明内容
本发明所要解决的技术问题是:针对上述现有推送方法不适用于教学领域的问题,本发明提出了一种多通道音视频自动检测实时推送方法。
本发明解决上述技术问题,采用的技术方案是:
多通道音视频自动检测实时推送方法,包括如下步骤:
A、对采集到的多通道音视频进行选择性推送,所述多通道音视频包括一路默认通道视频和至少一路非默认通道视频,默认推送所述默认通道视频;
B、实时检测默认通道视频的来源终端(电脑)中的人机交互事件并记录对应的人机交互时间点,根据所述人机交互时间点计算出默认通道切出时间点和默认通道切入时间点;
C、根据所述默认通道切出时间点和默认通道切入时间点选择对应通道的视频流进行推送。
优选的,所述步骤B包括:
S101、判断距离当前人机交互时间点的第一预设时长内是否存在新增的人机交互时间点,所述当前人机交互时间点的初始值为多通道音视频推送的起始时刻,如否,进入步骤S103,如是,进入步骤S102;
S102、设置当前人机交互时间点为所述新增的人机交互时间点,进入步骤S101;
S103、新增一个类型为默认通道切出时间点的标记时间点,且其值设置为当前人机交互时间点加第一预设时长对应的时间点;
S104、当距离最新的标记时间点之后实时检测出新的人机交互时间点,设置当前人机交互时间点为所述新的人机交互时间点,新增一个类型为默认通道切入时间点的标记时间点,且其值设置为所述新的人机交互时间点;进入步骤S101。
进一步优选的,所述步骤B包括:
S201、判断距离当前人机交互时间点的第一预设时长内是否存在新增的人机交互时间点,所述当前人机交互时间点的初始值为多通道音视频推送的起始时刻,如不存在, 进入步骤S203,如存在,进入步骤S202;
S202、若上一个标记时间点的类型为默认通道切出时间点,则新增一个类型为默认通道切入时间点的标记时间点,且其值设置为所述新增的人机交互时间点;设置当前人机交互时间点为所述新增的人机交互时间点,进入步骤S201;
S203、设置图像时间起点为当前人机交互时间点加第一预设时长对应的时间点,若上一个标记时间点的类型为默认通道切入时间点或不存在标记点,则新增一个类型为默认通道切出时间点的标记时间点,且其值设置为当前人机交互时间点加第一预设时长对应的时间点;
S204、判断图像时间起点后第二预设时长内是否存在新增的人机交互时间点,如存在,进入步骤S202;否则进入步骤S205;
S205、判断图像时间起点处默认通道视频画面的图像与图像时间起点加第二预设时长处的默认通道视频画面的图像是否一致,如不一致,进入步骤S206;否则进入步骤S207;
S206、若上一个标记时间点的类型为默认通道切出时间点,则新增一个类型为默认通道切入时间点的标记时间点,且其值设置为图像时间起点加第二预设时长后对应的时间点;新的图像时间起点设置为本次的图像时间起点加第二预设时长;进入步骤S204;
S207、若上一个标记时间点的类型为默认通道切入时间点,则新增一个类型为默认通道切出时间点的标记时间点,且其值设置为图像时间起点加第二预设时长后对应的时间点;设置新的图像时间起点为本次的图像时间起点加第二预设时长;进入步骤S204。
再一步优选的,所述步骤B包括:
S301、判断距离当前人机交互时间点的第一预设时长内是否存在新增的人机交互时间点,所述当前人机交互时间点的初始值为多通道音视频推送的起始时刻,如不存在,进入步骤S303,如存在,进入步骤S302;
S302、若上一个标记时间点的类型为默认通道切出时间点,则新增一个类型为默认通道切入时间点的标记时间点,且其值设置为所述新增的人机交互时间点;设置当前人机交互时间点为所述新增的人机交互时间点,进入步骤S301;
S303、设置图像时间起点为当前人机交互时间点加第一预设时长对应的时间点;
S304、判断图像时间起点后第二预设时长内是否存在新增的人机交互时间点,如存在,进入步骤S302;否则进入步骤S305;
S305、判断图像时间起点处的默认通道视频画面的图像与图像时间起点加第二预设 时长处的默认通道视频画面的图像是否一致,如不一致,进入步骤S306,否则,进入步骤S307;
S306、若上一个标记时间点的类型为默认通道切出时间点,则新增一个类型为默认通道切入时间点的标记时间点,且其值设置为图像时间起点;设置新的图像时间起点为本次的图像时间起点加第二预设时长,进入步骤S304;
S307、若上一个标记时间点的类型为默认通道切入时间点或标记时间点不存在,则新增一个类型为默认通道切出时间点的标记时间点,且其值设置为图像时间起点;设置新的图像时间起点为本次的图像时间起点加第二预设时长;进入步骤S304。
具体的,所述人机交互事件包括屏幕触控指令、键盘操作、鼠标操作、电子讲稿控制指令和摄像通道展示操作指令。
优选的,所述步骤C包括:推送的视频流默认设置为默认通道视频,当存在默认通道切出时间点时,推送的视频流从默认通道视频切换到非默认通道视频;当存在默认通道切入时间点时,推送的视频流从非默认通道视频切换为默认通道视频。
进一步的,当多通道音视频包括N路非默认通道视频时,N≥2,将对应的N路非默认通道视频分别记为第一路非默认通道视频至第N路非默认通道视频;所述步骤C包括:推送的视频流默认设置为默认通道视频,当存在默认通道切出时间点时,推送的视频流从默认通道视频切换到第一路非默认通道视频,第一路非默认通道视频播放预设轮换时长后,推送的视频流切换至第二路非默认通道视频,依次类推,循环切换至第N路非默认通道视频,再切换至第一路非默认通道视频,直至到达默认通道切入时间点;当存在默认通道切入时间点时,推送的视频流从非默认通道视频切换为默认通道视频。
进一步的,当默认通道切出时间点存在时,在所述默认通道切出时间点处推送切出转场特效;当默认通道切入时间点存在时,在所述默认通道切入时间点处推送切入转场特效。
进一步的,当推送的视频流在各个非默认通道视频间存在切换时,在对应的切换时间点处推送轮换转场特效。
本发明的有益效果是:
根据日常授课和对在线教学的研究和分析,采用老师授课时的同一台电脑,自动智能分析和判断老师的人机交互信息,在多通道音视频中自动选择切换推送的通道。根据授课老师的人机交互信息判断精准,无需他人协助由软件自动智能完成,采用老师授课时同一台电脑完成而无需配置第三方独立硬件系统。此方法具有实施成本低,操作极其 简便及多通道音视频切换精准的优点,真实反映和表达了讲授者希望展示内容的意图,以及通过切换推送的视频画面变化来改善学习者的视觉疲劳,从而提高知识传达和教学效果。
附图说明
图1为本发明实施例的多通道音视频自动检测实时推送方法的流程图;
图2为本发明实施例的多通道音视频自动检测实时推送方法的一个示意图;
图3为本发明实施例的多通道音视频自动检测实时推送方法的又一个示意图;
图4为本发明实施例的多通道音视频自动检测实时推送方法的再一个示意图。
具体实施方式
本发明通过对教学电脑上的人机交互信息监测,根据监测到的老师在电脑上书写、操作等人机交互信息来确定推送的通道视频内容。当老师讲解屏幕中的内容,在屏幕中书写、标注,以及在电脑屏幕上的各类软件操作等,则判定此类人机交互事件为推送默认屏幕通道视频;当老师长时间未在屏幕上进行操控时,则自动根据预设时间从默认通道视频切换到其他视频通道显示,从不同视角展示内容;当老师调取某一路摄像头画面在电脑屏幕上展示时,则自动判定为当前老师在电脑屏幕上操作与讲解内容,自动切换到电脑屏幕视频画面,确保操作过程和展示的摄像头画面一并展示给学习者。
以上过程都是在教师授课过程中自动智能的进行分析和判断,无需专业人员介入进行切换操,通过判断教师(用户)在电脑上的人机交互操作,精准捕捉教师(用户)所想表达的内容,真实传达教师(用户)当前的意图,免去第三方人为操控带来的不精准和个人因素影响。
相比于单路电脑屏幕或单路摄像机画面推送的传统方案而言,通过本发明的多通道音视频画面推送方案,在视频通道画面之间切换,可改善学习者的视觉疲劳,同时变化画面能激活大脑持续以较好的效率加工学习内容,从而提升学习效果与质量。
相比于多路音视频同步推送,通过人工选择切换画面的传统方案而言,本发明在同一时间仅向学习终端推送一路音视频,大大降低了对带宽的要求,而且由软件自动根据人机交互信息智能选择切换需要推送的音视频通道,推送过程自动化,无需用户和学习者介入,且不需要布置复杂的设备,实现成本低、维护成本低。
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及如下实施例对本发明进行进一步详细说明。
如图1所示,多通道音视频自动检测实时推送方法,包括如下步骤:
A、对采集到的多通道音视频进行选择性推送,所述多通道音视频包括一路默认通道视频和至少一路非默认通道视频,默认推送所述默认通道视频;
B、实时检测默认通道视频的来源终端中的人机交互事件并记录对应的人机交互时间点,根据所述人机交互时间点计算出默认通道切出时间点和默认通道切入时间点;
C、根据所述默认通道切出时间点和默认通道切入时间点选择对应通道的视频流进行推送。
需要说明的是,默认通道视频一般设置为教学电脑屏幕视频,非默认通道则可来自于拍摄学生的摄像通道和/或拍摄教师的摄像通道和/或拍摄实验平台的摄像通道。默认通道切出时间点和默认通道切入时间点为两种类型的记录点,且默认通道切出时间点和默认通道切入时间点的个数根据人机交互时间点来进行计算,可为多个。
本发明视频推送开始就建立人机交互判断分析机制,分析教师对默认通道视频所在的来源终端的操作行为,通过对鼠标、键盘、翻页笔以及其他人体输入设备操作捕捉与监控,若出现上述操作行为,表示老师需要学生观看到自己在屏幕中所给出的操作内容,此时推送的视频为默认通道视频。当一段时间后未发生上述操作,例如15s后仍未发生人体输入设备操作时,这个时候主要交流在老师讲解以及老师的肢体动作向学生传达信息,需要将关注点切换到老师身上,此时通过分析人机交互时间点设置默认通道切出时间点,将推送的画面切换到拍摄老师的视频画面,使得在该默认通道切出时间点之后焦点转换到老师身上或第二、第三摄像头场景。当老师再次操作人体输入设备时,我们通过记录另外一类点即默认通道切入时间点,表示此时老师正在讲述重点内容,希望学生观看到所讲的展示在屏幕上的内容,通过屏幕传达信息,在记录这类点时,自动处理切换推送画面到屏幕视频,在老师的授课过程中这两类检测点交替出现,远端学生观看视频时在屏幕和摄像头画面切换,还原教师授课过程中的焦点转移,通过此类焦点转移,能精准的传达老师授课想表达的内容和希望观看到的视频画面,并改善单通道视频带来的视觉疲劳,提高学习效率。
在授课过程中,若需开展实验演示并同步展示该视频画面,此时根据预先设定策略,软件自动智能判定和显示讲课屏幕画面(该演示拍摄画面在屏幕上实时展示),老师在屏幕画面中可进行诸如文字书写、重点标注等操作。此策略能很好的规避第三方设备自动判断选择实验拍摄画面而漏掉屏幕书写内容和重点内容标注等信息。
需要说明的是,上述人机交互事件可包括屏幕触控指令、键盘操作、鼠标操作、电 子讲稿控制指令和各摄像通道展示操作。
屏幕触控指令:用户在屏幕上的任何触控、书写,主要涉及讲授过程中的内容标注、圈点、解题推导等,包括红外电子白板、一体机、电磁感应书写屏、数位板、电容书写屏等触控信息;
键盘操作:用户进行电脑键盘输入,例如文字输入、软件操作指令、功能按键等;
鼠标操作:鼠标左、右键点击事件,拖动事件、滚动事件、配合功能键的放大、缩小、复制等各种操作;
电子讲稿控制指令:无线翻页器对电脑上的讲稿翻页、黑屏、飞鼠控制等指令;
摄像通道展示操作指令:用户讲授过程中调取各通道视频在屏幕中展示,例如演示实验摄像设备通道的画面展示,实物展示拍摄设备通道的画面展示等。
优选的,所述步骤C包括:推送的视频流默认设置为默认通道视频,当存在默认通道切出时间点时,推送的视频流从默认通道视频切换到非默认通道视频;当存在默认通道切入时间点时,推送的视频流从非默认通道视频切换为默认通道视频。
当多通道音视频包括N路非默认通道视频时,N≥2,将对应的N路非默认通道视频分别记为第一路非默认通道视频至第N路非默认通道视频;所述步骤C包括:推送的视频流默认设置为默认通道视频,当存在默认通道切出时间点时,推送的视频流从默认通道视频切换到第一路非默认通道视频,第一路非默认通道视频播放预设轮换时长后,推送的视频流切换至第二路非默认通道视频,依次类推,循环切换至第N路非默认通道视频,再切换至第一路非默认通道视频,直至到达默认通道切入时间点;当存在默认通道切入时间点时,推送的视频流从非默认通道视频切换为默认通道视频。
进一步的,当默认通道切出时间点存在时,可在上述默认通道切出时间点处设置切出转场特效;当默认通道切入时间点存在时,可在上述默认通道切入时间点处设置切入转场特效,如此可以达到切换视频的提示作用,同理,当推送的视频流在各个非默认通道视频间存在切换时,可在对应的切换时间点处推送轮换转场特效。
如何获取默认通道切出时间和默认通道切入时间点可采用如下几种方式。
方式一:步骤B可包括:
S101、判断距离当前人机交互时间点的第一预设时长内是否存在新增的人机交互时间点,所述当前人机交互时间点的初始值为多通道音视频推送的起始时刻,如否,进入步骤S103,如是,进入步骤S102;
S102、设置当前人机交互时间点为所述新增的人机交互时间点,进入步骤S101;
S103、新增一个类型为默认通道切出时间点的标记时间点,且其值设置为当前人机交互时间点加第一预设时长对应的时间点;
S104、当距离最新的标记时间点之后实时检测出新的人机交互时间点,设置当前人机交互时间点为所述新的人机交互时间点,新增一个类型为默认通道切入时间点的标记时间点,且其值设置为所述新的人机交互时间点;进入步骤S101。
下面结合图2对上述方式一中的一种情况进行更清楚的解释说明,图中多通道音视频包括一路默认通道视频,一路第一路非默认通道视频和一路第二路非默认通道视频,三路视频的时长相同,其中,默认通道视频推送t0时长后迎来了第一人机交互时间点M1,t0<t1,M1后的第一预设时长t1内无新的人机交互时间点,M2为紧邻其后的第二人机交互时间点,t3为预设轮换时长,t5为第二人机交互时间点到视频推送结束时间点之间的时长,t5<t1,可见M2-M1>t1,推送的视频流在最开始时推送默认通道视频,然后需要在t0+t1处进行通道切换为第一路非默认通道视频处,第一路非默认通道视频的预设轮换时长t3内无新的人机交互时间点,推送的视频流切换到第二路非默认通道视频出,第二路非默认通道视频的预设轮换时长内有新的人机交互时间点即继续推送了t4时长后存在第二人机交互时间点M2,设置t0+t1+t3+t4处为默认通道切入时间点,推送的视频流需要切回至默认通道视频处,即推送的视频流包括图2中的四段斜杠阴影处的视频顺序推送而成,依次分别为t0+t1时间段的默认通道视频,t3时间段的第一路非默认通道视频,t4时间段的第二路非默认通道视频和t5时间段的默认通道视频,相应的,可在t0+t1处添加切出转场特效,在t0+t1+t3处添加轮换转场特效,在t0+t1+t3+t4处添加切入转场特效。
为了适应默认通道视频所在终端没有人机交互但在播放视频的应用场景,提出了方式二:上述步骤B包括:
S201、判断距离当前人机交互时间点的第一预设时长内是否存在新增的人机交互时间点,所述当前人机交互时间点的初始值为多通道音视频推送的起始时刻,如不存在,进入步骤S203,如存在,进入步骤S202;
S202、若上一个标记时间点的类型为默认通道切出时间点,则新增一个类型为默认通道切入时间点的标记时间点,且其值设置为所述新增的人机交互时间点;设置当前人机交互时间点为所述新增的人机交互时间点,进入步骤S201;
S203、设置图像时间起点为当前人机交互时间点加第一预设时长对应的时间点,若上一个标记时间点的类型为默认通道切入时间点或不存在标记点,则新增一个类型为默认通道切出时间点的标记时间点,且其值设置为当前人机交互时间点加第一预设时长对应的时间点;
S204、判断图像时间起点后第二预设时长内是否存在新增的人机交互时间点,如存在,进入步骤S202;否则进入步骤S205;
S205、判断图像时间起点处默认通道视频画面的图像与图像时间起点加第二预设时长处的默认通道视频画面的图像是否一致,如不一致,进入步骤S206;否则进入步骤S207;
S206、若上一个标记时间点的类型为默认通道切出时间点,则新增一个类型为默认通道切入时间点的标记时间点,且其值设置为图像时间起点加第二预设时长后对应的时间点;新的图像时间起点设置为本次的图像时间起点加第二预设时长;进入步骤S204;
S207、若上一个标记时间点的类型为默认通道切入时间点,则新增一个类型为默认通道切出时间点的标记时间点,且其值设置为图像时间起点加第二预设时长后对应的时间点;设置新的图像时间起点为本次的图像时间起点加第二预设时长;进入步骤S204。
下面结合图3对上述方式二中的一种情况进行更清楚的解释说明,图中多通道音视频包括一路默认通道视频和一路第一路非默认通道视频,两路视频的时长相同,默认通道视频推送t0时长后迎来了第一人机交互时间点M1,M1后的第一预设时长t1内无新的人机交互时间点,t0<t1,M2为紧邻其后的第二人机交互时间点,t6为第二预设时长,可见M2-M1>t1,推送的视频流在推送开始时默认推送默认通道视频,并在t0+t1处对应的时间点进行通道切换开始推送第一路非默认通道视频,假设t0+t1处对应的时间点的默认通道视频的图像与t0+t1+t6处对应的时间点的默认通道视频的图像不一致,此处表示默认通道视频处在无人机交互仅存在画面变化的时候例如播放演示视频,此时设置t0+t1+t6处为默认通道切入时间点,假设t0+t1+t6处经过2*t6+t7到达第二人机交互时间点M2,且期间的默认通道视频一直在变化,t5为第二人机交互时间点到视频推送结束时间点之间的时长,t5<t1,如此推送的视频流由图3中的三段斜杠阴影处的视频顺序推送而成,依次分别为t0+t1时间段的默认通道视频,t6时间段的第一路非默认通道视频和2*t6+t7+t5时间段的默认通道视频,相应的,可在t0+t1处添加切出转场特效,在t0+t1+t6处处添加切入转场特效。
为了适应默认通道视频所在终端没有人机交互但在播放视频的应用场景,还有一种方式三:上述步骤B包括:
S301、判断距离当前人机交互时间点的第一预设时长内是否存在新增的人机交互时间点,所述当前人机交互时间点的初始值为多通道音视频推送的起始时刻,如不存在,进入步骤S303,如存在,进入步骤S302;
S302、若上一个标记时间点的类型为默认通道切出时间点,则新增一个类型为默认通道切入时间点的标记时间点,且其值设置为所述新增的人机交互时间点;设置当前人机交互时间点为所述新增的人机交互时间点,进入步骤S301;
S303、设置图像时间起点为当前人机交互时间点加第一预设时长对应的时间点;
S304、判断图像时间起点后第二预设时长内是否存在新增的人机交互时间点,如存在,进入步骤S302;否则进入步骤S305;
S305、判断图像时间起点处的默认通道视频画面的图像与图像时间起点加第二预设时长处的默认通道视频画面的图像是否一致,如不一致,进入步骤S306,否则,进入步骤S307;
S306、若上一个标记时间点的类型为默认通道切出时间点,则新增一个类型为默认通道切入时间点的标记时间点,且其值设置为图像时间起点;设置新的图像时间起点为本次的图像时间起点加第二预设时长,进入步骤S304;
S307、若上一个标记时间点的类型为默认通道切入时间点或标记时间点不存在,则新增一个类型为默认通道切出时间点的标记时间点,且其值设置为图像时间起点;设置新的图像时间起点为本次的图像时间起点加第二预设时长;进入步骤S304。
下面结合图4对上述方式三中的一种情况进行更清楚的解释说明,图中多通道音视频包括一路默认通道视频和一路第一路非默认通道视频,两路视频的时长相同,默认通道视频推送t0时长后迎来了第一人机交互时间点M1,M1后的第一预设时长t1内无新的人机交互时间点,t0<t1,M2为紧邻其后的第二人机交互时间点,t6为第二预设时长,可见M2-M1>t1,假设t0+t1+t6处对应的时间点的默认通道视频的图像与t0+t1处对应的时间点的默认通道视频的图像不一致,且默认通道视频在t0+t1+t6处经过2*t6+t7到达第二人机交互时间点M2,且期间的默认通道视频一直在变化,则推送的视频流就一直为默认通道视频。

Claims (9)

  1. 多通道音视频自动检测实时推送方法,其特征在于,包括如下步骤:
    A、对采集到的多通道音视频进行选择性推送,所述多通道音视频包括一路默认通道视频和至少一路非默认通道视频,默认推送所述默认通道视频;
    B、实时检测默认通道视频的来源终端中的人机交互事件并记录对应的人机交互时间点,根据所述人机交互时间点计算出默认通道切出时间点和默认通道切入时间点;
    C、根据所述默认通道切出时间点和默认通道切入时间点选择对应通道的视频流进行推送。
  2. 如权利要求1所述的方法,其特征在于,
    所述步骤B包括:
    S101、判断距离当前人机交互时间点的第一预设时长内是否存在新增的人机交互时间点,所述当前人机交互时间点的初始值为多通道音视频推送的起始时刻,如否,进入步骤S103,如是,进入步骤S102;
    S102、设置当前人机交互时间点为所述新增的人机交互时间点,进入步骤S101;
    S103、新增一个类型为默认通道切出时间点的标记时间点,且其值设置为当前人机交互时间点加第一预设时长对应的时间点;
    S104、当距离最新的标记时间点之后实时检测出新的人机交互时间点,设置当前人机交互时间点为所述新的人机交互时间点,新增一个类型为默认通道切入时间点的标记时间点,且其值设置为所述新的人机交互时间点;进入步骤S101。
  3. 如权利要求1所述的方法,其特征在于,
    所述步骤B包括:
    S201、判断距离当前人机交互时间点的第一预设时长内是否存在新增的人机交互时间点,所述当前人机交互时间点的初始值为多通道音视频推送的起始时刻,如不存在,进入步骤S203,如存在,进入步骤S202;
    S202、若上一个标记时间点的类型为默认通道切出时间点,则新增一个类型为默认通道切入时间点的标记时间点,且其值设置为所述新增的人机交互时间点;设置当前人机交互时间点为所述新增的人机交互时间点,进入步骤S201;
    S203、设置图像时间起点为当前人机交互时间点加第一预设时长对应的时间 点,若上一个标记时间点的类型为默认通道切入时间点或不存在标记点,则新增一个类型为默认通道切出时间点的标记时间点,且其值设置为当前人机交互时间点加第一预设时长对应的时间点;
    S204、判断图像时间起点后第二预设时长内是否存在新增的人机交互时间点,如存在,进入步骤S202;否则进入步骤S205;
    S205、判断图像时间起点处默认通道视频画面的图像与图像时间起点加第二预设时长处的默认通道视频画面的图像是否一致,如不一致,进入步骤S206;否则进入步骤S207;
    S206、若上一个标记时间点的类型为默认通道切出时间点,则新增一个类型为默认通道切入时间点的标记时间点,且其值设置为图像时间起点加第二预设时长后对应的时间点;新的图像时间起点设置为本次的图像时间起点加第二预设时长;进入步骤S204;
    S207、若上一个标记时间点的类型为默认通道切入时间点,则新增一个类型为默认通道切出时间点的标记时间点,且其值设置为图像时间起点加第二预设时长后对应的时间点;设置新的图像时间起点为本次的图像时间起点加第二预设时长;进入步骤S204。
  4. 如权利要求1所述的方法,其特征在于,
    所述步骤B包括:
    S301、判断距离当前人机交互时间点的第一预设时长内是否存在新增的人机交互时间点,所述当前人机交互时间点的初始值为多通道音视频推送的起始时刻,如不存在,进入步骤S303,如存在,进入步骤S302;
    S302、若上一个标记时间点的类型为默认通道切出时间点,则新增一个类型为默认通道切入时间点的标记时间点,且其值设置为所述新增的人机交互时间点;设置当前人机交互时间点为所述新增的人机交互时间点,进入步骤S301;
    S303、设置图像时间起点为当前人机交互时间点加第一预设时长对应的时间点;
    S304、判断图像时间起点后第二预设时长内是否存在新增的人机交互时间点,如存在,进入步骤S302;否则进入步骤S305;
    S305、判断图像时间起点处的默认通道视频画面的图像与图像时间起点加第 二预设时长处的默认通道视频画面的图像是否一致,如不一致,进入步骤S306,否则,进入步骤S307;
    S306、若上一个标记时间点的类型为默认通道切出时间点,则新增一个类型为默认通道切入时间点的标记时间点,且其值设置为图像时间起点;设置新的图像时间起点为本次的图像时间起点加第二预设时长,进入步骤S304;
    S307、若上一个标记时间点的类型为默认通道切入时间点或标记时间点不存在,则新增一个类型为默认通道切出时间点的标记时间点,且其值设置为图像时间起点;设置新的图像时间起点为本次的图像时间起点加第二预设时长;进入步骤S304。
  5. 如权利要求1所述的方法,其特征在于,
    所述人机交互事件包括屏幕触控指令、键盘操作、鼠标操作、电子讲稿控制指令和摄像通道展示操作指令。
  6. 如权利要求1~5任意一项所述的方法,其特征在于,
    所述步骤C包括:推送的视频流默认设置为默认通道视频,当存在默认通道切出时间点时,推送的视频流从默认通道视频切换到非默认通道视频;当存在默认通道切入时间点时,推送的视频流从非默认通道视频切换为默认通道视频。
  7. 如权利要求6所述的方法,其特征在于,
    当多通道音视频包括N路非默认通道视频时,N≥2,将对应的N路非默认通道视频分别记为第一路非默认通道视频至第N路非默认通道视频;所述步骤C包括:推送的视频流默认设置为默认通道视频,当存在默认通道切出时间点时,推送的视频流从默认通道视频切换到第一路非默认通道视频,第一路非默认通道视频播放预设轮换时长后,推送的视频流切换至第二路非默认通道视频,依次类推,循环切换至第N路非默认通道视频,再切换至第一路非默认通道视频,直至到达默认通道切入时间点;当存在默认通道切入时间点时,推送的视频流从非默认通道视频切换为默认通道视频。
  8. 如权利要求6或7所述的方法,其特征在于,
    当默认通道切出时间点存在时,在所述默认通道切出时间点处推送切出转场特效;当默认通道切入时间点存在时,在所述默认通道切入时间点处推送切入转场特效。
  9. 如权利要求7所述的方法,其特征在于,
    当推送的视频流在各个非默认通道视频间存在切换时,在对应的切换时间点处推送轮换转场特效。
PCT/CN2020/092669 2019-05-28 2020-05-27 多通道音视频自动检测实时推送方法 WO2020238973A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910452583.4A CN110166841B (zh) 2019-05-28 2019-05-28 多通道音视频自动检测实时推送方法
CN201910452583.4 2019-05-28

Publications (1)

Publication Number Publication Date
WO2020238973A1 true WO2020238973A1 (zh) 2020-12-03

Family

ID=67629457

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/092669 WO2020238973A1 (zh) 2019-05-28 2020-05-27 多通道音视频自动检测实时推送方法

Country Status (2)

Country Link
CN (1) CN110166841B (zh)
WO (1) WO2020238973A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110166841B (zh) * 2019-05-28 2021-06-29 成都依能科技股份有限公司 多通道音视频自动检测实时推送方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060228692A1 (en) * 2004-06-30 2006-10-12 Panda Computer Services, Inc. Method and apparatus for effectively capturing a traditionally delivered classroom or a presentation and making it available for review over the Internet using remote production control
CN104822038A (zh) * 2015-04-30 2015-08-05 广州瀚唐电子科技有限公司 录播系统及其画面切换方法
CN205179215U (zh) * 2015-09-22 2016-04-20 杭州海康威视系统技术有限公司 一种教学录播系统
CN107507474A (zh) * 2016-06-14 2017-12-22 广东紫旭科技有限公司 一种远程交互式录播系统视频切换方法
CN108243327A (zh) * 2016-12-27 2018-07-03 天津阳冰科技有限公司 一种新型视频录播系统
CN110166841A (zh) * 2019-05-28 2019-08-23 成都依能科技股份有限公司 多通道音视频自动检测实时推送方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101394479B (zh) * 2008-09-25 2010-06-16 上海交通大学 基于运动检测结合多通道融合的教师运动跟踪方法
CN101909160A (zh) * 2009-12-17 2010-12-08 新奥特(北京)视频技术有限公司 一种网络视频直播中的播控切换方法及装置
US20120196254A1 (en) * 2011-01-27 2012-08-02 Bobby Joe Marsh Methods and systems for concurrent teaching of assembly processes at disparate locations
CN102547248B (zh) * 2012-02-03 2014-03-26 深圳锐取信息技术股份有限公司 多路实时监控单视频文件录制方法
CN103347155B (zh) * 2013-06-18 2016-08-10 北京汉博信息技术有限公司 实现两个视频流不同过渡效果切换的转场特效模块及方法
CN104410834A (zh) * 2014-12-04 2015-03-11 重庆晋才富熙科技有限公司 一种教学视频的智能切换方法
CN104469303A (zh) * 2014-12-04 2015-03-25 重庆晋才富熙科技有限公司 教学视频的智能切换方法
CN105744340A (zh) * 2016-02-26 2016-07-06 上海卓越睿新数码科技有限公司 直播视频和演示文稿实时画面融合方法
CN105939480A (zh) * 2016-04-18 2016-09-14 乐视控股(北京)有限公司 一种终端视频的直播方法及装置
CN106454200A (zh) * 2016-08-10 2017-02-22 惠州紫旭科技有限公司 一种基于场景切换的视频交互方法和系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060228692A1 (en) * 2004-06-30 2006-10-12 Panda Computer Services, Inc. Method and apparatus for effectively capturing a traditionally delivered classroom or a presentation and making it available for review over the Internet using remote production control
CN104822038A (zh) * 2015-04-30 2015-08-05 广州瀚唐电子科技有限公司 录播系统及其画面切换方法
CN205179215U (zh) * 2015-09-22 2016-04-20 杭州海康威视系统技术有限公司 一种教学录播系统
CN107507474A (zh) * 2016-06-14 2017-12-22 广东紫旭科技有限公司 一种远程交互式录播系统视频切换方法
CN108243327A (zh) * 2016-12-27 2018-07-03 天津阳冰科技有限公司 一种新型视频录播系统
CN110166841A (zh) * 2019-05-28 2019-08-23 成都依能科技股份有限公司 多通道音视频自动检测实时推送方法

Also Published As

Publication number Publication date
CN110166841B (zh) 2021-06-29
CN110166841A (zh) 2019-08-23

Similar Documents

Publication Publication Date Title
CN110010164B (zh) 多通道音视频自动智能编辑方法
US10805365B2 (en) System and method for tracking events and providing feedback in a virtual conference
CN107731032B (zh) 一种音视频切换方法、装置以及远程多点互动教学系统
CN107945592B (zh) 一种同步互助课堂教学系统
CN102646346B (zh) 用于远程授课的移动视频播放方法
CN113504852A (zh) 一种录播一体的智能全面屏黑板系统的控制方法
US20130314421A1 (en) Lecture method and device in virtual lecture room
US20160188125A1 (en) Method to include interactive objects in presentation
US20110123972A1 (en) System for automatic production of lectures and presentations for live or on-demand publishing and sharing
CN104735416A (zh) 跟踪摄像、录音信息采集处理直播网络教学系统
CN111131876B (zh) 视频直播的控制方法、装置、终端及计算机可读存储介质
US10545626B2 (en) Presenter/viewer role swapping during ZUI performance with video background
WO2018223529A1 (zh) 一种基于互联网的录播课程跟随学习系统和方法
CN112652200A (zh) 人机交互系统、方法、服务器、交互控制设备及存储介质
CN201063755Y (zh) 用于教学与会场的录制系统
WO2018036065A1 (zh) 智能交互平板
CN110933350A (zh) 一种电子云镜录播系统、方法及装置
WO2020238973A1 (zh) 多通道音视频自动检测实时推送方法
CN116781847A (zh) 一种黑板板书导播方法、装置、设备及存储介质
WO2019033664A1 (zh) 信息显示的控制方法、装置、智能教学设备及存储介质
CN102663907A (zh) 视频授课系统和视频授课方法
WO2019000617A1 (zh) 一种互联网教学的录播系统和方法
KR20150112113A (ko) 이벤트 처리 기반의 온라인 강의 콘텐츠 관리방법
JP2005524867A (ja) 低ビットレートの分散型スライドショウ・プレゼンテーションを提供するシステムおよび方法
CN114095747B (zh) 直播互动系统和方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20813581

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20813581

Country of ref document: EP

Kind code of ref document: A1