CN114286114B - Intelligent video stream switching method and system - Google Patents

Intelligent video stream switching method and system Download PDF

Info

Publication number
CN114286114B
CN114286114B CN202111363782.1A CN202111363782A CN114286114B CN 114286114 B CN114286114 B CN 114286114B CN 202111363782 A CN202111363782 A CN 202111363782A CN 114286114 B CN114286114 B CN 114286114B
Authority
CN
China
Prior art keywords
volume
sound
volume detection
channel
mono
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111363782.1A
Other languages
Chinese (zh)
Other versions
CN114286114A (en
Inventor
陆趣
杨君蔚
黄海峰
朱竝清
徐俊
李辉石
周鑫
胡春玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Media Tech Co ltd
Original Assignee
Shanghai Media Tech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Media Tech Co ltd filed Critical Shanghai Media Tech Co ltd
Priority to CN202111363782.1A priority Critical patent/CN114286114B/en
Publication of CN114286114A publication Critical patent/CN114286114A/en
Application granted granted Critical
Publication of CN114286114B publication Critical patent/CN114286114B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Television Receiver Circuits (AREA)

Abstract

The invention provides an intelligent video stream switching method and system, which relate to the technical field of signal processing and comprise the following steps: continuously acquiring pulse code modulation audio data of multiple channels of information sources and calculating to obtain a binaural sound value of each information source; carrying out mono conversion on the two-channel volume values to obtain mono volume values of all the information sources, and sequentially adding the mono volume values corresponding to each channel of information source into a volume buffer set; and aiming at each volume buffer set, respectively carrying out volume detection on each mono volume value to obtain a corresponding volume detection result, and when the corresponding audio duration of each mono volume value in the volume buffer set is not smaller than a preset value, carrying out intelligent switching on video pictures of video streams corresponding to all the information sources according to the sound judgment results of the current volume detection of all the information sources, and respectively updating each volume buffer set. The intelligent video switching device has the beneficial effects that the intelligent switching of the video pictures is performed based on the sound judgment result while noise interference is prevented, and the labor cost is effectively saved.

Description

Intelligent video stream switching method and system
Technical Field
The present invention relates to the field of signal processing technologies, and in particular, to a method and a system for intelligent switching of video streams.
Background
With the development of internet technology, video live broadcasting has been increasingly focused. In the live broadcast process of some video programs (such as product release meeting), it is often necessary to present the audience with the wind of each anchor at various different positions, and after the target anchor is determined, the live broadcast picture is switched from the current anchor to the target anchor which may be at different positions.
The traditional live picture switching is realized by using hardware such as a guide switching station. However, this conventional switching method has a problem of high cost, and is not suitable for the development of live internet broadcasting. With the development of the internet, a cloud broadcasting guide platform is used for a virtual broadcasting guide platform for switching multi-channel video streams, so that the functions of switching multi-channel audio and video and custom scenes, multi-channel push live broadcasting, adding custom pictures Logo, custom characters, custom score, custom caption bars, custom elements, time delay broadcasting and the like on the original basis can be realized, but the video stream switching of the existing cloud broadcasting guide platform still needs manual participation of operators although the video stream switching can be realized at low cost.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides an intelligent video stream switching method, which comprises the following steps:
step S1, continuously collecting double-channel pulse code modulation audio of a plurality of information sources, and calculating to obtain double-channel sound values of the pulse code modulation audio of each information source;
step S2, carrying out mono conversion on the two-channel sound volume values to obtain mono sound volume values of the information sources, and correspondingly adding the mono sound volume values corresponding to each channel of the information sources into a sound volume buffer set according to the sequence of the corresponding acquisition time;
step S3, for each volume buffer set, respectively performing volume detection on each mono volume value to obtain a corresponding volume detection result, and judging whether the audio duration of the pulse code modulation audio corresponding to each mono volume value in the volume buffer set is smaller than a preset value or not:
if yes, returning to the step S1;
if not, processing according to each volume detection result to obtain a corresponding sound judgment result of the current volume detection of the information source, and then turning to step S4;
and S4, intelligently switching video pictures of the video streams corresponding to the information sources according to the sound judging results of the current volume detection of all the information sources, respectively updating the volume cache sets, and returning to the step S3.
Preferably, the volume detection result is silent or voiced; then in step S4:
processing to obtain the sound judgment result of the corresponding information source according to a first duty ratio of the sound volume detection result representing silence in a first preset time period and a second duty ratio of the sound volume detection result representing sound in a second preset time period aiming at each sound volume cache set;
the first preset duration and the second preset duration are both greater than the preset value.
Preferably, in the step S3, the detecting the volume of each mono volume value includes:
comparing each mono volume value with a volume threshold value respectively, and judging whether the mono volume value is smaller than the volume threshold value or not:
if yes, outputting the volume detection result representing silence;
if not, outputting the volume detection result indicating sound.
Preferably, the first preset duration is smaller than the second preset duration, and the pulse code modulation audio data of the second preset duration includes the pulse code modulation audio data of the first preset duration; the process of processing the sound judgment result of the corresponding information source according to each volume detection result comprises the following steps:
step A1a, calculating the first duty ratio of the volume detection result representing silence in the first preset duration, and judging whether the first duty ratio is smaller than a first threshold value:
if not, outputting the sound judging result indicating that the information source is continuously silent in the first preset duration, and then turning to the step S4;
if yes, turning to the step A2a;
step A2a, calculating the second duty ratio of the volume detection result representing the sound in the second preset time period, and judging whether the second duty ratio is smaller than a second threshold value:
if not, outputting the sound judging result indicating that the information source continuously sounds within the second preset duration, and then turning to the step S4;
if yes, the sound judgment result of the last sound volume detection is used as the sound judgment result of the current sound volume detection, and then the step S4 is performed.
Preferably, the first preset duration is not less than the second preset duration, and the pulse code modulation audio data of the first preset duration includes the pulse code modulation audio data of the second preset duration; the process of processing the sound judgment result of the corresponding information source according to each volume detection result comprises the following steps:
step A1b, calculating the second duty ratio of the volume detection result representing the sound in the second preset time period, and judging whether the second duty ratio is smaller than a second threshold value:
if not, outputting the sound judging result indicating that the information source continuously sounds within the second preset duration, and then turning to the step S4;
if yes, turning to the step A2b;
step A2b, calculating the first duty ratio of the volume detection result representing silence in the first preset duration, and judging whether the first duty ratio is greater than a first threshold:
if not, outputting the sound judging result indicating that the information source is continuously silent in the first preset duration, and then turning to the step S4;
if yes, the sound judgment result of the last sound volume detection is used as the sound judgment result of the current sound volume detection, and then the step S4 is performed.
Preferably, in the step S2, a Max channel conversion algorithm is used to perform mono conversion on the binaural sound values.
Preferably, in the step S4, the number of the sound determination results indicating that the source is continuously sounding is used to intelligently switch the video frames of the video stream corresponding to each source according to the sound determination results of the current volume detection of all the sources.
Preferably, in the step S4, the number of the sound determination results indicating that the source is continuously sounding is counted, and when the number is one, the current video playing interface is intelligently switched to the video picture of the video stream corresponding to the source which is continuously sounding, and when the number is greater than one, the current video playing interface is intelligently switched to the video picture indicating panorama.
Preferably, in step S3, for each volume buffer set, a sliding time window is adopted to perform volume detection on each mono volume value to obtain a corresponding volume detection result.
The invention also provides an intelligent video stream switching system which is applied to the intelligent video stream switching method, and the intelligent video stream switching system comprises:
the volume calculation module is used for continuously collecting the double-channel pulse code modulation audio of the multi-channel information sources and calculating to obtain the double-channel volume value of the pulse code modulation audio of each information source;
the volume conversion module is connected with the volume calculation module and is used for carrying out mono-channel conversion on the two-channel volume values to obtain mono-channel volume values of the information sources, and adding the mono-channel volume values corresponding to each channel of the information sources into a volume buffer set correspondingly according to the sequence of the corresponding acquisition time;
the volume detection module is connected with the volume conversion module and is used for respectively carrying out volume detection on each single-channel volume value aiming at each volume cache set to obtain a corresponding volume detection result, and outputting a judging signal when the audio duration of the pulse code modulation audio corresponding to each single-channel volume value in the volume cache set is not less than a preset value;
and the sound judging module is connected with the volume detecting module and is used for processing the judging signals and the volume detecting results to obtain corresponding sound judging results of the information sources, intelligently switching video pictures of the video streams corresponding to the information sources according to the sound judging results of the current volume detection of all the information sources, and respectively updating the volume cache sets.
The technical scheme has the following advantages or beneficial effects: the pulse code modulation audio of the multipath information source is detected and analyzed, so that sound or silence protection is realized, the detected interference caused by noise is prevented, and meanwhile, the intelligent switching of the corresponding video picture can be performed based on the sound judgment result of sound or silence, so that the unmanned and automatic broadcasting cloud broadcasting operation mode is realized, and the labor cost is effectively saved.
Drawings
FIG. 1 is a flow chart of an intelligent video stream switching method according to a preferred embodiment of the invention;
FIG. 2 is a schematic diagram showing the comparison of the volume detection before and after each monaural volume value in the volume buffer set according to the preferred embodiment of the present invention;
FIG. 3 is a flow chart illustrating a sound determination process when the first preset duration is less than the second preset duration in a preferred embodiment of the present invention;
fig. 4 is a schematic diagram of volume detection results corresponding to a first preset duration and a second preset duration respectively in a preferred embodiment of the present invention;
FIG. 5 is a flow chart of a sound judging process when the first preset time period is not less than the second preset time period in the preferred embodiment of the present invention;
FIG. 6 is a schematic diagram showing the sliding steps of the volume detection in two adjacent times according to the preferred embodiment of the present invention;
fig. 7 is a schematic structural diagram of an intelligent video stream switching system according to a preferred embodiment of the present invention.
Detailed Description
The invention will now be described in detail with reference to the drawings and specific examples. The present invention is not limited to the embodiment, and other embodiments may fall within the scope of the present invention as long as they conform to the gist of the present invention.
In a preferred embodiment of the present invention, based on the above-mentioned problems existing in the prior art, an intelligent video stream switching method is now provided, as shown in fig. 1, which includes:
step S1, continuously collecting double-channel pulse code modulation audios of multiple signal sources, and calculating to obtain double-channel sound values of the pulse code modulation audios of all the signal sources;
step S2, carrying out mono-channel conversion on the two-channel volume values to obtain mono-channel volume values of all the information sources, and correspondingly adding the mono-channel volume values corresponding to each channel of the information sources into a volume buffer set according to the sequence of the corresponding acquisition time;
step S3, for each volume buffer set, respectively performing volume detection on each mono volume value to obtain a corresponding volume detection result, and judging whether the audio duration of the pulse code modulation audio corresponding to each mono volume value in the volume buffer set is smaller than a preset value or not:
if yes, returning to the step S1;
if not, processing according to each volume detection result to obtain a sound judgment result of the current volume detection of the corresponding information source, and then turning to step S4;
and S4, intelligently switching video pictures of the video streams corresponding to all the information sources according to the sound judgment results of the current volume detection of all the information sources, respectively updating each volume cache set, and returning to the step S3.
Specifically, in this embodiment, the technical solution may be applied to cloud broadcasting, and intelligent switching of video frames in a playing process of a television program is implemented by using cloud broadcasting. In the playing process of the television program, through continuously collecting the dual-channel PCM (Pulse Code Modulation ) audio of the multi-channel information sources corresponding to the television program, then continuously processing the pulse code modulation audio of each channel of information sources to obtain the sound judging result of each channel of information sources, and further, carrying out intelligent switching on video pictures of video streams corresponding to all the information sources according to the sound judging results of all the channel information sources.
Further specifically, after the dual-channel pulse code modulation audio of the multiple channels of information sources is acquired, firstly, the dual-channel sound values of the pulse code modulation audio of each channel can be calculated by a sound volume decibel calculator, then the dual-channel sound values can be sent to a sound volume processor for processing, so that the dual-channel sound values of the left and right dual channels are subjected to single-channel conversion to obtain single-channel sound volumes, and finally, the continuously acquired single-channel sound volumes of each channel of information source can be sent to an intelligent switcher for sound volume detection and judgment.
Preferably, a plurality of audio buffers may be configured in the volume db calculator, each of the audio buffers respectively corresponds to buffering the pulse code modulated audio of one channel of the signal source, and further preferably, a plurality of volume buffers may be configured in the intelligent switch, each of the volume buffers respectively corresponds to buffering a volume buffer set of one channel of the signal source. The audio buffers and the volume buffers adopt a first-in first-out principle, namely pulse code modulation audio of the audio buffers is firstly stored, after corresponding binaural volume is obtained through calculation, the audio buffers are firstly output and are sent into a volume processor to be calculated to obtain monophonic volume, and then the monophonic volume is firstly stored in the volume buffers to participate in volume detection and judgment, after the volume detection and judgment are finished, the monophonic volume which is already participated in the volume detection and judgment can be removed from a volume buffer set to update the volume buffer set for next volume detection, the volume detection is carried out once each time, and video picture switching is carried out once based on corresponding sound judgment results. It can be understood that if the sound determination results of two adjacent times are the same, the corresponding video frame is kept unchanged.
Further preferably, in order to prevent noise from interfering with volume detection, in the present technical solution, sound determination is performed on a plurality of volume detection results in a duration period, so that when the audio duration of pulse code modulated audio corresponding to each monaural volume value in the volume buffer set is long enough, that is, when the buffer amount in the volume buffer set is enough, further sound determination is performed. Specifically, by configuring the preset value as a judgment criterion for judging whether or not the audio duration is sufficiently long.
In a preferred embodiment of the present invention, the volume detection result is silent or voiced; then in step S4:
processing to obtain a sound judgment result of a corresponding information source according to a first duty ratio of a sound volume detection result representing silence in a first preset time period and a second duty ratio of a sound volume detection result representing sound in a second preset time period aiming at each sound volume cache set;
the first preset duration and the second preset duration are both greater than a preset value.
Specifically, in this embodiment, when the audio duration of the pulse code modulated audio corresponding to each monaural volume value in the volume buffer set reaches a preset value, the volume detection results of the first preset duration and the second preset duration may be selected from the volume buffer set respectively to determine silence or sound. The first preset time length and the second preset time length can be set automatically according to requirements.
In a preferred embodiment of the present invention, in step S3, performing volume detection on each monaural volume value includes:
comparing each mono volume value with a volume threshold value respectively, and judging whether the mono volume value is smaller than the volume threshold value or not:
if yes, outputting a sound volume detection result representing silence;
if not, outputting the sound volume detection result indicating sound.
Specifically, in this embodiment, as shown in fig. 2, taking a volume buffer set corresponding to one channel of information source as an example, a volume detection result of performing volume detection on each monaural volume value can be intuitively seen.
As a preferred embodiment, after the volume detection result of each processed monaural volume value, priority of sound determination is pre-configured when sound determination is performed, wherein sound determination is preferentially performed when the first preset time length is the same as the second preset time length, and priority determination is performed when the first preset time length is different from the second preset time length and the time length is shorter.
Further specifically, the first preset duration is less than the second preset duration, and the pulse code modulated audio data of the second preset duration comprises pulse code modulated audio data of the first preset duration; as shown in fig. 3, the process of obtaining the sound determination result of the corresponding source according to each volume detection result includes:
step A1a, calculating a first duty ratio of a volume detection result representing silence in a first preset time period, and judging whether the first duty ratio is smaller than a first threshold value:
if not, outputting a sound judgment result indicating that the information source is continuously silent in the first preset time period, and then turning to the step S4;
if yes, turning to the step A2a;
step A2a, calculating a second duty ratio of the sound volume detection result within a second preset time period, and judging whether the second duty ratio is smaller than a second threshold value:
if not, outputting a sound judgment result indicating that the information source continuously sounds in a second preset time period, and then turning to the step S4;
if yes, the sound determination result of the last sound volume detection is taken as the sound determination result of the current sound volume detection, and then the step S4 is performed.
Specifically, when the first preset duration is smaller than the second preset duration, firstly, soundless judgment is carried out, if the first duty ratio is not smaller than the first threshold, the sound judgment result is regarded as soundless, the soundless judgment is not carried out any more, if the first duty ratio is smaller than the first threshold, the sound judgment result is regarded as soundless, at the moment, further soundless judgment is needed, if the second duty ratio is not smaller than the second threshold, the sound judgment result is regarded as soundless, at the moment, the sound judgment result which indicates that the signal source continuously soundes in the second preset duration is output, if the second duty ratio is smaller than the second threshold, the sound judgment result is regarded as soundless, at the moment, the sound judgment result which indicates the current volume detection is invalid, and at the moment, the sound judgment result which indicates the last volume detection is regarded as the sound judgment result of the current volume detection.
Preferably, the first threshold value and the second threshold value may be 80%, or may be set according to the need. For each volume buffer set, the volume detection result of performing volume detection on each mono volume value is shown in fig. 4, wherein the audio collection time corresponding to the sound volume detection result located at the rightmost side is earliest, when the sound is determined, the audio collection time corresponding to the volume detection result can be used as a duration starting point, the volume detection result of a first preset duration is selected, it can be seen that the first duty ratio of the volume detection result representing silence in the first preset duration is 20%, and is less than 80%, and at this time, the sound is further determined. As shown in fig. 4, it can be seen that the starting point of the duration of the second preset duration and the starting point of the duration of the first preset duration are required to be consistent, and the sound volume detection result in the second preset duration is 75%, and is also less than 80%, and at this time, the sound judgment result of the last sound volume detection is taken as the sound judgment result of the current sound volume detection.
In a preferred embodiment of the present invention, the first preset duration is not less than the second preset duration, and the pulse code modulated audio data of the first preset duration includes pulse code modulated audio data of the second preset duration; as shown in fig. 5, the process of obtaining the sound determination result of the corresponding source according to each volume detection result includes:
step A1b, calculating a second duty ratio of the sound volume detection result in a second preset time period, and judging whether the second duty ratio is smaller than a second threshold value:
if not, outputting a sound judgment result indicating that the information source continuously sounds in a second preset time period, and then turning to the step S4;
if yes, turning to the step A2b;
step A2b, calculating a first duty ratio of the volume detection result representing silence in a first preset time period, and judging whether the first duty ratio is greater than a first threshold value:
if not, outputting a sound judgment result indicating that the information source is continuously silent in the first preset time period, and then turning to the step S4;
if yes, the sound determination result of the last sound volume detection is taken as the sound determination result of the current sound volume detection, and then the step S4 is performed.
Specifically, in this embodiment, when the first preset duration is not less than the second preset duration, the voiced determination is performed first, if the second duty is not less than the second threshold, the voiced determination result is considered to be a true proposition, no more silent determination is performed, if the second duty is less than the second threshold, the voiced determination result is considered to be a false proposition, at this time, further silent determination is required, if the first duty is not less than the first threshold, the acoustic determination result is considered to be a true proposition, at this time, a sound determination result indicating that the source is continuously silent within the first preset duration is output, and if the first duty is less than the first threshold, the acoustic determination result is considered to be a false proposition, at this time, the sound determination result indicating the current volume detection is invalid, at this time, the sound determination result of the last volume detection is taken as the sound determination result of the current volume detection.
In a preferred embodiment of the present invention, in step S2, a Max channel conversion algorithm is used to convert the binaural sound values into mono.
In a preferred embodiment of the present invention, in step S4, the number of sound determination results indicating that the source is continuously sounding is used to intelligently switch the video frames of the video stream corresponding to each source according to the sound determination results of the current volume detection of all the sources.
In the preferred embodiment of the present invention, in step S4, the number of sound determination results indicating that the source is continuously sounding is counted, and when the number is one, the current video playing interface is intelligently switched to the video picture of the video stream corresponding to the continuously sounding source, and when the number is greater than one, the current video playing interface is intelligently switched to the video picture indicating panoramic.
Specifically, in this embodiment, when the number of sound determination results indicating that the source is continuously sounding is one, it can be considered that only one anchor in the television program is continuously speaking at this time, and the current video broadcast program is switched to the corresponding video picture at this time, and the picture close-up is performed. If the number of sound determination results indicating that the source is continuously sounding is greater than one, it can be considered that a plurality of anchors are continuously speaking in the television program at this time, and the current video playing program is switched to the video picture corresponding to the panorama so as to include all anchors speaking.
In the preferred embodiment of the present invention, in step S3, for each volume buffer set, a sliding time window is adopted to perform volume detection on each monaural volume value to obtain a corresponding volume detection result.
Specifically, in the present embodiment, the sliding step size of the sliding time window includes, but is not limited to, 100ms. For example, as shown in fig. 6, a schematic diagram of a sliding step length of a sliding time window corresponding to a first preset duration and a second preset duration is shown when two adjacent volume measurements are performed; it can be seen that when the volume is detected this time, the starting time of the sliding time window corresponding to the first preset duration and the second preset duration is the rightmost side, and when the volume is detected next time, the starting time of the sliding time window corresponding to the first preset duration and the second preset duration is the audio acquisition time of the third audio volume detection result from the right side, and the sliding step length covers the audio time of the two audio volume detection results. Further, when the next volume detection is finished, the two sound volume detection results covered by the sliding step length can be deleted from the volume buffer set, so as to update the volume buffer set. It can be understood that the number of the volume detection results covered by the first preset duration, the second preset duration and the sliding step length is related to the acquisition period of the pulse code modulation audio, and the set values of the first preset duration, the second preset duration and the sliding step length, which are shown as an embodiment and are not limited in this way.
The invention also provides an intelligent video stream switching system, which is applied to the intelligent video stream switching method, as shown in fig. 7, and comprises:
the volume calculation module 1 is used for continuously collecting the double-channel pulse code modulation audio of the multi-channel information sources and calculating to obtain the double-channel volume value of the pulse code modulation audio of each information source;
the volume conversion module 2 is connected with the volume calculation module 1 and is used for carrying out mono-channel conversion on the two-channel volume values to obtain mono-channel volume values of all the information sources, and adding the mono-channel volume values corresponding to each channel of the information sources into a volume buffer set correspondingly according to the sequence of the corresponding acquisition time;
the volume detection module 3 is connected with the volume conversion module 2 and is used for respectively carrying out volume detection on each mono volume value aiming at each volume cache set to obtain a corresponding volume detection result, and outputting a judging signal when the audio duration of pulse code modulation audio corresponding to each mono volume value in the volume cache set is not less than a preset value;
the sound judging module 4 is connected with the sound detecting module 3 and is used for processing the sound judging results of the corresponding information sources according to the judging signals and the sound detecting results of the sound detecting of all the information sources, intelligently switching the video pictures of the video streams corresponding to the information sources according to the sound judging results of the current sound detecting of all the information sources and respectively updating the sound buffering sets.
The foregoing description is only illustrative of the preferred embodiments of the present invention and is not to be construed as limiting the scope of the invention, and it will be appreciated by those skilled in the art that equivalent substitutions and obvious variations may be made using the description and drawings, and are intended to be included within the scope of the present invention.

Claims (7)

1. An intelligent video stream switching method is applied to cloud broadcasting; characterized by comprising the following steps:
step S1, continuously collecting double-channel pulse code modulation audio of a plurality of information sources, and calculating to obtain double-channel sound values of the pulse code modulation audio of each information source;
step S2, carrying out mono conversion on the two-channel sound volume values to obtain mono sound volume values of the information sources, and correspondingly adding the mono sound volume values corresponding to each channel of the information sources into a sound volume buffer set according to the sequence of the corresponding acquisition time;
step S3, for each volume buffer set, respectively performing volume detection on each mono volume value to obtain a corresponding volume detection result, and judging whether the audio duration of the pulse code modulation audio corresponding to each mono volume value in the volume buffer set is smaller than a preset value or not:
if yes, returning to the step S1;
if not, processing according to each volume detection result to obtain a corresponding sound judgment result of the current volume detection of the information source, and then turning to step S4;
step S4, intelligent switching is carried out on video pictures of the video streams corresponding to the information sources according to the sound judging results of the current volume detection of all the information sources, the volume cache sets are updated respectively, and then the step S3 is returned;
the volume detection result is silent or voiced;
the sound judging result comprises that the information source continuously does not sound within the first preset time period or continuously sounds within the second preset time period;
in the step S4, according to the number of the sound determination results indicating that the source is continuously voiced among the sound determination results of the current volume detection of all the sources, the video frames of the video stream corresponding to each source are intelligently switched;
in the step S4, the number of the sound determination results of the current volume detection of all the sources is counted, the current video playing interface is intelligently switched to the video picture of the video stream corresponding to the continuously voiced source when the number is one, and the current video playing interface is intelligently switched to the video picture of the anchor including all the utterances when the number is greater than one.
2. The intelligent video stream switching method according to claim 1, wherein in the step S3, the detecting the volume of each of the mono volume values includes:
comparing each mono volume value with a volume threshold value respectively, and judging whether the mono volume value is smaller than the volume threshold value or not:
if yes, outputting the volume detection result representing silence;
if not, outputting the volume detection result indicating sound.
3. The method for intelligent switching of video streams according to claim 1, wherein the first preset duration is less than the second preset duration, and the pulse code modulated audio data of the second preset duration includes the pulse code modulated audio data of the first preset duration; the process of processing the sound judgment result of the corresponding information source according to each volume detection result comprises the following steps:
step A1a, calculating the first duty ratio of the volume detection result representing silence in the first preset duration, and judging whether the first duty ratio is smaller than a first threshold value:
if not, outputting the sound judging result indicating that the information source is continuously silent in the first preset duration, and then turning to the step S4;
if yes, turning to the step A2a;
step A2a, calculating the second duty ratio of the volume detection result representing the sound in the second preset time period, and judging whether the second duty ratio is smaller than a second threshold value:
if not, outputting the sound judging result indicating that the information source continuously sounds within the second preset duration, and then turning to the step S4;
if yes, taking the sound judgment result of the last sound volume detection as the sound judgment result of the current sound volume detection, and then turning to the step S4;
in the step S4:
processing to obtain the sound judgment result of the corresponding information source according to a first duty ratio of the sound volume detection result representing silence in a first preset time period and a second duty ratio of the sound volume detection result representing sound in a second preset time period aiming at each sound volume cache set;
the first preset duration and the second preset duration are both greater than the preset value.
4. The video stream intelligent switching method according to claim 2, wherein the first preset duration is not less than the second preset duration, and the pulse code modulated audio data of the first preset duration includes the pulse code modulated audio data of the second preset duration; the process of processing the sound judgment result of the corresponding information source according to each volume detection result comprises the following steps:
step A1b, calculating the second duty ratio of the volume detection result representing the sound in the second preset time period, and judging whether the second duty ratio is smaller than a second threshold value:
if not, outputting the sound judging result indicating that the information source continuously sounds within the second preset duration, and then turning to the step S4;
if yes, turning to the step A2b;
step A2b, calculating the first duty ratio of the volume detection result representing silence in the first preset duration, and judging whether the first duty ratio is smaller than a first threshold value:
if not, outputting the sound judging result indicating that the information source is continuously silent in the first preset duration, and then turning to the step S4;
if yes, taking the sound judgment result of the last sound volume detection as the sound judgment result of the current sound volume detection, and then turning to the step S4;
in the step S4:
processing to obtain the sound judgment result of the corresponding information source according to a first duty ratio of the sound volume detection result representing silence in a first preset time period and a second duty ratio of the sound volume detection result representing sound in a second preset time period aiming at each sound volume cache set;
the first preset duration and the second preset duration are both greater than the preset value.
5. The intelligent video stream switching method according to claim 1, wherein in the step S2, a Max channel conversion algorithm is used to perform mono conversion on the binaural sound values.
6. The method according to claim 1, wherein in step S3, for each volume buffer set, a sliding time window is adopted to perform volume detection on each mono volume value to obtain the corresponding volume detection result.
7. An intelligent video stream switching system is applied to cloud broadcasting; the method for intelligently switching the video streams is applied to the method for intelligently switching the video streams according to any one of claims 1 to 6, and the system for intelligently switching the video streams comprises the following steps:
the volume calculation module is used for continuously collecting the double-channel pulse code modulation audio of the multi-channel information sources and calculating to obtain the double-channel volume value of the pulse code modulation audio of each information source;
the volume conversion module is connected with the volume calculation module and is used for carrying out mono-channel conversion on the two-channel volume values to obtain mono-channel volume values of the information sources, and adding the mono-channel volume values corresponding to each channel of the information sources into a volume buffer set correspondingly according to the sequence of the corresponding acquisition time;
the volume detection module is connected with the volume conversion module and is used for respectively carrying out volume detection on each single-channel volume value aiming at each volume cache set to obtain a corresponding volume detection result, and outputting a judging signal when the audio duration of the pulse code modulation audio corresponding to each single-channel volume value in the volume cache set is not less than a preset value;
the sound judging module is connected with the volume detecting module and is used for processing the judging signals and the volume detecting results to obtain corresponding sound judging results of the information sources, intelligently switching video pictures of the video streams corresponding to the information sources according to the sound judging results of the current volume detection of all the information sources, and respectively updating the volume cache sets;
the volume detection result is silent or voiced; the sound determination module:
the sound judging result comprises that the information source continuously does not sound within the first preset time period or continuously sounds within the second preset time period;
the sound judging module intelligently switches video pictures of the video streams corresponding to all the information sources according to the number of the sound judging results which indicate that the information sources continuously sound in the sound judging results of the current volume detection of all the information sources;
and in the sound judging module, counting the number of the sound judging results which indicate that the information source continuously sounds in the sound judging results of the current volume detection of all the information sources, intelligently switching a current video playing interface to a video picture of the video stream corresponding to the continuously-sounding information source when the number is one, and intelligently switching the current video playing interface to a video picture of a host including all the utterances which indicates panorama when the number is more than one.
CN202111363782.1A 2021-11-17 2021-11-17 Intelligent video stream switching method and system Active CN114286114B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111363782.1A CN114286114B (en) 2021-11-17 2021-11-17 Intelligent video stream switching method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111363782.1A CN114286114B (en) 2021-11-17 2021-11-17 Intelligent video stream switching method and system

Publications (2)

Publication Number Publication Date
CN114286114A CN114286114A (en) 2022-04-05
CN114286114B true CN114286114B (en) 2024-02-09

Family

ID=80869316

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111363782.1A Active CN114286114B (en) 2021-11-17 2021-11-17 Intelligent video stream switching method and system

Country Status (1)

Country Link
CN (1) CN114286114B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05111020A (en) * 1991-10-17 1993-04-30 Matsushita Electric Ind Co Ltd Picture switching control device for video conference
JPH10308816A (en) * 1997-05-06 1998-11-17 Fujitsu Ltd Voice switch for speaking equipment
JP2008294759A (en) * 2007-05-24 2008-12-04 Toshiba Corp Video and sound reproducing device and method
CN102368816A (en) * 2011-12-01 2012-03-07 中科芯集成电路股份有限公司 Intelligent front end system of video conference
CN107105176A (en) * 2017-05-19 2017-08-29 上海东方传媒技术有限公司 A kind of system that switch contents are pinpointed in digital program
CN107948577A (en) * 2017-12-26 2018-04-20 深圳市保千里电子有限公司 A kind of method and its system of panorama video conference
CN108924469A (en) * 2018-08-01 2018-11-30 广州视源电子科技股份有限公司 A kind of display screen switching Transmission system, intelligent interaction plate and method
CN109905616A (en) * 2019-01-22 2019-06-18 视联动力信息技术股份有限公司 A kind of method and apparatus of Switch Video picture
CN111586341A (en) * 2020-05-20 2020-08-25 深圳随锐云网科技有限公司 Shooting method and picture display method of video conference shooting device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10524053B1 (en) * 2018-06-22 2019-12-31 EVA Automation, Inc. Dynamically adapting sound based on background sound

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05111020A (en) * 1991-10-17 1993-04-30 Matsushita Electric Ind Co Ltd Picture switching control device for video conference
JPH10308816A (en) * 1997-05-06 1998-11-17 Fujitsu Ltd Voice switch for speaking equipment
JP2008294759A (en) * 2007-05-24 2008-12-04 Toshiba Corp Video and sound reproducing device and method
CN102368816A (en) * 2011-12-01 2012-03-07 中科芯集成电路股份有限公司 Intelligent front end system of video conference
CN107105176A (en) * 2017-05-19 2017-08-29 上海东方传媒技术有限公司 A kind of system that switch contents are pinpointed in digital program
CN107948577A (en) * 2017-12-26 2018-04-20 深圳市保千里电子有限公司 A kind of method and its system of panorama video conference
CN108924469A (en) * 2018-08-01 2018-11-30 广州视源电子科技股份有限公司 A kind of display screen switching Transmission system, intelligent interaction plate and method
CN109905616A (en) * 2019-01-22 2019-06-18 视联动力信息技术股份有限公司 A kind of method and apparatus of Switch Video picture
CN111586341A (en) * 2020-05-20 2020-08-25 深圳随锐云网科技有限公司 Shooting method and picture display method of video conference shooting device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Dynamic Video Streaming Scheme Based on SP/SI Frames of H.264/AVC;Po-Chyi Su;2012 41st International Conference on Parallel Processing Workshops;全文 *
探讨4K超高清播出系统架构下自动应急切换控制机制及应用;张辉;现代电视技术(第12期);全文 *

Also Published As

Publication number Publication date
CN114286114A (en) 2022-04-05

Similar Documents

Publication Publication Date Title
EP0367569B1 (en) Sound effect system
US8139165B2 (en) Television receiver
US10546599B1 (en) Systems and methods for identifying a mute/sound sample-set attribute
US10311880B2 (en) System for perceived enhancement and restoration of compressed audio signals
US9282417B2 (en) Spatial sound reproduction
JP5737808B2 (en) Sound processing apparatus and program thereof
CN112400325A (en) Data-driven audio enhancement
US20090150151A1 (en) Audio processing apparatus, audio processing system, and audio processing program
JP2007533189A (en) Video / audio synchronization
WO2011072729A1 (en) Multi-channel audio processing
EP0892582A2 (en) Method and apparatus for error masking in multi-channel audio signals
JP4000095B2 (en) Speech recognition method, apparatus and program
CN114286114B (en) Intelligent video stream switching method and system
CN114144832A (en) Audio signal receiving/decoding method, audio signal encoding/transmitting method, audio signal decoding method, audio signal encoding method, audio signal receiving side device, audio signal transmitting side device, decoding device, encoding device, program, and recording medium
JP2007028065A (en) Surround reproducing apparatus
JPH064088A (en) Speech and music discriminating device
JP2009284212A (en) Digital sound signal analysis method, apparatus therefor and video/audio recorder
JP4212253B2 (en) Speaking speed converter
JP2009159020A (en) Signal processing apparatus, signal processing method, and program
JP2006050045A (en) Moving picture data edit apparatus and moving picture edit method
JP2007293214A (en) Speaking speed converting device, television receiver, and speaking speed converting method
JP2010093614A (en) Video signal playback apparatus
CN116582697A (en) Audio transmission method, device, terminal, storage medium and program product
JP2914179B2 (en) TV receiver capable of deleting specific audio
JP2005208173A (en) Speaking speed conversion device and voice signal transmission system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant