CN114286114A - Intelligent switching method and system for video streams - Google Patents

Intelligent switching method and system for video streams Download PDF

Info

Publication number
CN114286114A
CN114286114A CN202111363782.1A CN202111363782A CN114286114A CN 114286114 A CN114286114 A CN 114286114A CN 202111363782 A CN202111363782 A CN 202111363782A CN 114286114 A CN114286114 A CN 114286114A
Authority
CN
China
Prior art keywords
volume
sound
volume detection
channel
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111363782.1A
Other languages
Chinese (zh)
Other versions
CN114286114B (en
Inventor
陆趣
杨君蔚
黄海峰
朱竝清
徐俊
李辉石
周鑫
胡春玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Media Tech Co ltd
Original Assignee
Shanghai Media Tech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Media Tech Co ltd filed Critical Shanghai Media Tech Co ltd
Priority to CN202111363782.1A priority Critical patent/CN114286114B/en
Publication of CN114286114A publication Critical patent/CN114286114A/en
Application granted granted Critical
Publication of CN114286114B publication Critical patent/CN114286114B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Television Receiver Circuits (AREA)

Abstract

The invention provides a method and a system for intelligently switching video streams, which relate to the technical field of signal processing and comprise the following steps: continuously acquiring pulse code modulation audio data of multiple information sources and calculating to obtain a dual-channel volume value of each information source; performing single channel conversion on the dual-channel volume value to obtain a single channel volume value of each information source, and sequentially adding each single channel volume value corresponding to each channel of information source into a volume cache set; and carrying out volume detection on each single sound channel volume value respectively aiming at each volume cache set to obtain a corresponding volume detection result, carrying out intelligent switching on the video pictures of the video streams corresponding to the information sources according to the sound judgment results of the volume detection of all the information sources when the audio duration corresponding to each single sound channel volume value in the volume cache set is not less than a preset value, and updating each volume cache set respectively. The intelligent switching method has the beneficial effects that the intelligent switching of the video pictures is carried out based on the sound judgment result while the noise interference is prevented, and the labor cost is effectively saved.

Description

Intelligent switching method and system for video streams
Technical Field
The invention relates to the technical field of signal processing, in particular to a method and a system for intelligently switching video streams.
Background
With the development of internet technology, live video has been more and more widely paid attention to people. In the process of live broadcasting of some video programs (for example, product distribution conferences), it is often necessary to present to viewers the styles of various anchor programs located at various different positions, and after the target anchor program is determined, the live broadcasting picture is switched from the current anchor program to the target anchor program which may be located at different positions.
The traditional live broadcast picture switching is realized by utilizing hardware such as a director switching station and the like. However, the conventional switching method has a problem of high cost, and is not suitable for the development of internet live broadcast. With the development of the internet, a cloud broadcasting guide platform is provided, which is used for a virtual broadcasting guide platform for switching a plurality of paths of video streams, and can realize switching of a plurality of paths of audio and video and a user-defined scene, and live broadcast of a plurality of paths of push streams, and add functions such as a user-defined picture Logo, a user-defined character, a user-defined score, a user-defined caption bar, a user-defined element, delayed broadcast and the like on the original basis.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides an intelligent video stream switching method, which comprises the following steps:
step S1, continuously collecting the pulse code modulation audio of the double channels of the multi-channel information sources, and calculating to obtain the double-channel volume value of the pulse code modulation audio of each information source;
step S2, performing single channel conversion on the dual-channel volume value to obtain a single channel volume value of each information source, and correspondingly adding each single channel volume value corresponding to each information source into a volume cache set according to the sequence of corresponding acquisition time;
step S3, for each volume buffer set, respectively performing volume detection on each monaural volume value to obtain a corresponding volume detection result, and determining whether an audio duration of the pulse code modulation audio corresponding to each monaural volume value in the volume buffer set is less than a preset value:
if yes, returning to the step S1;
if not, processing according to each volume detection result to obtain a sound judgment result of the current volume detection of the corresponding information source, and then turning to the step S4;
step S4, intelligently switching video frames of video streams corresponding to the information sources according to the sound determination result of the current volume detection of all the information sources, updating each volume cache set, and then returning to the step S3.
Preferably, the volume detection result is silent or voiced; then in step S4:
for each volume cache set, processing to obtain the sound judgment result of the corresponding information source according to a first proportion representing the soundless volume detection result within a first preset time length and a second proportion representing the soundless volume detection result within a second preset time length;
the first preset time length and the second preset time length are both larger than the preset value.
Preferably, in step S3, the detecting the volume of each of the monaural volume values includes:
comparing each single sound channel volume value with a volume threshold value respectively, and judging whether the single sound channel volume value is smaller than the volume threshold value:
if yes, outputting the volume detection result representing silence;
and if not, outputting the volume detection result indicating sound.
Preferably, the first preset duration is less than the second preset duration, and the pulse code modulation audio data of the second preset duration includes the pulse code modulation audio data of the first preset duration; the process of processing to obtain the sound determination result of the corresponding information source according to each of the volume detection results includes:
step A1a, calculating the first duty ratio of the volume detection result indicating silence within the first preset duration, and determining whether the first duty ratio is less than a first threshold:
if not, outputting the sound judgment result indicating that the signal source is continuously silent within the first preset time period, and then turning to the step S4;
if yes, go to step A2 a;
step A2a, calculating the second proportion of the volume detection result indicating voiced sound within the second preset time period, and determining whether the second proportion is smaller than a second threshold:
if not, outputting the sound judgment result indicating that the signal source is continuously voiced within the second preset time period, and then turning to the step S4;
if so, the sound determination result of the last volume detection is used as the sound determination result of the current volume detection, and the process then proceeds to step S4.
Preferably, the first preset duration is not less than the second preset duration, and the pulse code modulation audio data of the first preset duration includes the pulse code modulation audio data of the second preset duration; the process of processing to obtain the sound determination result of the corresponding information source according to each of the volume detection results includes:
step A1b, calculating the second proportion of the volume detection result indicating voiced sound within the second preset time period, and determining whether the second proportion is smaller than a second threshold:
if not, outputting the sound judgment result indicating that the signal source is continuously voiced within the second preset time period, and then turning to the step S4;
if yes, go to step A2 b;
step A2b, calculating the first duty ratio of the volume detection result indicating silence within the first preset time period, and determining whether the first duty ratio is greater than a first threshold:
if not, outputting the sound judgment result indicating that the signal source is continuously silent within the first preset time period, and then turning to the step S4;
if so, the sound determination result of the last volume detection is used as the sound determination result of the current volume detection, and the process then proceeds to step S4.
Preferably, in step S2, the two-channel volume value is mono-converted by using a Max channel conversion algorithm.
Preferably, in step S4, the video frames of the video streams corresponding to the respective signal sources are intelligently switched according to the number of the sound determination results indicating that the signal sources are continuously voiced in the sound determination results of the current volume detection of all the signal sources.
Preferably, in step S4, the number of the sound determination results indicating that the source is continuously voiced is counted, and when the number is one, the current video playing interface is intelligently switched to the video frame of the video stream corresponding to the source that is continuously voiced, and when the number is greater than one, the current video playing interface is intelligently switched to the video frame indicating a panorama.
Preferably, in step S3, for each volume buffer set, a sliding time window is adopted to perform volume detection on each monaural volume value to obtain the corresponding volume detection result.
The invention also provides a video stream intelligent switching system, which is applied to the video stream intelligent switching method, and the video stream intelligent switching system comprises:
the volume calculation module is used for continuously acquiring the pulse code modulation audio of the double channels of the multi-channel information sources and calculating to obtain the double-channel volume value of the pulse code modulation audio of each information source;
the volume conversion module is connected with the volume calculation module and used for carrying out single-channel conversion on the double-channel volume value to obtain a single-channel volume value of each information source and correspondingly adding each single-channel volume value corresponding to each information source into a volume cache set according to the sequence of corresponding acquisition time;
the volume detection module is connected with the volume conversion module and is used for respectively carrying out volume detection on each single sound channel volume value aiming at each volume cache set to obtain a corresponding volume detection result and outputting a judgment signal when the audio time of the pulse code modulation audio corresponding to each single sound channel volume value in the volume cache set is not less than a preset value;
and the sound judgment module is connected with the volume detection module and used for processing according to the judgment signal and each volume detection result to obtain a sound judgment result of the corresponding information source, intelligently switching the video pictures of the video streams corresponding to the information sources according to the sound judgment result of the current volume detection of all the information sources, and updating each volume cache set respectively.
The technical scheme has the following advantages or beneficial effects: through detecting and analyzing the pulse code modulation audio of the multi-channel information source, sound or soundless protection is realized, the interference of noise to detection is prevented, meanwhile, the intelligent switching of corresponding video pictures can be carried out based on sound or soundless sound judgment results, the cloud program guide working mode of unattended operation and automatic program guide is realized, and the labor cost is effectively saved.
Drawings
FIG. 1 is a flow chart illustrating a method for intelligently switching video streams according to a preferred embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating the comparison between the volume detection of each single-channel volume value in the volume buffer set according to the preferred embodiment of the present invention;
FIG. 3 is a flowchart illustrating a sound determination process when a first predetermined duration is less than a second predetermined duration according to a preferred embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating volume detection results corresponding to a first preset duration and a second preset duration, respectively, in a preferred embodiment of the present invention;
FIG. 5 is a flowchart illustrating a sound determination process performed when the first predetermined duration is not less than the second predetermined duration according to a preferred embodiment of the present invention;
FIG. 6 is a schematic diagram of two adjacent sliding steps of volume detection according to the preferred embodiment of the present invention;
fig. 7 is a schematic structural diagram of an intelligent video stream switching system according to a preferred embodiment of the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. The present invention is not limited to the embodiment, and other embodiments may be included in the scope of the present invention as long as the gist of the present invention is satisfied.
In a preferred embodiment of the present invention, based on the above problems in the prior art, there is provided a method for intelligently switching video streams, as shown in fig. 1, including:
step S1, continuously collecting the pulse code modulation audio of the double channels of the multi-channel information sources, and calculating to obtain the double-channel volume value of the pulse code modulation audio of each information source;
step S2, mono-channel conversion is carried out on the double-channel volume value to obtain mono-channel volume value of each information source, and each mono-channel volume value corresponding to each information source is correspondingly added into a volume cache set according to the sequence of corresponding acquisition time;
step S3, for each volume buffer set, respectively carrying out volume detection on each single sound channel volume value to obtain a corresponding volume detection result, and judging whether the audio time length of the pulse code modulation audio corresponding to each single sound channel volume value in the volume buffer set is less than a preset value:
if yes, return to step S1;
if not, processing according to each volume detection result to obtain a sound judgment result of the current volume detection of the corresponding information source, and then turning to the step S4;
and step S4, intelligently switching the video pictures of the video streams corresponding to the information sources according to the sound judgment results of the current volume detection of all the information sources, respectively updating each volume cache set, and then returning to the step S3.
Specifically, in this embodiment, the technical scheme may be applied to cloud broadcasting, and the cloud broadcasting is adopted to realize intelligent switching of video frames in the playing process of a television program. During the playing process of the television program, the Pulse Code Modulation (PCM) audio of the dual channels of the multiple information sources corresponding to the television program is continuously acquired, and then the PCM audio of each information source is continuously processed to obtain the sound judgment result of each information source, so that the sound judgment results of all the information sources perform intelligent switching of the video pictures of the video streams corresponding to the information sources.
Further specifically, after acquiring the dual-channel pulse code modulation audio of the multiple information sources, the dual-channel volume value of the pulse code modulation audio of each information source can be calculated through a volume decibel calculator, then the dual-channel volume value can be sent to a volume processor to be processed, so that the dual-channel volume values of the left and right dual channels are subjected to single channel conversion to obtain single channel volume, and finally the continuously acquired single channel volume of each information source can be sent to an intelligent switcher to be subjected to volume detection and judgment.
Preferably, a plurality of audio buffers may be configured in the volume decibel calculator, each audio buffer respectively buffers a pulse code modulation audio of one channel of the information source, and further preferably, a plurality of volume buffers may be configured in the intelligent switcher, and each volume buffer respectively buffers a volume buffer set of one channel of the information source. The method comprises the steps that the audio buffers and the volume buffers adopt a first-in first-out principle, namely, pulse coding modulation audio stored in the audio buffers is firstly stored, after corresponding two-channel volume is obtained through calculation, the audio buffers are firstly stored in the volume buffers and are sent into a volume processor to obtain single-channel volume through calculation, then the single-channel volume is stored in the volume buffers firstly and participates in volume detection and judgment, after the volume detection and judgment are finished, the single-channel volume which participates in the volume detection and judgment can be removed from a volume buffer set, so that the volume buffer set is updated for next volume detection, once volume detection is carried out each time, and video pictures are switched once based on corresponding sound judgment results. It can be understood that if the sound determination results of two adjacent times are the same, the corresponding video picture is kept unchanged.
Further preferably, in order to prevent noise from interfering with volume detection, in the present technical solution, sound determination is performed on a plurality of volume detection results within a duration time period, so that when the audio duration of the pulse code modulation audio corresponding to each monaural volume value in the volume buffer set is long enough, that is, when the buffer amount in the volume buffer set is sufficient, further sound determination is performed. Specifically, a preset value is configured to be used as a judgment standard for judging whether the audio time is long enough.
In a preferred embodiment of the present invention, the volume detection result is silent or voiced; in step S4:
processing each volume cache set according to a first proportion representing a silent volume detection result in a first preset time length and a second proportion representing a voiced volume detection result in a second preset time length to obtain a sound judgment result of a corresponding information source;
the first preset time length and the second preset time length are both larger than a preset value.
Specifically, in this embodiment, when the audio time of the pulse code modulation audio corresponding to each monaural volume value in the volume buffer set reaches a preset value, the volume detection results of the first preset time and the second preset time may be respectively selected from the volume buffer set to perform the silent or voiced determination. The first preset time and the second preset time can be set according to the requirement.
In a preferred embodiment of the present invention, in step S3, the detecting the sound volume of each monaural sound volume value includes:
comparing each single sound track volume value with a volume threshold value respectively, and judging whether the single sound track volume value is smaller than the volume threshold value:
if yes, outputting a volume detection result representing silence;
if not, outputting a volume detection result indicating sound.
Specifically, in this embodiment, as shown in fig. 2, taking a volume cache set corresponding to one of the channels of information sources as an example, a volume detection result obtained by respectively performing volume detection on each monaural volume value can be visually seen.
In a preferred embodiment, after the processed sound volume detection result of each monaural sound volume value, when sound determination is performed, priority of silent determination or sound determination is configured in advance, wherein when the first preset time length and the second preset time length are the same, sound determination is performed preferentially, and when the first preset time length and the second preset time length are different, priority determination is performed with the shorter time length.
Further specifically, the first preset duration is less than the second preset duration, and the pulse code modulation audio data of the second preset duration comprises the pulse code modulation audio data of the first preset duration; as shown in fig. 3, the process of processing the sound determination result of the corresponding source according to each volume detection result includes:
step A1a, calculating a first ratio of the volume detection result indicating silence within a first preset time period, and determining whether the first ratio is less than a first threshold:
if not, outputting a sound judgment result indicating that the information source is continuously silent within a first preset time length, and then turning to the step S4;
if yes, go to step A2 a;
step A2a, calculating a second ratio of the voiced sound volume detection result within a second preset time period, and determining whether the second ratio is smaller than a second threshold:
if not, outputting a sound judgment result indicating that the information source is continuously voiced within a second preset time period, and then turning to the step S4;
if so, the sound determination result of the last volume detection is used as the sound determination result of the current volume detection, and the process then proceeds to step S4.
Specifically, when the first preset time length is less than the second preset time length, the judgment of silence is carried out, if the first ratio is not less than the first threshold value, the sound judgment result is that silence is a true proposition, the judgment of sound is not carried out any more, if the first ratio is less than the first threshold, the sound judgment result is that no sound is false proposition, at this time, further sound judgment is needed, if the second ratio is not less than the second threshold, the sound judgment result is considered to be voiced and is a true proposition, at the moment, the sound judgment result which indicates that the information source continues to be voiced within a second preset time length is output, if the second ratio is smaller than the second threshold, the sound judgment result is considered to be sound and false proposition, at this time, and when the sound judgment result of the volume detection at this time is invalid, the sound judgment result of the volume detection at the last time is taken as the sound judgment result of the volume detection at this time.
Preferably, the first threshold value and the second threshold value may be 80%, or may be set as needed. For each volume buffer set, the volume detection result of the volume detection on each monaural volume value is shown in fig. 4, where the audio acquisition time corresponding to the voiced volume detection result located on the rightmost side is the earliest, and when sound determination is performed, the audio acquisition time corresponding to the volume detection result may be used as a time starting point, and the volume detection result for a first preset time is selected, it can be seen that the first percentage representing the unvoiced volume detection result in the first preset time is 20% and is less than 80%, and at this time, voiced determination is further performed. As shown in fig. 4, it can be seen that the duration starting point of the second preset duration and the duration starting point of the first preset duration are required to be consistent, the volume detection result indicating that sound is present in the second preset duration is 75%, and is also less than 80%, and at this time, the sound determination result of the previous volume detection is taken as the sound determination result of the current volume detection.
In a preferred embodiment of the present invention, the first predetermined duration is not less than the second predetermined duration, and the pulse code modulation audio data of the first predetermined duration includes pulse code modulation audio data of the second predetermined duration; as shown in fig. 5, the process of processing the sound determination result of the corresponding source according to each volume detection result includes:
step A1b, calculating a second percentage of the voiced sound volume detection result within a second preset time period, and determining whether the second percentage is smaller than a second threshold:
if not, outputting a sound judgment result indicating that the information source is continuously voiced within a second preset time period, and then turning to the step S4;
if yes, go to step A2 b;
step A2b, calculating a first ratio of the volume detection result indicating silence within a first preset time period, and determining whether the first ratio is greater than a first threshold:
if not, outputting a sound judgment result indicating that the information source is continuously silent within a first preset time length, and then turning to the step S4;
if so, the sound determination result of the last volume detection is used as the sound determination result of the current volume detection, and the process then proceeds to step S4.
Specifically, in the present embodiment, specifically, when the first preset time period is not less than the second preset time period, the sound determination is performed first, if the second ratio is not less than the second threshold, the sound judgment result is that the sound is a true proposition, the judgment of silence is not performed, if the second ratio is smaller than the second threshold, the sound judgment result is considered to be that the sound is a false proposition, at this time, the further soundless judgment is needed, if the first ratio is not less than the first threshold, the sound judgment result is regarded as soundless and is a true proposition, at the moment, the sound judgment result which indicates that the information source continuously soundless within the first preset time length is output, if the first ratio is smaller than the first threshold, the sound judgment result is considered to be silence and is a false proposition, and at the moment, and when the sound judgment result of the volume detection at this time is invalid, the sound judgment result of the volume detection at the last time is taken as the sound judgment result of the volume detection at this time.
In the preferred embodiment of the present invention, in step S2, a Max channel conversion algorithm is used to perform mono conversion on the two-channel volume value.
In a preferred embodiment of the present invention, in step S4, the video frames of the video streams corresponding to the signal sources are intelligently switched according to the number of sound determination results indicating that the signal sources are continuously voiced in the sound determination results of the current volume detection of all the signal sources.
In a preferred embodiment of the present invention, in step S4, the number of sound determination results indicating that the signal source has continuous sound is counted, and when the number is one, the current video playing interface is intelligently switched to the video frame of the video stream corresponding to the signal source having continuous sound, and when the number is more than one, the current video playing interface is intelligently switched to the video frame indicating the panorama.
Specifically, in this embodiment, when the number of sound determination results indicating that the source has been continuously voiced is one, it may be considered that only one anchor in the television program is speaking continuously at this time, and at this time, the current video broadcast program is switched to the corresponding video frame, and a close-up is performed. If the number of the sound judgment results indicating that the information source has sound continuously is more than one, it can be considered that a plurality of anchor broadcasters continuously speak in the television program at the moment, and the current video playing program is switched to a video picture corresponding to the panorama of the current video playing program so as to contain all anchor broadcasters speaking.
In a preferred embodiment of the present invention, in step S3, for each volume buffer set, a sliding time window is adopted to perform volume detection on each monaural volume value to obtain a corresponding volume detection result.
Specifically, in the present embodiment, the sliding step size of the sliding time window includes, but is not limited to, 100 ms. For example, as shown in fig. 6, the time window is a schematic diagram of the sliding step length of the sliding time window corresponding to the first preset time length and the second preset time length when two adjacent volume measurements are performed; it can be seen that, during the volume detection, the start time of the sliding time window corresponding to the first preset duration and the second preset duration is the rightmost side, and then during the next volume detection, the start time of the sliding time window corresponding to the first preset duration and the second preset duration is the audio acquisition time of the third voiced volume detection result counted from the right side, and the sliding step length covers the audio durations of the two voiced volume detection results. Further, at the end of the next volume detection, the two voiced volume detection results covered by the sliding step may be deleted from the volume buffer set, so as to update the volume buffer set. It is understood that the number of the volume detection results covered by the first preset duration, the second preset duration and the sliding step is related to the acquisition period of the pcm audio and the setting values of the first preset duration, the second preset duration and the sliding step, and the illustration is only an example and is not intended to limit the present application.
The present invention further provides an intelligent video stream switching system, which is applied to the above intelligent video stream switching method, as shown in fig. 7, the intelligent video stream switching system includes:
the volume calculation module 1 is used for continuously collecting the pulse code modulation audio of the dual channels of the multi-channel information sources and calculating to obtain the dual-channel volume value of the pulse code modulation audio of each information source;
the volume conversion module 2 is connected with the volume calculation module 1 and is used for performing single-channel conversion on the dual-channel volume value to obtain a single-channel volume value of each information source, and correspondingly adding the single-channel volume values corresponding to each information source into a volume cache set according to the sequence of corresponding acquisition time;
the volume detection module 3 is connected with the volume conversion module 2 and is used for respectively carrying out volume detection on each single sound channel volume value aiming at each volume cache set to obtain a corresponding volume detection result and outputting a judgment signal when the audio time length of the pulse code modulation audio corresponding to each single sound channel volume value in the volume cache set is not less than a preset value;
and the sound judgment module 4 is connected with the volume detection module 3 and is used for processing the judgment signal and each volume detection result to obtain a sound judgment result of the corresponding information source, intelligently switching the video pictures of the video streams corresponding to each information source according to the sound judgment results of the volume detection of all the information sources at this time, and updating each volume cache set respectively.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims (10)

1. An intelligent video stream switching method is characterized by comprising the following steps:
step S1, continuously collecting the pulse code modulation audio of the double channels of the multi-channel information sources, and calculating to obtain the double-channel volume value of the pulse code modulation audio of each information source;
step S2, performing single channel conversion on the dual-channel volume value to obtain a single channel volume value of each information source, and correspondingly adding each single channel volume value corresponding to each information source into a volume cache set according to the sequence of corresponding acquisition time;
step S3, for each volume buffer set, respectively performing volume detection on each monaural volume value to obtain a corresponding volume detection result, and determining whether an audio duration of the pulse code modulation audio corresponding to each monaural volume value in the volume buffer set is less than a preset value:
if yes, returning to the step S1;
if not, processing according to each volume detection result to obtain a sound judgment result of the current volume detection of the corresponding information source, and then turning to the step S4;
step S4, intelligently switching video frames of video streams corresponding to the information sources according to the sound determination result of the current volume detection of all the information sources, updating each volume cache set, and then returning to the step S3.
2. The intelligent switching method for video streams according to claim 1, wherein the volume detection result is silence or voiced; then in step S4:
for each volume cache set, processing to obtain the sound judgment result of the corresponding information source according to a first proportion representing the soundless volume detection result within a first preset time length and a second proportion representing the soundless volume detection result within a second preset time length;
the first preset time length and the second preset time length are both larger than the preset value.
3. The method for intelligently switching video streams according to claim 2, wherein the step S3, performing volume detection on each of the mono volume values includes:
comparing each single sound channel volume value with a volume threshold value respectively, and judging whether the single sound channel volume value is smaller than the volume threshold value:
if yes, outputting the volume detection result representing silence;
and if not, outputting the volume detection result indicating sound.
4. The method according to claim 2, wherein the first predetermined duration is less than the second predetermined duration, and the pulse code modulated audio data of the second predetermined duration comprises the pulse code modulated audio data of the first predetermined duration; the process of processing to obtain the sound determination result of the corresponding information source according to each of the volume detection results includes:
step A1a, calculating the first duty ratio of the volume detection result indicating silence within the first preset duration, and determining whether the first duty ratio is less than a first threshold:
if not, outputting the sound judgment result indicating that the signal source is continuously silent within the first preset time period, and then turning to the step S4;
if yes, go to step A2 a;
step A2a, calculating the second proportion of the volume detection result indicating voiced sound within the second preset time period, and determining whether the second proportion is smaller than a second threshold:
if not, outputting the sound judgment result indicating that the signal source is continuously voiced within the second preset time period, and then turning to the step S4;
if so, the sound determination result of the last volume detection is used as the sound determination result of the current volume detection, and the process then proceeds to step S4.
5. The method according to claim 3, wherein the first predetermined duration is not less than the second predetermined duration, and the pulse code modulation audio data of the first predetermined duration comprises the pulse code modulation audio data of the second predetermined duration; the process of processing to obtain the sound determination result of the corresponding information source according to each of the volume detection results includes:
step A1b, calculating the second proportion of the volume detection result indicating voiced sound within the second preset time period, and determining whether the second proportion is smaller than a second threshold:
if not, outputting the sound judgment result indicating that the signal source is continuously voiced within the second preset time period, and then turning to the step S4;
if yes, go to step A2 b;
step A2b, calculating the first duty ratio of the volume detection result indicating silence within the first preset time period, and determining whether the first duty ratio is less than a first threshold:
if not, outputting the sound judgment result indicating that the signal source is continuously silent within the first preset time period, and then turning to the step S4;
if so, the sound determination result of the last volume detection is used as the sound determination result of the current volume detection, and the process then proceeds to step S4.
6. The method for intelligent switching of video streams according to claim 1, wherein in step S2, the two-channel volume value is mono-converted by using Max channel conversion algorithm.
7. The method for intelligently switching video streams according to claim 4 or 5, wherein in step S4, the video frames of the video streams corresponding to the respective sources are intelligently switched according to the number of the sound determination results indicating that the sound of the source continues in the sound determination results of the current volume detection of all the sources.
8. The method for intelligently switching video streams according to claim 7, wherein in step S4, the number of the sound determination results indicating that the source is continuously voiced is counted, and when the number is one, the current video playing interface is intelligently switched to the video picture of the video stream corresponding to the source that is continuously voiced, and when the number is more than one, the current video playing interface is intelligently switched to the video picture indicating panorama.
9. The method for intelligently switching video streams according to claim 1, wherein in step S3, for each volume buffer set, a sliding time window is used to perform volume detection on each monaural volume value to obtain the corresponding volume detection result.
10. An intelligent video stream switching system, applied to the intelligent video stream switching method according to any one of claims 1 to 9, the intelligent video stream switching system comprising:
the volume calculation module is used for continuously acquiring the pulse code modulation audio of the double channels of the multi-channel information sources and calculating to obtain the double-channel volume value of the pulse code modulation audio of each information source;
the volume conversion module is connected with the volume calculation module and used for carrying out single-channel conversion on the double-channel volume value to obtain a single-channel volume value of each information source and correspondingly adding each single-channel volume value corresponding to each information source into a volume cache set according to the sequence of corresponding acquisition time;
the volume detection module is connected with the volume conversion module and is used for respectively carrying out volume detection on each single sound channel volume value aiming at each volume cache set to obtain a corresponding volume detection result and outputting a judgment signal when the audio time of the pulse code modulation audio corresponding to each single sound channel volume value in the volume cache set is not less than a preset value;
and the sound judgment module is connected with the volume detection module and used for processing according to the judgment signal and each volume detection result to obtain a sound judgment result of the corresponding information source, intelligently switching the video pictures of the video streams corresponding to the information sources according to the sound judgment result of the current volume detection of all the information sources, and updating each volume cache set respectively.
CN202111363782.1A 2021-11-17 2021-11-17 Intelligent video stream switching method and system Active CN114286114B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111363782.1A CN114286114B (en) 2021-11-17 2021-11-17 Intelligent video stream switching method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111363782.1A CN114286114B (en) 2021-11-17 2021-11-17 Intelligent video stream switching method and system

Publications (2)

Publication Number Publication Date
CN114286114A true CN114286114A (en) 2022-04-05
CN114286114B CN114286114B (en) 2024-02-09

Family

ID=80869316

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111363782.1A Active CN114286114B (en) 2021-11-17 2021-11-17 Intelligent video stream switching method and system

Country Status (1)

Country Link
CN (1) CN114286114B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05111020A (en) * 1991-10-17 1993-04-30 Matsushita Electric Ind Co Ltd Picture switching control device for video conference
JPH10308816A (en) * 1997-05-06 1998-11-17 Fujitsu Ltd Voice switch for speaking equipment
JP2008294759A (en) * 2007-05-24 2008-12-04 Toshiba Corp Video and sound reproducing device and method
CN102368816A (en) * 2011-12-01 2012-03-07 中科芯集成电路股份有限公司 Intelligent front end system of video conference
CN107105176A (en) * 2017-05-19 2017-08-29 上海东方传媒技术有限公司 A kind of system that switch contents are pinpointed in digital program
CN107948577A (en) * 2017-12-26 2018-04-20 深圳市保千里电子有限公司 A kind of method and its system of panorama video conference
CN108924469A (en) * 2018-08-01 2018-11-30 广州视源电子科技股份有限公司 A kind of display screen switching Transmission system, intelligent interaction plate and method
CN109905616A (en) * 2019-01-22 2019-06-18 视联动力信息技术股份有限公司 A kind of method and apparatus of Switch Video picture
US20190394567A1 (en) * 2018-06-22 2019-12-26 EVA Automation, Inc. Dynamically Adapting Sound Based on Background Sound
CN111586341A (en) * 2020-05-20 2020-08-25 深圳随锐云网科技有限公司 Shooting method and picture display method of video conference shooting device

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05111020A (en) * 1991-10-17 1993-04-30 Matsushita Electric Ind Co Ltd Picture switching control device for video conference
JPH10308816A (en) * 1997-05-06 1998-11-17 Fujitsu Ltd Voice switch for speaking equipment
JP2008294759A (en) * 2007-05-24 2008-12-04 Toshiba Corp Video and sound reproducing device and method
CN102368816A (en) * 2011-12-01 2012-03-07 中科芯集成电路股份有限公司 Intelligent front end system of video conference
CN107105176A (en) * 2017-05-19 2017-08-29 上海东方传媒技术有限公司 A kind of system that switch contents are pinpointed in digital program
CN107948577A (en) * 2017-12-26 2018-04-20 深圳市保千里电子有限公司 A kind of method and its system of panorama video conference
US20190394567A1 (en) * 2018-06-22 2019-12-26 EVA Automation, Inc. Dynamically Adapting Sound Based on Background Sound
CN108924469A (en) * 2018-08-01 2018-11-30 广州视源电子科技股份有限公司 A kind of display screen switching Transmission system, intelligent interaction plate and method
CN109905616A (en) * 2019-01-22 2019-06-18 视联动力信息技术股份有限公司 A kind of method and apparatus of Switch Video picture
CN111586341A (en) * 2020-05-20 2020-08-25 深圳随锐云网科技有限公司 Shooting method and picture display method of video conference shooting device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PO-CHYI SU: "A Dynamic Video Streaming Scheme Based on SP/SI Frames of H.264/AVC", 2012 41ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS *
张辉: "探讨4K超高清播出系统架构下自动应急切换控制机制及应用", 现代电视技术, no. 12 *

Also Published As

Publication number Publication date
CN114286114B (en) 2024-02-09

Similar Documents

Publication Publication Date Title
US10546599B1 (en) Systems and methods for identifying a mute/sound sample-set attribute
CN109842795B (en) Audio and video synchronization performance testing method and device, electronic equipment and storage medium
CN109168078B (en) Video definition switching method and device
US8139165B2 (en) Television receiver
US9135920B2 (en) System for perceived enhancement and restoration of compressed audio signals
US10382830B2 (en) Trick play in digital video streaming
EP2903301A2 (en) Improving at least one of intelligibility or loudness of an audio program
US9129593B2 (en) Multi channel audio processing
US20020128822A1 (en) Method and apparatus for skipping and repeating audio frames
US8457954B2 (en) Sound quality control apparatus and sound quality control method
CN113347489B (en) Video clip detection method, device, equipment and storage medium
CN100536574C (en) A system and method for quickly playing multimedia information
EP1071294A2 (en) System, method and recording medium for audio-video synchronous playback
CN110933485A (en) Video subtitle generating method, system, device and storage medium
EP0892582A2 (en) Method and apparatus for error masking in multi-channel audio signals
CN108550369B (en) Variable-length panoramic sound signal coding and decoding method
CN114286114A (en) Intelligent switching method and system for video streams
US20200082834A1 (en) Stereo Signal Processing Method and Apparatus
KR101003415B1 (en) Method of decoding a dmb signal and apparatus of decoding thereof
RU2383941C2 (en) Method and device for encoding and decoding audio signals
JP2009284212A (en) Digital sound signal analysis method, apparatus therefor and video/audio recorder
CN112055253B (en) Method and device for adding and multiplexing independent subtitle stream
JP2006050045A (en) Moving picture data edit apparatus and moving picture edit method
CN110677208A (en) Sound mixing method and system for conference system
CN116668754A (en) Synchronous playing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant