WO2023011408A1 - Multi-window video communication method, device and system - Google Patents

Multi-window video communication method, device and system Download PDF

Info

Publication number
WO2023011408A1
WO2023011408A1 PCT/CN2022/109423 CN2022109423W WO2023011408A1 WO 2023011408 A1 WO2023011408 A1 WO 2023011408A1 CN 2022109423 W CN2022109423 W CN 2022109423W WO 2023011408 A1 WO2023011408 A1 WO 2023011408A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
video
window
bandwidth
windows
Prior art date
Application number
PCT/CN2022/109423
Other languages
French (fr)
Chinese (zh)
Inventor
张帮明
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023011408A1 publication Critical patent/WO2023011408A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/443OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Definitions

  • the embodiments of the present application relate to the field of communication technologies, and in particular, to a multi-window video communication method, device, and system.
  • multi-window video communication is used in professional conference scenarios (such as multi-party conference scenarios), life scenarios (such as group video chat scenarios) or online education scenarios in the form of multi-party video calls.
  • the sending end can forward the audio and video streams to the receiving end through the cloud.
  • the cloud side forwards the audio and video streams from the sending end to the receiving end, and performs bandwidth prediction on the communication link for forwarding the audio and video streams.
  • the bandwidth predicted by the cloud side is used to make communication decisions (such as frame extraction decisions, etc.) when the network is poor.
  • the present application provides a multi-window video communication method, device and system, which can make full use of bandwidth resources to avoid network congestion during a multi-party video call, thereby ensuring smoothness and/or clarity of the video.
  • a multi-window video communication method is provided, the method is applied in the process of video calling between a plurality of first devices and a second device, the method includes: the second device receives a plurality of video messages from a plurality of first devices respectively Audio and video streams; wherein, the audio and video corresponding to the above multiple audio and video streams are respectively played in multiple windows on the interface of the second device; the second device determines that the bandwidth resources are not enough to meet the bandwidth requirements of multiple audio and video streams; the second The second device adjusts the subscription strategy for audio and video streams in one or more windows according to the priorities corresponding to the multiple windows.
  • the adjustment of the subscription policy by the second device may include, but not limited to, one or more of unsubscribing, resuming subscription, delaying subscription, reducing clarity, or increasing clarity.
  • the bandwidth resources at the receiving end are not enough to meet the bandwidth requirements of multiple audio and video streams (i.e. a weak network environment), for example, when the total bandwidth prediction value used to characterize the bandwidth resources is less than the number of audio and video streams
  • the priority corresponding to the window can be used to represent specific business requirements or user preferences.
  • the above-mentioned second device receiving multiple audio and video streams respectively from multiple first devices includes: the second device receiving multiple audio and video streams respectively from multiple first devices forwarded by a third device Audio and video streaming.
  • the multi-window video communication method provided in this application can be applied to a network architecture where a third device forwards audio and video streams, so as to improve the applicability of the method provided in this application and compatibility with different network architectures.
  • the above method further includes: the above-mentioned second device receives multiple bandwidth prediction results for multiple links from the third device; the second device determines that the bandwidth resource is insufficient to meet the Bandwidth requirements for multiple audio and video streams.
  • the above multiple links correspond to the above multiple audio and video streams.
  • the above multiple links are respectively used to transmit the above multiple audio and video streams.
  • the third device may perform bandwidth prediction.
  • the third device may perform bandwidth prediction when forwarding audio and video streams to the second device.
  • the above method further includes: the second device measures and obtains multiple bandwidth prediction results for multiple links; the second device determines that the bandwidth resource is insufficient to meet the above multiple Bandwidth requirements for audio and video streams.
  • the above multiple links correspond to the above multiple audio and video streams.
  • the above multiple links are respectively used to transmit the above multiple audio and video streams.
  • the second device can perform bandwidth prediction, and through this solution, the compatibility of the method provided by the present application with different network architectures can be improved.
  • the first window plays the corresponding audio and video stream at the first definition; the first window is among the above-mentioned multiple windows; the second device adjusts one or more
  • the subscription strategy for audio and video streams in a window includes: the second device subscribes to an audio and video stream with a second definition for the first window; and the second definition is smaller than the first definition.
  • the receiving end when the receiving end is in a weak network environment, it can adjust the subscription policy to reduce the video definition for low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
  • the above method further includes: when the first preset condition is met, the second device Subscribe for audio and video streams in first definition.
  • the second device may determine whether the first preset condition is satisfied by monitoring the predicted value of the total bandwidth.
  • the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
  • the above method further includes: when the first preset condition is met for a preset time period, the second device Subscribe to the first definition audio and video stream for the first window.
  • the second device may determine whether the first preset condition is satisfied by monitoring the predicted value of the total bandwidth.
  • the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
  • the second window plays audio and video streams with a second definition, the second definition is less than or equal to a preset value, and the second window is a window with the lowest priority among the plurality of windows;
  • the second device adjusts the subscription strategy for the audio and video streams in one or more windows according to the priorities corresponding to the multiple windows, including: the second device unsubscribes the video stream corresponding to the second window.
  • the receiving end when the receiving end is in a weak network environment, it can adjust the subscription policy of unsubscribing to low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
  • the above method further includes: the second device displays a mask on the second window.
  • the second device displays a mask on the second window.
  • the above method further includes: when a second preset condition is met, the second device resumes subscribing to the second video stream for the second window.
  • the second device may determine whether the second preset condition is met by monitoring the predicted value of the total bandwidth.
  • the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
  • the above method further includes: when the second preset condition is met for a preset time period, the second device is the second window Resume subscription to second definition video stream.
  • the second device may determine whether the second preset condition is met by monitoring the predicted value of the total bandwidth.
  • the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
  • the priorities corresponding to the multiple windows are determined by the second device according to one or more of the following: the initial volume of the audio corresponding to the multiple windows; the playback volume of the audio corresponding to the multiple windows ; The function of business in multiple windows.
  • the initial volume of the audio is used to represent the original volume of the audio stream when the second device receives the audio stream.
  • various window priority settings can be supported.
  • the second device may determine the priorities corresponding to the multiple windows according to the volume of the initial/playing audio corresponding to the multiple windows and/or the functions of the services in the multiple windows.
  • the diversified window priority setting can facilitate diversified operations of the user and improve user experience.
  • the priorities corresponding to the foregoing multiple windows are determined by the second device according to user-defined specified operations.
  • various window priority settings can be supported. For example, it can be user-defined.
  • the diversified window priority setting can facilitate diversified operations of the user and improve user experience.
  • the above-mentioned second device receiving multiple audio and video streams from multiple first devices forwarded by the third device includes: the second device receives the first audio and video stream from the first cloud device ; The second device receives the second audio-video stream and the third audio-video stream from the second cloud device.
  • the multi-window video communication method provided by this application can be applied to a network architecture in which a distributed cloud device (that is, a third device) forwards audio and video streams, so as to improve the applicability of the method provided by this application and to communicate with different networks.
  • a distributed cloud device that is, a third device
  • the foregoing third device is a selective forwarding unit (selective forwarding unit, SFU).
  • SFU selective forwarding unit
  • an electronic device such as a second device
  • the electronic device includes: a transceiver unit, configured to receive a plurality of audio and video streams from a plurality of first devices; wherein, the plurality of audio and video streams correspond to The audio and video are respectively played in multiple windows on the interface of the second device; the display unit is used to play the multiple audio and video streams through the multiple windows; the processing unit is used to determine that bandwidth resources are not enough to satisfy multiple audio The bandwidth requirement of the video stream; and adjust the subscription strategy for the audio and video stream in one or more windows according to the priorities corresponding to multiple windows.
  • the adjustment of the subscription policy by the second device may include, but not limited to, one or more of unsubscribing, resuming subscription, delaying subscription, reducing clarity, or increasing clarity.
  • the bandwidth resources of the receiving end are insufficient to meet the bandwidth requirements of multiple audio and video streams (i.e. a weak network environment), for example, when the total bandwidth prediction value used to characterize the bandwidth resources is less than the number of audio and video streams
  • the priority corresponding to the window may be used to represent a specific service requirement or a user's preference.
  • the above-mentioned transceiving unit is specifically configured to: receive multiple audio and video streams respectively from multiple first devices forwarded by the third device.
  • the multi-window video communication method provided in this application can be applied to a network architecture where a third device forwards audio and video streams, so as to improve the applicability of the method provided in this application and compatibility with different network architectures.
  • the transceiver unit is further configured to: receive multiple bandwidth prediction results for multiple links from the third device; the processing unit is further configured to: determine that the bandwidth resource is insufficient according to the multiple bandwidth prediction results To meet the bandwidth requirements of multiple audio and video streams.
  • the above multiple links correspond to the above multiple audio and video streams.
  • the above multiple links are respectively used to transmit the above multiple audio and video streams.
  • the third device may perform bandwidth prediction.
  • the third device may perform bandwidth prediction when forwarding audio and video streams to the second device.
  • the above processing unit is further configured to: measure and obtain multiple bandwidth prediction results for multiple links; and determine that bandwidth resources are insufficient to meet the above multiple audio and video frequency requirements according to the above multiple bandwidth prediction results.
  • the above multiple links correspond to the above multiple audio and video streams.
  • the above multiple links are respectively used to transmit the above multiple audio and video streams.
  • the second device can perform bandwidth prediction, and through this solution, the compatibility of the method provided by the present application with different network architectures can be improved.
  • the first window plays the corresponding audio and video stream at the first definition; the first window is among the above-mentioned multiple windows; the above-mentioned processing unit is specifically configured to: subscribe the first window to the second definition audio and video stream; the second definition is smaller than the first definition.
  • the receiving end when the receiving end is in a weak network environment, it can adjust the subscription policy to reduce the video definition for low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
  • the above-mentioned processing unit is further configured to, when the first preset condition is satisfied, subscribe the first window to the audio-video stream of the first definition.
  • the processing unit may determine whether the first preset condition is met by monitoring the predicted value of the total bandwidth.
  • the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
  • the above-mentioned processing unit is further configured to: when the first preset condition is met for a preset time period, subscribe to the first window for the audio and video stream of the first definition.
  • the processing unit may monitor whether the first preset condition is met by monitoring the total bandwidth prediction value.
  • the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
  • the second window plays audio and video streams with a second definition, the second definition is less than or equal to a preset value, and the second window is a window with the lowest priority among the plurality of windows;
  • the above processing unit is specifically configured to: unsubscribe from the video stream corresponding to the second window.
  • the receiving end when the receiving end is in a weak network environment, it can adjust the subscription policy of unsubscribing to low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
  • the display unit is further configured to: display a mask on the second window.
  • the above-mentioned processing unit is further configured to: resume subscribing to the video stream of the second definition for the second window when the second preset condition is satisfied.
  • the processing unit may determine whether the second preset condition is satisfied by monitoring the total bandwidth prediction value.
  • the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
  • the processing unit is further configured to: when the second preset condition is met for a preset time period, the processing unit resumes subscribing to the video stream of the second definition for the second window.
  • the processing unit may determine whether the second preset condition is satisfied by monitoring the total bandwidth prediction value.
  • the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
  • the above processing unit is further configured to determine the priorities corresponding to multiple windows according to one or more of the following: initial volume of audio corresponding to multiple windows; playback of audio corresponding to multiple windows Volume; a function of business in multiple windows.
  • the initial volume of the audio is used to represent the original volume of the audio stream when the second device receives the audio stream.
  • various window priority settings can be supported.
  • the second device may determine the priorities corresponding to the multiple windows according to the volume of the initial/playing audio corresponding to the multiple windows and/or the functions of the services in the multiple windows.
  • the diversified window priority setting can facilitate diversified operations of the user and improve user experience.
  • the above processing unit is further configured to determine priorities corresponding to multiple windows according to user-defined specified operations.
  • various window priority settings can be supported. For example, may be user-defined.
  • the diversified window priority setting can facilitate diversified operations of the user and improve user experience.
  • the above-mentioned transceiving unit is specifically configured to: receive the first audio-video stream from the first cloud device; and receive the second audio-video stream and the third audio-video stream from the second cloud device.
  • the multi-window video communication method provided by this application can be applied to a network architecture in which a distributed cloud device (that is, a third device) forwards audio and video streams, so as to improve the applicability of the method provided by this application and to communicate with different networks.
  • a distributed cloud device that is, a third device
  • the foregoing third device is an SFU.
  • an electronic device such as a second device
  • the electronic device includes: a memory for storing a computer program; a transceiver for receiving or sending a radio signal; a display for displaying an interface; a processor, It is used to execute the computer program, so that the electronic device receives multiple audio and video streams from multiple first devices through a transceiver; wherein, the audio and video corresponding to the multiple audio and video streams are respectively displayed on the interface of the second device Play in multiple windows; determine that the bandwidth resources are not enough to meet the bandwidth requirements of multiple audio and video streams; and adjust the subscription strategy for the audio and video streams in one or more windows according to the priorities corresponding to the multiple windows.
  • the adjustment of the subscription policy by the second device may include, but not limited to, one or more of unsubscribing, resuming subscription, delaying subscription, reducing clarity, or increasing clarity.
  • the bandwidth resources at the receiving end are insufficient to meet the bandwidth requirements of multiple audio and video streams (i.e. a weak network environment), for example, when the total bandwidth prediction value used to characterize the bandwidth resources is less than the number of audio and video streams
  • the priority corresponding to the window may be used to represent a specific service requirement or a user's preference.
  • the foregoing transceiver is specifically configured to: receive multiple audio and video streams respectively from multiple first devices forwarded by the third device.
  • the multi-window video communication method provided in this application can be applied to a network architecture where a third device forwards audio and video streams, so as to improve the applicability of the method provided in this application and compatibility with different network architectures.
  • the transceiver is further configured to: receive multiple bandwidth prediction results for multiple links from the third device; the processor is further configured to: determine that bandwidth resources are insufficient according to the multiple bandwidth prediction results To meet the bandwidth requirements of multiple audio and video streams.
  • the above multiple links correspond to the above multiple audio and video streams.
  • the above multiple links are respectively used to transmit the above multiple audio and video streams.
  • the third device may perform bandwidth prediction.
  • the third device may perform bandwidth prediction when forwarding audio and video streams to the second device.
  • the above-mentioned processor is further configured to: measure and obtain multiple bandwidth prediction results for multiple links; The bandwidth requirements of the stream.
  • the above multiple links correspond to the above multiple audio and video streams.
  • the above multiple links are respectively used to transmit the above multiple audio and video streams.
  • the second device can perform bandwidth prediction, and through this solution, the compatibility of the method provided by the present application with different network architectures can be improved.
  • the first window plays the corresponding audio and video stream at the first definition; the first window is among the plurality of windows; the processor is specifically configured to: subscribe the first window to the second definition audio and video stream; the second definition is smaller than the first definition.
  • the receiving end when the receiving end is in a weak network environment, it can adjust the subscription policy to reduce the video definition for low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
  • the above-mentioned processor is further configured to: when the first preset condition is satisfied, subscribe the first window to the audio-video stream of the first definition.
  • the processor may monitor the total bandwidth prediction value to determine whether the first preset condition is met.
  • the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
  • the above-mentioned processor is further configured to: subscribe for the first window to an audio and video stream of the first definition when the first preset condition is met for a preset time period.
  • the processor may monitor the total bandwidth prediction value to determine whether the first preset condition is met.
  • the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
  • the second window plays audio and video streams with a second definition, the second definition is less than or equal to a preset value, and the second window is a window with the lowest priority among the plurality of windows;
  • the processor above is specifically configured to: unsubscribe from the video stream corresponding to the second window.
  • the receiving end when the receiving end is in a weak network environment, it can adjust the subscription policy of unsubscribing to low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
  • the display is further configured to display a mask on the second window.
  • the above-mentioned processor is further configured to: resume subscribing to the video stream of the second definition for the second window when the second preset condition is met.
  • the processor may monitor the total bandwidth prediction value to determine whether the second preset condition is met.
  • the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
  • the processor is further configured to: when the second preset condition is met for a preset time period, the processing unit resumes subscribing to the video stream of the second definition for the second window.
  • the processor may monitor the total bandwidth prediction value to determine whether the second preset condition is met.
  • the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
  • the above-mentioned processor is further configured to determine the priorities corresponding to multiple windows according to one or more of the following: the initial volume of the audio corresponding to the multiple windows; the playback of the audio corresponding to the multiple windows Volume; a function of business in multiple windows.
  • the initial volume of the audio is used to represent the original volume of the audio stream when the second device receives the audio stream.
  • various window priority settings can be supported.
  • the second device may determine the priorities corresponding to the multiple windows according to the volume of the initial/playing audio corresponding to the multiple windows and/or the functions of the services in the multiple windows.
  • the diversified window priority setting can facilitate diversified operations of the user and improve user experience.
  • the above-mentioned processor is further configured to: determine priorities corresponding to multiple windows according to user-defined specified operations.
  • various window priority settings can be supported. For example, it can be user-defined.
  • the diversified window priority setting can facilitate diversified operations of the user and improve user experience.
  • the above-mentioned transceiver is specifically configured to: receive the first audio-video stream from the first cloud device; and receive the second audio-video stream and the third audio-video stream from the second cloud device.
  • the multi-window video communication method provided by this application can be applied to a network architecture in which a distributed cloud device (that is, a third device) forwards audio and video streams, so as to improve the applicability of the method provided by this application and to communicate with different networks.
  • a distributed cloud device that is, a third device
  • the foregoing third device is an SFU.
  • a multi-window video communication method which is applied in the process of a video call between multiple first devices and a second device in a communication system, and the method includes: multiple first devices send to the second device Audio and video streams; wherein, the audio and video corresponding to the above multiple audio and video streams are respectively played in multiple windows on the interface of the second device; the second device determines that the bandwidth resources are not enough to meet the bandwidth requirements of multiple audio and video streams; the second The second device adjusts the subscription strategy for audio and video streams in one or more windows according to the priorities corresponding to the multiple windows.
  • the adjustment of the subscription policy by the second device may include, but not limited to, one or more of unsubscribing, resuming subscription, delaying subscription, reducing clarity, or increasing clarity.
  • the bandwidth resources at the receiving end are insufficient to meet the bandwidth requirements of multiple audio and video streams (i.e. a weak network environment), for example, when the total bandwidth prediction value used to characterize the bandwidth resources is less than the number of audio and video streams
  • the priority corresponding to the window may be used to represent a specific service requirement or a user's preference.
  • the above-mentioned communication system further includes: one or more third devices, configured to receive the above-mentioned multiple audio and video streams from the above-mentioned multiple first devices, and forward the above-mentioned multiple audio and video streams to the second device video stream.
  • the multi-window video communication method provided in this application can be applied to a network architecture where a third device forwards audio and video streams, so as to improve the applicability of the method provided in this application and compatibility with different network architectures.
  • the third device is further configured to: measure and obtain multiple bandwidth prediction results for multiple links during the process of forwarding the multiple audio and video streams to the second device;
  • the device is specifically configured to: determine, according to multiple bandwidth prediction results, that bandwidth resources are insufficient to meet bandwidth requirements of multiple audio and video streams.
  • the above multiple links correspond to the above multiple audio and video streams.
  • the above multiple links are respectively used to transmit the above multiple audio and video streams.
  • the third device may perform bandwidth prediction.
  • the third device may perform bandwidth prediction when forwarding audio and video streams to the second device.
  • the second device is further configured to: during the process of receiving the above multiple audio and video streams, perform measurement to obtain multiple bandwidth prediction results for multiple links; the second device is specifically configured to: It is determined according to the multiple bandwidth prediction results that the bandwidth resources are insufficient to meet the bandwidth requirements of the multiple audio and video streams.
  • the above multiple links correspond to the above multiple audio and video streams.
  • the above multiple links are respectively used to transmit the above multiple audio and video streams.
  • the second device can perform bandwidth prediction, and through this solution, the compatibility of the method provided by the present application with different network architectures can be improved.
  • the above-mentioned second device is further configured to: play the corresponding audio and video stream with the first definition through the first window; the first window is among the above-mentioned multiple windows; the second device
  • the corresponding priority adjustment of the subscription strategy for the audio and video streams in one or more windows includes: the second device subscribes to the audio and video streams of the second definition for the first window; the second definition is lower than the first definition.
  • the receiving end when the receiving end is in a weak network environment, it can adjust the subscription policy to reduce the video definition for low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
  • the second device is further configured to: when the first preset condition is met, Subscribe for audio and video streams in first definition.
  • the second device may determine whether the first preset condition is satisfied by monitoring the predicted value of the total bandwidth.
  • the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
  • the second device is further configured to: when the first preset condition is met for a preset time period, Subscribe to the first definition audio and video stream for the first window.
  • the second device may determine whether the first preset condition is satisfied by monitoring the predicted value of the total bandwidth.
  • the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when meeting the bandwidth requirements for restoring clarity, restore the definition of the degraded video to maximize the fluency and/or clarity of the video
  • the second device is further configured to: play audio and video streams with a second definition through a second window, the second definition is less than or equal to a preset value, and the second window is the above-mentioned multiple Among the windows, the window with the lowest priority; the second device adjusts the subscription strategy for the audio and video streams in one or more windows according to the priorities corresponding to the multiple windows, including: the second device unsubscribes the video stream corresponding to the second window .
  • the receiving end when the receiving end is in a weak network environment, it can adjust the subscription policy of unsubscribing to low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
  • the second device is further configured to: display a mask on the second window.
  • the mask layer By displaying the mask layer, users can be reminded that they are currently in a weak network environment and user experience can be improved.
  • the second device is further configured to: resume subscribing to the second window for the second window when the second preset condition is met.
  • high-definition video streaming For example, the second device may determine whether the second preset condition is met by monitoring the predicted value of the total bandwidth.
  • the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
  • the second device is further configured to: when the second preset condition is met for a preset time period, set Resume subscription to second definition video stream.
  • the second device may determine whether the second preset condition is met by monitoring the predicted value of the total bandwidth.
  • the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition.
  • the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
  • the second device is further configured to determine the priorities corresponding to the multiple windows according to one or more of the following: the initial volume of the audio corresponding to the multiple windows; The playback volume of the audio corresponding to the window; the function of business in multiple windows.
  • the initial volume of the audio is used to represent the original volume of the audio stream when the second device receives the audio stream.
  • various window priority settings can be supported.
  • the second device may determine the priorities corresponding to the multiple windows according to the volume of the initial/playing audio corresponding to the multiple windows and/or the functions of the services in the multiple windows.
  • the diversified window priority setting can facilitate diversified operations of the user and improve user experience.
  • the above-mentioned second device is further configured to: determine priorities corresponding to multiple windows according to user-defined specified operations.
  • various window priority settings can be supported. For example, it can be user-defined.
  • the diversified window priority setting can facilitate diversified operations of the user and improve user experience.
  • the above-mentioned one or more third devices include a first cloud device and a second cloud device, wherein the first cloud device is used to forward the first audio and video stream to the second device, and the second cloud device The device is configured to forward the second audio-video stream and the third audio-video stream to the second device.
  • the multi-window video communication method provided by this application can be applied to a network architecture in which a distributed cloud device (that is, a third device) forwards audio and video streams, so as to improve the applicability of the method provided by this application and to communicate with different networks.
  • a distributed cloud device that is, a third device
  • the foregoing third device is an SFU.
  • a communication system in a fifth aspect, includes: a plurality of first devices and the electronic device in any possible implementation manner of the second aspect or the third aspect.
  • the above-mentioned multiple first devices send audio and video streams to the second device for playing in multiple windows on the interface of the second device respectively.
  • the communication system is used to implement the method in any possible implementation manner of the fourth aspect.
  • the foregoing communication system further includes: one or more third devices, configured to forward audio and video streams from the plurality of first devices to an electronic device.
  • a computer-readable storage medium is provided.
  • Computer program code is stored on the computer-readable storage medium.
  • the processor can realize any possible implementation of the first aspect. methods in methods.
  • a chip system in a seventh aspect, includes a processor and a memory, and computer program code is stored in the memory; when the computer program code is executed by the processor, the processor implements any one of the first aspect. method in one possible implementation.
  • the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
  • a computer program product comprising computer instructions.
  • the computer instructions When the computer instructions are run on the computer, the computer is made to implement the method in any possible implementation manner of the first aspect.
  • Fig. 1 is an exemplary diagram of a selective forwarding unit (selective forwarding unit, SFU) forwarding scheme provided by the embodiment of the present application;
  • FIG. 2 is a video communication interaction diagram based on SFU frame extraction provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of a layered encoding provided by an embodiment of the present application.
  • FIG. 4 is an example diagram of a multi-window video communication scenario provided by an embodiment of the present application.
  • FIG. 5A is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
  • FIG. 5B is a software architecture diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a call service architecture based on real time communication (real time communication, RTC) provided by the embodiment of the present application;
  • FIG. 7 is a structure diagram of a multi-window video communication provided by an embodiment of the present application.
  • FIG. 8 is a flowchart of a multi-window video communication method provided by an embodiment of the present application.
  • FIG. 9 is an interaction diagram of a multi-window video communication method provided by an embodiment of the present application.
  • FIG. 10 is an example diagram of three types of multi-window display provided by the embodiment of the present application.
  • FIG. 11 is an example diagram of a multi-window display provided by the embodiment of the present application.
  • Fig. 12 is an example diagram of displaying a mask when a weak network is provided by the embodiment of the present application.
  • FIG. 13 is a flow chart of another multi-window video communication method provided by the embodiment of the present application.
  • FIG. 14 is an example diagram of displaying a weak network prompt when a weak network is provided by an embodiment of the present application.
  • FIG. 15 is an example diagram 1 of a method for setting a priority corresponding to a window provided in an embodiment of the present application
  • Figure 16 is an example of the setting method of the priority corresponding to the window provided in the embodiment of the present application Figure 2;
  • FIG. 17 is a structural block diagram of an electronic device provided by an embodiment of the present application.
  • first and second are used for descriptive purposes only, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as “first” and “second” may explicitly or implicitly include one or more of these features. In the description of this embodiment, unless otherwise specified, “plurality” means two or more.
  • An embodiment of the present application provides a multi-window video communication method, which is applied to a process in which multiple users conduct a multi-party real-time video call.
  • a multi-window video communication method provided in this embodiment of the application can be applied to a multi-party conference scenario.
  • user A, user B, and user C join a conference for a multi-party real-time video call, where user A is the conference speaker, and user B and user C are conference participants.
  • the multi-window video communication method provided in the embodiment of the present application may be applied to a group video chat scene.
  • user A, user B, and user C join a group chat for a multi-party real-time video call, wherein during the group chat, user A, user B, and user C all speak freely.
  • the multi-window video communication method provided in the embodiment of the present application may be applied to an online education scenario.
  • user A, user B, and user C join an online class group for a multi-party real-time video call, where user A is a teacher, and user B and user C are students.
  • User A shows courseware/whiteboard to user B and user C while teaching and explaining.
  • the sending end (such as the first device) can send a message to the receiving end through a third device (such as a cloud device) (such as the second device) forward audio and video streams.
  • a third device such as a cloud device
  • the third device may complete the forwarding of the audio and video stream based on the publishing and subscription relationship of the audio and video stream.
  • the third device supports the end-to-end encryption feature, and does not need to analyze the audio and video streams at the sending end.
  • the sending end and the receiving end store keys (such as a public key and a private key), which are not known by the third device, so the third device cannot parse the user's audio and video streams, which is highly secure.
  • the sending end may forward the audio and video streams to the receiving end through a selective forwarding unit (selective forwarding unit, SFU).
  • SFU selective forwarding unit
  • multiple SFUs can be deployed in a distributed manner to improve network scalability.
  • FIG. 1 shows an example diagram of an SFU forwarding solution. As shown in Figure 1, device 1 and device 2 respectively select SFU 1 and SFU 2 to transmit audio and video streams to device 4, and device 3 selects SFU3 to send audio and video streams to device 4 and device 5.
  • the sending end can select the optimal SFU for audio and video stream forwarding.
  • the sending end may select the optimal SFU according to network conditions (such as operators, regions, etc.), which is not limited in this application.
  • the third device (such as a cloud device) can also perform bandwidth prediction on the downlink of the forwarded audio and video stream when forwarding, and send the prediction result to the receiving end for use by the receiving end in the network. Time-lapse communication decisions.
  • a third device (such as an SFU) can make frame extraction decisions when the network is poor.
  • frame extraction can reduce downlink bandwidth consumption, avoid network congestion, and ensure audio and video fluency and/or clarity in a weak network environment.
  • FIG. 2 takes the bandwidth prediction performed by the SFU through the bandwidth prediction module as an example, and shows a video communication interaction diagram based on SFU frame extraction.
  • the video communication process based on SFU frame extraction mainly includes the following five steps:
  • Step 1 The sending end collects, encodes and encrypts audio and video data
  • Step 2 The sender sends the frame data, frame type, etc. to the SFU;
  • Step 3 The forwarding module of the SFU forwards the frame data to the receiving end, and at the same time, the bandwidth prediction module of the SFU performs downlink bandwidth prediction;
  • Step 4 SFU makes a frame drawing decision based on the bandwidth prediction value
  • Step 5 After receiving the frame data, the receiving end performs decryption, decoding and rendering.
  • the bandwidth prediction performed by the bandwidth prediction module may include: the downlink bandwidth prediction value calculated by the bandwidth prediction module according to data such as time delay and packet loss of downlink data transmission.
  • data such as time delay and packet loss of downlink data transmission.
  • IDR frame compression can be achieved 6:1 without any perceptible blurring. Using P frame compression at the same time as IDR frame compression can achieve a higher compression ratio without noticeable blurring.
  • B frame compression can achieve a compression ratio of 200:1, and its file size is generally 15% of the IDR frame compression size, less than half of the P frame compression size.
  • the IDR frame compression can remove the spatial redundancy of the image
  • the P frame and B frame compression can remove the temporal redundancy.
  • the IDR frame compression adopts full-frame compression encoding, that is, the full-frame image information is subjected to joint photographic experts group (JPEG) compression encoding.
  • JPEG joint photographic experts group
  • the IDR frame describes the details of the image background and moving subject, therefore, the IDR frame is also called a key frame. Based on this, IDR frames can be decoded and rendered independently. When decoding a video stream, only the data of the IDR frame can be used to reconstruct the complete picture.
  • the IDR frame can be used as a reference for P and B frames.
  • an IDR frame can be used as a reference for a P frame and a B frame after performing intra prediction, residual determination, residual transformation and quantization, variable length coding and arithmetic coding, image reconstruction and filtering, respectively.
  • the residual can be determined by subtracting the predicted value from the pixel value.
  • the P frame is a coded frame separated by 1 to 2 frames behind the IDR frame.
  • the P frame belongs to the interframe coding of forward prediction, so it only refers to the IDR frame or P frame closest to it for prediction.
  • the P frame adopts the method of motion compensation to predict the difference and the motion vector between the current frame and the previous nearest IDR frame or P frame.
  • the complete P frame image must be reconstructed after summing the prediction value and the prediction error in the IDR frame.
  • a P frame is predicted based on its preceding IDR frame.
  • the P frame can be the reference frame of the P frame behind it, or the reference frame of the B frame before and after it.
  • B-frames are bidirectionally inter-coded.
  • B frames extract data from preceding and following IDR frames or P frames.
  • the B frame is compressed based on the difference between the current frame and the image of the previous frame and the next frame to complete decoding and rendering. For example, the B frame predicts the prediction error and motion vector between the current frame and the preceding IDR frame or between the P frame and the following P frame. As shown in Figure 3, a B frame is predicted based on its preceding IDR frame and P frame, or based on its preceding and following P frames. B frames are not referenced by other frames.
  • B frames are not referenced by other frames, so the discarding of B frames will not affect the reference relationship of video frames. Based on this, when the network is poor at the receiving end, it can decide whether to extract B frames according to the bandwidth prediction value and the bandwidth requirements of various types of encoded frames. Among them, the elimination of B frames can achieve the purpose of reducing the downlink bandwidth load.
  • the bandwidth requirements of various types of coded frames can be obtained by counting the original frame data from the sending end.
  • Table 1 shows an example of bandwidth requirements.
  • the high-definition video is played in the large window, and the resolution of the video in the large window is 540P; the normal definition video is played in the small window, and the resolution of the video in the small window is 360P.
  • the required bandwidth of the video stream played in the large window shown in Figure 4 is 700 kilobits per second (kilobits per second, kbps) before drawing frames; the required bandwidth after drawing B frames is 400kbps.
  • the video streams played in the small window 1 and small window 2 shown in Figure 4 require a bandwidth of 500 kbps before frame extraction; and require a bandwidth of 300 kbps after extracting B frames.
  • the bandwidth required for audio and video streams is affected by the codec (Codec), the code rate control of the upstream bandwidth, the number of layers of layered coding, and the resolution/frame rate of the video.
  • codec codec
  • This example does not limit the specific calculation strategies and specific algorithms for the required bandwidth of audio and video streams.
  • the receiving end makes a decision Does not pump frames.
  • the receiving end decides to draw B frames.
  • the predicted bandwidth value is less than the required bandwidth of the IDR frame + the required bandwidth of the P frame, even if the B frame is extracted, the predicted bandwidth still cannot meet the required bandwidth after the frame is drawn. congestion. Network congestion will increase the delay, resulting in audio and video freezes.
  • the receiving end decides not to draw frames.
  • the bandwidth prediction value is less than 1000kbps, the predicted bandwidth cannot meet the minimum required bandwidth of the three windows. In this case, even if B frames are extracted, the predicted bandwidth still cannot meet the required bandwidth after frame extraction. Therefore, after frame extraction There are still audio and video freezes in the large window, small window 1, and small window 2 shown in Figure 4.
  • the receiver decides to draw B frames for the large window, small window 1, and small window 2 shown in Figure 4, so the large window, small window 1, and small window 2 The smoothness of the video in the medium and low-level videos has been reduced.
  • the predicted bandwidth value is less than 1000kbps
  • the receiver decides to draw B frames for the large window, small window 1, and small window 2 shown in Figure 4, but the predicted bandwidth after frame extraction still cannot meet the minimum requirements of the three windows. Bandwidth is required, so there is network congestion in all three windows. Network congestion will increase the delay, resulting in audio and video freezes in the large window, small window 1, and small window 2 shown in Figure 4.
  • an embodiment of the present application provides a multi-window video communication method.
  • the terminal data stream can be adjusted according to the specific business scenarios and demands of multiple video call users to ensure priority Normal playback of high priority video streams. For example, in the scenario shown in FIG. 4 , priority is given to ensuring video fluency and/or clarity in the large window. As another example, in online education scenarios, priority is given to ensuring the video fluency and/or clarity of courseware/whiteboards.
  • a weak network means that bandwidth resources are insufficient to meet the bandwidth requirements of audio and video streams.
  • the bandwidth is limited Limiting scenarios, etc.
  • using the method provided by the embodiment of this application to downgrade and subscribe to video streams based on specific priorities can refine the gradient of video stream control, and avoid network congestion while making full use of downlink bandwidth, thereby ensuring maximum protection The smoothness and/or clarity of the video.
  • the multi-window video communication method provided by the embodiment of the present application can be applied to, but not limited to, smart phones, netbooks, tablet computers, smart watches, smart bracelets, phone watches, smart cameras, palmtop computers, personal computers (personal computers, PCs), personal Digital assistant (personal digital assistant, PDA), portable multimedia player (portable multimedia player, PMP), augmented reality (augmented reality, AR) / virtual reality (virtual reality, VR) equipment, TV, projection equipment or human-computer interaction Somatosensory game consoles in the scene, etc.
  • the method may also be applied to electronic devices of other types or structures, which is not limited in this application.
  • FIG. 5A shows a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application by taking a smart phone as an example.
  • the electronic device can include a processor 510, a memory (comprising an external memory interface 520 and an internal memory 521), a universal serial bus (universal serial bus, USB) interface 530, a charging management module 540, and a power management module 541 , battery 542, antenna 1, antenna 2, mobile communication module 550, wireless communication module 560, audio module 570, speaker 570A, receiver 570B, microphone 570C, earphone jack 570D, sensor module 580, button 590, motor 591, indicator 592 , a camera 593, a display screen 594, and a subscriber identification module (subscriber identification module, SIM) card interface 595, etc.
  • SIM subscriber identification module
  • the sensor module 580 may include a pressure sensor, a gyroscope sensor, an air pressure sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity light sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, and the like.
  • the structure shown in the embodiment of the present invention does not constitute a specific limitation on the electronic device.
  • the electronic device may include more or fewer components than shown in the illustrations, or combine certain components, or separate certain components, or arrange different components.
  • the illustrated components can be realized in hardware, software or a combination of software and hardware.
  • Processor 510 may include one or more processing units.
  • the processor 510 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a flight controller, Video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc.
  • application processor application processor
  • AP application processor
  • modem processor graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • flight controller Video codec
  • digital signal processor digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit
  • a memory may also be provided in the processor 510 for storing instructions and data.
  • the memory in processor 510 is a cache memory.
  • the memory may hold instructions or data that the processor 510 has just used or recycled. If the processor 510 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 510 is reduced, thus improving the efficiency of the system.
  • processor 510 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transmitter (universal asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input and output (general-purpose input/output, GPIO) interface, subscriber identity module (subscriber identity module, SIM) interface, and /or universal serial bus (universal serial bus, USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transmitter
  • MIPI mobile industry processor interface
  • GPIO general-purpose input and output
  • subscriber identity module subscriber identity module
  • SIM subscriber identity module
  • USB universal serial bus
  • the charging management module 540 is used for receiving charging input from the charger.
  • the power management module 541 is used for connecting the battery 542 , the charging management module 540 and the processor 510 .
  • the power management module 541 receives the input of the battery 542 and/or the charging management module 540, and supplies power for the processor 510, the internal memory 521, the display screen 594, the camera component 593, and the wireless communication module 560, etc.
  • the wireless communication function of the electronic device can be realized by the antenna 1, the antenna 2, the mobile communication module 550, the wireless communication module 560, the modem processor and the baseband processor.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in an electronic device can be used to cover single or multiple communication frequency bands. Different antennas can also be multiplexed to improve the utilization of the antennas.
  • Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 550 can provide wireless communication solutions including 2G/3G/4G/5G applied to electronic devices.
  • the mobile communication module 550 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA) and the like.
  • the mobile communication module 550 can receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation.
  • the mobile communication module 550 can also amplify the signal modulated by the modem processor, convert it into electromagnetic wave and radiate it through the antenna 1 .
  • at least part of the functional modules of the mobile communication module 550 may be set in the processor 510 .
  • at least part of the functional modules of the mobile communication module 550 and at least part of the modules of the processor 510 may be set in the same device.
  • the electronic device can communicate with the cloud device through the mobile communication module 550, for example, sending audio and video streams and the like.
  • a modem processor may include a modulator and a demodulator.
  • the modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator sends the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low-frequency baseband signal is passed to the application processor after being processed by the baseband processor.
  • the application processor outputs a sound signal through an audio device (not limited to a speaker 570A, a receiver 570B, etc.), or displays an image or video through a display screen 594 .
  • the modem processor may be a stand-alone device. In some other embodiments, the modem processor may be independent of the processor 510, and be set in the same device as the mobile communication module 550 or other functional modules.
  • the electronic device may play audio corresponding to multiple windows through an audio device, and play corresponding video through multiple windows on the display screen 594 .
  • the wireless communication module 560 can provide wireless local area networks (wireless local area networks, WLAN) (such as WiFi network), Bluetooth BT, global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • the wireless communication module 560 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 560 receives electromagnetic waves via the antenna 2 , frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 510 .
  • the wireless communication module 560 can also receive the signal to be transmitted from the processor 510 , frequency-modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
  • the antenna 1 of the electronic device is coupled to the mobile communication module 550, and the antenna 2 is coupled to the wireless communication module 560, so that the electronic device can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access, CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC , FM, and/or IR techniques, etc.
  • GSM global system for mobile communications
  • general packet radio service general packet radio service
  • CDMA code division multiple access
  • WCDMA broadband Code division multiple access
  • time division code division multiple access time-division code division multiple access
  • TD-SCDMA time-division code division multiple access
  • LTE long term evolution
  • BT GNSS
  • the GNSS may include a global positioning system (global positioning system, GPS), a global navigation satellite system (global navigation satellite system, GLONASS), a Beidou navigation satellite system (beidou navigation satellite system, BDS), a quasi-zenith satellite system (quasi -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • Beidou navigation satellite system beidou navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite based augmentation systems
  • the electronic device realizes the display function through the GPU, the display screen 594, and the application processor.
  • the GPU is a microprocessor for image processing, connected to the display screen 594 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 510 may include one or more GPUs that execute program instructions to generate or alter display information.
  • the display screen 594 is used to display images, videos and the like.
  • Display 594 includes a display panel.
  • the electronic device may include 1 or N display screens 594, where N is a positive integer greater than 1.
  • the electronic device may render videos in multiple windows through the GPU, and the display screen 594 is used to play corresponding videos through the multiple windows.
  • the electronic device can realize the shooting function through ISP, camera component 593 , video codec, GPU, display screen 594 and application processor.
  • the external memory interface 520 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device.
  • the external memory card communicates with the processor 510 through the external memory interface 520 to implement a data storage function. Such as saving music, video and other files in the external memory card.
  • the internal memory 521 may be used to store computer-executable program codes including instructions.
  • the internal memory 521 may include an area for storing programs and an area for storing data.
  • the stored program area can store an operating system, at least one application program required by a function (such as a sound playing function, an image playing function, etc.) and the like.
  • the storage data area can store data (such as audio data, phone book, etc.) created during the use of the electronic device.
  • the internal memory 521 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
  • the processor 510 executes various functional applications and data processing of the electronic device by executing instructions stored in the internal memory 521 and/or instructions stored in a memory provided in the processor.
  • the electronic device can realize the audio function through the audio module 570, the speaker 570A, the receiver 570B, the microphone 570C, and the application processor. Such as music playback, recording, etc.
  • the audio module 570, the loudspeaker 570A, the receiver 570B and the microphone 570C, as well as the specific working principles and functions of the buttons 590, the motor 591, the indicator 592 and the SIM card interface 595 you can refer to the introduction in the conventional technology .
  • the software of the electronic device can be divided into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces. As shown in Figure 5B, the software structure of the electronic device can be divided into three layers from top to bottom:
  • Application layer (referred to as application layer), application framework layer (referred to as framework layer), system library, Android runtime and kernel layer (also referred to as driver layer).
  • the application layer may include a series of application packages, such as camera, gallery, calendar, call, map, navigation, bluetooth, music, video, short message and other applications.
  • application for short below.
  • the application layer may also include an RTC software development kit (software development kit, SDK) and a call SDK.
  • the call application is mainly used to complete a user interface (user interface, UI) and interaction logic.
  • the call SDK is mainly used to connect with the communication cloud, provide account management, contact management, signaling communication and other capabilities, complete audio and video data collection, broadcasting, encoding and decoding, connect with RTC SDK, and complete the intercommunication of media streams during the call.
  • the RTC SDK is mainly used to interact with the RTC cloud to provide the ability to send media streams (such as audio and video streams).
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer may include a window management server (window manager service, WMS), an activity management server (activity manager service, AMS) and an input event management server (input manager service, IMS).
  • the application framework layer may also include a content provider, a view system, a phone manager, a resource manager, a notification manager, etc. (not shown in FIG. 5B ).
  • the system library and the Android runtime include the functions that the FWK needs to call, the Android core library, and the Android virtual machine.
  • a system library can include multiple function modules. For example: browser kernel, three-dimensional (3 dimensional, 3D) graphics, font library, etc.
  • a system library can include multiple function modules. For example: surface manager (surface manager), media library (Media Libraries), 3D graphics processing library (eg: OpenGL ES), 2D graphics engine (eg: SGL), etc.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer can include display drivers, input/output device drivers (for example, keyboards, touch screens, earphones, speakers, microphones, etc.), device nodes, camera drivers, audio drivers, and sensor drivers.
  • input/output device drivers for example, keyboards, touch screens, earphones, speakers, microphones, etc.
  • device nodes camera drivers, audio drivers, and sensor drivers.
  • the user performs an input operation through the input device, and the kernel layer can generate corresponding original input events according to the input operation, and store them in the device node.
  • Input/output device drivers can detect user input events. For example, an operation where the user sets the window priority by dragging the window.
  • the end-side device (such as the receiving end) also includes a weak network decision module and a coder/decoder.
  • the weak network decision module is used to predict the overall bandwidth according to the predicted bandwidth, determine whether it is in a weak network environment according to the overall bandwidth and the bandwidth requirements of multiple audio and video streams, and perform weak network decision-making and processing when it is determined to be in a weak network environment.
  • the encoding/decoding module is responsible for decoding audio and video streams.
  • the weak network decision-making module and the encoding/decoding module are located at the application framework layer as an example.
  • the weak network decision module and the codec/decode module can be located at any software architecture layer of the device on the end (such as the receiving end).
  • the above-mentioned weak network decision-making module and encoding/decoding module may also be located at the software architecture layer such as the application program layer, the kernel layer, or the system library of the end-side device (such as the receiving end).
  • a multi-window video communication method provided in the embodiment of the present application may be implemented based on a real time communication (real time communication, RTC) call service architecture.
  • RTC real time communication
  • FIG. 6 shows a schematic diagram of an RTC-based call service architecture.
  • the end-side devices communicate through cloud devices.
  • the architecture includes two end-side devices as an example, but this embodiment of the present application does not limit the specific number of end-side devices in multi-window video communication.
  • cloud devices include communication cloud and RTC cloud.
  • the communication cloud includes an account server and a signaling server.
  • RTC cloud includes RTC server and RTC SFU.
  • the account server is mainly used for storing and maintaining account information, contact information, and push (Push) information.
  • the signaling server is mainly responsible for forwarding call information and call control signaling.
  • the RTC server is mainly responsible for room (Room) access authentication, SFU resource allocation, routing policy, and Room operation/interaction.
  • the RTC SFU is mainly used to maintain the publish/subscribe relationship of media streams, forward media streams (such as audio and video streams), and network adaptability.
  • the end-side device is a sending end (such as a first device) or a receiving end (such as a second device).
  • the call application, call SDK and RTC SDK are installed in the end-side device.
  • the call application is mainly used to complete UI and interaction logic.
  • the call SDK is mainly used to connect with the communication cloud, provide account management, contact management, signaling communication and other capabilities, complete audio and video data collection, broadcasting, encoding and decoding, connect with RTC SDK, and complete the intercommunication of media streams during the call.
  • the RTC SDK is mainly used to interact with the RTC cloud to provide the ability to send media streams (such as audio and video streams).
  • the end-side device also includes an encoding/decoding module.
  • the encoding module is used to encode audio and video streams; when the end-side device is a receiving end, the decoding module is used to decode audio and video streams.
  • the structure of the RTC SFU in the call service architecture shown in FIG. 6 may be the structure of the SFU shown in FIG. 2 .
  • the RTC SFU may include a forwarding module and a bandwidth prediction module.
  • the forwarding module is used to forward audio and video streams.
  • the bandwidth prediction module is used for bandwidth prediction.
  • the terminal-side device (such as the receiving terminal) further includes a weak network decision module.
  • the weak network decision module is used to predict the overall bandwidth (that is, predict bandwidth resources), determine whether it is in a weak network environment according to the overall bandwidth and the bandwidth requirements of multiple audio and video streams, and perform weak network decision-making and processing when it is determined to be in a weak network environment.
  • the weak network decision-making module you can refer to the specific introduction below.
  • FIG. 7 shows a multi-window video communication architecture diagram provided by an embodiment of the present application.
  • RTC SFU 1 and RTC SFU 2 are deployed in a distributed manner to improve network scalability.
  • Each sender shown in Figure 7 can select the optimal RTC SFU for audio and video stream forwarding by itself, for example, select the optimal RTC SFU according to network conditions (such as operators, regions, etc.).
  • sender A chooses to send audio and video stream 1 to receiver D through RTC SFU 1; sender B and sender C choose to send audio and video stream 2 and audio and video to receiver D through RTC SFU 2 respectively Stream 3.
  • the bandwidth prediction module of RTC SFU 1 shown in Figure 7 performs bandwidth prediction according to the downlink data (audio and video stream 1 shown in Figure 7), and sends the bandwidth prediction result to the weak network decision module of the receiving end D.
  • the bandwidth prediction module calculates the bandwidth prediction value 1 according to the data such as time delay and packet loss of the audio and video stream 1 shown in FIG. 7 .
  • the bandwidth prediction module of the RTC SFU 2 shown in Figure 7 calculates the bandwidth prediction value 2 according to the data such as the delay and packet loss of the audio and video stream 2 shown in Figure 7, and sends it according to the audio and video stream 3 shown in Figure 7.
  • the bandwidth prediction value 3 is obtained by calculating the time delay, packet loss and other data.
  • the receiving end D shown in FIG. 7 includes a decoding module and a weak network decision module.
  • the decoding module of the receiving end D is used to decode the received audio and video stream (as shown in FIG. 7 , audio and video stream 1 & audio and video stream 2 & audio and video stream 3).
  • the decoding module may include multiple decoders, and the multiple decoders are used to share the decoding work of audio and video streams.
  • the decoding module includes a decoder A, a decoder B and a decoder C
  • the decoder A is used to decode the audio and video stream 1
  • the decoder B is used to decode the audio and video stream 2
  • the decoder C is used to decode the audio and video stream 3 .
  • the weak network decision-making module shown in Figure 7 is used to carry out overall bandwidth prediction (i.e. bandwidth resource Prediction) to get the total bandwidth prediction value. Further, the weak network decision-making module shown in Figure 7 is also used to make weak network judgments based on the total bandwidth prediction value and the bandwidth requirements of audio and video streams, and when it is determined to be in a weak network environment, the video in the window with a lower priority is preferentially downgraded and subscribed , to ensure the smooth playback of videos in windows with higher priority.
  • bandwidth resource Prediction bandwidth resource Prediction
  • the weak network decision-making module may downgrade the subscription to the video in the lower priority window, which may include but not limited to one or more of the following: unsubscribe, resume subscription, delay subscription, reduce definition, improve clarity degree etc.
  • the third device may also be an electronic device such as a smart phone.
  • the third device can form a peer-to-peer (P2P) network architecture with the first device and the second device, wherein the first device, the second device and the third device can all serve as the receiver.
  • P2P peer-to-peer
  • the third device may also serve as a forwarding device, forwarding the audio and video stream from the first device to the second device.
  • the following will take the multi-window video communication architecture shown in Figure 7 as an example, that is, multiple sending ends (that is, the first device) send audio and video streams to the receiving end (that is, the second device) through the third device (such as SFU) as an example , taking the display interface of the receiving end including the large window, small window 1, and small window 2 as shown in FIG. 4 as an example, a multi-window video communication method provided by the embodiment of the present application is specifically introduced with reference to the accompanying drawings.
  • the multi-window video communication method provided by the embodiment of the present application may include the following steps S801-S804:
  • Multiple sending ends that is, the first device send audio and video streams to a third device (such as the SFU).
  • the audio and video stream also carries the identification (identification, ID) of the receiving end (such as the second device), which is used for the third device (such as SFU) to forward the audio to the receiving end according to the ID of the receiving end. video stream.
  • ID identification
  • the ID of the receiver carried in the audio and video stream is also used by the SFU to predict the downlink bandwidth of the corresponding downlink path.
  • the audio and video stream also carries the ID of the sending end.
  • the above step S801 may specifically include: the sending end A sends the audio and video stream 1 to the RTC SFU 1, and the sending end B and the sending end C send the audio and video stream 1 to the RTC SFU 1 respectively.
  • 2 Send audio and video stream 2 and audio and video stream 3.
  • the audio and video stream 1, the audio and video stream 2 and the audio and video stream 3 carry the ID of the sending end D.
  • the audio and video stream 1 carries the ID of the sender A
  • the audio and video stream 2 carries the ID of the sender B
  • the audio and video stream 3 carries the ID of the sender C.
  • the audio and video streams in this embodiment of the present application may also carry the following information: stream resolution information, frame rate information, encoding Codec, stream level, and the like.
  • audio-video stream 1, audio-video stream 2, and audio-video stream 3 shown in FIG. 9 not only include audio-video information, but also carry information as shown in Table 4 below:
  • the embodiment of the present application does not limit the specific order and timing of sending audio and video streams by multiple sending ends.
  • the sending end A, sending end B, and sending end C shown in FIG. 7 may send audio and video streams at the same time, or may send audio and video streams in any order of time.
  • the third device (such as the SFU) forwards audio and video streams to the receiving end (such as the second device), predicts downlink bandwidth, and sends the bandwidth prediction result to the receiving end.
  • the above step S802 may specifically include: the RTC SFU 1 transfers the voice video stream 1 to the receiving end D, and simultaneously performs bandwidth prediction to obtain a bandwidth prediction value 1, and Send the bandwidth prediction value 1 to the receiving end D; RTC SFU 2 transfers the voice video stream 2 to the receiving end D, and at the same time performs bandwidth prediction to obtain the bandwidth prediction value 2, and sends the bandwidth prediction value 2 to the receiving end D; RTC SFU 2 to the receiving end D forwards the audio and video stream 3, performs bandwidth prediction at the same time to obtain a bandwidth prediction value 3, and sends the bandwidth prediction value 3 to the receiving end D.
  • the predicted bandwidth value 1, the predicted bandwidth value 2, and the predicted bandwidth value 3 shown in FIG. 9 are all 500 kbps.
  • the third device may forward the audio-video stream to the receiver according to the ID of the receiver carried in the audio-video stream.
  • RTC SFU 1 shown in Figure 9 can forward audio and video stream 1 to receiver D according to the ID of receiver D carried in audio and video stream 1 from transmitter A;
  • RTC SFU 2 shown in Figure 9 can transmit audio and video stream 1 based on The ID of the receiving end D carried in the audio and video stream 2 of the terminal B forwards the audio video stream 2 to the receiving end D; and the RTC SFU 2 transmits the ID of the receiving end D to the The receiving end D forwards the audio and video stream 3.
  • the third device (such as the SFU) can predict the downlink bandwidth of the corresponding downlink path according to the ID of the receiver carried in the audio and video stream.
  • the audio and video corresponding to the audio and video streams forwarded by multiple sending ends to the receiving end through a third device are respectively played in multiple windows displayed on the display screen of the receiving end.
  • FIG. 10 shows several examples of multi-window display.
  • (a) in FIG. 10 and (b) in FIG. 10 show the multi-window display interface of the receiving end in a multi-window video communication scenario (such as a multi-party conference scenario or a group video chat scenario).
  • (c) in FIG. 10 shows the multi-window display interface of the receiving end in the online education scenario.
  • FIG. 10 only shows windows related to this application.
  • function keys, menu bars, navigation keys/bars, etc. may also be displayed on the interface of the receiving end, which is not limited by this application.
  • the sending end A, the sending end B, and the sending end C shown in Figure 7 respectively report to the receiving end D
  • the sent audio-video stream 1, audio-video stream 2, and audio-video stream 3 are used to play in the large window, small window 1, and small window 2 of the receiving end D respectively.
  • the SFU may also send the bandwidth requirement to the receiving end.
  • the required bandwidths of audio-video stream 1, audio-video stream 2, and audio-video stream 3 shown in FIG. 9 are 600 kbps, 500 kbps, and 500 kbps respectively.
  • the third device may also send the frame extraction status and the corresponding bandwidth requirement to the receiving end.
  • the frame extraction state is used to indicate whether the video coding mode is to extract frames or not to extract frames.
  • the frame extraction state information and corresponding bandwidth requirements of audio-video stream 1, audio-video stream 2, and audio-video stream 3 shown in FIG. 9 are shown in Table 5 below:
  • Audio and video streaming Frame state bandwidth requirements Audio and video stream 1 no frames 400kbps Audio and video stream 2 Frame 300kbps Audio and video stream 3 Frame 300kbps
  • the receiving end obtains a total bandwidth prediction value according to bandwidth prediction results from one or more third devices (such as SFU).
  • third devices such as SFU
  • the size of the total bandwidth prediction value can be used to represent the amount of bandwidth resources. For example, a larger total bandwidth prediction value indicates more sufficient bandwidth resources; a smaller total bandwidth prediction value indicates less bandwidth resources.
  • the above step S803 may specifically include: the receiving end D according to the bandwidth prediction value 1 from the RTC SFU 1, the bandwidth prediction value 2 from the RTC SFU 2 and The bandwidth prediction value is 3, and the total bandwidth prediction value is obtained.
  • total bandwidth prediction value bandwidth prediction value 1+bandwidth prediction value 2+bandwidth prediction value 3.
  • the predicted bandwidth value 1, the predicted bandwidth value 2, and the predicted bandwidth value 3 shown in FIG. 9 are all 500 kbps, it can be obtained that the total predicted bandwidth value is 1500 kbps.
  • this application does not limit the specific algorithm for the receiving end to obtain the total bandwidth prediction value based on the bandwidth prediction results from one or more SFUs.
  • this part reference can be made to calculation methods in conventional technologies.
  • the receiving end adjusts a subscription strategy for audio and video streams in one or more windows according to priorities corresponding to the multiple windows.
  • the above step S804 may specifically include: when determining a weak network according to the total bandwidth prediction value, the receiving end D according to the large window, small window 1 and small window 2, adjust the subscription strategy for audio and video streams in one or more windows of the large window, small window 1, and small window 2.
  • multiple windows on the display screen of the receiving end may have priority attributes.
  • the priority corresponding to the window is used to represent the importance of the video played in the window to the user at the receiving end. For example, relatively important windows have a higher priority than relatively less important windows.
  • the large window plays high-definition video at a higher resolution (540P), and the small windows 1 and 2 play normal-definition video at a lower resolution (360P).
  • the video played in the large window is more important to the user than the videos played in the small window 1 and the small window 2. Therefore, the priority of the large window shown in FIG. 11 is higher than that of the small window 1 and the small window 2 .
  • the courseware/whiteboard/screen sharing window is the core of online education, while the portrait of the lecturer is relatively unimportant. Therefore, the priority of the courseware/whiteboard/screen sharing window is higher than that of the lecturer portrait window. The specific method for determining the priority will be introduced below.
  • weak network means that under the current network environment, bandwidth resources are insufficient to meet the bandwidth requirements of audio and video streams.
  • the weak network means that under the current network environment, the total bandwidth prediction value obtained by the receiving end (for example, 300kbps-1000kbps) is smaller than the bandwidth value required by the audio and video stream (for example, 1200kbps).
  • the weak network means that under the current network environment, the bandwidth resources are insufficient to meet the video stream frame extraction. previous bandwidth requirements.
  • the video stream corresponding to the large window displayed on the receiving end D requires a bandwidth of 700 kbps before frame extraction
  • the video stream corresponding to small window 1 and window 2 requires a bandwidth of 500 kbps before frame extraction.
  • a weak network means that under the current network environment, bandwidth resources are insufficient to meet the bandwidth requirement of the video stream after frame extraction.
  • the video stream corresponding to the large window displayed on the receiving end D requires a bandwidth of 400 kbps after frame extraction
  • the video stream corresponding to small window 1 and small window 2 requires a bandwidth of 300 kbps before frame extraction.
  • the receiving end can determine that it is in a weak network environment.
  • the embodiment of the present application uses layered coding as an example, and for other video coding methods, the multi-window video communication method provided in the embodiment of the present application is also applicable. And, the embodiment of the present application takes the layered encoding of IDR frame+P frame+B frame, or IDR frame+P frame as an example. For other layered encoding types, the multi-window video communication method provided by the embodiment of the present application is also applicable .
  • the adjustment of the subscription policy by the receiving end may include, but not limited to, one or more of the following: unsubscribing, resuming subscription, delaying subscription, reducing clarity, improving clarity, and the like.
  • the receiving end when the receiving end adjusts the subscription policy, it can only adjust the subscription policy for the video stream, and keep the audio stream playing normally, so as to ensure the normal voice communication and communication of the user and ensure the user experience.
  • unsubscribing refers to unsubscribing from the corresponding video stream to cancel the video display in the window.
  • Restoring the subscription refers to resuming the subscription to the corresponding video stream to restore the video display in the window.
  • Delayed subscription refers to delayed subscription to the corresponding video stream, and the video display in the delayed window.
  • Reducing the definition refers to subscribing to the reduced-definition video stream for the window, for example, switching from subscribing to a high-definition video stream to subscribing to a normal-definition video stream.
  • Raising the definition refers to subscribing to the video stream with improved definition for the window, for example, switching from subscribing to a normal-definition video stream to subscribing to a high-definition video stream.
  • the display interface of the receiving end includes m windows (m ⁇ 3, m is an integer), and all m windows display high-definition video as an example.
  • the specific examples illustrate that the receiving end cancels subscription, resumes subscription, The process of delaying subscription, reducing clarity, or increasing clarity. It should be noted that the embodiment of the present application does not limit the number of windows on the display interface of the receiving end, for example, the display interface of the receiving end may further include 2 windows.
  • the display interface of the receiving end includes windows W 1 , W 2 ... W m (where m represents the priority corresponding to the window), among the windows W 1 , W 2 ... W m , the priority corresponding to W 1 is the highest , the priority corresponding to W 2 is the second, and the priority corresponding to W m is the lowest. If the bandwidth resources are not enough to meet the bandwidth requirements of keeping windows W 1 , W 2 ... W m at the current resolution, but meeting the requirements of keeping window W 1 , W 2 ...
  • the receiving end decides to subscribe to the lowered resolution video stream for the window W m , so as to ensure that the high-priority window (such as The resolution of the video in windows W 1 , W 2 . . . W m-1 ).
  • the receiving end decides to subscribe to the audio and video stream of the first definition for the window, and reduce the subscription to the audio and video stream of the second definition for the window.
  • the second definition is smaller than the first definition.
  • the receiving end decides to reduce the definition of the video in the window W m to ensure the video in the window W 1 , W 2 ??W m-1 HD display. That is, the receiving end decides to subscribe to the normal-definition video stream (that is, the second-definition video stream) for the window W m .
  • RW 1 , RW 2 . . . RW m are bandwidth requirements of windows W 1 , W 2 . . . W m respectively.
  • the resolution of high-definition video may be 540P
  • the resolution of normal-definition video may be 360P.
  • the receiving end after the receiving end reduces the definition of the video in the window W m , if RW 1/HD +RW 2/HD +...+RW m-1/HD +RW m/plain > total bandwidth prediction value ⁇ RW 1/HD +RW 2/HD +...+RW m-1/Pudio +RW m/Pu , then the receiving end can further decide to subscribe to PQ video stream for window W m-1 (that is, the second high-definition video stream), thereby reducing the definition of the video in the window W m-1 to ensure the high-definition display of the video in the windows W 1 , W 2 . . . W m-2 , and so on.
  • window W m-1 that is, the second high-definition video stream
  • the receiving end after the receiving end reduces the video definition in the window W m , if RW 1/HD +RW 2/ HD +...+RW m-1/HD +RW m/Pu >total bandwidth Predicted value ⁇ RW 1/HD +RW 2/HD +...+RW m-1/Pu Qing +RW m/Pu Qing , the receiving end can further decide to cancel the subscription of video stream for window W m , thereby canceling window W m
  • the video in the window is displayed to ensure the high-definition display of the video in the windows W 1 , W 2 . . . W m-2 .
  • the definition may also include three levels of ultra-clear, high-definition and normal definition.
  • the definition when reducing the definition, it can be reduced sequentially according to the gradient of ultra-definition ⁇ high-definition ⁇ normal definition.
  • the display interface of the receiving end includes windows W 1 , W 2 . If the bandwidth requirements for high-definition display are satisfied, but the bandwidth requirements for keeping windows W 1 , W 2 ... W m-1 displayed with the current parameters and the window W m does not display video are met, the receiving end decides to cancel the window W m with the lowest priority Subscribe to the video stream, thereby canceling the display of the video in the lowest priority window to ensure that other windows are displayed in the first definition. .
  • the receiving end decides to cancel the subscription of the video stream for the window instead of subscribing to the second-definition audio and video stream for the window with the lowest priority (that is, to cancel the subscription of the video stream for the window).
  • the second definition is less than or equal to a preset value.
  • the receiving end decides to cancel the video stream subscription for window W m , thereby canceling the video display in window W m , so as to ensure that the windows W 1 , W 2 ... W m-1 HD display of video.
  • the receiver can further decide to subscribe to the video stream with reduced definition (such as normal definition) for window W m-1 , thereby reducing the definition of the video in window W m-1 , or, the receiving end can further decide to cancel the subscription of the video stream for the window W m-1 to ensure the high-definition display of the video in the windows W 1 , W 2 ... W m-2 , and so on.
  • reduced definition such as normal definition
  • the window for unsubscribing may display the last image frame before unsubscribing.
  • the unsubscribe window may display a mask.
  • FIG. 12 shows an example of displaying a mask in the window after the widget 2 with a lower priority (priority is 2) unsubscribes in a weak network environment.
  • the unsubscribed window may display a mask superimposed on the last frame image before unsubscribed.
  • At least one of the windows W 1 , W 2 . . . Wm must be guaranteed to display video in at least one window. For example, at least ensure that the video in window W1 is displayed at the lowest resolution.
  • this application does not limit the specific adjustment strategy adopted by the receiver device for subscribing to audio and video streams when it is in a weak network environment.
  • the bandwidth resources are insufficient to meet the bandwidth requirements of all windows to keep the video stream displayed in the first definition, but the video display in the window with the lowest priority is canceled and other windows display the required bandwidth in the first definition, then the receiving end It may also be decided to cancel the subscription of the video stream for the window with the lowest priority, thereby canceling the display of the video in the window with the lowest priority, so as to ensure that other windows are displayed with the first definition.
  • the receiving end decides to cancel the video stream subscription for window W m , thereby canceling the video display in window W m , so as to ensure that windows W 1 , W 2 ... W m- HD display of videos in 1 .
  • the receiving end if the receiving end monitors the total bandwidth prediction value (ie, bandwidth resource monitoring), and determines that the latest total bandwidth prediction value satisfies the unsubscribed window, the window with the highest priority resumes subscription and other windows remain currently displayed bandwidth requirement (that is, the second preset condition), the receiving end decides to revert to the unsubscribed window, and the window with the highest priority subscribes to the video stream, so as to restore the display of the corresponding video.
  • the total bandwidth prediction value ie, bandwidth resource monitoring
  • the receiver determines that the latest total bandwidth prediction value meets the unsubscribed window within a preset time (such as 6 seconds) through the total bandwidth prediction value monitoring (i.e., bandwidth resource monitoring), priority If the window with the highest priority resumes subscription and other windows maintain the current display bandwidth requirements, the receiving end decides to revert to the unsubscribed window, and the window with the highest priority subscribes to the video stream to resume the display of the corresponding video.
  • a preset time such as 6 seconds
  • bandwidth prediction value monitoring i.e., bandwidth resource monitoring
  • the receiving end decides to revert to subscribing to the video stream for window W m , so as to restore the video display in window W m . For example, the receiving end decides to resume subscribing to the second-definition video for the window W m .
  • the receiving end passes the total bandwidth prediction Value monitoring, confirm that within 6 consecutive seconds, RW 1/ HD +RW 2/HD +...+RW m-1PuD +RW mPuD >total bandwidth prediction value ⁇ RW 1/HD +RW 2/HD +... ...+RW m-1 normal clear , then the receiving end decides to revert to subscribing to the video stream of window W m-1 to restore the video display in window W m-1 , but the status of window W m is still unsubscribed.
  • Delayed subscription means that when a window with a higher priority than a certain window is in the unsubscribed state, it delays subscribing to the video stream for the window, so as to delay the restoration of the video display in the window.
  • window W m-1 For example, assuming that the current windows W 1 , W 2 ... W m-2 all display high-definition video, and windows W m-1 and W m do not display video, then before window W m-1 resumes subscription, the state of window W m Still unsubscribe.
  • the receiving end after reducing the definition, if the receiving end is monitored by the total bandwidth prediction value (ie, bandwidth resource monitoring), and it is determined that the latest total bandwidth prediction value satisfies multiple windows that have been reduced in definition, the priority is the highest.
  • the window improves the definition display and other windows maintain the current display bandwidth requirements (that is, the first preset condition), then the receiving end decides to improve the definition of the video in the window with the highest priority among multiple windows that have been reduced in definition , to ensure the clarity of the video in the high-priority window.
  • the receiving end after reducing the definition, if the receiving end is monitored by the total bandwidth prediction value (i.e., bandwidth resource monitoring), it is determined that within a preset time (such as 6 seconds), the latest total bandwidth prediction value satisfies multiple Among the windows whose resolution has been reduced, the window with the highest priority is displayed with higher resolution and other windows maintain the bandwidth requirements of the current display, then the receiving end decides to increase the resolution of the windows with the highest priority The clarity of the video to ensure the clarity of the video in the high priority window.
  • the total bandwidth prediction value i.e., bandwidth resource monitoring
  • the receiving end decides to increase the definition of the window W m-1 , for example, switching from normal definition display (that is, display with the second definition) to high definition display (that is, display with the first definition).
  • the receiving end decides to improve the definition of window W m-1 , for example, switch from ordinary clear display (that is, display with the second definition) to high-definition display (that is, is displayed in the first definition), but the window Wm still displays the normal definition video (that is, the video in the second definition).
  • the receiving end can adjust the subscription of audio and video streams in multiple windows according to the priorities corresponding to multiple windows. Strategy.
  • the receiving end may decide to reduce the resolution of the video in the window W m-1 while unsubscribing from the window W m (for example, switching from the first resolution display to the second resolution display.)
  • the receiving end may decide to reduce the definition of the video in the window Wm- 1 while reducing the definition of the video in the window Wm -1 . For example, switching from the first definition display to the second definition display.
  • the receiving end can decide to improve the definition of the video in the window Wm -1 while restoring the display of the window Wm in the second definition (for example, switching from the second definition display to the first definition show.)
  • the receiving end may decide to increase the definition of the video in the window Wm- 1 while improving the definition of the video in the window Wm -1 . For example, switching from the second-definition display to the first-definition display.
  • step S805 the method provided by the embodiment of the present application further includes step S805:
  • the receiving end subscribes to multiple sending ends for audio and video streams according to the latest subscription policy.
  • the above step S801 may specifically include: the receiving end D subscribes to the sending end A, the sending end B, and the sending end C according to the determined latest subscription policy. flow.
  • the receiving end may send the following information to the third device to request the third device to subscribe to the corresponding audio and video stream from the first device: the identification (ID) of the sending end, the priority corresponding to the window, the original subscription parameter, the target subscription parameter, the window Identification (Surface ID).
  • ID the identification of the sending end
  • the priority corresponding to the window the priority corresponding to the window
  • the original subscription parameter the target subscription parameter
  • the window Identification the window Identification
  • the receiving end determines to subscribe to the high-definition video stream for the window W m and switch to subscribing for the ordinary clear video stream for the window W m , wherein the sending end of the audio and video stream corresponding to the window W m is the sending end A, and the window W m
  • the receiving end can send the following information to the third device to request the third device to subscribe to the first device for the audio and video stream of the corresponding parameters: ID of sending end A, 3 (priority), high-definition ( original subscription parameter), common clear (target subscription parameter), and the ID of the window W m .
  • the above-mentioned embodiments shown in FIG. 8 and FIG. 9 of this application only use the third device (such as SFU) to predict the downlink bandwidth and send the bandwidth prediction result to the receiving end as an example, and this application does not limit the specific device responsible for downlink bandwidth prediction. equipment.
  • the receiving end that is, the second device
  • the receiving end may also measure and obtain bandwidth prediction results of multiple links during the process of receiving multiple audio and video streams.
  • the above multiple links correspond to multiple audio and video streams.
  • the above multiple links are respectively used to transmit the above multiple audio and video streams.
  • a prompt message can be displayed on the display screen of the electronic device to remind the user that the current network is relatively weak. Difference.
  • FIG. 14 shows an example of displaying a weak network prompt after widget 2 with a lower priority (priority 2) unsubscribes when in a weak network environment.
  • the priority corresponding to the window can be specified by the user, or can be determined by the electronic device itself.
  • the following will introduce several methods of determining the priority corresponding to the window with reference to the accompanying drawings:
  • the user can customize the priority setting by changing the size of the window. For example, a small window is enlarged to a large window, so as to increase the priority corresponding to the window.
  • the priorities of window A, window B, window C and window D shown in (a) in Figure 15 are all 2, in response to the user's operation 1401 of stretching window A into a large window, as shown
  • the electronic device displays a large window D in the middle of the screen, and the electronic device sets the priority corresponding to window D to 1.
  • the user can change the order of the windows by dragging the windows, so as to customize the priority settings. For example, drag a window to the middle of the screen to increase the corresponding priority of the window.
  • the priorities of window B, window C and window D shown in (a) in Figure 16 are all 2, in response to the user's operation 1501 of dragging window D to the middle of the screen, as shown in Figure 16
  • the electronic device displays window D in the middle of the screen, and the electronic device sets the priority corresponding to window D to 1.
  • the embodiment of the present application does not specify a specific manner and method for the user to specify the priority corresponding to the window.
  • the user can also set the priority of the window in the menu.
  • the priority corresponding to the window is determined by the electronic device according to the volume of the audio corresponding to the multiple windows.
  • the electronic device may determine the priorities corresponding to the windows according to the initial volumes of the audios corresponding to the multiple windows.
  • the initial volume of the audio is used to represent the original volume of the audio stream when the electronic device receives the audio stream.
  • the electronic device may adaptively adjust the priority corresponding to the windows according to the initial volume of the audio corresponding to the multiple windows.
  • the electronic device may determine the priority corresponding to the windows according to the playback volume of the audio corresponding to the multiple windows.
  • the user at the receiving end can individually set the playback volume of the audio corresponding to different windows according to their concerns and points of interest. For example, for the window that the user pays most attention to, the user can increase the playback volume, and for the window that the user does not pay attention to, the user can lower the playback volume.
  • the electronic device may adaptively adjust the priority corresponding to the windows according to the playback volume of the audio corresponding to the multiple windows.
  • the priority corresponding to the window is determined by the electronic device according to the functions of the services in the multiple windows.
  • the interface shown in (c) in Figure 10 includes the courseware/whiteboard/screen sharing window and the window where the lecturer portrait is located, wherein the courseware/whiteboard/screen sharing window It is used to display the courseware/whiteboard or display the shared interface, and the window where the lecturer portrait is located is used to play the lecturer video in real time.
  • the core function of online education lies in the display of classroom content. Whether the lecturer portrait is smooth and clear will not affect the acquisition of classroom content. Therefore, the corresponding priority of the courseware/whiteboard/screen sharing window is higher than that of the lecturer portrait window.
  • the above-mentioned electronic device determines the priority corresponding to the window according to the volume of the audio corresponding to the multiple windows or the function of the business in the multiple windows as only two examples.
  • the embodiment of the present application determines the priority corresponding to the window for the electronic device
  • the specific rules and methods are not limited.
  • the priorities corresponding to the windows may also be determined by the electronic device according to other factors such as attributes of videos in multiple windows.
  • multiple windows may correspond to different priorities under different business scenarios or different user requirements.
  • Electronic devices can downgrade and subscribe to video streams based on specific priorities when the downlink bandwidth is limited or the bandwidth fluctuates greatly, such as reducing video resolution, unsubscribing video streams, or delaying subscribing to video streams, etc., to avoid network congestion and ensure Smoothness and/or clarity of video is a high priority.
  • the electronic device determines that the network situation has improved, it can restore the unsubscribed video (that is, restore the subscription) or restore the definition of the downgraded video (that is, improve the definition), so as to ensure maximum protection in multiple windows.
  • the smoothness and/or clarity of video playback are particularly advantageous.
  • serial numbers of the above-mentioned processes do not mean the order of execution, and the order of execution of the processes should be determined by their functions and internal logic, and should not be implemented in this application.
  • the implementation of the examples constitutes no limitation.
  • the electronic device (such as the first device, the second device or the third device) includes corresponding hardware structures and/or software modules for performing various functions.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software in combination with the units and algorithm steps of each example described in the embodiments disclosed herein. Whether a certain function is executed by hardware or computer software drives hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the embodiment of the present application can divide the functional modules of the electronic device (such as the first device, the second device or the third device), for example, each functional module can be divided corresponding to each function, or two or more functions can be integrated in one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. It should be noted that the division of modules in the embodiment of the present application is schematic, and is only a logical function division, and there may be other division methods in actual implementation.
  • FIG. 17 it is a structural block diagram of an electronic device provided in the embodiment of the present application.
  • the electronic device may be a first device, a second device or a third device.
  • the electronic device may include a transceiver unit 1710 , a processing unit 1720 and a storage unit 1730 .
  • the transceiver unit 1710 is configured to support the second device to receive audio and video streams from multiple first devices.
  • the transceiver unit 1710 is configured to support the second device to receive audio and video streams from multiple first devices forwarded by the third device.
  • the transceiver unit 1710 may also be configured to support the second device to receive bandwidth prediction values corresponding to multiple audio and video streams from the third device.
  • the transceiver unit 1710 may also be used to support the second device to subscribe to the audio and video stream from the first device, and/or other processes related to the embodiment of the present application.
  • the processing unit 1720 is configured to support the second device to obtain a total bandwidth prediction value according to multiple bandwidth prediction results, determine that it is in a weak network environment according to the total bandwidth prediction value, and adjust a subscription strategy for audio and video streams in one or more windows. In some embodiments, the processing unit 1720 is further configured to support the second device to measure bandwidth prediction results corresponding to multiple audio and video streams, and/or other processes related to the embodiments of the present application.
  • the storage unit 1730 is used to store computer programs and implement processing data and/or processing results in the methods provided by the embodiments of the present application.
  • the transceiver unit 1710 may include a radio frequency circuit.
  • the electronic device can receive and send wireless signals through a radio frequency circuit.
  • radio frequency circuitry includes, but is not limited to, an antenna, at least one amplifier, transceiver, coupler, low noise amplifier, duplexer, and the like.
  • radio frequency circuits can also communicate with other devices through wireless communication.
  • the wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile Communications, General Packet Radio Service, Code Division Multiple Access, Wideband Code Division Multiple Access, Long Term Evolution, Email, Short Message Service, etc.
  • each module in the electronic device may be implemented in the form of software and/or hardware, which is not specifically limited. In other words, electronic equipment is presented in the form of functional modules.
  • the "module” here may refer to an application-specific integrated circuit ASIC, a circuit, a processor and memory executing one or more software or firmware programs, an integrated logic circuit, and/or other devices that can provide the above-mentioned functions.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes one or more available media integrated.
  • the available medium can be a magnetic medium, (such as a floppy disk, a hard disk, etc. , tape), optical media (such as digital video disk (digital video disk, DVD)), or semiconductor media (such as solid state disk (SSD)), etc.
  • the steps of the methods or algorithms described in conjunction with the embodiments of the present application may be implemented in hardware, or may be implemented in a manner in which a processor executes software instructions.
  • the software instructions can be composed of corresponding software modules, and the software modules can be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, mobile hard disk, CD-ROM or any other form of storage known in the art medium.
  • An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium.
  • the storage medium may also be a component of the processor.
  • the processor and storage medium can be located in the ASIC.
  • the ASIC may be located in the electronic device.
  • the processor and the storage medium can also exist in the electronic device as discrete components.

Abstract

The present application relates to the technical field of communications, and discloses a multi-window video communication method, device and system. In a multiparty video call process, bandwidth resources can be fully used to avoid network congestion, thereby ensuring the fluency and/or definition of a video. In the present application, multiple windows are displayed on an interface of a receiving end, and when bandwidth resources are insufficient to meet the bandwidth requirements of multiple audio/video streams (i.e., in a weak network environment), the receiving end can perform downgrade subscription according to specific priorities corresponding to the multiple windows, such as reducing the definition of a video, unsubscribing from a video stream or delaying the subscription to a video stream, such that network congestion is avoided, and the fluency and/or definition of a high-priority video is ensured.

Description

一种多窗口视频通信方法、设备及系统A multi-window video communication method, device and system
本申请要求于2021年8月3日提交国家知识产权局、申请号为202110887044.0、申请名称为“一种多窗口视频通信方法、设备及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202110887044.0 and the application title "A Multi-window Video Communication Method, Device and System" submitted to the State Intellectual Property Office on August 3, 2021, the entire contents of which are incorporated by reference incorporated in this application.
技术领域technical field
本申请实施例涉及通信技术领域,尤其涉及一种多窗口视频通信方法、设备及系统。The embodiments of the present application relate to the field of communication technologies, and in particular, to a multi-window video communication method, device, and system.
背景技术Background technique
随着线上学习、线上会议、线上聊天的普及,多窗口视频通信的应用场景越来越多样化。例如多窗口视频通信以多方视频通话的形式用于专业的会议场景(例如多方会议场景),生活场景(例如群视频聊天场景)或网上教育场景等。With the popularity of online learning, online meetings, and online chats, the application scenarios of multi-window video communication are becoming more and more diverse. For example, multi-window video communication is used in professional conference scenarios (such as multi-party conference scenarios), life scenarios (such as group video chat scenarios) or online education scenarios in the form of multi-party video calls.
在多窗口视频通信时,作为一种实现方案,发送端可以通过云侧向接收端转发音视频流。云侧向接收端转发来自发送端的音视频流,以及对转发音视频流的通信链路进行带宽预测。其中,云侧预测的带宽用于在网络较差时进行通信决策(例如抽帧决策等)。In multi-window video communication, as an implementation solution, the sending end can forward the audio and video streams to the receiving end through the cloud. The cloud side forwards the audio and video streams from the sending end to the receiving end, and performs bandwidth prediction on the communication link for forwarding the audio and video streams. Among them, the bandwidth predicted by the cloud side is used to make communication decisions (such as frame extraction decisions, etc.) when the network is poor.
可以理解,上述常规技术在网络较差时,对于通信多方的处理是平等的。但是,不同多窗口视频通信场景的侧重点不同,例如网上教育场景需优先保障课件/白板的流程度,其次为讲师的人像画面;又如多方会议场景需优先保障当前的主讲人(如声音音量值最大的人)的视频和音频,其次为其他会议参与者,因此基于上述常规技术,在网络较差时,接收端无法根据具体需求适应性处理,会导致重要的视频在弱网时出现卡顿。It can be understood that, when the network is poor, the above-mentioned conventional technology treats multiple communication parties equally. However, different multi-window video communication scenarios have different emphases. For example, in online education scenarios, it is necessary to give priority to ensuring the streaming of courseware/whiteboards, followed by the lecturer’s portrait screen; and in multi-party conference scenarios, it is necessary to prioritize the guarantee of the current speaker (such as voice volume). The video and audio of the person with the largest value), followed by other conference participants. Therefore, based on the above-mentioned conventional technology, when the network is poor, the receiving end cannot adapt to the specific needs, which will cause important video to be stuck when the network is weak. pause.
发明内容Contents of the invention
本申请提供一种多窗口视频通信方法、设备及系统,可以在多方视频通话的过程中,充分利用带宽资源以避免网络拥塞,从而保障视频的流畅度和/或清晰度。The present application provides a multi-window video communication method, device and system, which can make full use of bandwidth resources to avoid network congestion during a multi-party video call, thereby ensuring smoothness and/or clarity of the video.
为达到上述目的,本申请实施例采用如下技术方案:In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:
第一方面,提供一种多窗口视频通信方法,该方法应用于多个第一设备与第二设备视频通话的过程中,该方法包括:第二设备接收分别来自多个第一设备的多个音视频流;其中,上述多个音视频流对应的音频和视频分别在第二设备界面上的多个窗口中播放;第二设备确定带宽资源不足以满足多个音视频流的带宽需求;第二设备根据多个窗口对应的优先级调整对一个或多个窗口中音视频流的订阅策略。示例性的,第二设备调整订阅策略可以包括但不限于取消订阅、恢复订阅、延迟订阅、降低清晰度或提高清晰度中的一种或多种。In the first aspect, a multi-window video communication method is provided, the method is applied in the process of video calling between a plurality of first devices and a second device, the method includes: the second device receives a plurality of video messages from a plurality of first devices respectively Audio and video streams; wherein, the audio and video corresponding to the above multiple audio and video streams are respectively played in multiple windows on the interface of the second device; the second device determines that the bandwidth resources are not enough to meet the bandwidth requirements of multiple audio and video streams; the second The second device adjusts the subscription strategy for audio and video streams in one or more windows according to the priorities corresponding to the multiple windows. Exemplarily, the adjustment of the subscription policy by the second device may include, but not limited to, one or more of unsubscribing, resuming subscription, delaying subscription, reducing clarity, or increasing clarity.
上述第一方面提供的方案,接收端在带宽资源不足以满足多个音视频流的带宽需求(即弱网环境)时,例如在用于表征带宽资源的总带宽预测值小于个音视频流的带宽需求时,可以根据多个窗口对应的具体优先级进行降级订阅,例如降低视频清晰度、取消订阅视频流或延迟订阅视频流等,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。其中,窗口对应的优先级可以用于表征具体业务需求或者用户的 偏好等。In the solution provided by the first aspect above, when the bandwidth resources at the receiving end are not enough to meet the bandwidth requirements of multiple audio and video streams (i.e. a weak network environment), for example, when the total bandwidth prediction value used to characterize the bandwidth resources is less than the number of audio and video streams When the bandwidth is required, you can downgrade the subscription according to the specific priorities corresponding to multiple windows, such as reducing video clarity, unsubscribing video streams, or delaying subscribing video streams, etc., to avoid network congestion and ensure the smoothness of high-priority videos and/or clarity. Among them, the priority corresponding to the window can be used to represent specific business requirements or user preferences.
在一种可能的实现方式中,上述第二设备接收分别来自多个第一设备的多个音视频流,包括:第二设备接收第三设备转发的,分别来自多个第一设备的多个音视频流。作为一种示例,本申请提供的多窗口视频通信方法可以适用于第三设备转发音视频流的网络架构,以提高本申请提供的方法的适用度和与不同网络架构的兼容性。In a possible implementation manner, the above-mentioned second device receiving multiple audio and video streams respectively from multiple first devices includes: the second device receiving multiple audio and video streams respectively from multiple first devices forwarded by a third device Audio and video streaming. As an example, the multi-window video communication method provided in this application can be applied to a network architecture where a third device forwards audio and video streams, so as to improve the applicability of the method provided in this application and compatibility with different network architectures.
在一种可能的实现方式中,上述方法还包括:上述第二设备从第三设备接收对多条链路的多个带宽预测结果;第二设备根据多个带宽预测结果确定带宽资源不足以满足多个音视频流的带宽需求。其中,上述多条链路与上述多个音视频流对应。例如上述多条链路分别用于传输上述多个音视频流。作为一种示例,本申请提供的多窗口视频通信方法中,可以由第三设备进行带宽预测。例如,第三设备可以在向第二设备转发音视频流时进行带宽预测,通过该方案,可以提高本申请提供的方法与不同网络架构的兼容性。In a possible implementation manner, the above method further includes: the above-mentioned second device receives multiple bandwidth prediction results for multiple links from the third device; the second device determines that the bandwidth resource is insufficient to meet the Bandwidth requirements for multiple audio and video streams. Wherein, the above multiple links correspond to the above multiple audio and video streams. For example, the above multiple links are respectively used to transmit the above multiple audio and video streams. As an example, in the multi-window video communication method provided in the present application, the third device may perform bandwidth prediction. For example, the third device may perform bandwidth prediction when forwarding audio and video streams to the second device. Through this solution, the compatibility of the method provided by this application with different network architectures can be improved.
在一种可能的实现方式中,上述方法还包括:第二设备测量得到对多条链路的多个带宽预测结果;第二设备根据上述多个带宽预测结果确定带宽资源不足以满足上述多个音视频流的带宽需求。其中,上述多条链路与上述多个音视频流对应。例如上述多条链路分别用于传输上述多个音视频流。作为一种示例,本申请提供的多窗口视频通信方法中,可以由第二设备进行带宽预测,通过该方案,可以提高本申请提供的方法与不同网络架构的兼容性。In a possible implementation manner, the above method further includes: the second device measures and obtains multiple bandwidth prediction results for multiple links; the second device determines that the bandwidth resource is insufficient to meet the above multiple Bandwidth requirements for audio and video streams. Wherein, the above multiple links correspond to the above multiple audio and video streams. For example, the above multiple links are respectively used to transmit the above multiple audio and video streams. As an example, in the multi-window video communication method provided by the present application, the second device can perform bandwidth prediction, and through this solution, the compatibility of the method provided by the present application with different network architectures can be improved.
在一种可能的实现方式中,第一窗口以第一清晰度播放对应音视频流;该第一窗口是上述多个窗口中;第二设备根据多个窗口对应的优先级调整对一个或多个窗口中音视频流的订阅策略,包括:第二设备为第一窗口订阅第二清晰度的音视频流;第二清晰度小于第一清晰度。本申请中,接收端在弱网环境时,可以对低优先级的窗口进行降低视频清晰度的订阅策略调整,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。In a possible implementation manner, the first window plays the corresponding audio and video stream at the first definition; the first window is among the above-mentioned multiple windows; the second device adjusts one or more The subscription strategy for audio and video streams in a window includes: the second device subscribes to an audio and video stream with a second definition for the first window; and the second definition is smaller than the first definition. In this application, when the receiving end is in a weak network environment, it can adjust the subscription policy to reduce the video definition for low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
在一种可能的实现方式中,在上述第二设备为第一窗口订阅第二清晰度的音视频流之后,上述方法还包括:在满足第一预设条件时,第二设备为第一窗口订阅第一清晰度的音视频流。例如,第二设备可以通过进行总带宽预测值监测以判断是否满足第一预设条件。例如,第一预设条件可以是最新的总带宽预测值满足第一窗口恢复以第一清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复清晰度的带宽需求时,恢复被降级的视频的清晰度,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, after the second device subscribes to the audio and video stream of the second definition for the first window, the above method further includes: when the first preset condition is met, the second device Subscribe for audio and video streams in first definition. For example, the second device may determine whether the first preset condition is satisfied by monitoring the predicted value of the total bandwidth. For example, the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
在一种可能的实现方式中,在上述第二设备为第一窗口订阅第二清晰度的音视频流之后,上述方法还包括:在满足第一预设条件预设时间段时,第二设备为第一窗口订阅第一清晰度的音视频流。例如,第二设备可以通过进行总带宽预测值监测以判断是否满足第一预设条件。例如,第一预设条件可以是最新的总带宽预测值满足第一窗口恢复以第一清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复清晰度的带宽需求时,恢复被降级的视频的清晰度,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, after the second device subscribes to the audio and video stream of the second definition for the first window, the above method further includes: when the first preset condition is met for a preset time period, the second device Subscribe to the first definition audio and video stream for the first window. For example, the second device may determine whether the first preset condition is satisfied by monitoring the predicted value of the total bandwidth. For example, the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
在一种可能的实现方式中,第二窗口以第二清晰度播放音视频流,上述第二清 晰度小于或等于预设值,第二窗口是上述多个窗口中、优先级最低的窗口;上述第二设备根据多个窗口对应的优先级调整对一个或多个窗口中音视频流的订阅策略,包括:第二设备取消订阅第二窗口对应的视频流。本申请中,接收端在弱网环境时,可以对低优先级的窗口进行取消订阅的订阅策略调整,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。In a possible implementation manner, the second window plays audio and video streams with a second definition, the second definition is less than or equal to a preset value, and the second window is a window with the lowest priority among the plurality of windows; The second device adjusts the subscription strategy for the audio and video streams in one or more windows according to the priorities corresponding to the multiple windows, including: the second device unsubscribes the video stream corresponding to the second window. In this application, when the receiving end is in a weak network environment, it can adjust the subscription policy of unsubscribing to low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
在一种可能的实现方式中,在上述第二窗口取消订阅之后,上述方法还包括:第二设备在第二窗口上显示蒙层。通过显示蒙层,可以提醒用户当前处于弱网环境,提高用户体验。In a possible implementation manner, after the second window unsubscribes, the above method further includes: the second device displays a mask on the second window. By displaying the mask layer, users can be reminded that they are currently in a weak network environment and user experience can be improved.
在一种可能的实现方式中,在上述第二设备取消订阅第二窗口对应的视频流之后,上述方法还包括:在满足第二预设条件时,第二设备为第二窗口恢复订阅第二清晰度的视频流。例如,第二设备可以通过进行总带宽预测值监测以判断是否满足第二预设条件。例如,第二预设条件可以是最新的总带宽预测值满足第二窗口恢复以第二清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复订阅的带宽需求时,恢复对被取消订阅的窗口对应的音视频流的订阅,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, after the second device unsubscribes from the video stream corresponding to the second window, the above method further includes: when a second preset condition is met, the second device resumes subscribing to the second video stream for the second window. high-definition video streaming. For example, the second device may determine whether the second preset condition is met by monitoring the predicted value of the total bandwidth. For example, the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
在一种可能的实现方式中,在上述第二设备取消订阅第二窗口对应的视频流之后,上述方法还包括:在满足第二预设条件预设时间段时,第二设备为第二窗口恢复订阅第二清晰度的视频流。例如,第二设备可以通过进行总带宽预测值监测以判断是否满足第二预设条件。例如,第二预设条件可以是最新的总带宽预测值满足第二窗口恢复以第二清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复订阅的带宽需求时,恢复对被取消订阅的窗口对应的音视频流的订阅,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, after the second device unsubscribes from the video stream corresponding to the second window, the above method further includes: when the second preset condition is met for a preset time period, the second device is the second window Resume subscription to second definition video stream. For example, the second device may determine whether the second preset condition is met by monitoring the predicted value of the total bandwidth. For example, the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
在一种可能的实现方式中,上述多个窗口对应的优先级由第二设备根据以下中的一个或多个确定:多个窗口对应的音频的初始音量;多个窗口对应的音频的播放音量;多个窗口中业务的功能。其中,音频的初始音量用于表征第二设备接收到音频流时,音频流的原始音量。在本申请中,可以支持多样化的窗口优先级设置。例如,可以由第二设备自行根据多个窗口对应的初始/播放音频的音量和/或多个窗口中业务的功能确定多个窗口对应的优先级。该多样化的窗口优先级设置可以方便用户多样化操作,提供用户的体验度。In a possible implementation manner, the priorities corresponding to the multiple windows are determined by the second device according to one or more of the following: the initial volume of the audio corresponding to the multiple windows; the playback volume of the audio corresponding to the multiple windows ; The function of business in multiple windows. Wherein, the initial volume of the audio is used to represent the original volume of the audio stream when the second device receives the audio stream. In this application, various window priority settings can be supported. For example, the second device may determine the priorities corresponding to the multiple windows according to the volume of the initial/playing audio corresponding to the multiple windows and/or the functions of the services in the multiple windows. The diversified window priority setting can facilitate diversified operations of the user and improve user experience.
在一种可能的实现方式中,上述多个窗口对应的优先级由第二设备根据用户的自定义指定操作确定。在本申请中,可以支持多样化的窗口优先级设置。例如,可以由用户自定义。该多样化的窗口优先级设置可以方便用户多样化操作,提供用户的体验度。In a possible implementation manner, the priorities corresponding to the foregoing multiple windows are determined by the second device according to user-defined specified operations. In this application, various window priority settings can be supported. For example, it can be user-defined. The diversified window priority setting can facilitate diversified operations of the user and improve user experience.
在一种可能的实现方式中,上述第二设备接收第三设备转发的,分别来自多个第一设备的多个音视频流,包括:第二设备从第一云端设备接收第一音视频流;第二设备从第二云端设备接收第二音视频流和第三音视频流。作为一种示例,本申请提供的多窗口视频通信方法可以适用于分布式的云端设备(即第三设备)转发音视频流的网络架构,以提高本申请提供的方法的适用度和与不同网络架构的兼容性。In a possible implementation manner, the above-mentioned second device receiving multiple audio and video streams from multiple first devices forwarded by the third device includes: the second device receives the first audio and video stream from the first cloud device ; The second device receives the second audio-video stream and the third audio-video stream from the second cloud device. As an example, the multi-window video communication method provided by this application can be applied to a network architecture in which a distributed cloud device (that is, a third device) forwards audio and video streams, so as to improve the applicability of the method provided by this application and to communicate with different networks. Architecture compatibility.
在一种可能的实现方式中,上述第三设备是选择性转发单元(selective forwarding  unit,SFU)。通过该方案,可以提高本申请提供的方法的适用度和与不同网络架构的兼容性。In a possible implementation manner, the foregoing third device is a selective forwarding unit (selective forwarding unit, SFU). Through this solution, the applicability of the method provided in this application and the compatibility with different network architectures can be improved.
第二方面,提供一种电子设备(如第二设备),该电子设备包括:收发单元,用于接收分别来自多个第一设备的多个音视频流;其中,上述多个音视频流对应的音频和视频分别在第二设备界面上的多个窗口中播放;显示单元,用于通过上述多个窗口播放上述多个音视频流;处理单元,用于确定带宽资源不足以满足多个音视频流的带宽需求;以及根据多个窗口对应的优先级调整对一个或多个窗口中音视频流的订阅策略。示例性的,第二设备调整订阅策略可以包括但不限于取消订阅、恢复订阅、延迟订阅、降低清晰度或提高清晰度中的一种或多种。In a second aspect, an electronic device (such as a second device) is provided, and the electronic device includes: a transceiver unit, configured to receive a plurality of audio and video streams from a plurality of first devices; wherein, the plurality of audio and video streams correspond to The audio and video are respectively played in multiple windows on the interface of the second device; the display unit is used to play the multiple audio and video streams through the multiple windows; the processing unit is used to determine that bandwidth resources are not enough to satisfy multiple audio The bandwidth requirement of the video stream; and adjust the subscription strategy for the audio and video stream in one or more windows according to the priorities corresponding to multiple windows. Exemplarily, the adjustment of the subscription policy by the second device may include, but not limited to, one or more of unsubscribing, resuming subscription, delaying subscription, reducing clarity, or increasing clarity.
上述第二方面提供的方案,接收端在带宽资源不足以满足多个音视频流的带宽需求(即弱网环境)时,例如在用于表征带宽资源的总带宽预测值小于个音视频流的带宽需求时,可以根据多个窗口对应的具体优先级进行降级订阅,例如降低视频清晰度、取消订阅视频流或延迟订阅视频流等,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。其中,窗口对应的优先级可以用于表征具体业务需求或者用户的偏好等。In the solution provided by the above-mentioned second aspect, when the bandwidth resources of the receiving end are insufficient to meet the bandwidth requirements of multiple audio and video streams (i.e. a weak network environment), for example, when the total bandwidth prediction value used to characterize the bandwidth resources is less than the number of audio and video streams When the bandwidth is required, you can downgrade the subscription according to the specific priorities corresponding to multiple windows, such as reducing video clarity, unsubscribing video streams, or delaying subscribing video streams, etc., to avoid network congestion and ensure the smoothness of high-priority videos and/or clarity. Wherein, the priority corresponding to the window may be used to represent a specific service requirement or a user's preference.
在一种可能的实现方式中,上述收发单元具体用于:接收第三设备转发的,分别来自多个第一设备的多个音视频流。作为一种示例,本申请提供的多窗口视频通信方法可以适用于第三设备转发音视频流的网络架构,以提高本申请提供的方法的适用度和与不同网络架构的兼容性。In a possible implementation manner, the above-mentioned transceiving unit is specifically configured to: receive multiple audio and video streams respectively from multiple first devices forwarded by the third device. As an example, the multi-window video communication method provided in this application can be applied to a network architecture where a third device forwards audio and video streams, so as to improve the applicability of the method provided in this application and compatibility with different network architectures.
在一种可能的实现方式中,上述收发单元还用于:从第三设备接收对多条链路的多个带宽预测结果;上述处理单元还用于:根据多个带宽预测结果确定带宽资源不足以满足多个音视频流的带宽需求。其中,上述多条链路与上述多个音视频流对应。例如上述多条链路分别用于传输上述多个音视频流。作为一种示例,本申请提供的多窗口视频通信方法中,可以由第三设备进行带宽预测。例如,第三设备可以在向第二设备转发音视频流时进行带宽预测,通过该方案,可以提高本申请提供的方法与不同网络架构的兼容性。In a possible implementation manner, the transceiver unit is further configured to: receive multiple bandwidth prediction results for multiple links from the third device; the processing unit is further configured to: determine that the bandwidth resource is insufficient according to the multiple bandwidth prediction results To meet the bandwidth requirements of multiple audio and video streams. Wherein, the above multiple links correspond to the above multiple audio and video streams. For example, the above multiple links are respectively used to transmit the above multiple audio and video streams. As an example, in the multi-window video communication method provided in the present application, the third device may perform bandwidth prediction. For example, the third device may perform bandwidth prediction when forwarding audio and video streams to the second device. Through this solution, the compatibility of the method provided by this application with different network architectures can be improved.
在一种可能的实现方式中,上述处理单元还用于:测量得到对多条链路的多个带宽预测结果;以及,根据上述多个带宽预测结果确定带宽资源不足以满足上述多个音视频流的带宽需求。其中,上述多条链路与上述多个音视频流对应。例如上述多条链路分别用于传输上述多个音视频流。作为一种示例,本申请提供的多窗口视频通信方法中,可以由第二设备进行带宽预测,通过该方案,可以提高本申请提供的方法与不同网络架构的兼容性。In a possible implementation manner, the above processing unit is further configured to: measure and obtain multiple bandwidth prediction results for multiple links; and determine that bandwidth resources are insufficient to meet the above multiple audio and video frequency requirements according to the above multiple bandwidth prediction results. The bandwidth requirements of the stream. Wherein, the above multiple links correspond to the above multiple audio and video streams. For example, the above multiple links are respectively used to transmit the above multiple audio and video streams. As an example, in the multi-window video communication method provided by the present application, the second device can perform bandwidth prediction, and through this solution, the compatibility of the method provided by the present application with different network architectures can be improved.
在一种可能的实现方式中,第一窗口以第一清晰度播放对应音视频流;该第一窗口是上述多个窗口中;上述处理单元具体用于:为第一窗口订阅第二清晰度的音视频流;第二清晰度小于第一清晰度。本申请中,接收端在弱网环境时,可以对低优先级的窗口进行降低视频清晰度的订阅策略调整,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。In a possible implementation manner, the first window plays the corresponding audio and video stream at the first definition; the first window is among the above-mentioned multiple windows; the above-mentioned processing unit is specifically configured to: subscribe the first window to the second definition audio and video stream; the second definition is smaller than the first definition. In this application, when the receiving end is in a weak network environment, it can adjust the subscription policy to reduce the video definition for low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
在一种可能的实现方式中,上述处理单元还用于,在满足第一预设条件时,为第一窗口订阅第一清晰度的音视频流。例如,处理单元可以通过进行总带宽预测值监测 以判断是否满足第一预设条件。例如,第一预设条件可以是最新的总带宽预测值满足第一窗口恢复以第一清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复清晰度的带宽需求时,恢复被降级的视频的清晰度,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, the above-mentioned processing unit is further configured to, when the first preset condition is satisfied, subscribe the first window to the audio-video stream of the first definition. For example, the processing unit may determine whether the first preset condition is met by monitoring the predicted value of the total bandwidth. For example, the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
在一种可能的实现方式中,上述处理单元还用于:在满足第一预设条件预设时间段时,为第一窗口订阅第一清晰度的音视频流。例如,处理单元可以通过进行总带宽预测值监测以判断是否满足第一预设条件。例如,第一预设条件可以是最新的总带宽预测值满足第一窗口恢复以第一清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复清晰度的带宽需求时,恢复被降级的视频的清晰度,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, the above-mentioned processing unit is further configured to: when the first preset condition is met for a preset time period, subscribe to the first window for the audio and video stream of the first definition. For example, the processing unit may monitor whether the first preset condition is met by monitoring the total bandwidth prediction value. For example, the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
在一种可能的实现方式中,第二窗口以第二清晰度播放音视频流,上述第二清晰度小于或等于预设值,第二窗口是上述多个窗口中、优先级最低的窗口;上述处理单元具体用于:取消订阅第二窗口对应的视频流。本申请中,接收端在弱网环境时,可以对低优先级的窗口进行取消订阅的订阅策略调整,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。In a possible implementation manner, the second window plays audio and video streams with a second definition, the second definition is less than or equal to a preset value, and the second window is a window with the lowest priority among the plurality of windows; The above processing unit is specifically configured to: unsubscribe from the video stream corresponding to the second window. In this application, when the receiving end is in a weak network environment, it can adjust the subscription policy of unsubscribing to low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
在一种可能的实现方式中,在上述第二窗口取消订阅之后,上述显示单元还用于:在第二窗口上显示蒙层。In a possible implementation manner, after the second window unsubscribes, the display unit is further configured to: display a mask on the second window.
在一种可能的实现方式中,上述处理单元还用于:在满足第二预设条件时,为第二窗口恢复订阅第二清晰度的视频流。例如,处理单元可以通过进行总带宽预测值监测以判断是否满足第二预设条件。例如,第二预设条件可以是最新的总带宽预测值满足第二窗口恢复以第二清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复订阅的带宽需求时,恢复对被取消订阅的窗口对应的音视频流的订阅,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, the above-mentioned processing unit is further configured to: resume subscribing to the video stream of the second definition for the second window when the second preset condition is satisfied. For example, the processing unit may determine whether the second preset condition is satisfied by monitoring the total bandwidth prediction value. For example, the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
在一种可能的实现方式中,上述处理单元还用于:在满足第二预设条件预设时间段时,处理单元为第二窗口恢复订阅第二清晰度的视频流。例如,处理单元可以通过进行总带宽预测值监测以判断是否满足第二预设条件。例如,第二预设条件可以是最新的总带宽预测值满足第二窗口恢复以第二清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复订阅的带宽需求时,恢复对被取消订阅的窗口对应的音视频流的订阅,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, the processing unit is further configured to: when the second preset condition is met for a preset time period, the processing unit resumes subscribing to the video stream of the second definition for the second window. For example, the processing unit may determine whether the second preset condition is satisfied by monitoring the total bandwidth prediction value. For example, the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
在一种可能的实现方式中,上述处理单元还用于根据下中的一个或多个确定多个窗口对应的优先级:多个窗口对应的音频的初始音量;多个窗口对应的音频的播放音量;多个窗口中业务的功能。其中,音频的初始音量用于表征第二设备接收到音频流时,音频流的原始音量。在本申请中,可以支持多样化的窗口优先级设置。例如,可以由第二设备自行根据多个窗口对应的初始/播放音频的音量和/或多个窗口中业务的功能确定多个窗口对应的优先级。该多样化的窗口优先级设置可以方便用户多样化操作,提供用户的体验度。In a possible implementation, the above processing unit is further configured to determine the priorities corresponding to multiple windows according to one or more of the following: initial volume of audio corresponding to multiple windows; playback of audio corresponding to multiple windows Volume; a function of business in multiple windows. Wherein, the initial volume of the audio is used to represent the original volume of the audio stream when the second device receives the audio stream. In this application, various window priority settings can be supported. For example, the second device may determine the priorities corresponding to the multiple windows according to the volume of the initial/playing audio corresponding to the multiple windows and/or the functions of the services in the multiple windows. The diversified window priority setting can facilitate diversified operations of the user and improve user experience.
在一种可能的实现方式中,上述处理单元还用于,根据用户的自定义指定操作确定多个窗口对应的优先级。在本申请中,可以支持多样化的窗口优先级设置。例如, 可以由用户自定义。该多样化的窗口优先级设置可以方便用户多样化操作,提供用户的体验度。In a possible implementation manner, the above processing unit is further configured to determine priorities corresponding to multiple windows according to user-defined specified operations. In this application, various window priority settings can be supported. For example, may be user-defined. The diversified window priority setting can facilitate diversified operations of the user and improve user experience.
在一种可能的实现方式中,上述收发单元具体用于:从第一云端设备接收第一音视频流;以及,从第二云端设备接收第二音视频流和第三音视频流。作为一种示例,本申请提供的多窗口视频通信方法可以适用于分布式的云端设备(即第三设备)转发音视频流的网络架构,以提高本申请提供的方法的适用度和与不同网络架构的兼容性。In a possible implementation manner, the above-mentioned transceiving unit is specifically configured to: receive the first audio-video stream from the first cloud device; and receive the second audio-video stream and the third audio-video stream from the second cloud device. As an example, the multi-window video communication method provided by this application can be applied to a network architecture in which a distributed cloud device (that is, a third device) forwards audio and video streams, so as to improve the applicability of the method provided by this application and to communicate with different networks. Architecture compatibility.
在一种可能的实现方式中,上述第三设备是SFU。通过该方案,可以提高本申请提供的方法的适用度和与不同网络架构的兼容性。In a possible implementation manner, the foregoing third device is an SFU. Through this solution, the applicability of the method provided in this application and the compatibility with different network architectures can be improved.
第三方面,提供一种电子设备(如第二设备),该电子设备包括:存储器,用于存储计算机程序;收发器,用于接收或发送无线电信号;显示器,用于显示界面;处理器,用于执行所述计算机程序,使得电子设备通过收发器接收分别来自多个第一设备的多个音视频流;其中,上述多个音视频流对应的音频和视频分别在第二设备界面上的多个窗口中播放;确定带宽资源不足以满足多个音视频流的带宽需求;以及根据多个窗口对应的优先级调整对一个或多个窗口中音视频流的订阅策略。示例性的,第二设备调整订阅策略可以包括但不限于取消订阅、恢复订阅、延迟订阅、降低清晰度或提高清晰度中的一种或多种。In a third aspect, an electronic device (such as a second device) is provided, and the electronic device includes: a memory for storing a computer program; a transceiver for receiving or sending a radio signal; a display for displaying an interface; a processor, It is used to execute the computer program, so that the electronic device receives multiple audio and video streams from multiple first devices through a transceiver; wherein, the audio and video corresponding to the multiple audio and video streams are respectively displayed on the interface of the second device Play in multiple windows; determine that the bandwidth resources are not enough to meet the bandwidth requirements of multiple audio and video streams; and adjust the subscription strategy for the audio and video streams in one or more windows according to the priorities corresponding to the multiple windows. Exemplarily, the adjustment of the subscription policy by the second device may include, but not limited to, one or more of unsubscribing, resuming subscription, delaying subscription, reducing clarity, or increasing clarity.
上述第三方面提供的方案,接收端在带宽资源不足以满足多个音视频流的带宽需求(即弱网环境)时,例如在用于表征带宽资源的总带宽预测值小于个音视频流的带宽需求时,可以根据多个窗口对应的具体优先级进行降级订阅,例如降低视频清晰度、取消订阅视频流或延迟订阅视频流等,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。其中,窗口对应的优先级可以用于表征具体业务需求或者用户的偏好等。In the solution provided by the third aspect above, when the bandwidth resources at the receiving end are insufficient to meet the bandwidth requirements of multiple audio and video streams (i.e. a weak network environment), for example, when the total bandwidth prediction value used to characterize the bandwidth resources is less than the number of audio and video streams When the bandwidth is required, you can downgrade the subscription according to the specific priorities corresponding to multiple windows, such as reducing video clarity, unsubscribing video streams, or delaying subscribing video streams, etc., to avoid network congestion and ensure the smoothness of high-priority videos and/or clarity. Wherein, the priority corresponding to the window may be used to represent a specific service requirement or a user's preference.
在一种可能的实现方式中,上述收发器具体用于:接收第三设备转发的,分别来自多个第一设备的多个音视频流。作为一种示例,本申请提供的多窗口视频通信方法可以适用于第三设备转发音视频流的网络架构,以提高本申请提供的方法的适用度和与不同网络架构的兼容性。In a possible implementation manner, the foregoing transceiver is specifically configured to: receive multiple audio and video streams respectively from multiple first devices forwarded by the third device. As an example, the multi-window video communication method provided in this application can be applied to a network architecture where a third device forwards audio and video streams, so as to improve the applicability of the method provided in this application and compatibility with different network architectures.
在一种可能的实现方式中,上述收发器还用于:从第三设备接收对多条链路的多个带宽预测结果;上述处理器还用于:根据多个带宽预测结果确定带宽资源不足以满足多个音视频流的带宽需求。其中,上述多条链路与上述多个音视频流对应。例如上述多条链路分别用于传输上述多个音视频流。作为一种示例,本申请提供的多窗口视频通信方法中,可以由第三设备进行带宽预测。例如,第三设备可以在向第二设备转发音视频流时进行带宽预测,通过该方案,可以提高本申请提供的方法与不同网络架构的兼容性。In a possible implementation manner, the transceiver is further configured to: receive multiple bandwidth prediction results for multiple links from the third device; the processor is further configured to: determine that bandwidth resources are insufficient according to the multiple bandwidth prediction results To meet the bandwidth requirements of multiple audio and video streams. Wherein, the above multiple links correspond to the above multiple audio and video streams. For example, the above multiple links are respectively used to transmit the above multiple audio and video streams. As an example, in the multi-window video communication method provided in the present application, the third device may perform bandwidth prediction. For example, the third device may perform bandwidth prediction when forwarding audio and video streams to the second device. Through this solution, the compatibility of the method provided by this application with different network architectures can be improved.
在一种可能的实现方式中,上述处理器还用于:测量得到对多条链路的多个带宽预测结果;以及,根据上述多个带宽预测结果确定带宽资源不足以满足上述多个音视频流的带宽需求。其中,上述多条链路与上述多个音视频流对应。例如上述多条链路分别用于传输上述多个音视频流。作为一种示例,本申请提供的多窗口视频通信方法中,可以由第二设备进行带宽预测,通过该方案,可以提高本申请提供的方法与不同网络架构的兼容性。In a possible implementation manner, the above-mentioned processor is further configured to: measure and obtain multiple bandwidth prediction results for multiple links; The bandwidth requirements of the stream. Wherein, the above multiple links correspond to the above multiple audio and video streams. For example, the above multiple links are respectively used to transmit the above multiple audio and video streams. As an example, in the multi-window video communication method provided by the present application, the second device can perform bandwidth prediction, and through this solution, the compatibility of the method provided by the present application with different network architectures can be improved.
在一种可能的实现方式中,第一窗口以第一清晰度播放对应音视频流;该第一窗口是上述多个窗口中;上述处理器具体用于:为第一窗口订阅第二清晰度的音视频流;第二清晰度小于第一清晰度。本申请中,接收端在弱网环境时,可以对低优先级的窗口进行降低视频清晰度的订阅策略调整,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。In a possible implementation manner, the first window plays the corresponding audio and video stream at the first definition; the first window is among the plurality of windows; the processor is specifically configured to: subscribe the first window to the second definition audio and video stream; the second definition is smaller than the first definition. In this application, when the receiving end is in a weak network environment, it can adjust the subscription policy to reduce the video definition for low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
在一种可能的实现方式中,上述处理器还用于:在满足第一预设条件时,为第一窗口订阅第一清晰度的音视频流。例如,处理器可以通过进行总带宽预测值监测以判断是否满足第一预设条件。例如,第一预设条件可以是最新的总带宽预测值满足第一窗口恢复以第一清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复清晰度的带宽需求时,恢复被降级的视频的清晰度,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, the above-mentioned processor is further configured to: when the first preset condition is satisfied, subscribe the first window to the audio-video stream of the first definition. For example, the processor may monitor the total bandwidth prediction value to determine whether the first preset condition is met. For example, the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
在一种可能的实现方式中,上述处理器还用于:在满足第一预设条件预设时间段时,为第一窗口订阅第一清晰度的音视频流。例如,处理器可以通过进行总带宽预测值监测以判断是否满足第一预设条件。例如,第一预设条件可以是最新的总带宽预测值满足第一窗口恢复以第一清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复清晰度的带宽需求时,恢复被降级的视频的清晰度,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, the above-mentioned processor is further configured to: subscribe for the first window to an audio and video stream of the first definition when the first preset condition is met for a preset time period. For example, the processor may monitor the total bandwidth prediction value to determine whether the first preset condition is met. For example, the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
在一种可能的实现方式中,第二窗口以第二清晰度播放音视频流,上述第二清晰度小于或等于预设值,第二窗口是上述多个窗口中、优先级最低的窗口;上述处理器具体用于:取消订阅第二窗口对应的视频流。本申请中,接收端在弱网环境时,可以对低优先级的窗口进行取消订阅的订阅策略调整,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。In a possible implementation manner, the second window plays audio and video streams with a second definition, the second definition is less than or equal to a preset value, and the second window is a window with the lowest priority among the plurality of windows; The processor above is specifically configured to: unsubscribe from the video stream corresponding to the second window. In this application, when the receiving end is in a weak network environment, it can adjust the subscription policy of unsubscribing to low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
在一种可能的实现方式中,在上述第二窗口取消订阅之后,上述显示器还用于,在第二窗口上显示蒙层。In a possible implementation manner, after the second window unsubscribes, the display is further configured to display a mask on the second window.
在一种可能的实现方式中,上述处理器还用于:在满足第二预设条件时,为第二窗口恢复订阅第二清晰度的视频流。例如,处理器可以通过进行总带宽预测值监测以判断是否满足第二预设条件。例如,第二预设条件可以是最新的总带宽预测值满足第二窗口恢复以第二清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复订阅的带宽需求时,恢复对被取消订阅的窗口对应的音视频流的订阅,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, the above-mentioned processor is further configured to: resume subscribing to the video stream of the second definition for the second window when the second preset condition is met. For example, the processor may monitor the total bandwidth prediction value to determine whether the second preset condition is met. For example, the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
在一种可能的实现方式中,上述处理器还用于:在满足第二预设条件预设时间段时,处理单元为第二窗口恢复订阅第二清晰度的视频流。例如,处理器可以通过进行总带宽预测值监测以判断是否满足第二预设条件。例如,第二预设条件可以是最新的总带宽预测值满足第二窗口恢复以第二清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复订阅的带宽需求时,恢复对被取消订阅的窗口对应的音视频流的订阅,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, the processor is further configured to: when the second preset condition is met for a preset time period, the processing unit resumes subscribing to the video stream of the second definition for the second window. For example, the processor may monitor the total bandwidth prediction value to determine whether the second preset condition is met. For example, the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
在一种可能的实现方式中,上述处理器还用于根据下中的一个或多个确定多个窗口对应的优先级:多个窗口对应的音频的初始音量;多个窗口对应的音频的播放音量; 多个窗口中业务的功能。其中,音频的初始音量用于表征第二设备接收到音频流时,音频流的原始音量。在本申请中,可以支持多样化的窗口优先级设置。例如,可以由第二设备自行根据多个窗口对应的初始/播放音频的音量和/或多个窗口中业务的功能确定多个窗口对应的优先级。该多样化的窗口优先级设置可以方便用户多样化操作,提供用户的体验度。In a possible implementation, the above-mentioned processor is further configured to determine the priorities corresponding to multiple windows according to one or more of the following: the initial volume of the audio corresponding to the multiple windows; the playback of the audio corresponding to the multiple windows Volume; a function of business in multiple windows. Wherein, the initial volume of the audio is used to represent the original volume of the audio stream when the second device receives the audio stream. In this application, various window priority settings can be supported. For example, the second device may determine the priorities corresponding to the multiple windows according to the volume of the initial/playing audio corresponding to the multiple windows and/or the functions of the services in the multiple windows. The diversified window priority setting can facilitate diversified operations of the user and improve user experience.
在一种可能的实现方式中,上述处理器还用于:根据用户的自定义指定操作确定多个窗口对应的优先级。在本申请中,可以支持多样化的窗口优先级设置。例如,可以由用户自定义。该多样化的窗口优先级设置可以方便用户多样化操作,提供用户的体验度。In a possible implementation manner, the above-mentioned processor is further configured to: determine priorities corresponding to multiple windows according to user-defined specified operations. In this application, various window priority settings can be supported. For example, it can be user-defined. The diversified window priority setting can facilitate diversified operations of the user and improve user experience.
在一种可能的实现方式中,上述收发器具体用于:从第一云端设备接收第一音视频流;以及,从第二云端设备接收第二音视频流和第三音视频流。作为一种示例,本申请提供的多窗口视频通信方法可以适用于分布式的云端设备(即第三设备)转发音视频流的网络架构,以提高本申请提供的方法的适用度和与不同网络架构的兼容性。In a possible implementation manner, the above-mentioned transceiver is specifically configured to: receive the first audio-video stream from the first cloud device; and receive the second audio-video stream and the third audio-video stream from the second cloud device. As an example, the multi-window video communication method provided by this application can be applied to a network architecture in which a distributed cloud device (that is, a third device) forwards audio and video streams, so as to improve the applicability of the method provided by this application and to communicate with different networks. Architecture compatibility.
在一种可能的实现方式中,上述第三设备是SFU。通过该方案,可以提高本申请提供的方法的适用度和与不同网络架构的兼容性。In a possible implementation manner, the foregoing third device is an SFU. Through this solution, the applicability of the method provided in this application and the compatibility with different network architectures can be improved.
第四方面,提供一种多窗口视频通信方法,该方法应用于通信系统中的多个第一设备与第二设备视频通话的过程中,该方法包括:多个第一设备向第二设备发送音视频流;其中,上述多个音视频流对应的音频和视频分别在第二设备界面上的多个窗口中播放;第二设备确定带宽资源不足以满足多个音视频流的带宽需求;第二设备根据多个窗口对应的优先级调整对一个或多个窗口中音视频流的订阅策略。示例性的,第二设备调整订阅策略可以包括但不限于取消订阅、恢复订阅、延迟订阅、降低清晰度或提高清晰度中的一种或多种。In a fourth aspect, there is provided a multi-window video communication method, which is applied in the process of a video call between multiple first devices and a second device in a communication system, and the method includes: multiple first devices send to the second device Audio and video streams; wherein, the audio and video corresponding to the above multiple audio and video streams are respectively played in multiple windows on the interface of the second device; the second device determines that the bandwidth resources are not enough to meet the bandwidth requirements of multiple audio and video streams; the second The second device adjusts the subscription strategy for audio and video streams in one or more windows according to the priorities corresponding to the multiple windows. Exemplarily, the adjustment of the subscription policy by the second device may include, but not limited to, one or more of unsubscribing, resuming subscription, delaying subscription, reducing clarity, or increasing clarity.
上述第四方面提供的方案,接收端在带宽资源不足以满足多个音视频流的带宽需求(即弱网环境)时,例如在用于表征带宽资源的总带宽预测值小于个音视频流的带宽需求时,可以根据多个窗口对应的具体优先级进行降级订阅,例如降低视频清晰度、取消订阅视频流或延迟订阅视频流等,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。其中,窗口对应的优先级可以用于表征具体业务需求或者用户的偏好等。In the solution provided by the fourth aspect above, when the bandwidth resources at the receiving end are insufficient to meet the bandwidth requirements of multiple audio and video streams (i.e. a weak network environment), for example, when the total bandwidth prediction value used to characterize the bandwidth resources is less than the number of audio and video streams When the bandwidth is required, you can downgrade the subscription according to the specific priorities corresponding to multiple windows, such as reducing video clarity, unsubscribing video streams, or delaying subscribing video streams, etc., to avoid network congestion and ensure the smoothness of high-priority videos and/or clarity. Wherein, the priority corresponding to the window may be used to represent a specific service requirement or a user's preference.
在一种可能的实现方式中,上述通信系统还包括:一个或多个第三设备,用于从上述多个第一设备接收上述多个音视频流,以及向第二设备转发上述多个音视频流。作为一种示例,本申请提供的多窗口视频通信方法可以适用于第三设备转发音视频流的网络架构,以提高本申请提供的方法的适用度和与不同网络架构的兼容性。In a possible implementation manner, the above-mentioned communication system further includes: one or more third devices, configured to receive the above-mentioned multiple audio and video streams from the above-mentioned multiple first devices, and forward the above-mentioned multiple audio and video streams to the second device video stream. As an example, the multi-window video communication method provided in this application can be applied to a network architecture where a third device forwards audio and video streams, so as to improve the applicability of the method provided in this application and compatibility with different network architectures.
在一种可能的实现方式中,上述第三设备还用于:在向第二设备转发上述多个音视频流的过程中,测量以得到对多条链路的多个带宽预测结果;第二设备具体用于:根据多个带宽预测结果确定带宽资源不足以满足多个音视频流的带宽需求。其中,上述多条链路与上述多个音视频流对应。例如上述多条链路分别用于传输上述多个音视频流。作为一种示例,本申请提供的多窗口视频通信方法中,可以由第三设备进行带宽预测。例如,第三设备可以在向第二设备转发音视频流时进行带宽预测,通过该方案,可以提高本申请提供的方法与不同网络架构的兼容性。In a possible implementation manner, the third device is further configured to: measure and obtain multiple bandwidth prediction results for multiple links during the process of forwarding the multiple audio and video streams to the second device; The device is specifically configured to: determine, according to multiple bandwidth prediction results, that bandwidth resources are insufficient to meet bandwidth requirements of multiple audio and video streams. Wherein, the above multiple links correspond to the above multiple audio and video streams. For example, the above multiple links are respectively used to transmit the above multiple audio and video streams. As an example, in the multi-window video communication method provided in the present application, the third device may perform bandwidth prediction. For example, the third device may perform bandwidth prediction when forwarding audio and video streams to the second device. Through this solution, the compatibility of the method provided by this application with different network architectures can be improved.
在一种可能的实现方式中,第二设备还用于:在接收上述多个音视频流的过程中,测量以得到对多条链路的多个带宽预测结果;第二设备具体用于:根据上述多个带宽预测结果确定带宽资源不足以满足上述多个音视频流的带宽需求。其中,上述多条链路与上述多个音视频流对应。例如上述多条链路分别用于传输上述多个音视频流。作为一种示例,本申请提供的多窗口视频通信方法中,可以由第二设备进行带宽预测,通过该方案,可以提高本申请提供的方法与不同网络架构的兼容性。In a possible implementation manner, the second device is further configured to: during the process of receiving the above multiple audio and video streams, perform measurement to obtain multiple bandwidth prediction results for multiple links; the second device is specifically configured to: It is determined according to the multiple bandwidth prediction results that the bandwidth resources are insufficient to meet the bandwidth requirements of the multiple audio and video streams. Wherein, the above multiple links correspond to the above multiple audio and video streams. For example, the above multiple links are respectively used to transmit the above multiple audio and video streams. As an example, in the multi-window video communication method provided by the present application, the second device can perform bandwidth prediction, and through this solution, the compatibility of the method provided by the present application with different network architectures can be improved.
在一种可能的实现方式中,上述第二设备还用于:通过第一窗口以第一清晰度播放对应音视频流;该第一窗口是上述多个窗口中;第二设备根据多个窗口对应的优先级调整对一个或多个窗口中音视频流的订阅策略,包括:第二设备为第一窗口订阅第二清晰度的音视频流;第二清晰度小于第一清晰度。本申请中,接收端在弱网环境时,可以对低优先级的窗口进行降低视频清晰度的订阅策略调整,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。In a possible implementation manner, the above-mentioned second device is further configured to: play the corresponding audio and video stream with the first definition through the first window; the first window is among the above-mentioned multiple windows; the second device The corresponding priority adjustment of the subscription strategy for the audio and video streams in one or more windows includes: the second device subscribes to the audio and video streams of the second definition for the first window; the second definition is lower than the first definition. In this application, when the receiving end is in a weak network environment, it can adjust the subscription policy to reduce the video definition for low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
在一种可能的实现方式中,在上述第二设备为第一窗口订阅第二清晰度的音视频流之后,上述第二设备还用于:在满足第一预设条件时,为第一窗口订阅第一清晰度的音视频流。例如,第二设备可以通过进行总带宽预测值监测以判断是否满足第一预设条件。例如,第一预设条件可以是最新的总带宽预测值满足第一窗口恢复以第一清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复清晰度的带宽需求时,恢复被降级的视频的清晰度,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, after the second device subscribes to the audio and video stream of the second definition for the first window, the second device is further configured to: when the first preset condition is met, Subscribe for audio and video streams in first definition. For example, the second device may determine whether the first preset condition is satisfied by monitoring the predicted value of the total bandwidth. For example, the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for restoring the definition is met, the definition of the degraded video is restored, so as to ensure the smoothness and/or definition of the video to the greatest extent.
在一种可能的实现方式中,在上述第二设备为第一窗口订阅第二清晰度的音视频流之后,上述第二设备还用于:在满足第一预设条件预设时间段时,为第一窗口订阅第一清晰度的音视频流。例如,第二设备可以通过进行总带宽预测值监测以判断是否满足第一预设条件。例如,第一预设条件可以是最新的总带宽预测值满足第一窗口恢复以第一清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复清晰度的带宽需求时,恢复被降级的视频的清晰度,以最大限度的保证视频的流畅度和/或清晰度In a possible implementation manner, after the second device subscribes to the audio and video stream of the second definition for the first window, the second device is further configured to: when the first preset condition is met for a preset time period, Subscribe to the first definition audio and video stream for the first window. For example, the second device may determine whether the first preset condition is satisfied by monitoring the predicted value of the total bandwidth. For example, the first preset condition may be that the latest total bandwidth prediction value satisfies the first window and resumes playing the audio and video stream at the first definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when meeting the bandwidth requirements for restoring clarity, restore the definition of the degraded video to maximize the fluency and/or clarity of the video
在一种可能的实现方式中,上述第二设备还用于:通过第二窗口以第二清晰度播放音视频流,上述第二清晰度小于或等于预设值,第二窗口是上述多个窗口中、优先级最低的窗口;上述第二设备根据多个窗口对应的优先级调整对一个或多个窗口中音视频流的订阅策略,包括:第二设备取消订阅第二窗口对应的视频流。本申请中,接收端在弱网环境时,可以对低优先级的窗口进行取消订阅的订阅策略调整,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。In a possible implementation manner, the second device is further configured to: play audio and video streams with a second definition through a second window, the second definition is less than or equal to a preset value, and the second window is the above-mentioned multiple Among the windows, the window with the lowest priority; the second device adjusts the subscription strategy for the audio and video streams in one or more windows according to the priorities corresponding to the multiple windows, including: the second device unsubscribes the video stream corresponding to the second window . In this application, when the receiving end is in a weak network environment, it can adjust the subscription policy of unsubscribing to low-priority windows to avoid network congestion while ensuring the fluency and/or clarity of high-priority videos.
在一种可能的实现方式中,在上述第二窗口取消订阅之后,上述第二设备还用于:在第二窗口上显示蒙层。通过显示蒙层,可以提醒用户当前处于弱网环境,提高用户体验。In a possible implementation manner, after the second window unsubscribes, the second device is further configured to: display a mask on the second window. By displaying the mask layer, users can be reminded that they are currently in a weak network environment and user experience can be improved.
在一种可能的实现方式中,在上述第二设备取消订阅第二窗口对应的视频流之后,上述第二设备还用于:在满足第二预设条件时,为第二窗口恢复订阅第二清晰度的视频流。例如,第二设备可以通过进行总带宽预测值监测以判断是否满足第二预设条件。例如,第二预设条件可以是最新的总带宽预测值满足第二窗口恢复以第二清晰度播 放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复订阅的带宽需求时,恢复对被取消订阅的窗口对应的音视频流的订阅,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, after the second device unsubscribes from the video stream corresponding to the second window, the second device is further configured to: resume subscribing to the second window for the second window when the second preset condition is met. high-definition video streaming. For example, the second device may determine whether the second preset condition is met by monitoring the predicted value of the total bandwidth. For example, the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
在一种可能的实现方式中,在上述第二设备取消订阅第二窗口对应的视频流之后,上述第二设备还用于:在满足第二预设条件预设时间段时,为第二窗口恢复订阅第二清晰度的视频流。例如,第二设备可以通过进行总带宽预测值监测以判断是否满足第二预设条件。例如,第二预设条件可以是最新的总带宽预测值满足第二窗口恢复以第二清晰度播放音视频流。本申请中,接收端可以根据实时带宽情况进行订阅策略调整。例如在满足恢复订阅的带宽需求时,恢复对被取消订阅的窗口对应的音视频流的订阅,以最大限度的保证视频的流畅度和/或清晰度。In a possible implementation manner, after the second device unsubscribes from the video stream corresponding to the second window, the second device is further configured to: when the second preset condition is met for a preset time period, set Resume subscription to second definition video stream. For example, the second device may determine whether the second preset condition is met by monitoring the predicted value of the total bandwidth. For example, the second preset condition may be that the latest total bandwidth prediction value satisfies the second window and resumes playing the audio and video stream at the second definition. In this application, the receiving end can adjust the subscription policy according to the real-time bandwidth situation. For example, when the bandwidth requirement for resuming subscription is met, the subscription to the audio and video stream corresponding to the unsubscribed window is resumed, so as to ensure the fluency and/or clarity of the video to the greatest extent.
在一种可能的实现方式中,上述第二设备还用于根据以下中的一个或多个确定上述多个窗口对应的优先级由第二设备:多个窗口对应的音频的初始音量;多个窗口对应的音频的播放音量;多个窗口中业务的功能。其中,音频的初始音量用于表征第二设备接收到音频流时,音频流的原始音量。在本申请中,可以支持多样化的窗口优先级设置。例如,可以由第二设备自行根据多个窗口对应的初始/播放音频的音量和/或多个窗口中业务的功能确定多个窗口对应的优先级。该多样化的窗口优先级设置可以方便用户多样化操作,提供用户的体验度。In a possible implementation manner, the second device is further configured to determine the priorities corresponding to the multiple windows according to one or more of the following: the initial volume of the audio corresponding to the multiple windows; The playback volume of the audio corresponding to the window; the function of business in multiple windows. Wherein, the initial volume of the audio is used to represent the original volume of the audio stream when the second device receives the audio stream. In this application, various window priority settings can be supported. For example, the second device may determine the priorities corresponding to the multiple windows according to the volume of the initial/playing audio corresponding to the multiple windows and/or the functions of the services in the multiple windows. The diversified window priority setting can facilitate diversified operations of the user and improve user experience.
在一种可能的实现方式中,上述第二设备还用于:根据用户的自定义指定操作确定多个窗口对应的优先级。在本申请中,可以支持多样化的窗口优先级设置。例如,可以由用户自定义。该多样化的窗口优先级设置可以方便用户多样化操作,提供用户的体验度。In a possible implementation manner, the above-mentioned second device is further configured to: determine priorities corresponding to multiple windows according to user-defined specified operations. In this application, various window priority settings can be supported. For example, it can be user-defined. The diversified window priority setting can facilitate diversified operations of the user and improve user experience.
在一种可能的实现方式中,上述一个或多个第三设备包括第一云端设备和第二云端设备,其中,第一云端设备用于向第二设备转发第一音视频流,第二云端设备用于向第二设备转发第二音视频流和第三音视频流。作为一种示例,本申请提供的多窗口视频通信方法可以适用于分布式的云端设备(即第三设备)转发音视频流的网络架构,以提高本申请提供的方法的适用度和与不同网络架构的兼容性。In a possible implementation manner, the above-mentioned one or more third devices include a first cloud device and a second cloud device, wherein the first cloud device is used to forward the first audio and video stream to the second device, and the second cloud device The device is configured to forward the second audio-video stream and the third audio-video stream to the second device. As an example, the multi-window video communication method provided by this application can be applied to a network architecture in which a distributed cloud device (that is, a third device) forwards audio and video streams, so as to improve the applicability of the method provided by this application and to communicate with different networks. Architecture compatibility.
在一种可能的实现方式中,上述第三设备是SFU。通过该方案,可以提高本申请提供的方法的适用度和与不同网络架构的兼容性。In a possible implementation manner, the foregoing third device is an SFU. Through this solution, the applicability of the method provided in this application and the compatibility with different network architectures can be improved.
第五方面,提供一种通信系统,该通信系统包括:多个第一设备和如第二方面或第三方面任一种可能的实现方式中的电子设备。其中,上述多个第一设备向第二设备发送音视频流,用于分别在第二设备界面上的多个窗口中播放。该通信系统用于实现如第四方面任一种可能的实现方式中的方法。In a fifth aspect, a communication system is provided, and the communication system includes: a plurality of first devices and the electronic device in any possible implementation manner of the second aspect or the third aspect. Wherein, the above-mentioned multiple first devices send audio and video streams to the second device for playing in multiple windows on the interface of the second device respectively. The communication system is used to implement the method in any possible implementation manner of the fourth aspect.
在一种可能的实现方式中,上述通信系统还包括:一个或多个第三设备,用于将来自上述多个第一设备的音视频流转发至电子设备。In a possible implementation manner, the foregoing communication system further includes: one or more third devices, configured to forward audio and video streams from the plurality of first devices to an electronic device.
第六方面,提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序代码,该计算机程序代码被处理器执行时,使得处理器实现如第一方面任一种可能的实现方式中的方法。According to a sixth aspect, a computer-readable storage medium is provided. Computer program code is stored on the computer-readable storage medium. When the computer program code is executed by a processor, the processor can realize any possible implementation of the first aspect. methods in methods.
第七方面,提供一种芯片系统,该芯片系统包括处理器、存储器,存储器中存储有计算机程序代码;所述计算机程序代码被所述处理器执行时,使得处理器实现如第 一方面任一种可能的实现方式中的方法。该芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。In a seventh aspect, there is provided a chip system, the chip system includes a processor and a memory, and computer program code is stored in the memory; when the computer program code is executed by the processor, the processor implements any one of the first aspect. method in one possible implementation. The system-on-a-chip may consist of chips, or may include chips and other discrete devices.
第八方面,提供一种计算机程序产品,该计算机程序产品包括计算机指令。当该计算机指令在计算机上运行时,使得计算机实现如第一方面任一种可能的实现方式中的方法。In an eighth aspect, a computer program product is provided, the computer program product comprising computer instructions. When the computer instructions are run on the computer, the computer is made to implement the method in any possible implementation manner of the first aspect.
附图说明Description of drawings
图1为本申请实施例提供的一种选择性转发单元(selective forwarding unit,SFU)转发方案示例图;Fig. 1 is an exemplary diagram of a selective forwarding unit (selective forwarding unit, SFU) forwarding scheme provided by the embodiment of the present application;
图2为本申请实施例提供的一种基于SFU抽帧的视频通信交互图;FIG. 2 is a video communication interaction diagram based on SFU frame extraction provided by an embodiment of the present application;
图3为本申请实施例提供的一种分层编码示意图;FIG. 3 is a schematic diagram of a layered encoding provided by an embodiment of the present application;
图4为本申请实施例提供的一种多窗口视频通信场景示例图;FIG. 4 is an example diagram of a multi-window video communication scenario provided by an embodiment of the present application;
图5A为本申请实施例提供的一种电子设备的硬件结构示意图;FIG. 5A is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application;
图5B为本申请实施例提供的一种电子设备的软件架构图;FIG. 5B is a software architecture diagram of an electronic device provided by an embodiment of the present application;
图6为本申请实施例提供的一种基于实时通信(real time communication,RTC)的通话业务架构示意图;6 is a schematic diagram of a call service architecture based on real time communication (real time communication, RTC) provided by the embodiment of the present application;
图7为本申请实施例提供的一种多窗口视频通信架构图;FIG. 7 is a structure diagram of a multi-window video communication provided by an embodiment of the present application;
图8为本申请实施例提供的一种多窗口视频通信方法流程图;FIG. 8 is a flowchart of a multi-window video communication method provided by an embodiment of the present application;
图9为本申请实施例提供的一种多窗口视频通信方法交互图;FIG. 9 is an interaction diagram of a multi-window video communication method provided by an embodiment of the present application;
图10为本申请实施例提供的三种多窗口显示示例图;FIG. 10 is an example diagram of three types of multi-window display provided by the embodiment of the present application;
图11为本申请实施例提供的一种多窗口显示示例图;FIG. 11 is an example diagram of a multi-window display provided by the embodiment of the present application;
图12为本申请实施例提供的一种弱网时显示蒙层的示例图;Fig. 12 is an example diagram of displaying a mask when a weak network is provided by the embodiment of the present application;
图13为本申请实施例提供的另一种多窗口视频通信方法流程图;FIG. 13 is a flow chart of another multi-window video communication method provided by the embodiment of the present application;
图14为本申请实施例提供的一种弱网时显示弱网提示的示例图;FIG. 14 is an example diagram of displaying a weak network prompt when a weak network is provided by an embodiment of the present application;
图15为本申请实施例提供的窗口对应的优先级的设置方法示例图一;FIG. 15 is an example diagram 1 of a method for setting a priority corresponding to a window provided in an embodiment of the present application;
图16为本申请实施例提供的窗口对应的优先级的设置方法示例图二;Figure 16 is an example of the setting method of the priority corresponding to the window provided in the embodiment of the present application Figure 2;
图17为本申请实施例提供的一种电子设备的结构框图。FIG. 17 is a structural block diagram of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。其中,在本申请实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,在本申请实施例的描述中,“多个”是指两个或多于两个。The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. Among them, in the description of the embodiments of this application, unless otherwise specified, "/" means or means, for example, A/B can mean A or B; "and/or" in this article is only a description of associated objects The association relationship of indicates that there may be three kinds of relationships, for example, A and/or B may indicate: A exists alone, A and B exist simultaneously, and B exists independently. In addition, in the description of the embodiments of the present application, "plurality" refers to two or more than two.
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。Hereinafter, the terms "first" and "second" are used for descriptive purposes only, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as "first" and "second" may explicitly or implicitly include one or more of these features. In the description of this embodiment, unless otherwise specified, "plurality" means two or more.
本申请实施例提供一种多窗口视频通信方法,该方法应用于多个用户进行多方实时视频通话的过程中。An embodiment of the present application provides a multi-window video communication method, which is applied to a process in which multiple users conduct a multi-party real-time video call.
作为一种示例,本申请实施例提供的一种多窗口视频通信方法可以应用于多方会 议场景。例如,用户A、用户B和用户C加入会议进行多方实时视频通话,其中用户A是会议主讲人,用户B和用户C是会议参与者。As an example, a multi-window video communication method provided in this embodiment of the application can be applied to a multi-party conference scenario. For example, user A, user B, and user C join a conference for a multi-party real-time video call, where user A is the conference speaker, and user B and user C are conference participants.
作为另一种示例,本申请实施例提供的一种多窗口视频通信方法可以应用于群视频聊天场景。例如,用户A、用户B和用户C加入群聊进行多方实时视频通话,其中在群聊过程中,用户A、用户B和用户C均进行随意发言。As another example, the multi-window video communication method provided in the embodiment of the present application may be applied to a group video chat scene. For example, user A, user B, and user C join a group chat for a multi-party real-time video call, wherein during the group chat, user A, user B, and user C all speak freely.
作为另一种示例,本申请实施例提供的一种多窗口视频通信方法可以应用于网上教育场景。例如,用户A、用户B和用户C加入网课群组进行多方实时视频通话,其中用户A是老师,用户B和用户C是学生。用户A在授课讲解的同时,向用户B和用户C展示课件/白板等。As another example, the multi-window video communication method provided in the embodiment of the present application may be applied to an online education scenario. For example, user A, user B, and user C join an online class group for a multi-party real-time video call, where user A is a teacher, and user B and user C are students. User A shows courseware/whiteboard to user B and user C while teaching and explaining.
在例如上述多方会议场景、群视频聊天场景或网上教育场景等多窗口视频通信场景中,作为一种实现方案,发送端(如第一设备)可以通过第三设备(如云端设备)向接收端(如第二设备)转发音视频流。例如,第三设备可以基于音视频流的发布、订阅关系等,完成音视频流的转发。其中,第三设备支持端到端加密特性,无需解析发送端的音视频流。示例性的,发送端与接收端保存有密钥(例如公钥和私钥),该密钥不被第三设备所知,因此第三设备无法解析用户的音视频流,安全性高。In multi-window video communication scenarios such as the above-mentioned multi-party conference scenario, group video chat scenario, or online education scenario, as an implementation solution, the sending end (such as the first device) can send a message to the receiving end through a third device (such as a cloud device) (such as the second device) forward audio and video streams. For example, the third device may complete the forwarding of the audio and video stream based on the publishing and subscription relationship of the audio and video stream. Wherein, the third device supports the end-to-end encryption feature, and does not need to analyze the audio and video streams at the sending end. Exemplarily, the sending end and the receiving end store keys (such as a public key and a private key), which are not known by the third device, so the third device cannot parse the user's audio and video streams, which is highly secure.
示例性的,发送端可以通过选择性转发单元(selective forwarding unit,SFU)向接收端转发音视频流。在一些实施例中,多个SFU可以采用分布式部署,以提升网络的可扩展性。请参考图1,图1示出了一种SFU转发方案示例图。如图1所示,设备1和设备2分别选择SFU 1和SFU 2向设备4转发音视频流,设备3选择SFU3向设备4和设备5发送音视频流。Exemplarily, the sending end may forward the audio and video streams to the receiving end through a selective forwarding unit (selective forwarding unit, SFU). In some embodiments, multiple SFUs can be deployed in a distributed manner to improve network scalability. Please refer to FIG. 1 , which shows an example diagram of an SFU forwarding solution. As shown in Figure 1, device 1 and device 2 respectively select SFU 1 and SFU 2 to transmit audio and video streams to device 4, and device 3 selects SFU3 to send audio and video streams to device 4 and device 5.
在一些实施例中,在SFU转发方案中,发送端可以自行选择最优的SFU进行音视频流转发。例如,发送端可以根据网络情况(如运营商、地区等)选择最优的SFU,本申请不限定。In some embodiments, in the SFU forwarding scheme, the sending end can select the optimal SFU for audio and video stream forwarding. For example, the sending end may select the optimal SFU according to network conditions (such as operators, regions, etc.), which is not limited in this application.
进一步的,上述转发方案中,第三设备(如云端设备)在转发时还可以对转发音视频流的下行链路进行带宽预测,并将预测结果发送给接收端,用于接收端在网络较差时进行通信决策。Further, in the above forwarding scheme, the third device (such as a cloud device) can also perform bandwidth prediction on the downlink of the forwarded audio and video stream when forwarding, and send the prediction result to the receiving end for use by the receiving end in the network. Time-lapse communication decisions.
以视频分层编码为例,第三设备(如SFU)可以在网络较差时进行抽帧决策。其中,抽帧可以减少下行带宽消耗,避免网络拥塞,保障弱网环境下的音视频流畅度和/或清晰度。Taking video layered coding as an example, a third device (such as an SFU) can make frame extraction decisions when the network is poor. Among them, frame extraction can reduce downlink bandwidth consumption, avoid network congestion, and ensure audio and video fluency and/or clarity in a weak network environment.
请参考图2,图2以SFU通过带宽预测模块进行带宽预测为例,示出了一种基于SFU抽帧的视频通信交互图。如图2所示,基于SFU抽帧的视频通信过程主要可以包括以下五个步骤:Please refer to FIG. 2 . FIG. 2 takes the bandwidth prediction performed by the SFU through the bandwidth prediction module as an example, and shows a video communication interaction diagram based on SFU frame extraction. As shown in Figure 2, the video communication process based on SFU frame extraction mainly includes the following five steps:
步骤1:发送端进行音视频数据的采集、编码和加密;Step 1: The sending end collects, encodes and encrypts audio and video data;
步骤2:发送端将帧数据、帧类型等发送至SFU;Step 2: The sender sends the frame data, frame type, etc. to the SFU;
步骤3:SFU的转发模块向接收端转发帧数据,同时SFU的带宽预测模块进行下行带宽预测;Step 3: The forwarding module of the SFU forwards the frame data to the receiving end, and at the same time, the bandwidth prediction module of the SFU performs downlink bandwidth prediction;
步骤4:SFU根据带宽预测值进行抽帧决策;Step 4: SFU makes a frame drawing decision based on the bandwidth prediction value;
步骤5:接收端接收到帧数据后,进行解密、解码和渲染。Step 5: After receiving the frame data, the receiving end performs decryption, decoding and rendering.
在一些示例中,带宽预测模块进行带宽预测可以包括:带宽预测模块根据下行数 据发送的时延、丢包等数据计算得来,下行的带宽预测值。关于下行带宽预测的具体方法,可以参考常规技术,本申请不做限定。In some examples, the bandwidth prediction performed by the bandwidth prediction module may include: the downlink bandwidth prediction value calculated by the bandwidth prediction module according to data such as time delay and packet loss of downlink data transmission. For the specific method of downlink bandwidth prediction, reference may be made to conventional technologies, which are not limited in this application.
以分层编码为例,如图3所示,在对视频流中的连续动态图像帧编码时,连续的图像帧可以分别压缩成即时解码刷新(instantaneous decoding refresh,IDR)帧,前向预测编码帧(P帧),双向预测内插编码帧(B帧)三种类型。其中,图3是以画面组(group of pictures,GOP)=30作为示例,其中,GOP即一组连续的画面。在一些实施例中,IDR帧压缩可以得到6:1而不产生任何可觉察的模糊现象。IDR帧压缩的同时使用P帧压缩,可以达到更高的压缩比而无可觉察的模糊现象。B帧压缩可以达到200:1的压缩比,其文件尺寸一般为IDR帧压缩尺寸的15%,不到P帧压缩尺寸的一半。其中,IDR帧压缩可以去掉图像的空间冗余度,P帧和B帧压缩可以去掉时间冗余度。Taking layered encoding as an example, as shown in Figure 3, when encoding continuous dynamic image frames in a video stream, the continuous image frames can be compressed into instant decoding refresh (instantaneous decoding refresh, IDR) frames, forward predictive encoding There are three types of frame (P frame) and bidirectional predictive interpolation coded frame (B frame). Wherein, FIG. 3 takes group of pictures (group of pictures, GOP)=30 as an example, wherein, GOP is a group of continuous pictures. In some embodiments, IDR frame compression can be achieved 6:1 without any perceptible blurring. Using P frame compression at the same time as IDR frame compression can achieve a higher compression ratio without noticeable blurring. B frame compression can achieve a compression ratio of 200:1, and its file size is generally 15% of the IDR frame compression size, less than half of the P frame compression size. Among them, the IDR frame compression can remove the spatial redundancy of the image, and the P frame and B frame compression can remove the temporal redundancy.
其中,IDR帧压缩采用全帧压缩编码,即将全帧图像信息进行联合图象专家组(joint photographic experts group,JPEG)压缩编码。IDR帧描述了图像背景和运动主体的详情,因此,IDR帧也叫关键帧。基于此,IDR帧可以独立进行解码和渲染。在解码视频流时,仅用IDR帧的数据就可重构完整图像。Wherein, the IDR frame compression adopts full-frame compression encoding, that is, the full-frame image information is subjected to joint photographic experts group (JPEG) compression encoding. The IDR frame describes the details of the image background and moving subject, therefore, the IDR frame is also called a key frame. Based on this, IDR frames can be decoded and rendered independently. When decoding a video stream, only the data of the IDR frame can be used to reconstruct the complete picture.
IDR帧可以作为P帧和B帧的参考。例如,IDR帧可以在分别进行帧内预测、残差确定、残差变换和量化、变长编码和算术编码、重构图像和滤波之后,作为P帧和B帧的参考。其中,残差可以通过像素值减去预测值确定。The IDR frame can be used as a reference for P and B frames. For example, an IDR frame can be used as a reference for a P frame and a B frame after performing intra prediction, residual determination, residual transformation and quantization, variable length coding and arithmetic coding, image reconstruction and filtering, respectively. Wherein, the residual can be determined by subtracting the predicted value from the pixel value.
P帧是IDR帧后面相隔1~2帧的编码帧。P帧属于前向预测的帧间编码,因此它只参考前面最靠近它的IDR帧或P帧进行预测。例如,P帧采用运动补偿的方法,预测当前帧与前面最近的IDR帧或P帧的差别及运动矢量。在解码P帧时,必须将IDR帧中的预测值与预测误差求和后才能重构完整的P帧图像。如图3所示,P帧基于它前面的IDR帧预测而来。P帧可以是其后面P帧的参考帧,也可以是其前后的B帧的参考帧。The P frame is a coded frame separated by 1 to 2 frames behind the IDR frame. The P frame belongs to the interframe coding of forward prediction, so it only refers to the IDR frame or P frame closest to it for prediction. For example, the P frame adopts the method of motion compensation to predict the difference and the motion vector between the current frame and the previous nearest IDR frame or P frame. When decoding a P frame, the complete P frame image must be reconstructed after summing the prediction value and the prediction error in the IDR frame. As shown in Figure 3, a P frame is predicted based on its preceding IDR frame. The P frame can be the reference frame of the P frame behind it, or the reference frame of the B frame before and after it.
B帧为双向帧间编码。B帧从前面和后面的IDR帧或P帧中提取数据。B帧基于当前帧与前一帧和后一帧图像之间的差别进行压缩,完成解码和渲染。例如,B帧预测当前帧与前面的IDR帧或P帧和后面的P帧之间的预测误差及运动矢量。如图3所示,B帧基于它前面的IDR帧和P帧预测而来,或者基于它前后的P帧预测而来。B帧不被其他帧参考。B-frames are bidirectionally inter-coded. B frames extract data from preceding and following IDR frames or P frames. The B frame is compressed based on the difference between the current frame and the image of the previous frame and the next frame to complete decoding and rendering. For example, the B frame predicts the prediction error and motion vector between the current frame and the preceding IDR frame or between the P frame and the following P frame. As shown in Figure 3, a B frame is predicted based on its preceding IDR frame and P frame, or based on its preceding and following P frames. B frames are not referenced by other frames.
可以理解,在类似图3所示分层编码中,B帧不被其他帧参考,因此B帧的丢弃不会影响视频帧的参考关系。基于此,接收端在网络较差时,可以根据带宽预测值以及各种类型编码帧的带宽需求决策是否抽B帧。其中,B帧的剔除可以达到减少下行带宽负载的目的。It can be understood that in layered coding similar to that shown in FIG. 3 , B frames are not referenced by other frames, so the discarding of B frames will not affect the reference relationship of video frames. Based on this, when the network is poor at the receiving end, it can decide whether to extract B frames according to the bandwidth prediction value and the bandwidth requirements of various types of encoded frames. Among them, the elimination of B frames can achieve the purpose of reducing the downlink bandwidth load.
作为一种示例,在本申请实施例中,可以通过统计来自发送端的原始帧数据,以得到各种类型编码帧(如IDR帧、P帧和B帧)的带宽需求。As an example, in this embodiment of the present application, the bandwidth requirements of various types of coded frames (such as IDR frames, P frames, and B frames) can be obtained by counting the original frame data from the sending end.
示例性的,以图4所示包括大窗口和小窗口的多窗口视频通信场景为例,以下表1示出了一种带宽需求示例。其中,大窗中播放高清视频,大窗中视频的分辨率是540P;小窗中播放普清视频,小窗中视频的分辨率是360P。Exemplarily, taking the multi-window video communication scenario including large windows and small windows shown in FIG. 4 as an example, Table 1 below shows an example of bandwidth requirements. Among them, the high-definition video is played in the large window, and the resolution of the video in the large window is 540P; the normal definition video is played in the small window, and the resolution of the video in the small window is 360P.
表1Table 1
Figure PCTCN2022109423-appb-000001
Figure PCTCN2022109423-appb-000001
如表1所示,图4所示大窗中播放的视频流在抽帧前所需带宽是700千比特每秒(kilobits per second,kbps);在抽B帧之后所需带宽是400kbps。图4所示小窗1和小窗2中播放的视频流在抽帧前所需带宽是500kbps;在抽B帧之后所需带宽是300kbps。As shown in Table 1, the required bandwidth of the video stream played in the large window shown in Figure 4 is 700 kilobits per second (kilobits per second, kbps) before drawing frames; the required bandwidth after drawing B frames is 400kbps. The video streams played in the small window 1 and small window 2 shown in Figure 4 require a bandwidth of 500 kbps before frame extraction; and require a bandwidth of 300 kbps after extracting B frames.
其中,在本申请实施例中,音视频流所需带宽受编解码器(Codec)、上行带宽的码率控制、分层编码的层数、视频的分辨率/帧率等影响,本申请实施例不限定音视频流所需带宽的具体计算策略和具体算法。Among them, in the embodiment of this application, the bandwidth required for audio and video streams is affected by the codec (Codec), the code rate control of the upstream bandwidth, the number of layers of layered coding, and the resolution/frame rate of the video. This example does not limit the specific calculation strategies and specific algorithms for the required bandwidth of audio and video streams.
请参考以下表2,表2示出了一种SFU抽帧策略示例。Please refer to Table 2 below, which shows an example of an SFU frame extraction strategy.
表2Table 2
Figure PCTCN2022109423-appb-000002
Figure PCTCN2022109423-appb-000002
如表2所示,若预测的带宽满足IDR帧、P帧和B帧的带宽需求,即带宽预测值≥IDR帧所需带宽+P帧所需带宽+B帧所需带宽,则接收端决策不抽帧。As shown in Table 2, if the predicted bandwidth satisfies the bandwidth requirements of IDR frames, P frames, and B frames, that is, the predicted bandwidth value ≥ the required bandwidth of IDR frames + the required bandwidth of P frames + the required bandwidth of B frames, then the receiving end makes a decision Does not pump frames.
若预测的带宽满足抽B帧之后,IDR帧和P帧的带宽需求,但是不满足IDR帧、P帧和B帧的带宽需求,即IDR帧所需带宽+P帧所需带宽≤带宽预测值<IDR帧所需带宽+P帧所需带宽+B帧所需带宽,或者预测的带宽不满足IDR帧和P帧的带宽需求,即带宽预测值<IDR帧所需带宽+P帧所需带宽,则接收端决策抽B帧。If the predicted bandwidth meets the bandwidth requirements of IDR frames and P frames after extracting B frames, but does not meet the bandwidth requirements of IDR frames, P frames, and B frames, that is, the required bandwidth of IDR frames + the required bandwidth of P frames ≤ the predicted bandwidth value <Bandwidth required by IDR frame+Bandwidth required by P frame+Bandwidth required by B frame, or the predicted bandwidth does not meet the bandwidth requirements of IDR frame and P frame, that is, the predicted bandwidth value<Bandwidth required by IDR frame+Bandwidth required by P frame , the receiving end decides to draw B frames.
需要说明的是,对于带宽预测值<IDR帧所需带宽+P帧所需带宽的情况,即使抽B帧,预测的带宽仍然满足不了抽帧之后所需带宽,因此,抽帧之后仍然存在网络拥塞。网络拥塞会造成时延增加,从而导致音视频卡顿。It should be noted that, for the case where the predicted bandwidth value is less than the required bandwidth of the IDR frame + the required bandwidth of the P frame, even if the B frame is extracted, the predicted bandwidth still cannot meet the required bandwidth after the frame is drawn. congestion. Network congestion will increase the delay, resulting in audio and video freezes.
也就是说,以表1所示带宽需求为例,若采用表2所示SFU抽帧策略,如表3所示,当带宽预测值≥1700kbps时,预测的带宽可以满足3个窗口的最大所需带宽,对于这种情况,接收端决策不抽帧。当1000kbps≤带宽预测值<1700kbps时,预测的带宽虽然不满足3个窗口的最大所需带宽,但是可以满足3个窗口的最小所需带宽,对于这种情况,接收端决策抽B帧。当带宽预测值<1000kbps时,预测的带宽不能满足3个窗口的最小所需带宽,对于这种情况,即使抽B帧,预测的带宽仍然满足不了抽帧 之后所需带宽,因此,抽帧之后仍然存在图4所示大窗、小窗1和小窗2的音视频卡顿。That is to say, taking the bandwidth requirements shown in Table 1 as an example, if the SFU frame extraction strategy shown in Table 2 is adopted, as shown in Table 3, when the predicted bandwidth value is ≥ 1700 kbps, the predicted bandwidth can meet the maximum requirements of the three windows. Bandwidth is required. In this case, the receiving end decides not to draw frames. When 1000kbps≤predicted bandwidth<1700kbps, although the predicted bandwidth does not meet the maximum required bandwidth of the three windows, it can meet the minimum required bandwidth of the three windows. In this case, the receiving end decides to extract B frames. When the bandwidth prediction value is less than 1000kbps, the predicted bandwidth cannot meet the minimum required bandwidth of the three windows. In this case, even if B frames are extracted, the predicted bandwidth still cannot meet the required bandwidth after frame extraction. Therefore, after frame extraction There are still audio and video freezes in the large window, small window 1, and small window 2 shown in Figure 4.
表3table 3
Figure PCTCN2022109423-appb-000003
Figure PCTCN2022109423-appb-000003
基于上述示例,可以理解,在网络较差时,采用常规技术对于通信多方的处理是平等的。但是,由于不同多窗口视频通信场景的侧重点不同,例如多方会议场景需优先保障当前的主讲人(如声音音量值最大的人,又如图4所示大窗对应的用户)的视频和音频,其次为其他会议参与者;又如网上教育场景需优先保障课件/白板的流畅度和/或清晰度,其次为讲师的人像画面;因此采用上述常规技术,在网络较差时,接收端无法根据具体需求适应性处理,会导致重要的视频在弱网环境时出现卡顿。Based on the above example, it can be understood that when the network is poor, conventional techniques are used to treat multiple communication parties equally. However, due to the different emphases of different multi-window video communication scenarios, for example, multi-party conference scenarios need to give priority to the video and audio of the current speaker (such as the person with the highest voice volume value, and the user corresponding to the large window as shown in Figure 4). , followed by other conference participants; another example is that in online education scenarios, the fluency and/or clarity of courseware/whiteboards must be guaranteed first, followed by the lecturer’s portrait; therefore, when the network is poor, the receiving end cannot Adaptive processing according to specific needs will cause important videos to freeze in a weak network environment.
以表3为例,当1000kbps≤带宽预测值<1700kbps时,接收端决策对图4所示大窗、小窗1和小窗2均抽B帧,因此大窗、小窗1和小窗2中视频的流畅度均有所降低。又如,当带宽预测值<1000kbps时,接收端决策对图4所示大窗、小窗1和小窗2均抽B帧,但是由于抽帧后预测带宽仍然不能满足3个窗口的最小所需带宽,因此3个窗口均存在网络拥塞。网络拥塞会造成时延增加,从而导致图4所示大窗、小窗1和小窗2均出现音视频卡顿的问题。Taking Table 3 as an example, when 1000kbps≤predicted bandwidth value<1700kbps, the receiver decides to draw B frames for the large window, small window 1, and small window 2 shown in Figure 4, so the large window, small window 1, and small window 2 The smoothness of the video in the medium and low-level videos has been reduced. As another example, when the predicted bandwidth value is less than 1000kbps, the receiver decides to draw B frames for the large window, small window 1, and small window 2 shown in Figure 4, but the predicted bandwidth after frame extraction still cannot meet the minimum requirements of the three windows. Bandwidth is required, so there is network congestion in all three windows. Network congestion will increase the delay, resulting in audio and video freezes in the large window, small window 1, and small window 2 shown in Figure 4.
为解决上述问题,本申请实施例提供一种多窗口视频通信方法,该方法中,接收端在弱网环境时可以根据多个视频通话用户的具体业务场景和诉求调整终端数据流,以优先保障高优先级的视频流的正常播放。例如,在图4所示场景中,优先保障大窗中的视频流畅度和/或清晰度。又如,在网上教育场景中,优先保障课件/白板的视频流畅度和/或清晰度。In order to solve the above problems, an embodiment of the present application provides a multi-window video communication method. In this method, when the receiving end is in a weak network environment, the terminal data stream can be adjusted according to the specific business scenarios and demands of multiple video call users to ensure priority Normal playback of high priority video streams. For example, in the scenario shown in FIG. 4 , priority is given to ensuring video fluency and/or clarity in the large window. As another example, in online education scenarios, priority is given to ensuring the video fluency and/or clarity of courseware/whiteboards.
其中,在本申请实施例中,弱网是指带宽资源不足以满足音视频流的带宽需求。Wherein, in the embodiment of the present application, a weak network means that bandwidth resources are insufficient to meet the bandwidth requirements of audio and video streams.
示例性的,在下行带宽受限或带宽波动较大的场景,例如高铁场景、远离无线局域网(wireless local area networks,WLAN)(如WiFi网络)热点的场景、偏远地区或者人群密集区的带宽受限场景等,采用本申请实施例提供的方法,基于具体优先级对视频流进行降级订阅,可以细化视频流控制的梯度,在充分利用下行带宽的同时,避免网络拥塞,从而最大程度地保障视频的流畅度和/或清晰度。Exemplarily, in a scene where the downlink bandwidth is limited or the bandwidth fluctuates greatly, such as a high-speed rail scene, a scene far away from a wireless local area network (wireless local area networks, WLAN) (such as a WiFi network) hotspot, a remote area or a densely populated area, the bandwidth is limited Limiting scenarios, etc., using the method provided by the embodiment of this application to downgrade and subscribe to video streams based on specific priorities can refine the gradient of video stream control, and avoid network congestion while making full use of downlink bandwidth, thereby ensuring maximum protection The smoothness and/or clarity of the video.
本申请实施例提供的多窗口视频通信方法可以应用但不限于智能手机、上网本、平板电脑、智能手表、智能手环、电话手表、智能相机、掌上电脑、个人计算机(personal computer,PC)、个人数字助理(personal digital assistant,PDA)、便携式多媒体播 放器(portable multimedia player,PMP)、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、电视机、投影设备或人机交互场景中的体感游戏机等。或者,该方法还可以应用于其他类型或结构的电子设备,本申请不限定。The multi-window video communication method provided by the embodiment of the present application can be applied to, but not limited to, smart phones, netbooks, tablet computers, smart watches, smart bracelets, phone watches, smart cameras, palmtop computers, personal computers (personal computers, PCs), personal Digital assistant (personal digital assistant, PDA), portable multimedia player (portable multimedia player, PMP), augmented reality (augmented reality, AR) / virtual reality (virtual reality, VR) equipment, TV, projection equipment or human-computer interaction Somatosensory game consoles in the scene, etc. Alternatively, the method may also be applied to electronic devices of other types or structures, which is not limited in this application.
请参考图5A,图5A以智能手机为例,示出了本申请实施例提供的一种电子设备的硬件结构示意图。如图5A所示,电子设备可以包括处理器510,存储器(包括外部存储器接口520和内部存储器521),通用串行总线(universal serial bus,USB)接口530,充电管理模块540,电源管理模块541,电池542,天线1,天线2,移动通信模块550,无线通信模块560,音频模块570,扬声器570A,受话器570B,麦克风570C,耳机接口570D,传感器模块580,按键590,马达591,指示器592,摄像头593,显示屏594,以及用户标识模块(subscriber identification module,SIM)卡接口595等。其中传感器模块580可以包括压力传感器,陀螺仪传感器,气压传感器,磁传感器,加速度传感器,距离传感器,接近光传感器,指纹传感器,温度传感器,触摸传感器,环境光传感器,骨传导传感器等。Please refer to FIG. 5A . FIG. 5A shows a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application by taking a smart phone as an example. As shown in Figure 5A, the electronic device can include a processor 510, a memory (comprising an external memory interface 520 and an internal memory 521), a universal serial bus (universal serial bus, USB) interface 530, a charging management module 540, and a power management module 541 , battery 542, antenna 1, antenna 2, mobile communication module 550, wireless communication module 560, audio module 570, speaker 570A, receiver 570B, microphone 570C, earphone jack 570D, sensor module 580, button 590, motor 591, indicator 592 , a camera 593, a display screen 594, and a subscriber identification module (subscriber identification module, SIM) card interface 595, etc. The sensor module 580 may include a pressure sensor, a gyroscope sensor, an air pressure sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity light sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, and the like.
可以理解的是,本发明实施例示意的结构并不构成对电子设备的具体限定。在本申请另一些实施例中,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It can be understood that, the structure shown in the embodiment of the present invention does not constitute a specific limitation on the electronic device. In other embodiments of the present application, the electronic device may include more or fewer components than shown in the illustrations, or combine certain components, or separate certain components, or arrange different components. The illustrated components can be realized in hardware, software or a combination of software and hardware.
处理器510可以包括一个或多个处理单元。例如:处理器510可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),飞行控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。 Processor 510 may include one or more processing units. For example: the processor 510 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a flight controller, Video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
处理器510中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器510中的存储器为高速缓冲存储器。该存储器可以保存处理器510刚用过或循环使用的指令或数据。如果处理器510需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器510的等待时间,因而提高了系统的效率。A memory may also be provided in the processor 510 for storing instructions and data. In some embodiments, the memory in processor 510 is a cache memory. The memory may hold instructions or data that the processor 510 has just used or recycled. If the processor 510 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 510 is reduced, thus improving the efficiency of the system.
在一些实施例中,处理器510可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。In some embodiments, processor 510 may include one or more interfaces. The interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transmitter (universal asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input and output (general-purpose input/output, GPIO) interface, subscriber identity module (subscriber identity module, SIM) interface, and /or universal serial bus (universal serial bus, USB) interface, etc.
充电管理模块540用于从充电器接收充电输入。电源管理模块541用于连接电池542,充电管理模块540与处理器510。电源管理模块541接收电池542和/或充电管理模块540的输入,为处理器510,内部存储器521,显示屏594,摄像组件593,和无线通信模块560等供电。The charging management module 540 is used for receiving charging input from the charger. The power management module 541 is used for connecting the battery 542 , the charging management module 540 and the processor 510 . The power management module 541 receives the input of the battery 542 and/or the charging management module 540, and supplies power for the processor 510, the internal memory 521, the display screen 594, the camera component 593, and the wireless communication module 560, etc.
电子设备的无线通信功能可以通过天线1,天线2,移动通信模块550,无线通信 模块560,调制解调处理器以及基带处理器等实现。The wireless communication function of the electronic device can be realized by the antenna 1, the antenna 2, the mobile communication module 550, the wireless communication module 560, the modem processor and the baseband processor.
天线1和天线2用于发射和接收电磁波信号。电子设备中的每个天线可用于覆盖单个或多个通信频段。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。 Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals. Each antenna in an electronic device can be used to cover single or multiple communication frequency bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
移动通信模块550可以提供应用在电子设备上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块550可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块550可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块550还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块550的至少部分功能模块可以被设置于处理器510中。在一些实施例中,移动通信模块550的至少部分功能模块可以与处理器510的至少部分模块被设置在同一个器件中。The mobile communication module 550 can provide wireless communication solutions including 2G/3G/4G/5G applied to electronic devices. The mobile communication module 550 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA) and the like. The mobile communication module 550 can receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation. The mobile communication module 550 can also amplify the signal modulated by the modem processor, convert it into electromagnetic wave and radiate it through the antenna 1 . In some embodiments, at least part of the functional modules of the mobile communication module 550 may be set in the processor 510 . In some embodiments, at least part of the functional modules of the mobile communication module 550 and at least part of the modules of the processor 510 may be set in the same device.
在本申请实施例中,电子设备可以通过移动通信模块550与云端设备通信,例如发送音视频流等。In the embodiment of the present application, the electronic device can communicate with the cloud device through the mobile communication module 550, for example, sending audio and video streams and the like.
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器570A、受话器570B等)输出声音信号,或通过显示屏594显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器510,与移动通信模块550或其他功能模块设置在同一个器件中。A modem processor may include a modulator and a demodulator. Wherein, the modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator sends the demodulated low-frequency baseband signal to the baseband processor for processing. The low-frequency baseband signal is passed to the application processor after being processed by the baseband processor. The application processor outputs a sound signal through an audio device (not limited to a speaker 570A, a receiver 570B, etc.), or displays an image or video through a display screen 594 . In some embodiments, the modem processor may be a stand-alone device. In some other embodiments, the modem processor may be independent of the processor 510, and be set in the same device as the mobile communication module 550 or other functional modules.
在本申请实施例中,电子设备可以通过音频设备播放多个窗口对应的音频,通过显示屏594上的多个窗口播放对应的视频。In this embodiment of the present application, the electronic device may play audio corresponding to multiple windows through an audio device, and play corresponding video through multiple windows on the display screen 594 .
无线通信模块560可以提供应用在电子设备上的包括无线局域网(wireless local area networks,WLAN)(如WiFi网络),蓝牙BT,全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块560可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块560经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器510。无线通信模块560还可以从处理器510接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。The wireless communication module 560 can provide wireless local area networks (wireless local area networks, WLAN) (such as WiFi network), Bluetooth BT, global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions. The wireless communication module 560 may be one or more devices integrating at least one communication processing module. The wireless communication module 560 receives electromagnetic waves via the antenna 2 , frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 510 . The wireless communication module 560 can also receive the signal to be transmitted from the processor 510 , frequency-modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
在一些实施例中,电子设备的天线1和移动通信模块550耦合,天线2和无线通信模块560耦合,使得电子设备可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR 技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。In some embodiments, the antenna 1 of the electronic device is coupled to the mobile communication module 550, and the antenna 2 is coupled to the wireless communication module 560, so that the electronic device can communicate with the network and other devices through wireless communication technology. The wireless communication technology may include global system for mobile communications (GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access, CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC , FM, and/or IR techniques, etc. The GNSS may include a global positioning system (global positioning system, GPS), a global navigation satellite system (global navigation satellite system, GLONASS), a Beidou navigation satellite system (beidou navigation satellite system, BDS), a quasi-zenith satellite system (quasi -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
电子设备通过GPU,显示屏594,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏594和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器510可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。The electronic device realizes the display function through the GPU, the display screen 594, and the application processor. The GPU is a microprocessor for image processing, connected to the display screen 594 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering. Processor 510 may include one or more GPUs that execute program instructions to generate or alter display information.
显示屏594用于显示图像,视频等。显示屏594包括显示面板。在一些实施例中,电子设备可以包括1个或N个显示屏594,N为大于1的正整数。The display screen 594 is used to display images, videos and the like. Display 594 includes a display panel. In some embodiments, the electronic device may include 1 or N display screens 594, where N is a positive integer greater than 1.
在本申请实施例中,电子设备可以通过GPU进行多个窗口中视频的渲染,显示屏594用于通过多个窗口播放对应的视频。In the embodiment of the present application, the electronic device may render videos in multiple windows through the GPU, and the display screen 594 is used to play corresponding videos through the multiple windows.
电子设备可以通过ISP,摄像组件593,视频编解码器,GPU,显示屏594以及应用处理器等实现拍摄功能。The electronic device can realize the shooting function through ISP, camera component 593 , video codec, GPU, display screen 594 and application processor.
外部存储器接口520可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备的存储能力。外部存储卡通过外部存储器接口520与处理器510通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。The external memory interface 520 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device. The external memory card communicates with the processor 510 through the external memory interface 520 to implement a data storage function. Such as saving music, video and other files in the external memory card.
内部存储器521可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。内部存储器521可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储电子设备使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器521可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。处理器510通过运行存储在内部存储器521的指令,和/或存储在设置于处理器中的存储器的指令,执行电子设备的各种功能应用以及数据处理。The internal memory 521 may be used to store computer-executable program codes including instructions. The internal memory 521 may include an area for storing programs and an area for storing data. Wherein, the stored program area can store an operating system, at least one application program required by a function (such as a sound playing function, an image playing function, etc.) and the like. The storage data area can store data (such as audio data, phone book, etc.) created during the use of the electronic device. In addition, the internal memory 521 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like. The processor 510 executes various functional applications and data processing of the electronic device by executing instructions stored in the internal memory 521 and/or instructions stored in a memory provided in the processor.
电子设备可以通过音频模块570,扬声器570A,受话器570B,麦克风570C以及应用处理器等实现音频功能。例如音乐播放,录音等。关于音频模块570、扬声器570A、受话器570B和麦克风570C的具体工作原理和作用,以及按键590、马达591、指示器592和SIM卡接口595等的具体工作原理和作用,可以参考常规技术中的介绍。The electronic device can realize the audio function through the audio module 570, the speaker 570A, the receiver 570B, the microphone 570C, and the application processor. Such as music playback, recording, etc. For the specific working principles and functions of the audio module 570, the loudspeaker 570A, the receiver 570B and the microphone 570C, as well as the specific working principles and functions of the buttons 590, the motor 591, the indicator 592 and the SIM card interface 595, you can refer to the introduction in the conventional technology .
需要说明的是,图5A所示电子设备包括的硬件模块只是示例性地描述,并不对电子设备的具体结构做出限定。例如,电子设备还可以包括其他功能模块。It should be noted that the hardware modules included in the electronic device shown in FIG. 5A are only described as examples, and do not limit the specific structure of the electronic device. For example, an electronic device may also include other functional modules.
以包括分层架构的Android系统的端侧设备(如发送端和接收端)为例,如图5B所示,电子设备的软件可以分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。如图5B所示,电子设备的软件结构从上至下可以分为三层:Taking the end-side device (such as the sender and the receiver) of the Android system including a layered architecture as an example, as shown in FIG. 5B , the software of the electronic device can be divided into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces. As shown in Figure 5B, the software structure of the electronic device can be divided into three layers from top to bottom:
应用程序层(简称应用层),应用程序框架层(简称框架层),系统库,安卓运行时和内核层(也称为驱动层)。Application layer (referred to as application layer), application framework layer (referred to as framework layer), system library, Android runtime and kernel layer (also referred to as driver layer).
其中,应用程序层可以包括一系列应用程序包,例如相机,图库,日历,通话,地图,导航,蓝牙,音乐,视频,短信息等应用程序。为方便描述,以下将应用程序简称为应用。如图5B所示,应用程序层还可以包括RTC软件开发工具包(software  development kit,SDK)和通话SDK。Wherein, the application layer may include a series of application packages, such as camera, gallery, calendar, call, map, navigation, bluetooth, music, video, short message and other applications. For convenience of description, the application program is referred to as application for short below. As shown in FIG. 5B, the application layer may also include an RTC software development kit (software development kit, SDK) and a call SDK.
其中,通话应用主要用于完成用户界面(user interface,UI)和交互逻辑。通话SDK主要用于对接通信云,提供账号管理、联系人管理、信令通信等能力,完成音视频数据的采播、编解码,对接RTC SDK,完成通话过程中媒体流的互通等。RTC SDK主要用于负责与RTC云交互,以提供媒体流(如音视频流)的发送能力。Among them, the call application is mainly used to complete a user interface (user interface, UI) and interaction logic. The call SDK is mainly used to connect with the communication cloud, provide account management, contact management, signaling communication and other capabilities, complete audio and video data collection, broadcasting, encoding and decoding, connect with RTC SDK, and complete the intercommunication of media streams during the call. The RTC SDK is mainly used to interact with the RTC cloud to provide the ability to send media streams (such as audio and video streams).
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。如图5B所示,应用程序框架层可以包括窗口管理服务器(window manager service,WMS),活动管理服务器(activity manager service,AMS)和输入事件管理服务器(input manager service,IMS)。在一些实施例中,应用程序框架层还可以包括内容提供器,视图系统,电话管理器,资源管理器,通知管理器等(图5B中未示出)。The application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer. As shown in FIG. 5B, the application framework layer may include a window management server (window manager service, WMS), an activity management server (activity manager service, AMS) and an input event management server (input manager service, IMS). In some embodiments, the application framework layer may also include a content provider, a view system, a phone manager, a resource manager, a notification manager, etc. (not shown in FIG. 5B ).
系统库和安卓运行时包含FWK所需要调用的功能函数,Android的核心库,以及Android虚拟机。系统库可以包括多个功能模块。例如:浏览器内核,三维(3 dimensional,3D)图形,字体库等。The system library and the Android runtime include the functions that the FWK needs to call, the Android core library, and the Android virtual machine. A system library can include multiple function modules. For example: browser kernel, three-dimensional (3 dimensional, 3D) graphics, font library, etc.
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(Media Libraries),三维图形处理库(例如:OpenGL ES),2D图形引擎(例如:SGL)等。A system library can include multiple function modules. For example: surface manager (surface manager), media library (Media Libraries), 3D graphics processing library (eg: OpenGL ES), 2D graphics engine (eg: SGL), etc.
内核层是硬件和软件之间的层。内核层可以包含显示驱动,输入/输出设备驱动(例如,键盘、触摸屏、耳机、扬声器、麦克风等),设备节点,摄像头驱动,音频驱动以及传感器驱动等。用户通过输入设备进行输入操作,内核层可以根据输入操作产生相应的原始输入事件,并存储在设备节点中。输入/输出设备驱动可以检测到用户的输入事件。例如,用户通过拖动窗口设置窗口优先级的操作。The kernel layer is the layer between hardware and software. The kernel layer can include display drivers, input/output device drivers (for example, keyboards, touch screens, earphones, speakers, microphones, etc.), device nodes, camera drivers, audio drivers, and sensor drivers. The user performs an input operation through the input device, and the kernel layer can generate corresponding original input events according to the input operation, and store them in the device node. Input/output device drivers can detect user input events. For example, an operation where the user sets the window priority by dragging the window.
进一步的,如图5B所示,端侧设备(如接收端)还包括弱网决策模块和编/解码器。弱网决策模块用于根据预测带宽预测整体带宽,根据整体带宽和多个音视频流的带宽需求判定是否处于弱网环境的,在确定处于弱网环境时进行弱网决策与处理。Further, as shown in FIG. 5B , the end-side device (such as the receiving end) also includes a weak network decision module and a coder/decoder. The weak network decision module is used to predict the overall bandwidth according to the predicted bandwidth, determine whether it is in a weak network environment according to the overall bandwidth and the bandwidth requirements of multiple audio and video streams, and perform weak network decision-making and processing when it is determined to be in a weak network environment.
编/解码模块用于负责音视频流的解码工作。The encoding/decoding module is responsible for decoding audio and video streams.
需要说明的是,图5B仅以弱网决策模块和编/解码模块位于应用程序框架层作为示例。事实上,在本申请实施例中,弱网决策模块和编/解码模块可以位于端侧设备(如接收端)的任意软件架构层。例如,上述弱网决策模块和编/解码模块还可以位于端侧设备(如接收端)的应用程序层、内核层或系统库等软件架构层。It should be noted that, in FIG. 5B , the weak network decision-making module and the encoding/decoding module are located at the application framework layer as an example. In fact, in the embodiment of the present application, the weak network decision module and the codec/decode module can be located at any software architecture layer of the device on the end (such as the receiving end). For example, the above-mentioned weak network decision-making module and encoding/decoding module may also be located at the software architecture layer such as the application program layer, the kernel layer, or the system library of the end-side device (such as the receiving end).
作为一种示例,本申请实施例提供的一种多窗口视频通信方法可以基于实时通信(real time communication,RTC)的通话业务架构实现。As an example, a multi-window video communication method provided in the embodiment of the present application may be implemented based on a real time communication (real time communication, RTC) call service architecture.
请参考图6,图6示出了一种基于RTC的通话业务架构示意图。如图6所示,端侧设备通过云端设备进行通信。需要说明的是,图6以架构中包括2个端侧设备为例,但是本申请实施例不限定多窗口视频通信中端侧设备的具体数量。Please refer to FIG. 6, which shows a schematic diagram of an RTC-based call service architecture. As shown in Figure 6, the end-side devices communicate through cloud devices. It should be noted that, in FIG. 6 , the architecture includes two end-side devices as an example, but this embodiment of the present application does not limit the specific number of end-side devices in multi-window video communication.
如图6所示,云端设备包括通信云和RTC云。通信云包括账号服务器和信令服务器。RTC云包括RTC服务器和RTC SFU。As shown in Figure 6, cloud devices include communication cloud and RTC cloud. The communication cloud includes an account server and a signaling server. RTC cloud includes RTC server and RTC SFU.
其中,账号服务器主要用于负责账号信息、联系人信息、推送(Push)信息的存储和信息维护等。信令服务器主要用于负责呼叫信息和通话控制信令的转发等。RTC 服务器主要用于负责房间(Room)接入鉴权、SFU资源分配、路由策略和Room操作/交互等。RTC SFU主要用于负责媒体流的发布/订阅关系维护、媒体流(如音视频流)的转发和网络适应性等。Wherein, the account server is mainly used for storing and maintaining account information, contact information, and push (Push) information. The signaling server is mainly responsible for forwarding call information and call control signaling. The RTC server is mainly responsible for room (Room) access authentication, SFU resource allocation, routing policy, and Room operation/interaction. The RTC SFU is mainly used to maintain the publish/subscribe relationship of media streams, forward media streams (such as audio and video streams), and network adaptability.
其中,端侧设备如发送端(如第一设备)或接收端(如第二设备)。如图6所示,端侧设备中安装有通话应用、通话SDK和RTC SDK。Wherein, the end-side device is a sending end (such as a first device) or a receiving end (such as a second device). As shown in Figure 6, the call application, call SDK and RTC SDK are installed in the end-side device.
其中,通话应用主要用于完成UI和交互逻辑。通话SDK主要用于对接通信云,提供账号管理、联系人管理、信令通信等能力,完成音视频数据的采播、编解码,对接RTC SDK,完成通话过程中媒体流的互通等。RTC SDK主要用于负责与RTC云交互,以提供媒体流(如音视频流)的发送能力。Among them, the call application is mainly used to complete UI and interaction logic. The call SDK is mainly used to connect with the communication cloud, provide account management, contact management, signaling communication and other capabilities, complete audio and video data collection, broadcasting, encoding and decoding, connect with RTC SDK, and complete the intercommunication of media streams during the call. The RTC SDK is mainly used to interact with the RTC cloud to provide the ability to send media streams (such as audio and video streams).
如图6所示,端侧设备中还包括编/解码模块。示例性的,在端侧设备是发送端时,编码模块用于对音视频流进行编码;在端侧设备是接收端时,解码模块用于对音视频流进行解码。As shown in FIG. 6 , the end-side device also includes an encoding/decoding module. Exemplarily, when the end-side device is a sending end, the encoding module is used to encode audio and video streams; when the end-side device is a receiving end, the decoding module is used to decode audio and video streams.
作为一种示例,图6所示通话业务架构中RTC SFU的结构可以如图2所示SFU的结构。其中,RTC SFU可以包括转发模块和带宽预测模块。在本申请实施例中,转发模块用于进行音视频流的转发。带宽预测模块用于进行带宽预测。As an example, the structure of the RTC SFU in the call service architecture shown in FIG. 6 may be the structure of the SFU shown in FIG. 2 . Wherein, the RTC SFU may include a forwarding module and a bandwidth prediction module. In the embodiment of the present application, the forwarding module is used to forward audio and video streams. The bandwidth prediction module is used for bandwidth prediction.
如图6所示,为了实现本申请实施例提供的多窗口视频通信方法,端侧设备(如接收端)还包括弱网决策模块。弱网决策模块用于预测整体带宽(即预测带宽资源),根据整体带宽和多个音视频流的带宽需求判定是否处于弱网环境的,在确定处于弱网环境时进行弱网决策与处理。关于弱网决策模块的具体作用,可以参考下文中的具体介绍。As shown in FIG. 6 , in order to implement the multi-window video communication method provided by the embodiment of the present application, the terminal-side device (such as the receiving terminal) further includes a weak network decision module. The weak network decision module is used to predict the overall bandwidth (that is, predict bandwidth resources), determine whether it is in a weak network environment according to the overall bandwidth and the bandwidth requirements of multiple audio and video streams, and perform weak network decision-making and processing when it is determined to be in a weak network environment. For the specific role of the weak network decision-making module, you can refer to the specific introduction below.
以图6所示通话业务架构为例,请参考图7,图7示出了本申请实施例提供的一种多窗口视频通信架构图。Taking the call service architecture shown in FIG. 6 as an example, please refer to FIG. 7 , which shows a multi-window video communication architecture diagram provided by an embodiment of the present application.
如图7所示,RTC SFU 1和RTC SFU 2分布式部署,以提升网络的可扩展性。图7所示各发送端可以自行选择最优的RTC SFU进行音视频流转发,例如根据网络情况(如运营商、地区等)选择最优的RTC SFU。例如,如图7所示,发送端A选择通过RTC SFU 1向接收端D发送音视频流1;发送端B和发送端C选择通过RTC SFU 2向接收端D分别发送音视频流2和音视频流3。As shown in Figure 7, RTC SFU 1 and RTC SFU 2 are deployed in a distributed manner to improve network scalability. Each sender shown in Figure 7 can select the optimal RTC SFU for audio and video stream forwarding by itself, for example, select the optimal RTC SFU according to network conditions (such as operators, regions, etc.). For example, as shown in Figure 7, sender A chooses to send audio and video stream 1 to receiver D through RTC SFU 1; sender B and sender C choose to send audio and video stream 2 and audio and video to receiver D through RTC SFU 2 respectively Stream 3.
其中,图7所示RTC SFU 1的带宽预测模块根据下行数据(如图7所示音视频流1)进行带宽预测,以及向接收端D的弱网决策模块发送带宽预测结果。例如,带宽预测模块根据图7所示音视频流1发送的时延、丢包等数据计算得到带宽预测值1。同理,图7所示RTC SFU 2的带宽预测模块根据图7所示音视频流2发送的时延、丢包等数据计算得到带宽预测值2,以及根据图7所示音视频流3发送的时延、丢包等数据计算得到带宽预测值3。Among them, the bandwidth prediction module of RTC SFU 1 shown in Figure 7 performs bandwidth prediction according to the downlink data (audio and video stream 1 shown in Figure 7), and sends the bandwidth prediction result to the weak network decision module of the receiving end D. For example, the bandwidth prediction module calculates the bandwidth prediction value 1 according to the data such as time delay and packet loss of the audio and video stream 1 shown in FIG. 7 . Similarly, the bandwidth prediction module of the RTC SFU 2 shown in Figure 7 calculates the bandwidth prediction value 2 according to the data such as the delay and packet loss of the audio and video stream 2 shown in Figure 7, and sends it according to the audio and video stream 3 shown in Figure 7. The bandwidth prediction value 3 is obtained by calculating the time delay, packet loss and other data.
图7所示接收端D包括解码模块和弱网决策模块。其中,接收端D的解码模块用于对接收到的音视频流(如图7所示音视频流1&音视频流2&音视频流3)进行解码。作为一种示例,解码模块中可以包括多个解码器,该多个解码器用于分担音视频流的解码工作。例如解码模块包括解码器A、解码器B和解码器C,解码器A用于解码音视频流1,解码器B用于解码音视频流2,解码器C用于解码音视频流3。The receiving end D shown in FIG. 7 includes a decoding module and a weak network decision module. Wherein, the decoding module of the receiving end D is used to decode the received audio and video stream (as shown in FIG. 7 , audio and video stream 1 & audio and video stream 2 & audio and video stream 3). As an example, the decoding module may include multiple decoders, and the multiple decoders are used to share the decoding work of audio and video streams. For example, the decoding module includes a decoder A, a decoder B and a decoder C, the decoder A is used to decode the audio and video stream 1 , the decoder B is used to decode the audio and video stream 2 , and the decoder C is used to decode the audio and video stream 3 .
图7所示弱网决策模块用于根据来自RTC SFU 1的带宽预测模块的带宽预测值1、 来自RTC SFU 2的带宽预测模块的带宽预测值2和带宽预测值3进行整体带宽预测(即带宽资源预测),得到总带宽预测值。进一步的,图7所示弱网决策模块还用于根据总带宽预测值和音视频流的带宽需求进行弱网判断,在确定处于弱网环境时优先对优先级较低的窗口中的视频进行降级订阅,以保障优先级较高的窗口中的视频的流畅播放。The weak network decision-making module shown in Figure 7 is used to carry out overall bandwidth prediction (i.e. bandwidth resource Prediction) to get the total bandwidth prediction value. Further, the weak network decision-making module shown in Figure 7 is also used to make weak network judgments based on the total bandwidth prediction value and the bandwidth requirements of audio and video streams, and when it is determined to be in a weak network environment, the video in the window with a lower priority is preferentially downgraded and subscribed , to ensure the smooth playback of videos in windows with higher priority.
示例性的,弱网决策模块对优先级较低的窗口中的视频进行降级订阅可以包括但不限于以下中的一种或多种:取消订阅、恢复订阅、延迟订阅、降低清晰度、提高清晰度等。Exemplarily, the weak network decision-making module may downgrade the subscription to the video in the lower priority window, which may include but not limited to one or more of the following: unsubscribe, resume subscription, delay subscription, reduce definition, improve clarity degree etc.
需要说明的是,上述图7仅以第三设备是SFU作为示例,本申请实施例不限定第三设备的具体结构、功能和形态等。例如,第三设备还可以是例如智能手机等电子设备。例如第三设备可以与第一设备、第二设备组成点对点(peer-to-peer,P2P)的网络架构,其中,第一设备、第二设备和第三设备均既可以作为发送端,也可以作为接收端。作为一种示例,在多窗口视频通信场景中,第三设备还可以作为转发设备,将来自第一设备的音视频流转发至第二设备。It should be noted that the above FIG. 7 only uses the third device as an SFU as an example, and this embodiment of the present application does not limit the specific structure, function, form, etc. of the third device. For example, the third device may also be an electronic device such as a smart phone. For example, the third device can form a peer-to-peer (P2P) network architecture with the first device and the second device, wherein the first device, the second device and the third device can all serve as the receiver. As an example, in a multi-window video communication scenario, the third device may also serve as a forwarding device, forwarding the audio and video stream from the first device to the second device.
以下将以图7所示多窗口视频通信架构为例,即以多个发送端(即第一设备)通过第三设备(如SFU)向接收端(即第二设备)发送音视频流为例,以接收端的显示界面包括如图4所示大窗、小窗1和小窗2为例,结合附图,对本申请实施例提供的一种多窗口视频通信方法进行具体介绍。The following will take the multi-window video communication architecture shown in Figure 7 as an example, that is, multiple sending ends (that is, the first device) send audio and video streams to the receiving end (that is, the second device) through the third device (such as SFU) as an example , taking the display interface of the receiving end including the large window, small window 1, and small window 2 as shown in FIG. 4 as an example, a multi-window video communication method provided by the embodiment of the present application is specifically introduced with reference to the accompanying drawings.
如图8所示,本申请实施例提供的多窗口视频通信方法可以包括以下步骤S801-S804:As shown in Figure 8, the multi-window video communication method provided by the embodiment of the present application may include the following steps S801-S804:
S801、多个发送端(即第一设备)向第三设备(如SFU)发送音视频流。S801. Multiple sending ends (that is, the first device) send audio and video streams to a third device (such as the SFU).
其中,音视频流中除包括音视频信息外,还携带有接收端(如第二设备)的标识(identification,ID),用于第三设备(如SFU)根据接收端的ID向接收端转发音视频流。进一步的,在一些实施例中,音视频流中携带的接收端的ID还用于SFU预测对应下行路径的下行带宽。Wherein, in addition to including audio and video information, the audio and video stream also carries the identification (identification, ID) of the receiving end (such as the second device), which is used for the third device (such as SFU) to forward the audio to the receiving end according to the ID of the receiving end. video stream. Further, in some embodiments, the ID of the receiver carried in the audio and video stream is also used by the SFU to predict the downlink bandwidth of the corresponding downlink path.
在本申请实施例中,音视频流中还携带有发送端的ID。In the embodiment of the present application, the audio and video stream also carries the ID of the sending end.
以图7所示多窗口视频通信架构为例,如图9所示,上述步骤S801具体可以包括:发送端A向RTC SFU 1发送音视频流1,发送端B和发送端C分别向RTC SFU 2发送音视频流2和音视频流3。其中,音视频流1、音视频流2和音视频流3中携带有发送端D的ID。进一步的,音视频流1中携带有发送端A的ID,音视频流2中携带有发送端B的ID,音视频流3中携带有发送端C的ID。Taking the multi-window video communication architecture shown in Figure 7 as an example, as shown in Figure 9, the above step S801 may specifically include: the sending end A sends the audio and video stream 1 to the RTC SFU 1, and the sending end B and the sending end C send the audio and video stream 1 to the RTC SFU 1 respectively. 2 Send audio and video stream 2 and audio and video stream 3. Among them, the audio and video stream 1, the audio and video stream 2 and the audio and video stream 3 carry the ID of the sending end D. Further, the audio and video stream 1 carries the ID of the sender A, the audio and video stream 2 carries the ID of the sender B, and the audio and video stream 3 carries the ID of the sender C.
作为一种示例,本申请实施例中的音视频流中还可以携带有以下信息:流分辨率信息、帧率信息、编码Codec、流等级等。As an example, the audio and video streams in this embodiment of the present application may also carry the following information: stream resolution information, frame rate information, encoding Codec, stream level, and the like.
示例性的,图9所示音视频流1、音视频流2和音视频流3中除了包括有音视频信息外,还携带有如以下表4所示信息:Exemplarily, audio-video stream 1, audio-video stream 2, and audio-video stream 3 shown in FIG. 9 not only include audio-video information, but also carry information as shown in Table 4 below:
表4Table 4
Figure PCTCN2022109423-appb-000004
Figure PCTCN2022109423-appb-000004
Figure PCTCN2022109423-appb-000005
Figure PCTCN2022109423-appb-000005
需要说明的是,本申请实施例对多个发送端发送音视频流的具体顺序和时机等不作限定。例如,图7所示发送端A、发送端B和发送端C可以同时发送音视频流,也可以以任意时间顺序发送音视频流。It should be noted that, the embodiment of the present application does not limit the specific order and timing of sending audio and video streams by multiple sending ends. For example, the sending end A, sending end B, and sending end C shown in FIG. 7 may send audio and video streams at the same time, or may send audio and video streams in any order of time.
S802、第三设备(如SFU)向接收端(如第二设备)转发音视频流,预测下行带宽,并向接收端发送带宽预测结果。S802. The third device (such as the SFU) forwards audio and video streams to the receiving end (such as the second device), predicts downlink bandwidth, and sends the bandwidth prediction result to the receiving end.
以图7所示多窗口视频通信架构为例,如图9所示,上述步骤S802具体可以包括:RTC SFU 1向接收端D转发音视频流1,同时进行带宽预测得到带宽预测值1,并向接收端D发送带宽预测值1;RTC SFU 2向接收端D转发音视频流2,同时进行带宽预测得到带宽预测值2,并向接收端D发送带宽预测值2;RTC SFU 2向接收端D转发音视频流3,同时进行带宽预测得到带宽预测值3,并向接收端D发送带宽预测值3。例如,图9所示带宽预测值1、带宽预测值2和带宽预测值3均是500kbps。Taking the multi-window video communication architecture shown in FIG. 7 as an example, as shown in FIG. 9, the above step S802 may specifically include: the RTC SFU 1 transfers the voice video stream 1 to the receiving end D, and simultaneously performs bandwidth prediction to obtain a bandwidth prediction value 1, and Send the bandwidth prediction value 1 to the receiving end D; RTC SFU 2 transfers the voice video stream 2 to the receiving end D, and at the same time performs bandwidth prediction to obtain the bandwidth prediction value 2, and sends the bandwidth prediction value 2 to the receiving end D; RTC SFU 2 to the receiving end D forwards the audio and video stream 3, performs bandwidth prediction at the same time to obtain a bandwidth prediction value 3, and sends the bandwidth prediction value 3 to the receiving end D. For example, the predicted bandwidth value 1, the predicted bandwidth value 2, and the predicted bandwidth value 3 shown in FIG. 9 are all 500 kbps.
示例性的,第三设备(如SFU)可以根据音视频流中携带的接收端的ID向接收端转发音视频流。Exemplarily, the third device (such as the SFU) may forward the audio-video stream to the receiver according to the ID of the receiver carried in the audio-video stream.
例如,图9所示RTC SFU 1可以根据来自发送端A的音视频流1中携带的接收端D的ID,向接收端D转发音视频流1;图9所示RTC SFU 2可以根据来自发送端B的音视频流2中携带的接收端D的ID,向接收端D转发音视频流2;以及RTC SFU 2根据来自发送端C的音视频流3中携带的接收端D的ID,向接收端D转发音视频流3。For example, RTC SFU 1 shown in Figure 9 can forward audio and video stream 1 to receiver D according to the ID of receiver D carried in audio and video stream 1 from transmitter A; RTC SFU 2 shown in Figure 9 can transmit audio and video stream 1 based on The ID of the receiving end D carried in the audio and video stream 2 of the terminal B forwards the audio video stream 2 to the receiving end D; and the RTC SFU 2 transmits the ID of the receiving end D to the The receiving end D forwards the audio and video stream 3.
进一步的,在一些实施例中,第三设备(如SFU)可以根据音视频流中携带的接收端的ID,预测对应下行路径的下行带宽。Further, in some embodiments, the third device (such as the SFU) can predict the downlink bandwidth of the corresponding downlink path according to the ID of the receiver carried in the audio and video stream.
其中,在本申请实施例中,多个发送端通过第三设备(如SFU)向接收端转发的音视频流对应的音频和视频分别在接收端显示屏上显示的多个窗口中播放。Wherein, in the embodiment of the present application, the audio and video corresponding to the audio and video streams forwarded by multiple sending ends to the receiving end through a third device (such as SFU) are respectively played in multiple windows displayed on the display screen of the receiving end.
请参考图10,图10示出了几种多窗口显示示例图。其中,图10中的(a)和图10中的(b)示出了多窗口视频通信场景(如多方会议场景或群视频聊天场景)下,接收端的多窗口显示界面。图10中的(c)示出了网上教育场景下,接收端的多窗口显示界面。Please refer to FIG. 10 , which shows several examples of multi-window display. Among them, (a) in FIG. 10 and (b) in FIG. 10 show the multi-window display interface of the receiving end in a multi-window video communication scenario (such as a multi-party conference scenario or a group video chat scenario). (c) in FIG. 10 shows the multi-window display interface of the receiving end in the online education scenario.
需要说明的是,图10仅示出了与本申请相关的窗口,在一些示例中,接收端界面上还可以显示有功能键、菜单栏、导航键/栏等,本申请不限定。It should be noted that FIG. 10 only shows windows related to this application. In some examples, function keys, menu bars, navigation keys/bars, etc. may also be displayed on the interface of the receiving end, which is not limited by this application.
以图7所示接收端D的显示界面包括如图4所示大窗、小窗1和小窗2为例,图7所示发送端A、发送端B和发送端C分别向接收端D发送的音视频流1、音视频流2和音视频流3用于分别在接收端D的大窗、小窗1和小窗2中播放。Taking the display interface of the receiving end D shown in Figure 7 including the large window, the small window 1 and the small window 2 as shown in Figure 4 as an example, the sending end A, the sending end B, and the sending end C shown in Figure 7 respectively report to the receiving end D The sent audio-video stream 1, audio-video stream 2, and audio-video stream 3 are used to play in the large window, small window 1, and small window 2 of the receiving end D respectively.
进一步的,在一些实施例中,SFU还可以向接收端发送带宽需求。例如,图9所示音视频流1、音视频流2和音视频流3所需带宽分别为600kbps、500kbps和500kbps。Further, in some embodiments, the SFU may also send the bandwidth requirement to the receiving end. For example, the required bandwidths of audio-video stream 1, audio-video stream 2, and audio-video stream 3 shown in FIG. 9 are 600 kbps, 500 kbps, and 500 kbps respectively.
进一步的,在一些实施例中,例如对于可抽帧的视频编码方式,第三设备(如SFU) 还可以向接收端发送抽帧状态和对应的带宽需求。其中,抽帧状态用于表征视频编码方式抽帧还是不抽帧。例如,图9所示音视频流1、音视频流2和音视频流3的抽帧状态信息和对应的带宽需求如以下表5所示:Further, in some embodiments, for example, for a video encoding method in which frames can be extracted, the third device (such as SFU) may also send the frame extraction status and the corresponding bandwidth requirement to the receiving end. Wherein, the frame extraction state is used to indicate whether the video coding mode is to extract frames or not to extract frames. For example, the frame extraction state information and corresponding bandwidth requirements of audio-video stream 1, audio-video stream 2, and audio-video stream 3 shown in FIG. 9 are shown in Table 5 below:
表5table 5
音视频流audio and video streaming 抽帧状态Frame state 带宽需求bandwidth requirements
音视频流1Audio and video stream 1 不抽帧no frames 400kbps400kbps
音视频流2Audio and video stream 2 抽帧Frame 300kbps300kbps
音视频流3Audio and video stream 3 抽帧Frame 300kbps300kbps
S803、接收端根据来自一个或多个第三设备(如SFU)的带宽预测结果,得到总带宽预测值。S803. The receiving end obtains a total bandwidth prediction value according to bandwidth prediction results from one or more third devices (such as SFU).
其中,总带宽预测值的大小可以用于表征带宽资源的多少。例如,总带宽预测值越大,则说明带宽资源越充足;总带宽预测值越小,则说明带宽资源越少。Wherein, the size of the total bandwidth prediction value can be used to represent the amount of bandwidth resources. For example, a larger total bandwidth prediction value indicates more sufficient bandwidth resources; a smaller total bandwidth prediction value indicates less bandwidth resources.
以图7所示多窗口视频通信架构为例,如图9所示,上述步骤S803具体可以包括:接收端D根据来自RTC SFU 1的带宽预测值1,来自RTC SFU 2的带宽预测值2和带宽预测值3,得到总带宽预测值。Taking the multi-window video communication architecture shown in FIG. 7 as an example, as shown in FIG. 9, the above step S803 may specifically include: the receiving end D according to the bandwidth prediction value 1 from the RTC SFU 1, the bandwidth prediction value 2 from the RTC SFU 2 and The bandwidth prediction value is 3, and the total bandwidth prediction value is obtained.
作为一种示例,总带宽预测值=带宽预测值1+带宽预测值2+带宽预测值3。例如,假设图9所示带宽预测值1、带宽预测值2和带宽预测值3均是500kbps,则可以得到总带宽预测值是1500kbps。As an example, total bandwidth prediction value=bandwidth prediction value 1+bandwidth prediction value 2+bandwidth prediction value 3. For example, assuming that the predicted bandwidth value 1, the predicted bandwidth value 2, and the predicted bandwidth value 3 shown in FIG. 9 are all 500 kbps, it can be obtained that the total predicted bandwidth value is 1500 kbps.
需要说明的是,本申请不限定接收端根据来自一个或多个SFU的带宽预测结果,得到总带宽预测值的具体算法,关于这部分内容,可以参考常规技术中的计算方法。It should be noted that this application does not limit the specific algorithm for the receiving end to obtain the total bandwidth prediction value based on the bandwidth prediction results from one or more SFUs. For this part, reference can be made to calculation methods in conventional technologies.
S804、在根据总带宽预测值确定弱网时,接收端根据多个窗口对应的优先级,调整对一个或多个窗口中音视频流的订阅策略。S804. When determining a weak network according to the total bandwidth prediction value, the receiving end adjusts a subscription strategy for audio and video streams in one or more windows according to priorities corresponding to the multiple windows.
以图7所示多窗口视频通信架构为例,如图9所示,上述步骤S804具体可以包括:在根据总带宽预测值确定弱网时,接收端D根据大窗、小窗1和小窗2对应的优先级,调整对大窗、小窗1和小窗2中一个或多个窗口中音视频流的订阅策略。Taking the multi-window video communication architecture shown in FIG. 7 as an example, as shown in FIG. 9, the above step S804 may specifically include: when determining a weak network according to the total bandwidth prediction value, the receiving end D according to the large window, small window 1 and small window 2, adjust the subscription strategy for audio and video streams in one or more windows of the large window, small window 1, and small window 2.
可以理解,在本申请实施例中,为了优先保证重要视频的播放流畅度和/或清晰度,接收端显示屏上的多个窗口可以具有优先级属性。It can be understood that, in this embodiment of the present application, in order to give priority to ensuring smoothness and/or clarity of playing important videos, multiple windows on the display screen of the receiving end may have priority attributes.
其中,窗口对应的优先级用于表征窗口中播放的视频对接收端的用户的重要程度。例如,相对重要的窗口的优先级高于相对次要的窗口的优先级。Wherein, the priority corresponding to the window is used to represent the importance of the video played in the window to the user at the receiving end. For example, relatively important windows have a higher priority than relatively less important windows.
例如,图11所示多窗口视频通信场景的显示界面上,大窗以较高分辨率(540P)播放高清视频,小窗1和小窗2以较低分辨率(360P)播放普清视频。则可以理解,大窗中播放的视频相比于小窗1和小窗2中播放的视频而言,对用户更加重要。因此,图11所示大窗的优先级高于小窗1和小窗2。For example, on the display interface of the multi-window video communication scene shown in Figure 11, the large window plays high-definition video at a higher resolution (540P), and the small windows 1 and 2 play normal-definition video at a lower resolution (360P). It can be understood that the video played in the large window is more important to the user than the videos played in the small window 1 and the small window 2. Therefore, the priority of the large window shown in FIG. 11 is higher than that of the small window 1 and the small window 2 .
需要说明的是,本申请上述示例仅以高清视频对应的分辨率为540P,普清视频对应的分辨率为360P作为示例,本申请并不限定不同清晰度视频的具体清晰度的规定。It should be noted that the above examples in this application only take the resolution corresponding to HD video as 540P and the resolution corresponding to normal HD video as 360P as an example, and this application does not limit the specific definition of videos with different resolutions.
又如,图10中的(c)所示多窗口视频通信场景的显示界面上,课件/白板/屏幕共享窗口是网上教育的核心,而讲师人像相对而言不重要。因此,课件/白板/屏幕共享窗口对应的优先级高于讲师人像所在的窗口。关于优先级的具体确定方法,将在下文中介绍。As another example, on the display interface of the multi-window video communication scene shown in (c) in FIG. 10 , the courseware/whiteboard/screen sharing window is the core of online education, while the portrait of the lecturer is relatively unimportant. Therefore, the priority of the courseware/whiteboard/screen sharing window is higher than that of the lecturer portrait window. The specific method for determining the priority will be introduced below.
其中,在本申请实施例中,弱网是指当前网络环境下,带宽资源不足以满足音视频流的带宽需求。示例性的,弱网是指当前网络环境下,接收端得到的总带宽预测值(例如300kbps-1000kbps)小于音视频流所需的带宽值(如1200kbps)。Wherein, in the embodiment of the present application, weak network means that under the current network environment, bandwidth resources are insufficient to meet the bandwidth requirements of audio and video streams. Exemplarily, the weak network means that under the current network environment, the total bandwidth prediction value obtained by the receiving end (for example, 300kbps-1000kbps) is smaller than the bandwidth value required by the audio and video stream (for example, 1200kbps).
可以理解,相比于视频流,音频流所需的通信资源很少,因此,在本申请实施例中,还可以仅基于视频流的带宽需求判定是否处于弱网环境。It can be understood that compared with video streams, audio streams require less communication resources. Therefore, in the embodiment of the present application, it is also possible to determine whether the network is in a weak network environment based only on the bandwidth requirements of video streams.
以基于视频流的带宽需求判定是否处于弱网环境为例,在一些实施例中,例如对于不抽帧的视频编码方式,弱网是指当前网络环境下,带宽资源不足以满足视频流抽帧前的带宽需求。Taking the determination of whether it is in a weak network environment based on the bandwidth requirements of the video stream as an example, in some embodiments, for example, for the video encoding method without frame extraction, the weak network means that under the current network environment, the bandwidth resources are insufficient to meet the video stream frame extraction. previous bandwidth requirements.
例如,接收端D上显示的大窗对应的视频流抽帧前所需的带宽是700kbps,小窗1和小窗2对应的视频流抽帧前所需的带宽是500kbps,则视频流所需的总带宽值是700kbps+500kbps+500kbps=1700kbps。假设总带宽预测值是1500kbps,由于总带宽预测值1500kbps小于视频流所需的总带宽值,因此接收端可以确定当前处于弱网环境。For example, the video stream corresponding to the large window displayed on the receiving end D requires a bandwidth of 700 kbps before frame extraction, and the video stream corresponding to small window 1 and window 2 requires a bandwidth of 500 kbps before frame extraction. The total bandwidth value of is 700kbps+500kbps+500kbps=1700kbps. Assuming that the total bandwidth prediction value is 1500kbps, since the total bandwidth prediction value of 1500kbps is less than the total bandwidth value required by the video stream, the receiving end can determine that it is currently in a weak network environment.
在另一些实施例中,例如对于抽帧的视频编码方式,弱网是指当前网络环境下,带宽资源不足以满足视频流抽帧后的带宽需求。In some other embodiments, for example, for the video encoding method of frame extraction, a weak network means that under the current network environment, bandwidth resources are insufficient to meet the bandwidth requirement of the video stream after frame extraction.
例如,接收端D上显示的大窗对应的视频流抽帧后所需的带宽是400kbps,小窗1和小窗2对应的视频流抽帧前所需的带宽是300kbps,则视频流所需的总带宽值是400kbps+300kbps+300kbps=1000kbps。假设总带宽预测值是900kbps,由于总带宽预测值900kbps小于视频流所需的总带宽值,因此接收端可以确定处于弱网环境。For example, the video stream corresponding to the large window displayed on the receiving end D requires a bandwidth of 400 kbps after frame extraction, and the video stream corresponding to small window 1 and small window 2 requires a bandwidth of 300 kbps before frame extraction. The total bandwidth value of is 400kbps+300kbps+300kbps=1000kbps. Assuming that the total bandwidth prediction value is 900kbps, since the total bandwidth prediction value of 900kbps is smaller than the total bandwidth value required by the video stream, the receiving end can determine that it is in a weak network environment.
需要说明的是,本申请实施例是以分层编码作为示例,对于其他视频编码方式,本申请实施例提供的多窗口视频通信方法同样适用。以及,本申请实施例是以IDR帧+P帧+B帧,或者IDR帧+P帧的分层编码作为示例,对于其他分层编码类型,本申请实施例提供的多窗口视频通信方法同样适用。It should be noted that the embodiment of the present application uses layered coding as an example, and for other video coding methods, the multi-window video communication method provided in the embodiment of the present application is also applicable. And, the embodiment of the present application takes the layered encoding of IDR frame+P frame+B frame, or IDR frame+P frame as an example. For other layered encoding types, the multi-window video communication method provided by the embodiment of the present application is also applicable .
作为一种示例,在本申请实施例中,接收端调整订阅策略可以包括但不限于以下中的一种或多种:取消订阅、恢复订阅、延迟订阅、降低清晰度、提高清晰度等。As an example, in this embodiment of the application, the adjustment of the subscription policy by the receiving end may include, but not limited to, one or more of the following: unsubscribing, resuming subscription, delaying subscription, reducing clarity, improving clarity, and the like.
在一些实施例中,接收端调整订阅策略时,可以仅对视频流进行订阅策略调整,保持音频流的正常播放,以保证用户正常的语音沟通和交流,确保用户的体验。In some embodiments, when the receiving end adjusts the subscription policy, it can only adjust the subscription policy for the video stream, and keep the audio stream playing normally, so as to ensure the normal voice communication and communication of the user and ensure the user experience.
以对视频流进行订阅策略调整为例,其中,取消订阅是指取消订阅对应视频流,以取消窗口中的视频显示。恢复订阅是指恢复订阅对应视频流,以恢复窗口中的视频显示。延迟订阅是指延迟订阅对应视频流,以延迟窗口中的视频显示。降低清晰度是指为窗口订阅降低清晰度后的视频流,例如从订阅高清视频流切换为订阅普清视频流。提高清晰度是指为窗口订阅提高清晰度后的视频流,例如从订阅普清视频流切换为订阅高清视频流。Take the subscription policy adjustment of the video stream as an example, where unsubscribing refers to unsubscribing from the corresponding video stream to cancel the video display in the window. Restoring the subscription refers to resuming the subscription to the corresponding video stream to restore the video display in the window. Delayed subscription refers to delayed subscription to the corresponding video stream, and the video display in the delayed window. Reducing the definition refers to subscribing to the reduced-definition video stream for the window, for example, switching from subscribing to a high-definition video stream to subscribing to a normal-definition video stream. Raising the definition refers to subscribing to the video stream with improved definition for the window, for example, switching from subscribing to a normal-definition video stream to subscribing to a high-definition video stream.
以下以接收端的显示界面包括m个窗口(m≥3,m为整数),m个窗口均显示高清视频为例,具体举例说明接收端根据多个窗口对应的优先级进行取消订阅、恢复订阅、延迟订阅、降低清晰度或提高清晰度的过程。需要说明的是,本申请实施例不限定接收端的显示界面上的窗口数量,例如,接收端的显示界面上还可以包括2个窗口。In the following, the display interface of the receiving end includes m windows (m≥3, m is an integer), and all m windows display high-definition video as an example. The specific examples illustrate that the receiving end cancels subscription, resumes subscription, The process of delaying subscription, reducing clarity, or increasing clarity. It should be noted that the embodiment of the present application does not limit the number of windows on the display interface of the receiving end, for example, the display interface of the receiving end may further include 2 windows.
(1)、降低清晰度(1), reduce clarity
示例性的,假设接收端的显示界面包括窗口W 1、W 2……W m(其中m表示窗口对应的优先级),窗口W 1、W 2……W m中,W 1对应的优先级最高,W 2对应的优先级次 之,W m对应的优先级最低,若带宽资源不足以满足保持窗口W 1、W 2……W m以当前清晰度显示的带宽需求,但是满足保持窗口W 1、W 2……W m-1以当前参数显示且窗口W m降低清晰度显示的带宽需求,则接收端决策为窗口W m订阅降低清晰度后的视频流,以保证高优先级窗口(如窗口W 1、W 2……W m-1)中视频的清晰度。例如,接收端决策由为窗口订阅第一清晰度的音视频流,降低为窗口订阅第二清晰度的音视频流。其中,第二清晰度小于第一清晰度。 Exemplarily, it is assumed that the display interface of the receiving end includes windows W 1 , W 2 ... W m (where m represents the priority corresponding to the window), among the windows W 1 , W 2 ... W m , the priority corresponding to W 1 is the highest , the priority corresponding to W 2 is the second, and the priority corresponding to W m is the lowest. If the bandwidth resources are not enough to meet the bandwidth requirements of keeping windows W 1 , W 2 ... W m at the current resolution, but meeting the requirements of keeping window W 1 , W 2 ... W m-1 are displayed with the current parameters and the window W m lowers the resolution display bandwidth requirement, then the receiving end decides to subscribe to the lowered resolution video stream for the window W m , so as to ensure that the high-priority window (such as The resolution of the video in windows W 1 , W 2 . . . W m-1 ). For example, the receiving end decides to subscribe to the audio and video stream of the first definition for the window, and reduce the subscription to the audio and video stream of the second definition for the window. Wherein, the second definition is smaller than the first definition.
例如,假设当前窗口W 1、W 2……W m均显示高清视频(即第一清晰度的视频),RW 1/高清+RW 2/高清+……+RW m/高清>总带宽预测值≥RW 1/高清+RW 2/高清+……+RW m/普清,则接收端决策降低窗口W m中视频的清晰度,以保证窗口W 1、W 2……W m-1中视频的高清显示。即,接收端决策为窗口W m订阅普清视频流(即第二清晰度的视频流)。其中,RW 1、RW 2……RW m分别是窗口W 1、W 2……W m的带宽需求。示例性的,高清视频的分辨率可以是540P,普清视频的分辨率可以是360P。 For example, assuming that the current windows W 1 , W 2 ... W m all display high-definition video (that is, the first-definition video), RW 1/HD +RW 2/HD +...+RW m/HD > predicted total bandwidth ≥RW 1/HD +RW 2/HD +…+RW m/Pu , then the receiving end decides to reduce the definition of the video in the window W m to ensure the video in the window W 1 , W 2 ……W m-1 HD display. That is, the receiving end decides to subscribe to the normal-definition video stream (that is, the second-definition video stream) for the window W m . Wherein, RW 1 , RW 2 . . . RW m are bandwidth requirements of windows W 1 , W 2 . . . W m respectively. Exemplarily, the resolution of high-definition video may be 540P, and the resolution of normal-definition video may be 360P.
在一些实施例中,在接收端降低窗口W m中视频的清晰度后,若RW 1/高清+RW 2/高清+……+RW m-1/高清+RW m/普清>总带宽预测值≥RW 1/高清+RW 2/高清+……+RW m-1/普清+RW m/普清,则接收端可以进一步决策为窗口W m-1订阅普清视频流(即第二清晰度的视频流),从而降低窗口W m-1中视频的清晰度,以保证窗口W 1、W 2……W m-2中视频的高清显示,以此类推。 In some embodiments, after the receiving end reduces the definition of the video in the window W m , if RW 1/HD +RW 2/HD +...+RW m-1/HD +RW m/plain > total bandwidth prediction value ≥ RW 1/HD +RW 2/HD +...+RW m-1/Pudio +RW m/Pu , then the receiving end can further decide to subscribe to PQ video stream for window W m-1 (that is, the second high-definition video stream), thereby reducing the definition of the video in the window W m-1 to ensure the high-definition display of the video in the windows W 1 , W 2 . . . W m-2 , and so on.
在另一些实施例中,在接收端降低窗口W m中的视频清晰度后,若RW 1/高清+RW 2/ 高清+……+RW m-1/高清+RW m/普清>总带宽预测值≥RW 1/高清+RW 2/高清+……+RW m-1/普清+RW m/普 ,接收端还可以进一步决策取消为窗口W m订阅视频流,从而取消窗口W m中的视频显示,以保证窗口W 1、W 2……W m-2中视频的高清显示。 In some other embodiments, after the receiving end reduces the video definition in the window W m , if RW 1/HD +RW 2/ HD +...+RW m-1/HD +RW m/Pu >total bandwidth Predicted value ≥ RW 1/HD +RW 2/HD +…+RW m-1/Pu Qing +RW m/Pu Qing , the receiving end can further decide to cancel the subscription of video stream for window W m , thereby canceling window W m The video in the window is displayed to ensure the high-definition display of the video in the windows W 1 , W 2 . . . W m-2 .
需要说明的是,上述举例仅以清晰度包括高清和普清两档作为示例,本申请并不限定清晰度的设定规则,例如,清晰度还可能包括超清、高清和普清三档,对于这种情况,在一些实施例中,在降低清晰度时,可以按照超清→高清→普清的梯度依次降低。It should be noted that the above examples only take the definition as an example including high-definition and normal definition, and this application does not limit the definition rules. For example, the definition may also include three levels of ultra-clear, high-definition and normal definition. For this situation, in some embodiments, when reducing the definition, it can be reduced sequentially according to the gradient of ultra-definition→high-definition→normal definition.
(2)、取消订阅(2), unsubscribe
示例性的,假设接收端的显示界面包括窗口W 1、W 2……W m(其中m表示窗口对应的优先级),若带宽资源不足以满足保持窗口W 1、W 2……W m以当前清晰度显示的带宽需求,但是满足保持窗口W 1、W 2……W m-1以当前参数显示且窗口W m不显示视频的带宽需求,则接收端决策取消为优先级最低的窗口W m订阅视频流,从而取消优先级最低的窗口中视频的显示,以保证其他窗口以第一清晰度显示。。 Exemplarily , it is assumed that the display interface of the receiving end includes windows W 1 , W 2 . If the bandwidth requirements for high-definition display are satisfied, but the bandwidth requirements for keeping windows W 1 , W 2 ... W m-1 displayed with the current parameters and the window W m does not display video are met, the receiving end decides to cancel the window W m with the lowest priority Subscribe to the video stream, thereby canceling the display of the video in the lowest priority window to ensure that other windows are displayed in the first definition. .
例如,接收端决策由为优先级最低的窗口订阅第二清晰度的音视频流,改为取消为该窗口订阅视频流(即取消为该窗口订阅视频流)。其中,第二清晰度小于或等于预设值。For example, the receiving end decides to cancel the subscription of the video stream for the window instead of subscribing to the second-definition audio and video stream for the window with the lowest priority (that is, to cancel the subscription of the video stream for the window). Wherein, the second definition is less than or equal to a preset value.
例如,假设当前窗口W 1、W 2……W m-1均显示高清视频,窗口W m显示普清视频(即第二清晰度的视频),若RW 1/高清+RW 2/高清+……+RW m普清>总带宽预测值,则接收端决策取消为窗口W m订阅视频流,从而取消窗口W m中的视频显示,以保证窗口W 1、W 2……W m-1中视频的高清显示。在一些实施例中,在接收端取消为窗口W m订阅视频流后,若RW 1/高清+RW 2/高清+……+RW m-1/高清>总带宽预测值≥RW 1/高清+RW 2/高清 For example, assuming that the current windows W 1 , W 2 ... W m-1 all display high-definition video, and window W m displays normal-definition video (that is, the second-definition video), if RW 1/HD +RW 2/HD +... …+RW m normal clear > total bandwidth prediction value, then the receiving end decides to cancel the video stream subscription for window W m , thereby canceling the video display in window W m , so as to ensure that the windows W 1 , W 2 ... W m-1 HD display of video. In some embodiments, after the receiving end cancels the video stream subscription for the window W m , if RW 1/HD +RW 2/HD +...+RW m-1/HD >total bandwidth prediction value≥RW 1/HD + RW 2/HD
+……+RW m-1/普清,则接收端可以进一步决策为窗口W m-1订阅降低清晰度后(如普清)的视频流,从而降低窗口W m-1中视频的清晰度,或者,接收端也可以进一步决策取消为窗口W m-1订阅视频流,以保证窗口W 1、W 2……W m-2中视频的高清显示,以此类推。 +...+RW m-1/common definition , then the receiver can further decide to subscribe to the video stream with reduced definition (such as normal definition) for window W m-1 , thereby reducing the definition of the video in window W m-1 , or, the receiving end can further decide to cancel the subscription of the video stream for the window W m-1 to ensure the high-definition display of the video in the windows W 1 , W 2 ... W m-2 , and so on.
作为一种实现方式,在本申请实施例中,取消订阅的窗口可以显示取消订阅前的最后一帧图像。As an implementation manner, in this embodiment of the application, the window for unsubscribing may display the last image frame before unsubscribing.
作为另一种实现方式,在本申请实施例中,取消订阅的窗口可以显示蒙层。例如,图12示出了处于弱网环境时,优先级较低(优先级为2)的小窗2取消订阅后,窗口中显示蒙层的示例。As another implementation manner, in the embodiment of the present application, the unsubscribe window may display a mask. For example, FIG. 12 shows an example of displaying a mask in the window after the widget 2 with a lower priority (priority is 2) unsubscribes in a weak network environment.
作为另一种实现方式,在本申请实施例中,取消订阅的窗口可以在取消订阅前的最后一帧图像上叠加显示蒙层。As another implementation manner, in the embodiment of the present application, the unsubscribed window may display a mask superimposed on the last frame image before unsubscribed.
需要说明的是,在本申请实施例中,为了维持第三设备(如SFU)对下行传输的带宽预测,至少要保证窗口W 1、W 2……W m中至少一个窗口中视频的显示。例如,至少要保证窗口W 1中的视频以最低清晰度显示。 It should be noted that, in the embodiment of the present application, in order to maintain the bandwidth prediction of the third device (such as SFU) for downlink transmission, at least one of the windows W 1 , W 2 . . . Wm must be guaranteed to display video in at least one window. For example, at least ensure that the video in window W1 is displayed at the lowest resolution.
需要说明的是,本申请不限定处于弱网环境时,接收端设备所采取的对音视频流订阅的具体调整策略。在带宽资源不足以满足所有窗口保持视频流以第一清晰度显示的带宽需求,但是满足取消优先级最低的窗口中的视频显示后,其他窗口以第一清晰度显示所需带宽,则接收端也可以决策取消为优先级最低的窗口订阅视频流,从而取消优先级最低的窗口中视频的显示,以保证其他窗口以第一清晰度显示。It should be noted that this application does not limit the specific adjustment strategy adopted by the receiver device for subscribing to audio and video streams when it is in a weak network environment. When the bandwidth resources are insufficient to meet the bandwidth requirements of all windows to keep the video stream displayed in the first definition, but the video display in the window with the lowest priority is canceled and other windows display the required bandwidth in the first definition, then the receiving end It may also be decided to cancel the subscription of the video stream for the window with the lowest priority, thereby canceling the display of the video in the window with the lowest priority, so as to ensure that other windows are displayed with the first definition.
例如,假设当前窗口W 1、W 2……W m均显示高清视频,若RW 1/高清+RW 2/高清 For example, assuming that the current windows W 1 , W 2 ... W m all display high-definition video, if RW 1/HD +RW 2/HD
+……+RW m/高清>总带宽预测值,则接收端决策取消为窗口W m订阅视频流,从而取消窗口W m中的视频显示,以保证窗口W 1、W 2……W m-1中视频的高清显示。 +...+RW m/HD >total bandwidth prediction value, then the receiving end decides to cancel the video stream subscription for window W m , thereby canceling the video display in window W m , so as to ensure that windows W 1 , W 2 ... W m- HD display of videos in 1 .
(3)、恢复订阅(3), resume subscription
在一些实施例中,若接收端经过总带宽预测值监测(即带宽资源监测),确定最新的总带宽预测值满足已取消订阅的窗口中,优先级最高的窗口恢复订阅且其他窗口保持当前显示的带宽需求(即第二预设条件),则接收端决策恢复为已取消订阅的窗口中,优先级最高的窗口订阅视频流,以恢复对应视频的显示。In some embodiments, if the receiving end monitors the total bandwidth prediction value (ie, bandwidth resource monitoring), and determines that the latest total bandwidth prediction value satisfies the unsubscribed window, the window with the highest priority resumes subscription and other windows remain currently displayed bandwidth requirement (that is, the second preset condition), the receiving end decides to revert to the unsubscribed window, and the window with the highest priority subscribes to the video stream, so as to restore the display of the corresponding video.
在另一些实施例中,若接收端经过总带宽预测值监测(即带宽资源监测),确定在预设时间内(如6秒),最新的总带宽预测值满足已取消订阅的窗口中,优先级最高的窗口恢复订阅且其他窗口保持当前显示的带宽需求,则接收端决策恢复为已取消订阅的窗口中,优先级最高的窗口订阅视频流,以恢复对应视频的显示。In some other embodiments, if the receiver determines that the latest total bandwidth prediction value meets the unsubscribed window within a preset time (such as 6 seconds) through the total bandwidth prediction value monitoring (i.e., bandwidth resource monitoring), priority If the window with the highest priority resumes subscription and other windows maintain the current display bandwidth requirements, the receiving end decides to revert to the unsubscribed window, and the window with the highest priority subscribes to the video stream to resume the display of the corresponding video.
例如,假设当前窗口W 1、W 2……W m-1均显示高清视频(即第一清晰度的视频),窗口W m不显示视频,若接收端经过总带宽预测值监测,确定连续6秒内,总带宽预测值≥RW 1/高清+RW 2/高清+……+RW m普清,则接收端决策恢复为窗口W m订阅视频流,以恢复窗口W m中的视频显示。例如,接收端决策恢复为窗口W m订阅第二清晰度的视频。 For example, assuming that the current windows W 1 , W 2 ... W m-1 all display high-definition video (that is, the first-definition video), and the window W m does not display video, if the receiving end is monitored by the total bandwidth prediction value, it is determined that the continuous 6 Within seconds, if the total bandwidth prediction value is greater than or equal to RW 1/HD +RW 2/HD +…+RW m, then the receiving end decides to revert to subscribing to the video stream for window W m , so as to restore the video display in window W m . For example, the receiving end decides to resume subscribing to the second-definition video for the window W m .
又如,假设当前窗口W 1、W 2……W m-2均显示高清视频(即第一清晰度的视频),窗口W m-1和W m不显示视频,若接收端经过总带宽预测值监测,确定连续6秒内,RW 1/ 高清+RW 2/高清+……+RW m-1普清+RW m普清>总带宽预测值≥RW 1/高清+RW 2/高清+……+RW m-1普清,则接收端决策恢复为窗口W m-1订阅视频流,以恢复窗口W m-1中的视频显示,但是窗 口W m的状态仍然是取消订阅。 As another example, assuming that the current windows W 1 , W 2 ... W m-2 all display high-definition video (that is, the first-definition video), and windows W m-1 and W m do not display video, if the receiving end passes the total bandwidth prediction Value monitoring, confirm that within 6 consecutive seconds, RW 1/ HD +RW 2/HD +…+RW m-1PuD +RW mPuD >total bandwidth prediction value≥RW 1/HD +RW 2/HD +… …+RW m-1 normal clear , then the receiving end decides to revert to subscribing to the video stream of window W m-1 to restore the video display in window W m-1 , but the status of window W m is still unsubscribed.
(4)、延迟订阅(4), delayed subscription
延迟订阅是指在优先级高于某一窗口的窗口处于取消订阅的状态时,延迟为该窗口订阅视频流,以延迟恢复该窗口中的视频显示。Delayed subscription means that when a window with a higher priority than a certain window is in the unsubscribed state, it delays subscribing to the video stream for the window, so as to delay the restoration of the video display in the window.
例如,假设当前窗口W 1、W 2……W m-2均显示高清视频,窗口W m-1和W m不显示视频,则在窗口W m-1未恢复订阅之前,窗口W m的状态仍然是取消订阅。 For example, assuming that the current windows W 1 , W 2 ... W m-2 all display high-definition video, and windows W m-1 and W m do not display video, then before window W m-1 resumes subscription, the state of window W m Still unsubscribe.
(5)、提高清晰度(5), improve clarity
在一些实施例中,在降低清晰度之后,若接收端经过总带宽预测值监测(即带宽资源监测),确定最新的总带宽预测值满足多个已被降低清晰度的窗口中,优先级最高的窗口提高清晰度显示且其他窗口保持当前显示的带宽需求(即第一预设条件),则接收端决策提高多个已被降低清晰度的窗口中,优先级最高的窗口中视频的清晰度,以保证高优先级窗口中视频的清晰度。In some embodiments, after reducing the definition, if the receiving end is monitored by the total bandwidth prediction value (ie, bandwidth resource monitoring), and it is determined that the latest total bandwidth prediction value satisfies multiple windows that have been reduced in definition, the priority is the highest. The window improves the definition display and other windows maintain the current display bandwidth requirements (that is, the first preset condition), then the receiving end decides to improve the definition of the video in the window with the highest priority among multiple windows that have been reduced in definition , to ensure the clarity of the video in the high-priority window.
在另一些实施例中,在降低清晰度之后,若接收端经过总带宽预测值监测(即带宽资源监测),确定在预设时间内(如6秒),最新的总带宽预测值满足多个已被降低清晰度的窗口中,优先级最高的窗口提高清晰度显示且其他窗口保持当前显示的带宽需求,则接收端决策提高多个已被降低清晰度的窗口中,优先级最高的窗口中视频的清晰度,以保证高优先级窗口中视频的清晰度。In some other embodiments, after reducing the definition, if the receiving end is monitored by the total bandwidth prediction value (i.e., bandwidth resource monitoring), it is determined that within a preset time (such as 6 seconds), the latest total bandwidth prediction value satisfies multiple Among the windows whose resolution has been reduced, the window with the highest priority is displayed with higher resolution and other windows maintain the bandwidth requirements of the current display, then the receiving end decides to increase the resolution of the windows with the highest priority The clarity of the video to ensure the clarity of the video in the high priority window.
例如,假设当前窗口W 1、W 2……W m-1均显示高清视频(即第一清晰度的视频),窗口W m显示普清视频(即第二清晰度的视频),若接收端经过总带宽预测值监测,确定连续6秒内,总带宽预测值≥RW 1/高清+RW 2/高清+……+RW m-1高清,则接收端决策提高窗口W m-1的清晰度,例如由普清显示(即以第二清晰度显示)切换为高清显示(即以第一清晰度显示)。 For example , assuming that the current windows W 1 , W 2 . After monitoring the total bandwidth prediction value, it is determined that within 6 consecutive seconds, the total bandwidth prediction value ≥ RW 1/HD +RW 2/HD +...+RW m-1 HD , then the receiving end decides to increase the definition of the window W m-1 , for example, switching from normal definition display (that is, display with the second definition) to high definition display (that is, display with the first definition).
又如,假设当前窗口W 1、W 2……W m-2均显示高清视频(即第一清晰度的视频),窗口W m-1和W m显示普清视频(即第二清晰度的视频),若接收端经过总带宽预测值监测,确定连续6秒内,RW 1/高清+RW 2/高清+……+RW m-1高清+RW m高清>总带宽预测值≥RW 1/ 高清+RW 2/高清+……+RW m-1高清,则接收端决策提高窗口W m-1的清晰度,例如由普清显示(即以第二清晰度显示)切换为高清显示(即以第一清晰度显示),但是窗口W m仍显示普清视频(即第二清晰度的视频)。 As another example, assume that the current windows W 1 , W 2 . video), if the receiving end has been monitored by the total bandwidth prediction value, it is determined that within 6 consecutive seconds, RW 1/HD +RW 2/HD +...+RW m-1 HD +RW mHD >total bandwidth prediction value ≥ RW 1/ HD +RW 2/HD +...+RW m-1 HD , then the receiving end decides to improve the definition of window W m-1 , for example, switch from ordinary clear display (that is, display with the second definition) to high-definition display (that is, is displayed in the first definition), but the window Wm still displays the normal definition video (that is, the video in the second definition).
需要说明的是,上述(1)-(5)仅作为几种订阅策略调整示例,在一些实施例中,接收端可以根据多个窗口对应的优先级,调整多个窗口中音视频流的订阅策略。It should be noted that the above (1)-(5) are only examples of several subscription policy adjustments. In some embodiments, the receiving end can adjust the subscription of audio and video streams in multiple windows according to the priorities corresponding to multiple windows. Strategy.
例如,接收端可以决策在为窗口W m取消订阅的同时,降低窗口W m-1中视频的清晰度(例如由第一清晰度显示切换为第二清晰度显示。) For example, the receiving end may decide to reduce the resolution of the video in the window W m-1 while unsubscribing from the window W m (for example, switching from the first resolution display to the second resolution display.)
又如,接收端可以决策在降低窗口W m中视频的清晰度的同时,降低窗口W m-1中视频的清晰度。例如由第一清晰度显示切换为第二清晰度显示。 For another example, the receiving end may decide to reduce the definition of the video in the window Wm- 1 while reducing the definition of the video in the window Wm -1 . For example, switching from the first definition display to the second definition display.
同样,在网络好转时,接收端可以决策在恢复窗口W m以第二清晰度显示的同时,提高窗口W m-1中视频的清晰度(例如由第二清晰度显示切换为第一清晰度显示。) Similarly, when the network is getting better, the receiving end can decide to improve the definition of the video in the window Wm -1 while restoring the display of the window Wm in the second definition (for example, switching from the second definition display to the first definition show.)
又如,在网络好转时,接收端可以决策在提高窗口W m中视频的清晰度的同时,提高窗口W m-1中视频的清晰度。例如由第二清晰度显示切换为第一清晰度显示。 For another example, when the network is improving, the receiving end may decide to increase the definition of the video in the window Wm- 1 while improving the definition of the video in the window Wm -1 . For example, switching from the second-definition display to the first-definition display.
进一步的,接收端在确定对一个或多个窗口中音视频流的订阅策略之后,如图13 所示,本申请实施例提供的方法还包括步骤S805:Further, after the receiving end determines the subscription strategy for audio and video streams in one or more windows, as shown in Figure 13, the method provided by the embodiment of the present application further includes step S805:
S805、接收端根据最新订阅策略向多个发送端订阅音视频流。S805. The receiving end subscribes to multiple sending ends for audio and video streams according to the latest subscription policy.
以图7所示多窗口视频通信架构为例,如图9所示,上述步骤S801具体可以包括:接收端D根据确定的最新订阅策略向发送端A、发送端B和发送端C订阅音视频流。Taking the multi-window video communication architecture shown in Figure 7 as an example, as shown in Figure 9, the above step S801 may specifically include: the receiving end D subscribes to the sending end A, the sending end B, and the sending end C according to the determined latest subscription policy. flow.
例如,接收端可以向第三设备发送以下信息,以请求第三设备向第一设备订阅对应音视频流:发送端的标识(ID)、窗口对应的优先级、原订阅参数、目标订阅参数、窗口标识(Surface ID)。For example, the receiving end may send the following information to the third device to request the third device to subscribe to the corresponding audio and video stream from the first device: the identification (ID) of the sending end, the priority corresponding to the window, the original subscription parameter, the target subscription parameter, the window Identification (Surface ID).
示例性的,假设接收端确定由为窗口W m订阅高清视频流,切换为为窗口W m订阅普清视频流,其中窗口W m对应的音视频流的发送端是发送端A,窗口W m对应的优先级是3,则接收端可以向第三设备发送以下信息,以请求第三设备向第一设备订阅对应参数的音视频流:发送端A的ID、3(优先级)、高清(原订阅参数)、普清(目标订阅参数)、窗口W m的ID。 Exemplarily, it is assumed that the receiving end determines to subscribe to the high-definition video stream for the window W m and switch to subscribing for the ordinary clear video stream for the window W m , wherein the sending end of the audio and video stream corresponding to the window W m is the sending end A, and the window W m The corresponding priority is 3, then the receiving end can send the following information to the third device to request the third device to subscribe to the first device for the audio and video stream of the corresponding parameters: ID of sending end A, 3 (priority), high-definition ( original subscription parameter), common clear (target subscription parameter), and the ID of the window W m .
需要说明的是,本申请上述图8和图9所示实施例仅以第三设备(如SFU)预测下行带宽以及向接收端发送带宽预测结果作为示例,本申请不限定负责下行带宽预测的具体设备。例如,在本申请实施例中,接收端(即第二设备)还可以在接收多个音视频流的过程中,测量得到对多条链路的带宽预测结果。其中,上述多条链路与多个音视频流对应。例如,上述多条链路分别用于传输上述多个音视频流。It should be noted that the above-mentioned embodiments shown in FIG. 8 and FIG. 9 of this application only use the third device (such as SFU) to predict the downlink bandwidth and send the bandwidth prediction result to the receiving end as an example, and this application does not limit the specific device responsible for downlink bandwidth prediction. equipment. For example, in the embodiment of the present application, the receiving end (that is, the second device) may also measure and obtain bandwidth prediction results of multiple links during the process of receiving multiple audio and video streams. Wherein, the above multiple links correspond to multiple audio and video streams. For example, the above multiple links are respectively used to transmit the above multiple audio and video streams.
作为另一种实现方式,在本申请实施例中,若由于处于弱网环境导致有窗口降低清晰度、取消订阅或延迟订阅等,电子设备显示屏上可以显示提示信息,以提示用户当前网络较差。例如,图14示出了处于弱网环境时,优先级较低(优先级为2)的小窗2取消订阅后,显示弱网提示的示例。As another implementation, in the embodiment of this application, if there is a window that reduces the clarity, cancels subscription or delays subscription due to being in a weak network environment, a prompt message can be displayed on the display screen of the electronic device to remind the user that the current network is relatively weak. Difference. For example, FIG. 14 shows an example of displaying a weak network prompt after widget 2 with a lower priority (priority 2) unsubscribes when in a weak network environment.
在本申请实施例中,窗口对应的优先级可以由用户指定,也可以由电子设备自行确定,以下将结合附图,举例介绍几种窗口对应的优先级的确定方法:In the embodiment of the present application, the priority corresponding to the window can be specified by the user, or can be determined by the electronic device itself. The following will introduce several methods of determining the priority corresponding to the window with reference to the accompanying drawings:
(a)、窗口对应的优先级由用户指定。(a) The priority corresponding to the window is specified by the user.
示例性的,作为一种实现方式,用户可以通过改变窗口的尺寸,以进行优先权自定义设置。例如,将小窗增大为大窗,以提高窗口对应的优先级。如图15所示,图15中的(a)所示窗口A、窗口B、窗口C和窗口D的优先级均为2,响应于用户将窗口A拉伸为大窗的操作1401,如图15中的(b)所示,电子设备在屏幕中间显示大窗口D,以及电子设备将窗口D对应的优先级设置为1。Exemplarily, as an implementation manner, the user can customize the priority setting by changing the size of the window. For example, a small window is enlarged to a large window, so as to increase the priority corresponding to the window. As shown in Figure 15, the priorities of window A, window B, window C and window D shown in (a) in Figure 15 are all 2, in response to the user's operation 1401 of stretching window A into a large window, as shown As shown in (b) in 15, the electronic device displays a large window D in the middle of the screen, and the electronic device sets the priority corresponding to window D to 1.
作为另一种实现方式,用户可以通过拖动窗口,以改变窗口的排序,以进行优先权自定义设置。例如,将窗口拖动至屏幕中间,以提高窗口对应的优先级。如图16所示,图16中的(a)所示窗口B、窗口C和窗口D的优先级均为2,响应于用户将窗口D拖动至屏幕中间的操作1501,如图16中的(b)所示,电子设备在屏幕中间显示窗口D,以及电子设备将窗口D对应的优先级设置为1。As another implementation manner, the user can change the order of the windows by dragging the windows, so as to customize the priority settings. For example, drag a window to the middle of the screen to increase the corresponding priority of the window. As shown in Figure 16, the priorities of window B, window C and window D shown in (a) in Figure 16 are all 2, in response to the user's operation 1501 of dragging window D to the middle of the screen, as shown in Figure 16 As shown in (b), the electronic device displays window D in the middle of the screen, and the electronic device sets the priority corresponding to window D to 1.
需要说明的是,本申请实施例不对用户指定窗口对应的优先级的具体方式和方法。例如,用户还可以在菜单中设置窗口的优先级。It should be noted that the embodiment of the present application does not specify a specific manner and method for the user to specify the priority corresponding to the window. For example, the user can also set the priority of the window in the menu.
(b)、窗口对应的优先级由电子设备根据多个窗口对应的音频的音量确定。(b), the priority corresponding to the window is determined by the electronic device according to the volume of the audio corresponding to the multiple windows.
作为一种实现方式,电子设备可以根据多个窗口对应的音频的初始音量确定窗口对应的优先级。其中,音频的初始音量用于表征电子设备接收到音频流时,音频流的 原始音量。As an implementation manner, the electronic device may determine the priorities corresponding to the windows according to the initial volumes of the audios corresponding to the multiple windows. Wherein, the initial volume of the audio is used to represent the original volume of the audio stream when the electronic device receives the audio stream.
可以理解,多个用户进行多方实时视频通话的时候,当前发言人的音量通常是比较大的。例如在多方会议场景中,当前会议主讲人的音量通常是比较大的。又如在群视频聊天场景中,当前说话的用户的音量通常是比较大的。因此,在本申请实施例中,电子设备可以根据多个窗口对应的音频的初始音量适应性调整窗口对应的优先级。It can be understood that when multiple users conduct a multi-party real-time video call, the volume of the current speaker is usually relatively loud. For example, in a multi-party conference scenario, the volume of the speaker of the current conference is usually relatively loud. For another example, in a group video chat scene, the volume of the currently speaking user is usually relatively loud. Therefore, in the embodiment of the present application, the electronic device may adaptively adjust the priority corresponding to the windows according to the initial volume of the audio corresponding to the multiple windows.
作为另一种实现方式,电子设备可以根据多个窗口对应的音频的播放音量确定窗口对应的优先级。As another implementation manner, the electronic device may determine the priority corresponding to the windows according to the playback volume of the audio corresponding to the multiple windows.
可以理解,在多个用户进行多方实时视频通话的时候,接收端的用户可以根据其关注点和兴趣点个性化设置不同窗口对应的音频的播放音量。例如对用户最关注的窗口,用户可以将播放音量调高,而对于用户不关注的窗口,用户可以将播放音量调低。基于此,在本申请实施例中,电子设备可以根据多个窗口对应的音频的播放音量适应性调整窗口对应的优先级。It can be understood that when multiple users are conducting a multi-party real-time video call, the user at the receiving end can individually set the playback volume of the audio corresponding to different windows according to their concerns and points of interest. For example, for the window that the user pays most attention to, the user can increase the playback volume, and for the window that the user does not pay attention to, the user can lower the playback volume. Based on this, in the embodiment of the present application, the electronic device may adaptively adjust the priority corresponding to the windows according to the playback volume of the audio corresponding to the multiple windows.
(c)、窗口对应的优先级由电子设备根据多个窗口中业务的功能确定。(c), the priority corresponding to the window is determined by the electronic device according to the functions of the services in the multiple windows.
以图10中的(c)所示网上教育场景为例,图10中的(c)所示界面包括课件/白板/屏幕共享窗口和讲师人像所在的窗口,其中,课件/白板/屏幕共享窗口用于显示课件/白板或者展示共享界面,讲师人像所在的窗口用于实时播放讲师视频。可以理解,网上教育的核心功能在于课堂内容的展示,讲师人像是否流畅、清晰不会影响课堂内容的获取,因此,课件/白板/屏幕共享窗口对应的优先级高于讲师人像所在的窗口。Taking the online education scene shown in (c) in Figure 10 as an example, the interface shown in (c) in Figure 10 includes the courseware/whiteboard/screen sharing window and the window where the lecturer portrait is located, wherein the courseware/whiteboard/screen sharing window It is used to display the courseware/whiteboard or display the shared interface, and the window where the lecturer portrait is located is used to play the lecturer video in real time. It can be understood that the core function of online education lies in the display of classroom content. Whether the lecturer portrait is smooth and clear will not affect the acquisition of classroom content. Therefore, the corresponding priority of the courseware/whiteboard/screen sharing window is higher than that of the lecturer portrait window.
需要说明的是,上述电子设备根据多个窗口对应的音频的音量或多个窗口中业务的功能确定窗口对应的优先级仅作为两种示例,本申请实施例对于电子设备确定窗口对应的优先级的具体规则和方法不做限定。例如,窗口对应的优先级还可以由电子设备根据多个窗口中视频的属性等其他因素确定。It should be noted that the above-mentioned electronic device determines the priority corresponding to the window according to the volume of the audio corresponding to the multiple windows or the function of the business in the multiple windows as only two examples. The embodiment of the present application determines the priority corresponding to the window for the electronic device The specific rules and methods are not limited. For example, the priorities corresponding to the windows may also be determined by the electronic device according to other factors such as attributes of videos in multiple windows.
在本申请实施例提供的方法中,在不同的业务场景下或者不同的用户需求下,多个窗口可以对应不同的优先级。电子设备可以在下行带宽受限或带宽波动较大时,基于具体优先级对视频流进行降级订阅,例如降低视频清晰度、取消订阅视频流或延迟订阅视频流等,以避免网络拥塞,同时保证高优先级的视频的流畅度和/或清晰度。In the method provided in the embodiment of the present application, multiple windows may correspond to different priorities under different business scenarios or different user requirements. Electronic devices can downgrade and subscribe to video streams based on specific priorities when the downlink bandwidth is limited or the bandwidth fluctuates greatly, such as reducing video resolution, unsubscribing video streams, or delaying subscribing to video streams, etc., to avoid network congestion and ensure Smoothness and/or clarity of video is a high priority.
进一步的,在电子设备确定网络情况好转时,可以恢复被取消订阅的视频(即恢复订阅)或者恢复被降级订阅的视频的清晰度(即提高清晰度),以最大程度地保障多个窗口中视频播放的流畅度和/或清晰度。Further, when the electronic device determines that the network situation has improved, it can restore the unsubscribed video (that is, restore the subscription) or restore the definition of the downgraded video (that is, improve the definition), so as to ensure maximum protection in multiple windows. The smoothness and/or clarity of video playback.
应理解,本申请实施例的各个方案可以进行合理的组合使用,并且实施例中出现的各个术语的解释或说明可以在各个实施例中互相参考或解释,对此不作限定。It should be understood that various schemes of the embodiments of the present application can be used in a reasonable combination, and the explanations or descriptions of various terms appearing in the embodiments can be referred to or interpreted in each embodiment, which is not limited.
还应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should also be understood that in various embodiments of the present application, the serial numbers of the above-mentioned processes do not mean the order of execution, and the order of execution of the processes should be determined by their functions and internal logic, and should not be implemented in this application. The implementation of the examples constitutes no limitation.
可以理解的是,电子设备(如第一设备、第二设备或第三设备)为了实现上述任一个实施例的功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。 专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。It can be understood that, in order to realize the functions of any one of the above-mentioned embodiments, the electronic device (such as the first device, the second device or the third device) includes corresponding hardware structures and/or software modules for performing various functions. Those skilled in the art should easily realize that the present application can be implemented in the form of hardware or a combination of hardware and computer software in combination with the units and algorithm steps of each example described in the embodiments disclosed herein. Whether a certain function is executed by hardware or computer software drives hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
本申请实施例可以对电子设备(如第一设备、第二设备或第三设备)进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。The embodiment of the present application can divide the functional modules of the electronic device (such as the first device, the second device or the third device), for example, each functional module can be divided corresponding to each function, or two or more functions can be integrated in one processing module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. It should be noted that the division of modules in the embodiment of the present application is schematic, and is only a logical function division, and there may be other division methods in actual implementation.
比如,以采用集成的方式划分各个功能模块的情况下,如图17所示,为本申请实施例提供的一种电子设备的结构框图。例如,该电子设备可以是第一设备、第二设备或第三设备。如图17所示,该电子设备可以包括收发单元1710、处理单元1720和存储单元1730。For example, in the case of dividing various functional modules in an integrated manner, as shown in FIG. 17 , it is a structural block diagram of an electronic device provided in the embodiment of the present application. For example, the electronic device may be a first device, a second device or a third device. As shown in FIG. 17 , the electronic device may include a transceiver unit 1710 , a processing unit 1720 and a storage unit 1730 .
其中,在电子设备为第二设备时,收发单元1710用于支持第二设备接收来自多个第一设备的音视频流。例如,收发单元1710用于支持第二设备接收第三设备转发的,来自多个第一设备的音视频流。在一些实施例中,收发单元1710还可以用于支持第二设备从第三设备接收多个音视频流对应的带宽预测值。进一步的,收发单元1710还可以用于支持第二设备向第一设备订阅音视频流,和/或与本申请实施例相关的其他过程。Wherein, when the electronic device is a second device, the transceiver unit 1710 is configured to support the second device to receive audio and video streams from multiple first devices. For example, the transceiver unit 1710 is configured to support the second device to receive audio and video streams from multiple first devices forwarded by the third device. In some embodiments, the transceiver unit 1710 may also be configured to support the second device to receive bandwidth prediction values corresponding to multiple audio and video streams from the third device. Further, the transceiver unit 1710 may also be used to support the second device to subscribe to the audio and video stream from the first device, and/or other processes related to the embodiment of the present application.
处理单元1720用于支持第二设备根据多个带宽预测结果得到总带宽预测值,根据总带宽预测值确定处于弱网环境,以及调整对一个或多个窗口中音视频流的订阅策略。在一些实施例中,处理单元1720还用于支持第二设备测量多个音视频流对应的带宽预测结果,和/或与本申请实施例相关的其他过程。The processing unit 1720 is configured to support the second device to obtain a total bandwidth prediction value according to multiple bandwidth prediction results, determine that it is in a weak network environment according to the total bandwidth prediction value, and adjust a subscription strategy for audio and video streams in one or more windows. In some embodiments, the processing unit 1720 is further configured to support the second device to measure bandwidth prediction results corresponding to multiple audio and video streams, and/or other processes related to the embodiments of the present application.
存储单元1730用于存储计算机程序和实现本申请实施例提供的方法中的处理数据和/或处理结果等。The storage unit 1730 is used to store computer programs and implement processing data and/or processing results in the methods provided by the embodiments of the present application.
需要说明的是,上述收发单元1710可以包括射频电路。具体的,电子设备可以通过射频电路进行无线信号的接收和发送。通常,射频电路包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器、双工器等。此外,射频电路还可以通过无线通信和其他设备通信。所述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统、通用分组无线服务、码分多址、宽带码分多址、长期演进、电子邮件、短消息服务等。It should be noted that the transceiver unit 1710 may include a radio frequency circuit. Specifically, the electronic device can receive and send wireless signals through a radio frequency circuit. Typically, radio frequency circuitry includes, but is not limited to, an antenna, at least one amplifier, transceiver, coupler, low noise amplifier, duplexer, and the like. In addition, radio frequency circuits can also communicate with other devices through wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile Communications, General Packet Radio Service, Code Division Multiple Access, Wideband Code Division Multiple Access, Long Term Evolution, Email, Short Message Service, etc.
应理解,电子设备中的各个模块可以通过软件和/或硬件形式实现,对此不作具体限定。换言之,电子设备是以功能模块的形式来呈现。这里的“模块”可以指特定应用集成电路ASIC、电路、执行一个或多个软件或固件程序的处理器和存储器、集成逻辑电路,和/或其他可以提供上述功能的器件。It should be understood that each module in the electronic device may be implemented in the form of software and/or hardware, which is not specifically limited. In other words, electronic equipment is presented in the form of functional modules. The "module" here may refer to an application-specific integrated circuit ASIC, a circuit, a processor and memory executing one or more software or firmware programs, an integrated logic circuit, and/or other devices that can provide the above-mentioned functions.
在一种可选的方式中,当使用软件实现数据传输时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地实现本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线((digital subscriber  line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如软盘、硬盘、磁带)、光介质(例如数字化视频光盘(digital video disk,DVD))、或者半导体介质(例如固态硬盘solid state disk(SSD))等。In an optional manner, when software is used to implement data transmission, it may be implemented in whole or in part in the form of computer program products. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are realized in whole or in part. The computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes one or more available media integrated. The available medium can be a magnetic medium, (such as a floppy disk, a hard disk, etc. , tape), optical media (such as digital video disk (digital video disk, DVD)), or semiconductor media (such as solid state disk (SSD)), etc.
结合本申请实施例所描述的方法或者算法的步骤可以硬件的方式来实现,也可以是由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于RAM存储器、闪存、ROM存储器、EPROM存储器、EEPROM存储器、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于电子设备中。当然,处理器和存储介质也可以作为分立组件存在于电子设备中。The steps of the methods or algorithms described in conjunction with the embodiments of the present application may be implemented in hardware, or may be implemented in a manner in which a processor executes software instructions. The software instructions can be composed of corresponding software modules, and the software modules can be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, mobile hard disk, CD-ROM or any other form of storage known in the art medium. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be a component of the processor. The processor and storage medium can be located in the ASIC. Alternatively, the ASIC may be located in the electronic device. Of course, the processor and the storage medium can also exist in the electronic device as discrete components.
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。Through the description of the above embodiments, those skilled in the art can clearly understand that for the convenience and brevity of the description, only the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned functions can be allocated according to needs It is completed by different functional modules, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.

Claims (20)

  1. 一种通信系统,其特征在于,所述通信系统包括:A communication system, characterized in that the communication system includes:
    多个第一设备,用于:向所述第二设备发送多个音视频流,所述多个音视频流对应的音频和视频分别在所述第二设备界面上的多个窗口中播放;The multiple first devices are configured to: send multiple audio and video streams to the second device, and the audio and video corresponding to the multiple audio and video streams are respectively played in multiple windows on the interface of the second device;
    第二设备,用于:确定带宽资源不足以满足所述多个音视频流的带宽需求;The second device is configured to: determine that bandwidth resources are insufficient to meet the bandwidth requirements of the multiple audio and video streams;
    根据所述多个窗口对应的优先级调整对一个或多个窗口中音视频流的订阅策略。Adjust the subscription strategy for audio and video streams in one or more windows according to the priorities corresponding to the multiple windows.
  2. 根据权利要求1所述的通信系统,其特征在于,所述通信系统还包括:一个或多个第三设备;The communication system according to claim 1, further comprising: one or more third devices;
    所述一个或多个第三设备用于:从所述多个第一设备接收所述多个音视频流;The one or more third devices are configured to: receive the multiple audio and video streams from the multiple first devices;
    向所述第二设备转发所述多个音视频流。Forward the multiple audio and video streams to the second device.
  3. 根据权利要求2所述的通信系统,其特征在于,所述一个或多个第三设备还用于:在向所述第二设备转发来自所述多个第一设备的多个音视频流的过程中,测量得到对多条链路的多个带宽预测结果,所述多条链路与所述多个音视频流对应;The communication system according to claim 2, wherein the one or more third devices are further configured to: forward multiple audio and video streams from the multiple first devices to the second device In the process, the measurement obtains multiple bandwidth prediction results for multiple links, and the multiple links correspond to the multiple audio and video streams;
    所述第二设备用于:根据所述多个带宽预测结果确定所述带宽资源不足以满足所述多个音视频流的带宽需求。The second device is configured to: determine, according to the multiple bandwidth prediction results, that the bandwidth resource is insufficient to meet the bandwidth requirements of the multiple audio and video streams.
  4. 根据权利要求1所述的通信系统,其特征在于,所述第二设备还用于:在接收所述多个音视频流的过程中,测量得到对多条链路的多个带宽预测结果,所述多条链路与所述多个音视频流对应;The communication system according to claim 1, wherein the second device is further configured to: measure and obtain multiple bandwidth prediction results for multiple links during the process of receiving the multiple audio and video streams, The multiple links correspond to the multiple audio and video streams;
    所述第二设备用于:根据所述多个带宽预测结果确定所述带宽资源不足以满足所述多个音视频流的带宽需求。The second device is configured to: determine, according to the multiple bandwidth prediction results, that the bandwidth resource is insufficient to meet the bandwidth requirements of the multiple audio and video streams.
  5. 根据权利要求1-4中任一项所述的通信系统,其特征在于,所述第二设备用于:The communication system according to any one of claims 1-4, wherein the second device is used for:
    所述第二设备根据所述多个窗口对应的优先级,降低一个或多个窗口对应的视频流的清晰度和/或取消订阅一个或多个窗口对应的视频流。The second device reduces the definition of video streams corresponding to one or more windows and/or unsubscribes from the video streams corresponding to one or more windows according to the priorities corresponding to the multiple windows.
  6. 一种多窗口视频通信方法,其特征在于,所述方法包括:A multi-window video communication method, characterized in that the method comprises:
    所述第二设备接收分别来自多个第一设备的多个音视频流,所述多个音视频流对应的音频和视频分别在所述第二设备界面上的多个窗口中播放;The second device receives multiple audio and video streams from multiple first devices, and the audio and video corresponding to the multiple audio and video streams are respectively played in multiple windows on the interface of the second device;
    所述第二设备确定带宽资源不足以满足所述多个音视频流的带宽需求;The second device determines that bandwidth resources are insufficient to meet the bandwidth requirements of the multiple audio and video streams;
    所述第二设备根据所述多个窗口对应的优先级调整对一个或多个窗口中音视频流的订阅策略。The second device adjusts a subscription strategy for audio and video streams in one or more windows according to priorities corresponding to the multiple windows.
  7. 根据权利要求6所述的方法,其特征在于,所述第二设备接收分别来自所述多个第一设备的多个音视频流,包括:The method according to claim 6, wherein the second device receives multiple audio and video streams respectively from the multiple first devices, comprising:
    所述第二设备接收第三设备转发的,分别来自所述多个第一设备的多个音视频流。The second device receives multiple audio and video streams respectively from the multiple first devices forwarded by the third device.
  8. 根据权利要求7所述的方法,其特征在于,所述方法还包括:The method according to claim 7, wherein the method further comprises:
    所述第二设备从所述第三设备接收对多条链路的多个带宽预测结果,所述多条链路与所述多个音视频流对应;The second device receives multiple bandwidth prediction results for multiple links from the third device, and the multiple links correspond to the multiple audio and video streams;
    所述第二设备确定所述带宽资源不足以满足所述多个音视频流的带宽需求,包括:The second device determines that the bandwidth resources are insufficient to meet the bandwidth requirements of the multiple audio and video streams, including:
    所述第二设备根据所述多个带宽预测结果确定所述带宽资源不足以满足所述多个音视频流的带宽需求。The second device determines, according to the multiple bandwidth prediction results, that the bandwidth resource is insufficient to meet the bandwidth requirements of the multiple audio and video streams.
  9. 根据权利要求6或7所述的方法,其特征在于,所述方法还包括:The method according to claim 6 or 7, characterized in that the method further comprises:
    所述第二设备测量得到对多条链路的带宽预测结果,所述多条链路与所述多个音视频流对应;The second device measures and obtains bandwidth prediction results for multiple links, and the multiple links correspond to the multiple audio and video streams;
    所述第二设备确定所述带宽资源不足以满足所述多个音视频流的带宽需求,包括:The second device determines that the bandwidth resources are insufficient to meet the bandwidth requirements of the multiple audio and video streams, including:
    所述第二设备根据所述多个带宽预测结果确定所述带宽资源不足以满足所述多个音视频流的带宽需求。The second device determines, according to the multiple bandwidth prediction results, that the bandwidth resource is insufficient to meet the bandwidth requirements of the multiple audio and video streams.
  10. 根据权利要求6-9中任一项所述的方法,其特征在于,第一窗口以第一清晰度播放对应音视频流,所述第一窗口是所述多个窗口中优先级最低的窗口;The method according to any one of claims 6-9, wherein the first window plays the corresponding audio and video stream with the first definition, and the first window is the window with the lowest priority among the plurality of windows ;
    所述第二设备根据所述多个窗口对应的优先级调整对一个或多个窗口中音视频流的订阅策略,包括:The second device adjusts a subscription strategy for audio and video streams in one or more windows according to the priorities corresponding to the multiple windows, including:
    所述第二设备为所述第一窗口订阅第二清晰度的音视频流,所述第二清晰度小于所述第一清晰度。The second device subscribes to an audio and video stream of a second definition for the first window, and the second definition is smaller than the first definition.
  11. 根据权利要求10所述的方法,其特征在于,在所述第二设备为所述第一窗口订阅第二清晰度的音视频流之后,所述方法还包括:The method according to claim 10, characterized in that, after the second device subscribes to the audio and video stream of the second definition for the first window, the method further comprises:
    在满足第一预设条件时,所述第二设备为所述第一窗口订阅所述第一清晰度的音视频流。When the first preset condition is met, the second device subscribes the first window to the audio and video stream of the first definition.
  12. 根据权利要求6-9中任一项所述的方法,其特征在于,第二窗口以第二清晰度播放音视频流,所述第二清晰度小于或等于预设值,所述第二窗口是所述多个窗口中、优先级最低的窗口,所述第二设备根据所述多个窗口对应的优先级调整对一个或多个窗口中音视频流的订阅策略,包括:The method according to any one of claims 6-9, wherein the second window plays the audio and video stream at a second definition, the second definition is less than or equal to a preset value, and the second window It is the window with the lowest priority among the plurality of windows, and the second device adjusts a subscription strategy for audio and video streams in one or more windows according to the priorities corresponding to the plurality of windows, including:
    所述第二设备取消订阅所述第二窗口对应的视频流。The second device unsubscribes from the video stream corresponding to the second window.
  13. 根据权利要求12所述的方法,其特征在于,在所述第二设备取消订阅所述第二窗口对应的视频流之后,所述方法还包括:The method according to claim 12, wherein after the second device unsubscribes from the video stream corresponding to the second window, the method further comprises:
    所述第二设备在所述第二窗口上显示蒙层。The second device displays a mask on the second window.
  14. 根据权利要求12或13所述的方法,其特征在于,在所述第二设备取消订阅第二窗口对应的视频流之后,所述方法还包括:The method according to claim 12 or 13, wherein after the second device unsubscribes from the video stream corresponding to the second window, the method further comprises:
    在满足第二预设条件时,所述第二设备为所述第二窗口恢复订阅所述第二清晰度的视频流。When the second preset condition is met, the second device resumes subscribing to the video stream of the second definition for the second window.
  15. 根据权利要求6-14中任一项所述的方法,其特征在于,所述多个窗口对应的优先级由所述第二设备根据以下中的一个或多个确定:The method according to any one of claims 6-14, wherein the priorities corresponding to the plurality of windows are determined by the second device according to one or more of the following:
    所述多个窗口对应的音频的初始音量;所述音频的初始音量用于表征所述第二设备接收到所述音频流时,所述音频流的原始音量;The initial volume of the audio corresponding to the plurality of windows; the initial volume of the audio is used to represent the original volume of the audio stream when the second device receives the audio stream;
    所述多个窗口对应的音频的播放音量;The playback volume of the audio corresponding to the plurality of windows;
    所述多个窗口中业务的功能。Functions of services in the multiple windows.
  16. 根据权利要求6-14中任一项所述的方法,其特征在于,所述多个窗口对应的优先级由所述第二设备根据用户的自定义指定操作确定。The method according to any one of claims 6-14, wherein the priorities corresponding to the plurality of windows are determined by the second device according to user-defined specified operations.
  17. 根据权利要求7-16中任一项所述的方法,其特征在于,所述第三设备是选择性转发单元SFU。The method according to any one of claims 7-16, characterized in that the third device is a Selective Forwarding Unit (SFU).
  18. 一种电子设备,其特征在于,所述电子设备包括:An electronic device, characterized in that the electronic device comprises:
    存储器,用于存储计算机程序;memory for storing computer programs;
    处理器,用于执行所述计算机程序,使得所述电子设备实现如权利要求6-17中任一项所述的方法。A processor, configured to execute the computer program, so that the electronic device implements the method according to any one of claims 6-17.
  19. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序代码,所述计算机程序代码被处理电路执行时实现如权利要求6-17中任一项所述的方法。A computer-readable storage medium, characterized in that computer program code is stored on the computer-readable storage medium, and when the computer program code is executed by a processing circuit, the method according to any one of claims 6-17 is realized. method.
  20. 一种芯片系统,其特征在于,所述芯片系统包括处理电路、存储介质,所述存储介质中存储有计算机程序代码;所述计算机程序代码被所述处理电路执行时实现如权利要求6-17中任一项所述的方法。A system on a chip, characterized in that the system on chip includes a processing circuit and a storage medium, and computer program codes are stored in the storage medium; when the computer program code is executed by the processing circuit, claims 6-17 are implemented. any one of the methods described.
PCT/CN2022/109423 2021-08-03 2022-08-01 Multi-window video communication method, device and system WO2023011408A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110887044.0 2021-08-03
CN202110887044.0A CN115706829A (en) 2021-08-03 2021-08-03 Multi-window video communication method, device and system

Publications (1)

Publication Number Publication Date
WO2023011408A1 true WO2023011408A1 (en) 2023-02-09

Family

ID=85154361

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/109423 WO2023011408A1 (en) 2021-08-03 2022-08-01 Multi-window video communication method, device and system

Country Status (2)

Country Link
CN (1) CN115706829A (en)
WO (1) WO2023011408A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117135364A (en) * 2023-10-26 2023-11-28 深圳市宏辉智通科技有限公司 Video decoding method and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009033348A (en) * 2007-07-25 2009-02-12 Toshiba Corp Video conference application server, and video conference method, and program
CN101557495A (en) * 2009-05-18 2009-10-14 上海华平信息技术股份有限公司 Bandwidth control method of video conferencing system
US20140317532A1 (en) * 2013-03-15 2014-10-23 Blue Jeans Network User interfaces for presentation of audio/video streams
CN109218653A (en) * 2018-09-30 2019-01-15 广州视源电子科技股份有限公司 A kind of multi-window display method of video conference, device, equipment and system
CN112104880A (en) * 2020-08-31 2020-12-18 广州华多网络科技有限公司 Network connection live broadcast control and display method and device, equipment and storage medium
US10999344B1 (en) * 2020-06-15 2021-05-04 Google Llc Dynamic video resolution and quality for improved video conferencing
CN113014858A (en) * 2021-03-05 2021-06-22 深圳壹秘科技有限公司 Method, system and device for changing resolution

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009033348A (en) * 2007-07-25 2009-02-12 Toshiba Corp Video conference application server, and video conference method, and program
CN101557495A (en) * 2009-05-18 2009-10-14 上海华平信息技术股份有限公司 Bandwidth control method of video conferencing system
US20140317532A1 (en) * 2013-03-15 2014-10-23 Blue Jeans Network User interfaces for presentation of audio/video streams
CN109218653A (en) * 2018-09-30 2019-01-15 广州视源电子科技股份有限公司 A kind of multi-window display method of video conference, device, equipment and system
US10999344B1 (en) * 2020-06-15 2021-05-04 Google Llc Dynamic video resolution and quality for improved video conferencing
CN112104880A (en) * 2020-08-31 2020-12-18 广州华多网络科技有限公司 Network connection live broadcast control and display method and device, equipment and storage medium
CN113014858A (en) * 2021-03-05 2021-06-22 深圳壹秘科技有限公司 Method, system and device for changing resolution

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117135364A (en) * 2023-10-26 2023-11-28 深圳市宏辉智通科技有限公司 Video decoding method and system
CN117135364B (en) * 2023-10-26 2024-02-02 深圳市宏辉智通科技有限公司 Video decoding method and system

Also Published As

Publication number Publication date
CN115706829A (en) 2023-02-17

Similar Documents

Publication Publication Date Title
WO2021004381A1 (en) Screencasting display method, and electronic apparatus
KR101634500B1 (en) Media workload scheduler
US8983555B2 (en) Wireless communication techniques
WO2021244341A1 (en) Picture coding method and apparatus, electronic device and computer readable storage medium
US8947492B2 (en) Combining multiple bit rate and scalable video coding
US20140198838A1 (en) Techniques for managing video streaming
KR20180031547A (en) Method and apparatus for adaptively providing multiple bit rate stream media in server
US8842159B2 (en) Encoding processing for conferencing systems
CN113556598A (en) Multi-window screen projection method and electronic equipment
JP2011176827A (en) Processing method of video conference system, video conference system, program and recording medium
WO2023011408A1 (en) Multi-window video communication method, device and system
JP2017520940A (en) Method and apparatus for multiplexing hierarchically encoded content
CN110865782B (en) Data transmission method, device and equipment
CN116204308A (en) Dynamic adjusting method and device for audio and video computing power and electronic equipment
US9467655B2 (en) Computer readable recording medium, communication terminal device and teleconferencing method
US20220239920A1 (en) Video processing method, related apparatus, storage medium, and program product
WO2021093882A1 (en) Video meeting method, meeting terminal, server, and storage medium
CN114697731B (en) Screen projection method, electronic equipment and storage medium
CN117193685A (en) Screen projection data processing method, electronic equipment and storage medium
CN114257771A (en) Video playback method and device for multi-channel audio and video, storage medium and electronic equipment
US11290680B1 (en) High-fidelity freeze-frame for precision video communication applications
US11936698B2 (en) Systems and methods for adaptive video conferencing
CN117412140A (en) Method and apparatus for video transmission
CN115002484A (en) Video encoding and decoding method for reducing time delay, video conference system and storage medium
CN116866604A (en) Image processing method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22852120

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE