WO2017173953A1 - Server, conference terminal, and cloud conference processing method - Google Patents

Server, conference terminal, and cloud conference processing method Download PDF

Info

Publication number
WO2017173953A1
WO2017173953A1 PCT/CN2017/078856 CN2017078856W WO2017173953A1 WO 2017173953 A1 WO2017173953 A1 WO 2017173953A1 CN 2017078856 W CN2017078856 W CN 2017078856W WO 2017173953 A1 WO2017173953 A1 WO 2017173953A1
Authority
WO
WIPO (PCT)
Prior art keywords
code stream
local video
video code
terminal
audio
Prior art date
Application number
PCT/CN2017/078856
Other languages
French (fr)
Chinese (zh)
Inventor
杨伯辉
孙博
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017173953A1 publication Critical patent/WO2017173953A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2368Multiplexing of audio and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2389Multiplex stream processing, e.g. multiplex stream encrypting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs

Definitions

  • the present invention relates to the field of video conference communication, and in particular, to a server, a conference site terminal, and a cloud conference processing method.
  • the current cloud conference system runs on a public cloud or a private cloud.
  • the use of public or private cloud for video conferencing will occupy the server's central processing unit (CPU), memory and other resources, and need to reserve resources (CPU core). Number / memory).
  • the CPU of the server is mainly used for decoding and encoding the video, in a conference or multi-screen conference where decoding and encoding are required for all ends, because each server CPU
  • the number of cores is relatively small, and the number of streams that can be processed is correspondingly small. For example, a 16-core CPU server can virtualize 32 virtual cores.
  • H.264 codec format can be added.
  • 720P (resolution: 1280*720 pixels) HD conference terminal.
  • the number is the total number of virtual cores of 32; the number of site terminals that add H.264-4CIF (resolution: 704*576 pixels) SD is 64. It can be seen that the access capacity of the cloud server is relatively small, and the resource utilization rate is not high.
  • the solution in the prior art is to directly increase the number of cores of the CPU of the server, and secondly, to reduce the video quality; for the first solution, the most intuitive one is undoubtedly increasing the production cost, and The configuration is also constantly improving, so the solution does not solve the problem fundamentally; for the second solution, the most intuitive is to reduce the user experience. The same is also a temporary solution.
  • the embodiments of the present invention provide a server, a conference site terminal, and a cloud conference processing method, which are intended to solve the problem of excessive access to the server caused by the video codec of the server in the prior art. Low capacity and poor user experience.
  • the present invention provides a cloud conference processing method, including the following steps:
  • the method when receiving the local video code stream of each site terminal, the method further includes:
  • the first audio code stream of each of the site terminals is processed and sent to each site terminal.
  • the processing after the first audio code stream of each site terminal is processed and sent to each site terminal, includes:
  • the preset rule includes: the local video code stream set includes a local video code stream of a broadcast source;
  • the sending the local video code stream included in the local video code stream set to the corresponding site terminal comprises: sending the local video code stream of the broadcast source to the corresponding site terminal.
  • the preset rule further includes: receiving a request of the site terminal, where the local video code stream set includes a local video code stream requested by the site terminal;
  • the sending the local video code stream that is included in the local video code stream set to the corresponding site terminal further includes: sending the local video code stream requested by the site terminal to the site terminal.
  • the present invention further provides a cloud conference processing method, including the following steps:
  • the receiving the local video code stream included in the local video code stream set sent by the server comprises: receiving a local video code stream of the broadcast source.
  • the receiving the local video code stream included in the local video code stream set sent by the server further includes: sending a request to the server; and receiving the requested local video code stream.
  • the method when the local video code stream of the venue terminal is sent to the server, the method further includes: sending the first audio code stream of the conference terminal to the server.
  • the method further includes: receiving the second audio code stream from the server.
  • the processing of the local video code stream to obtain the video includes: decoding the local video code stream to obtain at least one sub video; synthesizing the sub video to obtain a video.
  • the present invention further provides a server, including:
  • the first receiving module is configured to receive the local video code stream reported by each site terminal;
  • a determining module configured to determine, according to a preset rule, a local video code stream set to be sent to each of the site terminals, where the local video code stream set includes at least one of the local video code streams;
  • the first sending module is configured to send the local video code stream included in the local video code stream set to the corresponding conference terminal.
  • the embodiment of the present invention further includes a first audio receiving module and an audio processing module; the first audio receiving module is configured to receive a first audio code stream from each venue terminal; the audio The processing module is configured to process the audio code stream and send it to each venue terminal.
  • the audio processing module includes an audio decoding module, a mixing module, an encoding module, and a first audio sending module.
  • the audio decoding module is configured to decode the first audio code stream to obtain at least one Audio;
  • the mixing module is configured to mix the audio;
  • the encoding module is configured to encode the mixed audio to obtain a second audio stream;
  • the audio sending module is configured to The second audio code stream is sent to each of the venue terminals.
  • the preset rule includes: the local video code stream set includes a local video code stream of a broadcast source; the first sending module includes a video sending module, and is configured to set a local video code of the broadcast source. The stream is sent to the corresponding site terminal.
  • the preset rule further includes: receiving a request of the site terminal, where the local video code stream set includes a local video code stream requested by the site terminal; the video sending module is further configured to Forwarding the local video code stream requested by the site terminal to the site terminal.
  • the present invention further provides a venue terminal, including:
  • a second sending module configured to send the local video code stream of the field terminal to the server
  • a second receiving module configured to receive the local video code stream included in the local video code stream set delivered by the server
  • the video processing module is configured to process the local video code stream to obtain a video.
  • the second receiving module includes a video receiving module, configured to receive a local video code stream of a broadcast source.
  • the second receiving module further includes a request sending module, configured to send a request to the server; the video receiving module is further configured to receive the requested local video code stream.
  • the second audio sending module is further configured to: when the local video code stream of the terminal terminal is sent to the server, send the first audio code stream of the site terminal to the Said server.
  • the second audio receiving module is further configured to receive the second audio code stream from the server.
  • the video processing module includes a video decoding module and a synthesizing module, and the decoding module is configured to decode the local video code stream to obtain at least one sub video; the synthesizing module is configured to: The sub video is synthesized to obtain a video.
  • the present invention provides a server, a site terminal, and a cloud conference processing method.
  • the server receives the local video code stream sent by each site terminal, and then determines the local video code stream sent to the site terminal according to the preset rule.
  • the local video code stream set includes at least one local video code stream; the local video code stream included in the local video code stream set is sent to the corresponding site terminal; and the site terminal receives the local video code stream delivered by the server.
  • the local video stream included is aggregated, and the local video stream is processed to obtain a video.
  • FIG. 1 is a flowchart of a cloud conference processing method according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a cloud conference processing method according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of a cloud conference processing method according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of a cloud conference processing method according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a module of a server according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a module of a venue terminal according to an embodiment of the present invention.
  • the idea of the present invention is to perform the operations of decoding, synthesizing, and the like of the local video code stream, which is originally performed by the server, by the venue terminal, and can be customized according to the requirements of different venue terminals, and the server is free from local
  • the decoding and synthesizing operation of the video stream saves the CPU and memory resources of the server, so that the access capacity affecting the server is changed from the configuration of the server to the total bandwidth of the network port of the server, thereby greatly increasing the access capacity of the server.
  • This embodiment provides a cloud conference processing method. Referring to FIG. 1, the method includes:
  • Receive a local video code stream of the site terminal receive a local video code stream reported by each site terminal.
  • S102 Determine, according to a preset rule, a set of local video code streams that are sent to each site terminal, and determine, according to a preset rule, a local video code stream set that is sent to the site terminal, where the local video code stream set includes at least one local video.
  • Send the local video code stream to the corresponding site terminal send the local video code stream included in the local video code stream set to the corresponding site terminal.
  • each site terminal needs to send its own local video stream to the server.
  • the server does not perform the process of decoding and synthesizing the local video code stream sent by each site terminal, but determines the local video code stream set to be sent to the site terminal according to the preset rule.
  • the local video code stream set includes at least one local video code stream; and then, according to the determined local video code stream set sent to the site terminal, the local video code stream included in the local video code stream set is sent to each site terminal, Effectively save the resources of the server, and let the site terminals share the processing of the local video stream.
  • the local video code stream sent by the server is sent by each site terminal, and there is no interference between the sites.
  • the meaning of the message in step S103 is that the local video stream required by each site terminal is likely to be more than one.
  • the local video code stream required by the site terminal is separately forwarded; the local video code stream of each site terminal received by the server is not
  • the direct local forwarding of the processing, the required local video code stream forwarded to each site terminal is the original local video code stream sent by each site terminal to the server.
  • the local video code stream set refers to a set of local video code streams reported by the respective terminals to the server.
  • the set is a set of original local video code streams, and the local video code stream set corresponding to each site terminal is included in the local
  • the video streams may be the same, they may be partially identical, or they may be different.
  • the foregoing preset rule may include: the local video code stream of the broadcast source as part of the local video code stream set, that is, the local video code stream set includes the local video code stream of the broadcast source; and may further include: receiving the conference site The request of the terminal, the local video code stream requested by the field terminal is used as a part of the local video code stream set, that is, the local video code stream set includes the local video code stream requested by the site terminal.
  • the local video code stream sent to a certain site terminal should be a local video code stream different from the site terminal.
  • the local video stream of the site terminal does not need to be received from the server; in the cloud conference Any venue terminal should have at least one screen of the venue terminal, and more times there can be multiple screens. Because the number of conference site terminals is not unique, and each site terminal does not need to see all the site terminals. Some site terminals may only view videos of a certain site terminal for consideration. The site terminal may choose to view the video of all the site terminals in order to view the images of the other site terminals. Therefore, the server needs the local video of the site terminals required by the site terminals according to the requirements of different site terminals. The code stream is sent to these venue terminals.
  • the broadcast source In a cloud conference, there is often a broadcast source, and the broadcast source is also one of the venue terminals. That is, the overall situation of the entire cloud conference is controlled by the conference terminal. Therefore, it is often necessary to send the local video code stream of the conference terminal to each one.
  • the site terminal then, at this time, the local video code stream set includes the local video code stream of the broadcast source, and the local video code stream of the broadcast source needs to be directly forwarded to each of the other site terminals; in addition, the broadcast source also needs to be in the conference.
  • the broadcast source can look at other venue terminals, or you can watch the broadcast source itself, the broadcast source sees
  • the site terminal is called the broadcast source.
  • the local video stream of the broadcast source needs to be sent to each site terminal.
  • the end of the broadcast source needs to be sent to the broadcast source.
  • the broadcast source itself may also change in a cloud conference. After the broadcast source changes, the local video code stream of the changed broadcast source should also be sent to each conference terminal.
  • Each site terminal can send a request to the server at any time in the cloud conference, whether at the beginning or in the conference.
  • the request includes at least the site terminal that the site terminal needs to see, that is, each site terminal.
  • the server sends a request including the site terminal to be viewed; the server determines the local video code stream required by the site terminal according to the request, that is, the local video code stream set includes the local video code stream requested by the site terminal, and sends the local video code stream to the site terminal. Send the corresponding local video stream. That is to say, each site terminal should receive at least the local video code stream of the broadcast source, and can also receive the local video code stream of other site terminals.
  • the server When receiving the local video code stream of each site terminal, the server also receives the first audio code stream of each site terminal; when receiving the first audio code stream of each site terminal, it is true that the server can also directly The first audio stream of the site terminal is sent to each site terminal for processing. However, since the processing of the audio is lower than the CPU and memory of the server occupied by the video processing, it is not necessary; The conference site terminals of each conference do not need to see the screen of each conference terminal in the conference, but generally need to hear the voice of all the conference terminals, so that the conference can be performed normally. If the site terminal selectively receives the first audio code stream, the site terminal misses the speech of some other site terminals. If the missed speech is particularly important, the quality of the conference will be directly affected.
  • the first audio code stream sent by each site terminal in the embodiment is processed by the server directly, and then sent to each site terminal.
  • the processing of the first audio code stream sent by each site terminal includes: decoding the first audio code stream to obtain at least one audio; mixing the audio; encoding the mixed audio to obtain a second Audio stream; send the second audio stream to each venue terminal.
  • the first audio decoding and/or the second audio encoding sent to each site terminal may include any feasible manner in waveform codec, parameter codec, and hybrid codec; the waveform codec includes pulse code modulation. (Pulse Mode Modulation, PCM for short), Adaptive Differential Pulse Code Modulation (ADPCM), Subband Adaptive Differential Pulse Code Modulation (SB-ADPCM), etc.
  • parameter codec includes codec mode such as Linear Predictive Coding (LPC), and mixed codec includes Code Excited Linear Predictive Coding (CELPC), vector and excitation linear predictive coding ( Vector Sum Excited Linear Predictive Coding (VSLPC), Regular Pulse Excited-Long Term Predictive (RPE-LTP), Low Delay-Code Excited Linear Predictive (LD) -CELP), Multi-Pulse Excited (MPE) and so on.
  • Audio Mixing (often referred to as mix) is a step in audio processing that combines sound from multiple sources into one sound. The audio in this embodiment is from different venue terminals.
  • the frequency, dynamics, sound quality, positioning, reverberation and sound field of each audio are separately adjusted to optimize each track. Superimposed on the final product. This kind of processing can produce a layered audio effect.
  • the mix can be handled by the mixing software. After the mixed audio is encoded, it can be sent to each venue terminal.
  • the server can process all the audio, but in some cases, the server can selectively process the audio stream of each site terminal. This may be because the site terminal does not need to talk or is deployed by the broadcast source. , using an orderly way of speaking and so on.
  • This embodiment provides a cloud conference processing method. Referring to FIG. 2, the method includes:
  • Receive a local video code stream receive a local video code stream included in a local video code stream set delivered by the server.
  • S203 Process the local video code stream to obtain a video.
  • each venue terminal that joins the cloud conference should have its own local video.
  • the code stream is sent to the server.
  • each cloud server is in the same cloud conference, the process of sending local video code streams by each site terminal is independent, that is, each site terminal separately sends its own local video code stream to the server.
  • each site terminal The process of receiving the local video code stream of at least one other site terminal of the server is also independent, and the site terminals do not interfere with each other.
  • the local video code stream included in the local video code stream set sent by the receiving server refers to: in the cloud conference, the server receives the local video code stream sent by each site terminal, and then sends the local video code stream to each site respectively.
  • the local video code stream to be received by each site terminal should be the local video code stream of other site terminals, without receiving the local video code stream of the site terminal; the local video code stream included in the local video code stream set is generally In the cloud conference, any venue terminal should have at least one site terminal screen, and more often there are multiple screens. Because the number of conference site terminals is not unique, and each site terminal does not need to see all the site terminals. Some site terminals may only view videos of a certain site terminal for consideration.
  • the site terminal may choose to view the video of all the site terminals in order to view the images of the other site terminals. Therefore, the server needs the local video of the site terminals required by the site terminals according to the requirements of different site terminals.
  • the code stream is sent to these venue terminals.
  • the local video code stream set refers to a set of local video code streams reported by the respective terminals to the server. The set is a set of original local video code streams, and the local video code stream set corresponding to each site terminal is included in the local
  • the video streams may be the same, they may be partially identical, or they may be different.
  • a broadcast source In a cloud conference, there is often a broadcast source, and the broadcast source is also one of the venue terminals. That is, the overall situation of the entire cloud conference is controlled by the conference terminal. Therefore, it is often necessary to send the local video code stream of the conference terminal to each one.
  • the venue terminal then, at this time, each venue terminal needs to receive the local video code stream of the broadcast source, that is, the local video code stream set includes the local video code stream of the broadcast source; of course, the reception here is still sent by the server;
  • the broadcast source also needs to point to some or some of the venue terminals in the conference.
  • the local video code streams of the conference terminals need to be sent to the broadcast source, that is, the broadcast source needs to receive the broadcast source.
  • the broadcast source can look at other venue terminals, or you can watch the broadcast source itself.
  • the site terminal that the broadcast source sees is called the broadcast source.
  • the local video stream of the broadcast source needs to be sent to each site.
  • the terminal, while the end of the broadcast source needs to be sent to the broadcast source.
  • the end of the broadcast source can be changed in a cloud conference, and it may change frequently. This does not affect the implementation of this solution.
  • the broadcast source itself may also change in a cloud conference. After the broadcast source changes, the local video code stream of the changed broadcast source should also be sent to each conference terminal.
  • Each site terminal can send a request to the server at any time in the cloud conference, whether at the beginning or in the conference.
  • the request includes at least the site terminal that the site terminal needs to see, that is, each site terminal.
  • the server sends a request including the site terminal that needs to be viewed; the server sends a corresponding local video code stream to the site terminal according to the request.
  • the local video stream set includes a local video stream requested by the venue terminal. That is to say, each site terminal should receive at least the local video code stream of the broadcast source, and can also receive the local video code stream of other site terminals.
  • the local video code stream After receiving at least one local video code stream, the local video code stream may be processed. The process includes: decoding the local video code stream to obtain at least one sub video; and then performing a synthesizing operation on the sub video to obtain a video. .
  • the local video code stream received by the site terminal, including the broadcast source, may be multiple.
  • the site terminal decodes the multiple local video code streams.
  • the format of the video codec is mainly as follows: H.261, H .263, H.264, can be used in any format.
  • the screen layout of the synthesized video may be arbitrary.
  • Some commonly used synthesized screen layouts may be: first, the sub-videos corresponding to the respective venue terminals are evenly distributed in the screen according to the same size; second, one of them The video is the main video, and the picture is the largest. The other sub-videos are used as the slave video, and the picture is distributed around the main picture.
  • the main video can use the broadcast source as the main video, or the conference terminal that mainly speaks in the cloud conference.
  • the video is used as the main video; thirdly, the picture of the synthesized video can be dynamic, such as who is speaking and whose video picture is enlarged accordingly, which is more prominent. Any of the above methods or other unmentioned synthetic methods are applicable in this embodiment. As long as it can display the video on the terminal of the venue to the participating users.
  • the process of decoding and synthesizing the local video code stream is transferred to the corresponding site terminal for processing. Since the processing of each site terminal is independent, not only can the personal customization of each site terminal be customized, but also a good user experience. The server-side CPU and memory usage are reduced, so that the server can access more site terminals, thereby improving the efficiency of the conference.
  • the first audio code stream of each site terminal may also be sent to the server while the local video code stream of the site terminal is sent to the server.
  • the venue terminal can also process the audio stream using a processing method similar to the local video stream, but the audio stream has its particularity, audio processing and video. Compared with the processing, the CPU and memory of the server occupied by the audio processing are lower, so it is not necessary; in addition, from the actual situation, since the conference terminals of the participating conferences do not need to see the screen of each conference terminal in the conference, generally You need to hear the sound of all the venue terminals so that the conference can proceed normally.
  • the server processes the second audio code stream, and then sends the second audio code stream to each site terminal.
  • the processing of the first audio code stream by the server is the same as that in the above embodiment, and details are not described herein again.
  • This embodiment provides a cloud conference processing method. Referring to FIG. 3, the method includes:
  • S301 The server calls the conference terminal to join the conference: the server joins the conference site terminal T1-Tn to join the conference;
  • the initiator of the cloud conference is generally the broadcast source, and the broadcast source initiates a call through the server to establish a cloud conference.
  • the site terminal joins the conference, and sends the local video code stream and the first audio frequency code stream to the server: each site terminal T1-Tn joins the conference, and sends the local video code stream and the first audio code stream to the server;
  • Each site terminal that receives the conference request sends its local video stream and audio stream to the server when joining the conference.
  • each participating conference terminal needs at least a broadcast source picture in the conference
  • the server needs to determine which is the broadcast source; since the broadcast source is the initiator of the conference, this step can also be performed at the beginning.
  • the server sends the local video code stream and the second audio code stream of the broadcast source to each site terminal.
  • the server directly transmits and forwards the local video code stream of the received conference broadcast source Tx to the conference.
  • the T1-Tn site terminal is forwarded to the broadcast source Tx site terminal by the local video code stream of the received Ty site terminal.
  • the cloud server decodes, mixes and re-encodes the first audio stream sent by each site terminal to form a second audio stream, and then sends it back to all the venue terminals.
  • the server does not perform the process of decoding and synthesizing the local video code stream sent by each site terminal, but directly transmits the local video code stream of at least one other site terminal different from each site terminal to each site terminal, which can be effective.
  • the resources of the server are saved, and each site terminal shares the processing of the local video code stream.
  • the local video code stream of the broadcast source should be sent to each other site terminal, that is, the site terminal except Tx in T1-Tn; the local video code stream of the terminal Ty viewed by the broadcast source is sent to the broadcast source.
  • the server also receives the first audio stream from each venue terminal, decodes the first audio stream to obtain at least one audio, mixes the audio, and then encodes the mixed audio. And obtaining a second audio code stream, and then sending the second audio code stream to each of the venue terminals.
  • the server has already sent the local video code stream of the broadcast source to each site terminal, and the second audio code stream has been sent to each site terminal. If the site terminal does not need to view the images of other site terminals, then The server proceeds to the step of S304 to continue the cloud conference.
  • S306. Determine a local video code stream requested by the site terminal: when receiving the site terminal, sending The request from the server, the content requested by the server site terminal determines which local video stream of the conference terminal is requested by the site terminal;
  • the site terminal When the site terminal wants to view the screen of the site terminal, the site terminal sends a request to the server, and the server confirms which local video stream of the site terminal the site terminal wants to see according to the request.
  • the local video code stream can be sent to the site terminal; after that, the site video stream is processed by the site terminal, thereby obtaining The desired video picture.
  • This embodiment provides a cloud conference processing method. Referring to FIG. 4, the method includes:
  • the terminal of the site receives the membership request
  • the broadcast source initiates a cloud conference, and the corresponding site terminal needs to join the conference.
  • the terminal of the site joins the conference, and the local local video stream and the first audio stream are encoded and sent to the server.
  • the local video code stream and the first audio code stream of the conference terminal are sent to the server; the local video code stream and the first audio stream are encoded, according to the respective codes. format.
  • the local terminal terminal receives the local video code stream and the second audio code stream of the broadcast source sent by the cloud server; and receives the list of the participating site terminals;
  • the local video code stream of the broadcast source needs to be received by each site terminal, that is, the picture of the broadcast source needs to be seen by each site terminal; after receiving the first audio code stream sent by each site terminal, the server receives the audio stream.
  • the code stream is processed, including decoding, mixing, and encoding, and then sent to each venue terminal.
  • the site terminal In addition to receiving the local video stream and the second audio stream of the broadcast source, the site terminal should also know the site terminal of the conference, so as to determine whether to view other site terminals.
  • S404 Processing the local video code stream and the second audio code stream to form video and audio: the local terminal terminal processes the received local video code stream and the second audio code stream to form video and audio, respectively;
  • the local video stream received by the site terminal at this time is the local video stream of the broadcast source; the processing of the local video stream includes decoding and recombination; when there is only one local video stream, the synthesis operation is not required, directly to the local
  • the video stream can be decoded to obtain the video.
  • the second audio stream is processed by the first audio stream of each site terminal on the server. After decoding, the site terminal can obtain the synthesized audio of each site terminal.
  • S405 Whether to send a request: if the request is not sent to the server, the site terminal keeps steps S403 and S404;
  • S406 Sending a request to the server: the site terminal sends a request to the server, and first determines a screen of the site terminal that needs to be viewed from the site terminal list of the participant; and then sends the request message to the server, where the request includes a list of the site terminal to be viewed. ;
  • the site terminal needs to view the image of the other site terminal, first determine which site terminals need to be viewed; then, send a list of the site terminals to the server.
  • the server sends the local video code stream of the terminal to the site terminal.
  • the cloud server finds the corresponding site terminal from the list of the site terminals in the request of the site terminal, and forwards the local video code stream of the site terminal to the site.
  • the terminal of the site receives the video code, and performs synthesis to obtain a video: the site terminal receives the local video code stream and decodes the video stream, and then synthesizes into multiple frames and outputs the image to the display device.
  • This embodiment provides a server. Referring to FIG. 5, the method includes:
  • the first receiving module 101 is configured to receive a local video code stream reported by each site terminal;
  • the determining module 105 is configured to determine, according to a preset rule, a set of local video code streams sent to the site terminals, where the local video code stream set includes at least one local video code stream;
  • the first sending module 102 is configured to send the local video code stream included in the local video code stream set to the corresponding site terminal.
  • each site terminal needs to send its own local video stream to the server.
  • the server does not perform the process of decoding and synthesizing the local video code stream sent by each site terminal, but determines the local video code stream set to be sent to the site terminal according to the preset rule.
  • the local video code stream set includes at least one local video code stream; and then, according to the determined local video code stream set sent to the site terminal, the local video code stream included in the local video code stream set is sent to each site terminal, Effectively save the resources of the server, and let the site terminals share the processing of the local video stream.
  • the local video code stream sent by the server is sent by each site terminal, and there is no interference between the sites.
  • the meaning of the first sending module 102 is that the local video stream required by each site terminal is likely to be more than One, then the local video code stream required by the site terminal is separately forwarded; the server will forward the local video code stream of each site terminal to the required locality of each site terminal without processing.
  • the video stream is the original local video stream sent to the server by each venue terminal.
  • the local video code stream set refers to a set of local video code streams reported by the respective terminals to the server.
  • the set is a set of original local video code streams, and the local video code stream set corresponding to each site terminal is included in the local
  • the video streams may be the same, they may be partially identical, or they may be different.
  • the foregoing preset rule may include: the local video code stream of the broadcast source as part of the local video code stream set, that is, the local video code stream set includes the local video code stream of the broadcast source; and may further include: receiving the conference site The request of the terminal, the local video stream requested by the field terminal As part of the local video stream set, the local video stream set includes the local video stream requested by the venue terminal.
  • the local video code stream sent to a certain site terminal should be a local video code stream different from the site terminal.
  • the local video stream of the site terminal does not need to be received from the server; in the cloud conference Any venue terminal should have at least one screen of the venue terminal, and more times there can be multiple screens. Because the number of conference site terminals is not unique, and each site terminal does not need to see all the site terminals. Some site terminals may only view videos of a certain site terminal for consideration. The site terminal may choose to view the video of all the site terminals in order to view the images of the other site terminals. Therefore, the server needs the local video of the site terminals required by the site terminals according to the requirements of different site terminals. The code stream is sent to these venue terminals.
  • the broadcast source In a cloud conference, there is often a broadcast source, and the broadcast source is also one of the venue terminals. That is, the overall situation of the entire cloud conference is controlled by the conference terminal. Therefore, it is often necessary to send the local video code stream of the conference terminal to each one.
  • the site terminal then, at this time, the local video code stream set includes the local video code stream of the broadcast source, and the first sending module 102 includes a video sending module 1021, configured to send the local video code stream of the broadcast source to other site terminals;
  • the broadcast source needs to point to a certain or some site terminal in the conference.
  • the local video code stream of the site terminals needs to be sent to the broadcast source; the broadcast source can view other site terminals, or Look at the broadcast source itself.
  • the venue terminal that the broadcast source sees is called the broadcast source.
  • the local video stream of the broadcast source needs to be It is sent to each venue terminal, and the end of the broadcast source needs to be sent to the broadcast source.
  • the end of the broadcast source can be changed in a cloud conference, and it may change frequently. This does not affect the implementation of this solution.
  • the broadcast source itself may also change in a cloud conference. After the broadcast source changes, the local video code stream of the changed broadcast source should also be sent to each conference terminal.
  • the determining module 105 further includes a request receiving module 1051; each meeting terminal can send a request to the server at any time in the cloud meeting, whether at the beginning or in the meeting, the request at least includes the need for the site terminal to view Venue terminal, that is to say, each meeting
  • the field terminal sends a request to the server to include the site terminal to be viewed;
  • the request receiving module 1051 is configured to receive the request, and determine the local video code stream required by the site terminal, that is, the local video code stream set includes the local video requested by the site terminal.
  • the code stream, the video sending module 1021 sends a corresponding local video code stream to the venue terminal according to the request. That is to say, each site terminal should receive at least the local video code stream of the broadcast source, and can also receive the local video code stream of other site terminals.
  • the video sending module 1021 is further configured to send the local video code stream requested by the field terminal to the site terminal.
  • the first audio receiving module 103 and the audio processing module 104 are further included; when the server receives the local video code stream of each site terminal, correspondingly, the first audio receiving module 103 receives the first of each site terminal. Audio stream; when receiving the first audio stream of each venue terminal, it is true that the server can also directly send the first audio stream of each venue terminal to each venue terminal for processing, however, due to audio processing and video Compared with the CPU and memory of the server it occupies, it is not necessary. In addition, from the actual situation, since the conference terminals of each conference do not need to see the screen of each conference terminal in the conference, they generally need to hear it. The sound of all the venue terminals, so that the conference can be carried out normally.
  • the site terminal selectively receives the first audio code stream, the site terminal misses the speech of some other site terminals. If the missed speech is particularly important, the quality of the conference will be directly affected. Therefore, the first audio code stream sent by each site terminal in the embodiment is processed by the audio processing module 104, and then sent to each site terminal.
  • the audio processing module 104 includes: an audio decoding module 1041, a mixing module 1042, an encoding module 1043, and a first audio sending module 1044; wherein the audio decoding module 1041 is configured to decode the first audio stream to obtain at least one audio; The audio module 1042 is configured to mix the audio; the encoding module 1043 is configured to encode the mixed audio to obtain a second audio stream; the first audio transmitting module 1044 is configured to send the second audio stream to each Venue terminal.
  • the first audio decoding and/or the second audio encoding sent to each venue terminal may include any feasible manner in waveform encoding and decoding, parameter encoding and decoding, and hybrid encoding and decoding; the waveform encoding and decoding includes PCM, ADPCM, and SB-ADPCM.
  • codec mode parameter codec includes LPC and other codec modes
  • mixed codec includes CELPC, VSLPC, RPE-LTP, LD-CELP, MPE and other codec methods.
  • Mixing audio is a step in audio processing that combines sound from multiple sources into one sound. The audio in this embodiment is from different venue terminals.
  • the frequency, dynamics, sound quality, positioning, reverberation and sound field of each audio are separately adjusted to optimize each track. Superimposed on the final product. This kind of processing can produce a layered audio effect.
  • the mix can be handled by the mixing software. After the mixed audio is encoded, it can be sent to each venue terminal.
  • the server can process all the audio, but in some cases, the server can selectively process the audio stream of each site terminal. This may be because the site terminal does not need to talk or is deployed by the broadcast source. , using an orderly way of speaking and so on.
  • This embodiment provides a site terminal. Referring to FIG. 6, the method includes:
  • the second sending module 201 is configured to send the local video code stream of the field terminal to the server;
  • the second receiving module 202 is configured to receive a local video code stream included in the local video code stream set delivered by the server;
  • the video processing module 203 is configured to process the local video code stream to obtain a video.
  • each site terminal that joins the cloud conference should send its own local video stream to the server.
  • each cloud server is in the same cloud conference, the process of sending local video code streams by each site terminal is independent, that is, each site terminal separately sends its own local video code stream to the server.
  • each site terminal The process of receiving the local video code stream of at least one other site terminal of the server is also independent, and the site terminals do not interfere with each other.
  • the local video code stream included in the local video code stream set sent by the receiving server refers to: in the cloud conference, the server receives the local video code stream sent by each site terminal, and then sends the local video code stream to each site respectively.
  • the local video code stream to be received by each site terminal should be the local video code stream of other site terminals, without receiving the local video code stream of the site terminal; the local video code stream included in the local video code stream set is generally In the cloud conference, any venue terminal should have at least one site terminal screen, more often There are multiple screens. Because the number of conference site terminals is not unique, and each site terminal does not need to see all the site terminals. Some site terminals may only view videos of a certain site terminal for consideration.
  • the site terminal may choose to view the video of all the site terminals in order to view the images of the other site terminals. Therefore, the server needs the local video of the site terminals required by the site terminals according to the requirements of different site terminals.
  • the code stream is sent to these venue terminals.
  • the local video code stream set refers to a set of local video code streams reported by the respective terminals to the server. The set is a set of original local video code streams, and the local video code stream set corresponding to each site terminal is included in the local
  • the video streams may be the same, they may be partially identical, or they may be different.
  • the second receiving module 202 includes a video receiving module 2021 configured to receive a local video code stream of the broadcast source, that is, the local video code stream set includes a local video code stream of the broadcast source; of course, the receiving here is still performed by the server.
  • the broadcast source also needs to point to some or some of the site terminals in the conference.
  • the local video code streams of the site terminals need to be sent to the broadcast source, that is, the broadcast source needs to receive.
  • the end of the broadcast source needs to be sent to the broadcast source.
  • the end of the broadcast source can be changed in a cloud conference, and it may change frequently. This does not affect the implementation of this solution.
  • the broadcast source itself may also change in a cloud conference. After the broadcast source changes, the local video code stream of the changed broadcast source should also be sent to each conference terminal.
  • the second receiving module 202 further includes a request sending module 2022; configured to send a request to the server; each venue terminal can send a request to the server at any time in the cloud conference, whether at the beginning or in the conference, this
  • the request includes at least the site terminal that the site terminal needs to see, that is, each site terminal sends the server to the server including the site to be viewed.
  • the request of the end; the server sends a corresponding local video stream to the site terminal according to the request.
  • the local video stream set includes a local video stream requested by the venue terminal. That is to say, each site terminal should receive at least the local video code stream of the broadcast source, and can also receive the local video code stream of other site terminals.
  • the video receiving module 2021 is further configured to receive the requested local video code stream, that is, the local video code stream of the site terminal requested by the sending module 2022.
  • the local video code stream After receiving at least one local video code stream, the local video code stream can be processed; the video processing module 203 includes a video decoding module 2031 and a synthesis module 2032, and the video decoding module 2031 is configured to decode the local video code stream to obtain At least one sub video; then the synthesizing module 2032 performs a synthesizing operation on the sub video to obtain a video.
  • the local video code stream received by the site terminal, including the broadcast source, may be multiple.
  • the site terminal decodes the multiple local video code streams.
  • the format of the video codec is mainly as follows: H.261, H .263, H.264, can be used in any format.
  • each sub-video After decoding each local video stream, at least one sub-video is obtained, and each sub-video can be synthesized according to the will of the venue terminal.
  • the screen layout of the synthesized video may be arbitrary. Some commonly used synthesized screen layouts may be: first, the sub-videos corresponding to the respective venue terminals are evenly distributed in the screen according to the same size; second, one of them The video is the main video, and the picture is the largest. The other sub-videos are used as the slave video, and the picture is distributed around the main picture.
  • the main video can use the broadcast source as the main video, or the conference terminal that mainly speaks in the cloud conference.
  • the video is used as the main video; thirdly, the picture of the synthesized video can be dynamic, such as who is speaking and whose video picture is enlarged accordingly, which is more prominent.
  • Any of the above methods or other unmentioned synthesis methods are applicable in this embodiment, as long as they enable the video on the site terminal to be normally displayed to the participating users.
  • the process of decoding and synthesizing the local video code stream is transferred to the corresponding site terminal for processing. Since the processing of each site terminal is independent, not only can the personal customization of each site terminal be customized, but also a good user experience.
  • the server-side CPU and memory usage are reduced, so that the server can access more site terminals, thereby improving the efficiency of the conference.
  • the second audio sending module 204 is further configured to send the first audio code stream of each site terminal to the server while transmitting the local video code stream of the site terminal to the server.
  • the venue terminal can also adopt similar The processing method of the local video stream processes the audio stream, but the audio stream has its particularity. Compared with the processing of the video, the CPU and memory of the server occupied by the audio processing are lower, so it is not necessary; In addition, from the actual situation, since the conference terminals of each conference do not need to see the screen of each conference terminal in the conference, it is generally necessary to hear the voices of all the conference terminals, so that the conference can be performed normally.
  • the second audio receiving module 205 is further included, after each of the site terminals sends the first audio code stream to the server, the server processes the second audio code stream, and then sends the second audio code stream to each site. terminal.
  • the processing of the first audio code stream by the server is the same as that in the above embodiment, and details are not described herein again.
  • Embodiments of the present invention also provide a storage medium including a stored program, wherein the program described above executes the method of any of the above.
  • the above storage medium may be provided as program code for storing steps for performing the method shown in any of the figures described in FIGS. 1 to 4.
  • the foregoing storage medium may include, but is not limited to, a USB flash drive, a Read-Only Memory (ROM), and a Random Access Memory (RAM).
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • Embodiments of the present invention also provide a processor for running a program, wherein the program is executed to perform the steps of any of the above methods.
  • the above program is used to perform the steps of the method shown in any of the figures shown in FIGS. 1 to 4.
  • the local video code stream set to be sent to the site terminal is determined according to a preset rule, where the local video code stream set includes at least one local video code stream;
  • the local video code stream included in the video code stream set is sent to the corresponding site terminal;
  • the site terminal receives the local video code stream included in the local video code stream set sent by the server, and processes the local video code stream to obtain video.

Abstract

The invention provides a server, conference terminal, and cloud conference processing method. The server receives a local video bitstream transmitted by each conference terminal, and determines, according to a preconfigured rule, a local video bitstream set to be transmitted with respect to each conference terminal, wherein the local video bitstream set comprises at least one local video bitstream, and transmits, to the corresponding conference terminal, the local video bitstream in the local video bitstream set. The conference terminal receives the local video bitstream in the local video bitstream set and transmitted by the server, and processes the local video bitstream to obtain a video. The embodiment overcomes a problem of a low access capacity of a server in the prior art and resulting from the server using large portions of CPU and memory resources to process a local video bitstream, significantly increases a conference terminal access capacity of the server, and ensures favorable user experience.

Description

一种服务器、会场终端以及云会议处理方法Server, venue terminal and cloud conference processing method 技术领域Technical field
本发明涉及视频会议通信领域,尤其涉及一种服务器、会场终端及云会议处理方法。The present invention relates to the field of video conference communication, and in particular, to a server, a conference site terminal, and a cloud conference processing method.
背景技术Background technique
目前的云会议系统运行在公有云或私有云上,使用公有云或私有云召开视频会议会占用服务器的中央处理器(Central Processing Unit,简称CPU),内存等资源,需要预约好资源(CPU核数/内存)。而现有技术中,服务器的CPU主要是用来对视频进行解码、编码处理的,在需要对所有端进行解码、编码(全编全解)的会议或多画面会议中,因为每个服务器CPU的核数是比较少的,能处理的码流数量相应的也比较少。例如,一个16核的CPU的服务器,可以虚拟出32个虚拟核,召开一组云视频会议则可以添加H.264编解码格式,720P(分辨率为:1280*720像素)高清的会场终端的数量为虚拟核的总数32个;添加H.264-4CIF(分辨率为:704*576像素)标清能力的会场终端的数量为64个。可见云服务器的接入容量比较少,资源利用率也不高。针对该问题,现有技术中的解决方案一是直接增加服务器的CPU的核数,二是降低视频质量;对于第一个方案而言,最直观的无疑就是会增大生产成本,而且由于服务器的配置也是一直在提升中,因此该方案并不能从根本上解决问题;对于第二个方案而言,最直观的则是降低了用户体验。同样的也是治标不治本。The current cloud conference system runs on a public cloud or a private cloud. The use of public or private cloud for video conferencing will occupy the server's central processing unit (CPU), memory and other resources, and need to reserve resources (CPU core). Number / memory). In the prior art, the CPU of the server is mainly used for decoding and encoding the video, in a conference or multi-screen conference where decoding and encoding are required for all ends, because each server CPU The number of cores is relatively small, and the number of streams that can be processed is correspondingly small. For example, a 16-core CPU server can virtualize 32 virtual cores. When a cloud video conference is held, H.264 codec format can be added. 720P (resolution: 1280*720 pixels) HD conference terminal. The number is the total number of virtual cores of 32; the number of site terminals that add H.264-4CIF (resolution: 704*576 pixels) SD is 64. It can be seen that the access capacity of the cloud server is relatively small, and the resource utilization rate is not high. To solve this problem, the solution in the prior art is to directly increase the number of cores of the CPU of the server, and secondly, to reduce the video quality; for the first solution, the most intuitive one is undoubtedly increasing the production cost, and The configuration is also constantly improving, so the solution does not solve the problem fundamentally; for the second solution, the most intuitive is to reduce the user experience. The same is also a temporary solution.
发明内容Summary of the invention
本发明实施例提供了一种服务器、会场终端以及云会议处理方法,旨在解决现有技术中过度依赖服务器对视频编解码而导致的服务器的接入 容量低,用户体验差的问题。The embodiments of the present invention provide a server, a conference site terminal, and a cloud conference processing method, which are intended to solve the problem of excessive access to the server caused by the video codec of the server in the prior art. Low capacity and poor user experience.
为了解决上述技术问题,本发明提供了一种云会议处理方法,包括如下步骤:In order to solve the above technical problem, the present invention provides a cloud conference processing method, including the following steps:
接收各个会场终端上报的本地视频码流;Receiving a local video code stream reported by each site terminal;
针对各个会场终端,分别根据预设的规则确定向其下发的本地视频码流集合,所述本地视频码流集合至少包括一个所述本地视频码流;Determining, according to a preset rule, a set of local video code streams sent to each of the site terminals, where the set of local video code streams includes at least one of the local video code streams;
将所述本地视频码流集合包括的本地视频码流发送给对应的所述的会场终端。Sending the local video code stream included in the local video code stream set to the corresponding conference terminal.
本发明实施例中,所述接收各个会场终端的本地视频码流时,还包括:In the embodiment of the present invention, when receiving the local video code stream of each site terminal, the method further includes:
接收来自各个会场终端的第一音频码流;Receiving a first audio code stream from each venue terminal;
将所述各个会场终端的第一音频码流处理后发送给各个会场终端。The first audio code stream of each of the site terminals is processed and sent to each site terminal.
本发明实施例中,所述将各个会场终端的第一音频码流处理后发送给各个会场终端包括:In the embodiment of the present invention, the processing, after the first audio code stream of each site terminal is processed and sent to each site terminal, includes:
将所述第一音频码流进行解码,得到至少一个音频;Decoding the first audio code stream to obtain at least one audio;
对所述音频进行混音;Mixing the audio;
对所述混音后的音频进行编码,得到第二音频码流;Encoding the mixed audio to obtain a second audio stream;
将所述第二音频码流发送给各个所述会场终端。And transmitting the second audio code stream to each of the venue terminals.
本发明实施例中,所述预设的规则包括:所述本地视频码流集合包括广播源的本地视频码流;In the embodiment of the present invention, the preset rule includes: the local video code stream set includes a local video code stream of a broadcast source;
所述将本地视频码流集合包括的本地视频码流发送给对应的会场终端包括:将所述广播源的本地视频码流发送给对应的会场终端。The sending the local video code stream included in the local video code stream set to the corresponding site terminal comprises: sending the local video code stream of the broadcast source to the corresponding site terminal.
本发明实施例中,所述预设的规则还包括:接收所述会场终端的请求,所述本地视频码流集合包括所述会场终端请求的本地视频码流;In the embodiment of the present invention, the preset rule further includes: receiving a request of the site terminal, where the local video code stream set includes a local video code stream requested by the site terminal;
所述将本地视频码流集合包括的本地视频码流发送给对应的会场终端还包括:将所述会场终端请求的本地视频码流发送给所述会场终端。 The sending the local video code stream that is included in the local video code stream set to the corresponding site terminal further includes: sending the local video code stream requested by the site terminal to the site terminal.
本发明实施例中,本发明还提供了一种云会议处理方法,包括如下步骤:In the embodiment of the present invention, the present invention further provides a cloud conference processing method, including the following steps:
将会场终端的本地视频码流发送给服务器;Sending a local video code stream of the field terminal to the server;
接收所述服务器下发的本地视频码流集合包括的所述本地视频码流;Receiving, by the local video code stream that is sent by the server, the local video code stream;
对所述本地视频码流进行处理,得到视频。Processing the local video stream to obtain a video.
本发明实施例中,所述接收所述服务器下发的本地视频码流集合包括的本地视频码流包括:接收广播源的本地视频码流。In the embodiment of the present invention, the receiving the local video code stream included in the local video code stream set sent by the server comprises: receiving a local video code stream of the broadcast source.
本发明实施例中,所述接收所述服务器下发的本地视频码流集合包括的本地视频码流还包括:向所述服务器发送请求;接收请求的本地视频码流。In the embodiment of the present invention, the receiving the local video code stream included in the local video code stream set sent by the server further includes: sending a request to the server; and receiving the requested local video code stream.
本发明实施例中,在所述将会场终端的本地视频码流发送给服务器时,还包括:将所述会场终端的第一音频码流发送给所述服务器。In the embodiment of the present invention, when the local video code stream of the venue terminal is sent to the server, the method further includes: sending the first audio code stream of the conference terminal to the server.
本发明实施例中,在所述将会场终端的第一音频码流发送给服务器之后,还包括:接收来自服务器的第二音频码流。In the embodiment of the present invention, after the first audio code stream of the field terminal is sent to the server, the method further includes: receiving the second audio code stream from the server.
本发明实施例中,所述对本地视频码流进行处理,得到视频包括:对所述本地视频码流进行解码,得到至少一个子视频;将所述子视频进行合成,得到视频。In the embodiment of the present invention, the processing of the local video code stream to obtain the video includes: decoding the local video code stream to obtain at least one sub video; synthesizing the sub video to obtain a video.
本发明实施例中,本发明还提供了一种服务器,包括:In the embodiment of the present invention, the present invention further provides a server, including:
第一接收模块,设置为接收各个会场终端上报的本地视频码流;The first receiving module is configured to receive the local video code stream reported by each site terminal;
确定模块,设置为针对各个会场终端,分别根据预设规则确定向其下发的本地视频码流集合,所述本地视频码流集合至少包括一个所述本地视频码流;a determining module, configured to determine, according to a preset rule, a local video code stream set to be sent to each of the site terminals, where the local video code stream set includes at least one of the local video code streams;
第一发送模块,设置为将所述本地视频码流集合包括的本地视频码流发送给对应的所述的会场终端。The first sending module is configured to send the local video code stream included in the local video code stream set to the corresponding conference terminal.
本发明实施例中,还包括第一音频接收模块和音频处理模块;所述第一音频接收模块设置为接收来自各个会场终端的第一音频码流;所述音频 处理模块设置为将所述音频码流处理后发送给各个会场终端。The embodiment of the present invention further includes a first audio receiving module and an audio processing module; the first audio receiving module is configured to receive a first audio code stream from each venue terminal; the audio The processing module is configured to process the audio code stream and send it to each venue terminal.
本发明实施例中,所述音频处理模块包括音频解码模块、混音模块、编码模块、第一音频发送模块;所述音频解码模块设置为将所述第一音频码流进行解码,得到至少一个音频;所述混音模块设置为对所述音频进行混音;所述编码模块设置为对所述混音后的音频进行编码,得到第二音频码流;所述音频发送模块设置为将所述第二音频码流发送给各个所述会场终端。In the embodiment of the present invention, the audio processing module includes an audio decoding module, a mixing module, an encoding module, and a first audio sending module. The audio decoding module is configured to decode the first audio code stream to obtain at least one Audio; the mixing module is configured to mix the audio; the encoding module is configured to encode the mixed audio to obtain a second audio stream; the audio sending module is configured to The second audio code stream is sent to each of the venue terminals.
本发明实施例中,所述预设的规则包括:所述本地视频码流集合包括广播源的本地视频码流;所述第一发送模块包括视频发送模块,设置为将广播源的本地视频码流发送给对应的会场终端.In the embodiment of the present invention, the preset rule includes: the local video code stream set includes a local video code stream of a broadcast source; the first sending module includes a video sending module, and is configured to set a local video code of the broadcast source. The stream is sent to the corresponding site terminal.
本发明实施例中,所述预设的规则还包括:接收所述会场终端的请求,所述本地视频码流集合包括所述会场终端请求的本地视频码流;所述视频发送模块还设置为将所述会场终端请求的本地视频码流直接转发给所述会场终端。In the embodiment of the present invention, the preset rule further includes: receiving a request of the site terminal, where the local video code stream set includes a local video code stream requested by the site terminal; the video sending module is further configured to Forwarding the local video code stream requested by the site terminal to the site terminal.
本发明实施例中,本发明还提供了一种会场终端,包括:In the embodiment of the present invention, the present invention further provides a venue terminal, including:
第二发送模块,设置为将会场终端的本地视频码流发送给服务器;a second sending module, configured to send the local video code stream of the field terminal to the server;
第二接收模块,设置为接收所述服务器下发的本地视频码流集合包括的所述本地视频码流;a second receiving module, configured to receive the local video code stream included in the local video code stream set delivered by the server;
视频处理模块,设置为对所述本地视频码流进行处理,得到视频。The video processing module is configured to process the local video code stream to obtain a video.
本发明实施例中,所述第二接收模块包括视频接收模块,设置为接收广播源的本地视频码流。In the embodiment of the present invention, the second receiving module includes a video receiving module, configured to receive a local video code stream of a broadcast source.
本发明实施例中,所述第二接收模块还包括请求发送模块,设置为向所述服务器发送请求;所述视频接收模块还设置为接收请求的本地视频码流。In the embodiment of the present invention, the second receiving module further includes a request sending module, configured to send a request to the server; the video receiving module is further configured to receive the requested local video code stream.
本发明实施例中,还包括第二音频发送模块,设置为在将会场终端的本地视频码流发送给服务器时,将所述会场终端的第一音频码流发送给所 述服务器。In the embodiment of the present invention, the second audio sending module is further configured to: when the local video code stream of the terminal terminal is sent to the server, send the first audio code stream of the site terminal to the Said server.
本发明实施例中,还包括第二音频接收模块,设置为接收来自服务器的第二音频码流。In the embodiment of the present invention, the second audio receiving module is further configured to receive the second audio code stream from the server.
本发明实施例中,所述视频处理模块包括视频解码模块和合成模块,所述解码模块设置为对所述本地视频码流进行解码,得到至少一个子视频;所述合成模块设置为将所述子视频进行合成,得到视频。In the embodiment of the present invention, the video processing module includes a video decoding module and a synthesizing module, and the decoding module is configured to decode the local video code stream to obtain at least one sub video; the synthesizing module is configured to: The sub video is synthesized to obtain a video.
有益效果:Beneficial effects:
本发明提供了一种服务器、会场终端及云会议处理方法,服务器接收各个会场终端发送的本地视频码流,然后针对各个会场终端,分别根据预设的规则确定向其下发的本地视频码流集合,本地视频码流集合至少包括一个本地视频码流;将所述本地视频码流集合包括的本地视频码流发送给对应的所述的会场终端;会场终端接收服务器下发的本地视频码流集合包括的本地视频码流,并对该本地视频码流进行处理,得到视频。克服了现有技术中的由服务器对本地视频码流进行处理而导致的服务器的CPU和内存的占用高,进而使得服务器的接入容量低的问题,大大提升了服务器的会场终端接入量,且保证了用户体验。The present invention provides a server, a site terminal, and a cloud conference processing method. The server receives the local video code stream sent by each site terminal, and then determines the local video code stream sent to the site terminal according to the preset rule. And the local video code stream set includes at least one local video code stream; the local video code stream included in the local video code stream set is sent to the corresponding site terminal; and the site terminal receives the local video code stream delivered by the server. The local video stream included is aggregated, and the local video stream is processed to obtain a video. The problem that the CPU and the memory of the server are occupied by the server to process the local video code stream in the prior art is overcome, and the access capacity of the server is low, thereby greatly increasing the access volume of the server terminal. And guarantee the user experience.
附图说明DRAWINGS
图1是本发明实施例提供的一种云会议处理方法流程图;FIG. 1 is a flowchart of a cloud conference processing method according to an embodiment of the present invention;
图2是本发明实施例提供的一种云会议处理方法流程图;2 is a flowchart of a cloud conference processing method according to an embodiment of the present invention;
图3是本发明实施例提供的一种云会议处理方法流程图;3 is a flowchart of a cloud conference processing method according to an embodiment of the present invention;
图4是本发明实施例提供的一种云会议处理方法流程图;4 is a flowchart of a cloud conference processing method according to an embodiment of the present invention;
图5是本发明实施例提供的一种服务器的模块示意图;FIG. 5 is a schematic diagram of a module of a server according to an embodiment of the present invention; FIG.
图6是本发明实施例提供的一种会场终端的模块示意图。FIG. 6 is a schematic diagram of a module of a venue terminal according to an embodiment of the present invention.
具体实施方式 detailed description
本发明的构思在于:将本来由服务器来完成的本地视频码流的解码、合成等操作,改由会场终端来进行,且根据不同的会场终端的需求可以自行订制,服务器免去了对本地视频码流的解码合成操作,从而节约了服务器的CPU和内存资源,使得影响服务器的接入容量由服务器的配置改为了服务器的网口总带宽,从而大大提升了服务器的接入容量。The idea of the present invention is to perform the operations of decoding, synthesizing, and the like of the local video code stream, which is originally performed by the server, by the venue terminal, and can be customized according to the requirements of different venue terminals, and the server is free from local The decoding and synthesizing operation of the video stream saves the CPU and memory resources of the server, so that the access capacity affecting the server is changed from the configuration of the server to the total bandwidth of the network port of the server, thereby greatly increasing the access capacity of the server.
下面结合附图对本发明的具体实施方式作进一步说明。The specific embodiments of the present invention are further described below in conjunction with the accompanying drawings.
本实施例提供了一种云会议处理方法,请参考图1,包括:This embodiment provides a cloud conference processing method. Referring to FIG. 1, the method includes:
S101、接收会场终端的本地视频码流:接收各个会场终端上报的本地视频码流;S101. Receive a local video code stream of the site terminal: receive a local video code stream reported by each site terminal.
S102、根据预设的规则确定下发的本地视频码流集合:针对各个会场终端,分别根据预设的规则确定向其下发的本地视频码流集合,本地视频码流集合至少包括一个本地视频码流;S102. Determine, according to a preset rule, a set of local video code streams that are sent to each site terminal, and determine, according to a preset rule, a local video code stream set that is sent to the site terminal, where the local video code stream set includes at least one local video. Code stream
S103、将本地视频码流分别发送给对应的会场终端:将本地视频码流集合包括的本地视频码流发送给对应的会场终端。S103. Send the local video code stream to the corresponding site terminal: send the local video code stream included in the local video code stream set to the corresponding site terminal.
在进行云会议时,各个会场终端均需要将自己的本地视频码流发送给服务器。此时,服务器已经不再对各个会场终端所发送的本地视频码流进行解码合成等处理,而是针对各个会场终端,分别根据预设的规则确定向其下发的本地视频码流集合,而本地视频码流集合至少包括一个本地视频码流;然后再根据确定的向会场终端下发的本地视频码流集合,将该本地视频码流集合包括的本地视频码流发送给各个会场终端,可以有效的节约服务器的资源,让各个会场终端对本地视频码流的处理进行分担。服务器发送本地视频码流是各个会场终端分别发送的,各个会场终端之间也不会有干扰。由于与会的会场终端并不止一个,因此,会场终端的本地视频码流也会有很多;所以,步骤S103中的分别的含义,就是各个会场终端所需的本地视频码流很可能不止一个,那么就将该会场终端所需的本地视频码流分别进行转发;服务器将接收到的各个会场终端的本地视频码流,不 加处理的直接转发,转发给各个会场终端的所需的本地视频码流是各个会场终端发送给服务器的原始本地视频码流。本地视频码流集合,是指各个终端上报给服务器的本地视频码流的集合,这个集合是原始的本地视频码流的集合,而各个会场终端所对应的本地视频码流集合中所包括的本地视频码流可能是相同的,也可能是部分相同的,或者是各不相同。During a cloud conference, each site terminal needs to send its own local video stream to the server. At this time, the server does not perform the process of decoding and synthesizing the local video code stream sent by each site terminal, but determines the local video code stream set to be sent to the site terminal according to the preset rule. The local video code stream set includes at least one local video code stream; and then, according to the determined local video code stream set sent to the site terminal, the local video code stream included in the local video code stream set is sent to each site terminal, Effectively save the resources of the server, and let the site terminals share the processing of the local video stream. The local video code stream sent by the server is sent by each site terminal, and there is no interference between the sites. Since there are more than one site terminal at the conference, there are many local video streams in the site terminal. Therefore, the meaning of the message in step S103 is that the local video stream required by each site terminal is likely to be more than one. The local video code stream required by the site terminal is separately forwarded; the local video code stream of each site terminal received by the server is not The direct local forwarding of the processing, the required local video code stream forwarded to each site terminal is the original local video code stream sent by each site terminal to the server. The local video code stream set refers to a set of local video code streams reported by the respective terminals to the server. The set is a set of original local video code streams, and the local video code stream set corresponding to each site terminal is included in the local The video streams may be the same, they may be partially identical, or they may be different.
上述的预设的规则,可以包括:将广播源的本地视频码流,作为本地视频码流集合中的一部分,即本地视频码流集合包括广播源的本地视频码流;还可以包括,接收会场终端的请求,将会场终端请求的本地视频码流作为本地视频码流集合中的一部分,即本地视频码流集合包括会场终端请求的本地视频码流。The foregoing preset rule may include: the local video code stream of the broadcast source as part of the local video code stream set, that is, the local video code stream set includes the local video code stream of the broadcast source; and may further include: receiving the conference site The request of the terminal, the local video code stream requested by the field terminal is used as a part of the local video code stream set, that is, the local video code stream set includes the local video code stream requested by the site terminal.
一般而言,发送给某一个会场终端的本地视频码流应该是不同于该会场终端的本地视频码流,当然,该会场终端自己的本地视频码流是无须从服务器接收的;在云会议中,任何会场终端至少应该有一个会场终端的画面,更多的时候可以有多个画面。由于与会的会场终端数目并不是唯一的,而各个会场终端并不是都需要看到所有的会场终端,有的会场终端出于突出重点的考虑,可能只会查看某一个会场终端的视频,有的会场终端又可能为了多看看其他会场终端的画面而选择查看所有会场终端的视频,因此,服务器所需要的是,根据不同的会场终端的需求,将这些会场终端需要的那些会场终端的本地视频码流发送给这些会场终端。Generally, the local video code stream sent to a certain site terminal should be a local video code stream different from the site terminal. Of course, the local video stream of the site terminal does not need to be received from the server; in the cloud conference Any venue terminal should have at least one screen of the venue terminal, and more times there can be multiple screens. Because the number of conference site terminals is not unique, and each site terminal does not need to see all the site terminals. Some site terminals may only view videos of a certain site terminal for consideration. The site terminal may choose to view the video of all the site terminals in order to view the images of the other site terminals. Therefore, the server needs the local video of the site terminals required by the site terminals according to the requirements of different site terminals. The code stream is sent to these venue terminals.
在云会议中,常常会有广播源,广播源也是会场终端中的一个,即整个云会议的大局由这个会场终端来掌控,因此,往往需要将这个会场终端的本地视频码流发送到每一个会场终端,那么,此时,本地视频码流集合包括广播源的本地视频码流,就需要将广播源的本地视频码流直接转发给其他每一个会场终端;此外,广播源在会议中也需要有指向的看某个或者某些会场终端,相应的,需要将这些会场终端的本地视频码流发送给广播源;广播源可以看其他的会场终端,也可以看广播源自己,广播源所看的会场终端,就称为广播源看的端;在一个云会议中,往往会有这两个比较特殊的会场终端,其中,广播源的本地视频码流需要发送给每个会场终端, 而广播源看的端需要发送给广播源。当然,广播源看的端在一次云会议中是可以变化的,而且可能会经常性的变化,这个并不影响本方案的实施。此外,出于各种原因,广播源本身在一次云会议中也是可能变化的,在广播源变化之后,也应该将变化后的广播源的本地视频码流发送给各个会场终端。In a cloud conference, there is often a broadcast source, and the broadcast source is also one of the venue terminals. That is, the overall situation of the entire cloud conference is controlled by the conference terminal. Therefore, it is often necessary to send the local video code stream of the conference terminal to each one. The site terminal, then, at this time, the local video code stream set includes the local video code stream of the broadcast source, and the local video code stream of the broadcast source needs to be directly forwarded to each of the other site terminals; in addition, the broadcast source also needs to be in the conference. Pointing to some or some of the venue terminals, correspondingly, you need to send the local video stream of these venue terminals to the broadcast source; the broadcast source can look at other venue terminals, or you can watch the broadcast source itself, the broadcast source sees The site terminal is called the broadcast source. In a cloud conference, there are often two special site terminals. The local video stream of the broadcast source needs to be sent to each site terminal. The end of the broadcast source needs to be sent to the broadcast source. Of course, the end of the broadcast source can be changed in a cloud conference, and it may change frequently. This does not affect the implementation of this solution. In addition, for various reasons, the broadcast source itself may also change in a cloud conference. After the broadcast source changes, the local video code stream of the changed broadcast source should also be sent to each conference terminal.
各个会场终端在云会议中的任何时候,不论是刚开始,或者在会议中,都可以向服务器发送请求,这个请求至少包括该会场终端需要看的会场终端,也即是说,各个会场终端向服务器发送包括需要看的会场终端的请求;服务器则根据该请求,确定该会场终端所需的本地视频码流,即本地视频码流集合包括会场终端请求的本地视频码流,并向该会场终端发送相应的本地视频码流。也即是说,各个会场终端至少应该接收广播源的本地视频码流,此外,还可以接收其他会场终端的本地视频码流。Each site terminal can send a request to the server at any time in the cloud conference, whether at the beginning or in the conference. The request includes at least the site terminal that the site terminal needs to see, that is, each site terminal The server sends a request including the site terminal to be viewed; the server determines the local video code stream required by the site terminal according to the request, that is, the local video code stream set includes the local video code stream requested by the site terminal, and sends the local video code stream to the site terminal. Send the corresponding local video stream. That is to say, each site terminal should receive at least the local video code stream of the broadcast source, and can also receive the local video code stream of other site terminals.
服务器在接收各个会场终端的本地视频码流时,相应的,也会接收各个会场终端的第一音频码流;在接收各个会场终端的第一音频码流时,诚然,服务器也可以直接将各个会场终端的第一音频码流发送给各个会场终端进行处理,然而,由于音频的处理与视频的处理相比其占用的服务器的CPU和内存较低,因此没有必要;此外,从实际出发,由于各个与会的会场终端在会议中并不需要看到每个会场终端的画面,但是一般需要听到所有会场终端的声音,这样才能使的会议能够正常进行。若会场终端选择性的对第一音频码流流进行接收,则会导致该会场终端错过某些其他会场终端的发言,若错过的发言特别重要的话,则会直接影响此次会议的质量。因此,本实施例中的各个会场终端所发送的第一音频码流,服务器直接对该第一音频码流进行处理,然后发送给各个会场终端。而对各个会场终端所发送的第一音频码流进行处理包括:对第一音频码流进行解码,得到至少一个音频;对该音频进行混音;将混音后的音频进行编码,得到第二音频码流;将第二音频码流发送给各个会场终端。其中,对各个会场终端发送的第一音频解码和/或第二音频编码可以包括波形编解码、参数编解码以及混合编解码中的任何可行的方式;波形编解码包括脉冲编码调制 (Pulse Mode Modulation,简称PCM)、自适应差分脉冲编码调制(Adaptive Difference Pulse Code Modulation,简称ADPCM)、子带-自适应差分脉冲编码(Subband Adaptive Differential Pulse Code Modulation,简称SB-ADPCM)等等编解码方式,参数编解码包括线性预测编码(Linear Predictive Coding,简称LPC)等编解码方式,混合编解码包括码激励线性预测编码(Code Excited Linear Predictive Coding,简称CELPC)、矢量和激励线性预测编码(Vector Sum Excited Linear Predictive Coding,简称VSLPC)、规则脉冲激励长时预测(Regular Pulse Excited-Long Term Predictive,简称RPE-LTP)、低时延码激励线性预测(Low Delay-Code Excited Linear Predictive,简称LD-CELP)、多脉冲激励(Multi-Pulse Excited,简称MPE)等等编解码方式。对音频进行混音(Audio Mixing,常简称为mix)是音频处理中的一个步骤,是把多种来源的声音,整合为一个声音。本实施例中的各个音频来自不同的会场终端,在混音过程中,会将每一个音频的频率、动态、音质、定位、残响和声场单独进行调整,让各音轨最佳化,之后再叠加于最终成品上。这种处理方式,能制作出层次分明的音频效果。混音可以由混音软件来进行处理。对混音后的音频再进行编码,就可以发送给各个会场终端。服务器可以对所有音频均进行处理,但在某些情况下,服务器也可以选择性的对各个会场终端的音频码流进行处理,这可能是因为该会场终端不需要说话,或者由广播源进行调配,采用有序发言的方式等等。When receiving the local video code stream of each site terminal, the server also receives the first audio code stream of each site terminal; when receiving the first audio code stream of each site terminal, it is true that the server can also directly The first audio stream of the site terminal is sent to each site terminal for processing. However, since the processing of the audio is lower than the CPU and memory of the server occupied by the video processing, it is not necessary; The conference site terminals of each conference do not need to see the screen of each conference terminal in the conference, but generally need to hear the voice of all the conference terminals, so that the conference can be performed normally. If the site terminal selectively receives the first audio code stream, the site terminal misses the speech of some other site terminals. If the missed speech is particularly important, the quality of the conference will be directly affected. Therefore, the first audio code stream sent by each site terminal in the embodiment is processed by the server directly, and then sent to each site terminal. The processing of the first audio code stream sent by each site terminal includes: decoding the first audio code stream to obtain at least one audio; mixing the audio; encoding the mixed audio to obtain a second Audio stream; send the second audio stream to each venue terminal. The first audio decoding and/or the second audio encoding sent to each site terminal may include any feasible manner in waveform codec, parameter codec, and hybrid codec; the waveform codec includes pulse code modulation. (Pulse Mode Modulation, PCM for short), Adaptive Differential Pulse Code Modulation (ADPCM), Subband Adaptive Differential Pulse Code Modulation (SB-ADPCM), etc. Decoding mode, parameter codec includes codec mode such as Linear Predictive Coding (LPC), and mixed codec includes Code Excited Linear Predictive Coding (CELPC), vector and excitation linear predictive coding ( Vector Sum Excited Linear Predictive Coding (VSLPC), Regular Pulse Excited-Long Term Predictive (RPE-LTP), Low Delay-Code Excited Linear Predictive (LD) -CELP), Multi-Pulse Excited (MPE) and so on. Audio Mixing (often referred to as mix) is a step in audio processing that combines sound from multiple sources into one sound. The audio in this embodiment is from different venue terminals. During the mixing process, the frequency, dynamics, sound quality, positioning, reverberation and sound field of each audio are separately adjusted to optimize each track. Superimposed on the final product. This kind of processing can produce a layered audio effect. The mix can be handled by the mixing software. After the mixed audio is encoded, it can be sent to each venue terminal. The server can process all the audio, but in some cases, the server can selectively process the audio stream of each site terminal. This may be because the site terminal does not need to talk or is deployed by the broadcast source. , using an orderly way of speaking and so on.
本实施例提供了一种云会议处理方法,请参考图2,包括:This embodiment provides a cloud conference processing method. Referring to FIG. 2, the method includes:
S201、将会场终端的本地视频码流发送给服务器;S201. Send a local video code stream of the terminal to the server.
S202、接收本地视频码流:接收服务器下发的本地视频码流集合包括的本地视频码流;S202. Receive a local video code stream: receive a local video code stream included in a local video code stream set delivered by the server.
S203、对本地视频码流进行处理,得到视频。S203. Process the local video code stream to obtain a video.
在云会议中,每一个加入云会议的会场终端都应该将各自的本地视频 码流发送给服务器。虽然各个云服务器处在同一个云会议中,但各个会场终端发送本地视频码流的过程是独立的,即各个会场终端是分别将各自的本地视频码流发送给服务器,同样的,各个会场终端接收服务器的至少一个其他会场终端的本地视频码流的过程也是独立的,各个会场终端之间不会互相干扰。In the cloud conference, each venue terminal that joins the cloud conference should have its own local video. The code stream is sent to the server. Although each cloud server is in the same cloud conference, the process of sending local video code streams by each site terminal is independent, that is, each site terminal separately sends its own local video code stream to the server. Similarly, each site terminal The process of receiving the local video code stream of at least one other site terminal of the server is also independent, and the site terminals do not interfere with each other.
接收服务器下发的本地视频码流集合包括的本地视频码流指的是,在云会议中,服务器接收了各个会场终端分别发送的本地视频码流,然后将本地视频码流分别发送给各个会场终端;各个会场终端所要接收的本地视频码流应该是其他会场终端的本地视频码流,而无须接收会场终端自己的本地视频码流;本地视频码流集合包括的本地视频码流,一般而言,是指在云会议中,任何会场终端至少应该有一个会场终端的画面,更多的时候有多个画面。由于与会的会场终端数目并不是唯一的,而各个会场终端并不是都需要看到所有的会场终端,有的会场终端出于突出重点的考虑,可能只会查看某一个会场终端的视频,有的会场终端又可能为了多看看其他会场终端的画面而选择查看所有会场终端的视频,因此,服务器所需要的是,根据不同的会场终端的需求,将这些会场终端需要的那些会场终端的本地视频码流发送给这些会场终端。本地视频码流集合,是指各个终端上报给服务器的本地视频码流的集合,这个集合是原始的本地视频码流的集合,而各个会场终端所对应的本地视频码流集合中所包括的本地视频码流可能是相同的,也可能是部分相同的,或者是各不相同。The local video code stream included in the local video code stream set sent by the receiving server refers to: in the cloud conference, the server receives the local video code stream sent by each site terminal, and then sends the local video code stream to each site respectively. The local video code stream to be received by each site terminal should be the local video code stream of other site terminals, without receiving the local video code stream of the site terminal; the local video code stream included in the local video code stream set is generally In the cloud conference, any venue terminal should have at least one site terminal screen, and more often there are multiple screens. Because the number of conference site terminals is not unique, and each site terminal does not need to see all the site terminals. Some site terminals may only view videos of a certain site terminal for consideration. The site terminal may choose to view the video of all the site terminals in order to view the images of the other site terminals. Therefore, the server needs the local video of the site terminals required by the site terminals according to the requirements of different site terminals. The code stream is sent to these venue terminals. The local video code stream set refers to a set of local video code streams reported by the respective terminals to the server. The set is a set of original local video code streams, and the local video code stream set corresponding to each site terminal is included in the local The video streams may be the same, they may be partially identical, or they may be different.
在云会议中,常常会有广播源,广播源也是会场终端中的一个,即整个云会议的大局由这个会场终端来掌控,因此,往往需要将这个会场终端的本地视频码流发送到每一个会场终端,那么,此时,各个会场终端就需要接收广播源的本地视频码流,即本地视频码流集合包括广播源的本地视频码流;当然,这里的接收还是由服务器所发送的;此外,广播源在会议中也需要有指向的看某个或者某些会场终端,相应的,需要将这些会场终端的本地视频码流发送给广播源,也就是说广播源需要接收广播源看的端的本地视频码流;广播源可以看其他的会场终端,也可以看广播源自己, 广播源所看的会场终端,就称为广播源看的端;在一个云会议中,往往会有这两个比较特殊的会场终端,其中,广播源的本地视频码流需要发送给每个会场终端,而广播源看的端需要发送给广播源。当然,广播源看的端在一次云会议中是可以变化的,而且可能会经常性的变化,这个并不影响本方案的实施。此外,出于各种原因,广播源本身在一次云会议中也是可能变化的,在广播源变化之后,也应该将变化后的广播源的本地视频码流发送给各个会场终端。In a cloud conference, there is often a broadcast source, and the broadcast source is also one of the venue terminals. That is, the overall situation of the entire cloud conference is controlled by the conference terminal. Therefore, it is often necessary to send the local video code stream of the conference terminal to each one. The venue terminal, then, at this time, each venue terminal needs to receive the local video code stream of the broadcast source, that is, the local video code stream set includes the local video code stream of the broadcast source; of course, the reception here is still sent by the server; The broadcast source also needs to point to some or some of the venue terminals in the conference. Correspondingly, the local video code streams of the conference terminals need to be sent to the broadcast source, that is, the broadcast source needs to receive the broadcast source. Local video stream; the broadcast source can look at other venue terminals, or you can watch the broadcast source itself. The site terminal that the broadcast source sees is called the broadcast source. In a cloud conference, there are often two special site terminals. The local video stream of the broadcast source needs to be sent to each site. The terminal, while the end of the broadcast source needs to be sent to the broadcast source. Of course, the end of the broadcast source can be changed in a cloud conference, and it may change frequently. This does not affect the implementation of this solution. In addition, for various reasons, the broadcast source itself may also change in a cloud conference. After the broadcast source changes, the local video code stream of the changed broadcast source should also be sent to each conference terminal.
各个会场终端在云会议中的任何时候,不论是刚开始,或者在会议中,都可以向服务器发送请求,这个请求至少包括该会场终端需要看的会场终端,也即是说,各个会场终端向服务器发送包括需要看的会场终端的请求;服务器则根据该请求向该会场终端发送相应的本地视频码流。本地视频码流集合包括会场终端请求的本地视频码流。也即是说,各个会场终端至少应该接收广播源的本地视频码流,此外,还可以接收其他会场终端的本地视频码流。Each site terminal can send a request to the server at any time in the cloud conference, whether at the beginning or in the conference. The request includes at least the site terminal that the site terminal needs to see, that is, each site terminal The server sends a request including the site terminal that needs to be viewed; the server sends a corresponding local video code stream to the site terminal according to the request. The local video stream set includes a local video stream requested by the venue terminal. That is to say, each site terminal should receive at least the local video code stream of the broadcast source, and can also receive the local video code stream of other site terminals.
在接收到至少一个本地视频码流后,就可以对本地视频码流进行处理,处理的过程包括:对本地视频码流进行解码,得到至少一个子视频;然后对子视频执行合成操作,得到视频。会场终端接收的本地视频码流,包括广播源在内,可能有多个,会场终端对这多个本地视频码流进行解码;视频的编解码的格式主要有以下几种:H.261、H.263、H.264,采用任何一种格式均可。将各个本地视频码流解码后,得到了至少一个子视频,就可以根据会场终端自己的意愿对各个子视频进行合成。合成后的视频的画面版式可以是任意的,一些常用的合成后的画面版式可以是:其一,对应各个会场终端的子视频按照同样的大小在画面中均匀分布;其二,以其中一个子视频为主视频,使其画面最大,其他的子视频作为从视频,画面分布在主画面的周围,其中,主视频可以以广播源作为主视频,或者此时云会议中主要发言的会场终端的视频作为主视频;其三,合成后的视频的画面可以是动态的,如谁发言谁的视频画面就相应的放大,显得比较突出。采用以上任何的方式或者其他未提及的合成方式在本实施例中均是适用的, 只要其能够让该会场终端上的视频正常显示给与会的用户看即可。这样,将本地视频码流的解码合成的处理过程转到相应的会场终端来处理,由于各个会场终端的处理是独立的,不仅可以依照各个会场终端私人订制,有很好的用户体验,还减少了服务器端的CPU和内存使用,使得服务器可以接入更多的会场终端,从而可以提高会议的效率。After receiving at least one local video code stream, the local video code stream may be processed. The process includes: decoding the local video code stream to obtain at least one sub video; and then performing a synthesizing operation on the sub video to obtain a video. . The local video code stream received by the site terminal, including the broadcast source, may be multiple. The site terminal decodes the multiple local video code streams. The format of the video codec is mainly as follows: H.261, H .263, H.264, can be used in any format. After decoding each local video stream, at least one sub-video is obtained, and each sub-video can be synthesized according to the will of the venue terminal. The screen layout of the synthesized video may be arbitrary. Some commonly used synthesized screen layouts may be: first, the sub-videos corresponding to the respective venue terminals are evenly distributed in the screen according to the same size; second, one of them The video is the main video, and the picture is the largest. The other sub-videos are used as the slave video, and the picture is distributed around the main picture. The main video can use the broadcast source as the main video, or the conference terminal that mainly speaks in the cloud conference. The video is used as the main video; thirdly, the picture of the synthesized video can be dynamic, such as who is speaking and whose video picture is enlarged accordingly, which is more prominent. Any of the above methods or other unmentioned synthetic methods are applicable in this embodiment. As long as it can display the video on the terminal of the venue to the participating users. In this way, the process of decoding and synthesizing the local video code stream is transferred to the corresponding site terminal for processing. Since the processing of each site terminal is independent, not only can the personal customization of each site terminal be customized, but also a good user experience. The server-side CPU and memory usage are reduced, so that the server can access more site terminals, thereby improving the efficiency of the conference.
在发送会场终端的本地视频码流给服务器的同时,还可以将各个会场终端的第一音频码流发送给服务器。在发送各个会场终端的第一音频码流时,诚然,会场终端也可以采用类似本地视频码流的处理方式对音频码流进行处理,但是音频码流有其特殊性,音频的处理与视频的处理相比,音频处理所占用的服务器的CPU和内存较低,因此没有必要;此外,从实际出发,由于各个与会的会场终端在会议中并不需要看到每个会场终端的画面,但是一般需要听到所有会场终端的声音,这样才能使得会议能够正常进行。因此,各个会场终端将第一音频码流发送给服务器后,服务器对其进行处理,得到第二音频码流,然后将第二音频码流发送给每个会场终端。服务器对第一音频码流的处理与以上实施例中的一致,这里不再赘述。The first audio code stream of each site terminal may also be sent to the server while the local video code stream of the site terminal is sent to the server. When transmitting the first audio stream of each venue terminal, it is true that the venue terminal can also process the audio stream using a processing method similar to the local video stream, but the audio stream has its particularity, audio processing and video. Compared with the processing, the CPU and memory of the server occupied by the audio processing are lower, so it is not necessary; in addition, from the actual situation, since the conference terminals of the participating conferences do not need to see the screen of each conference terminal in the conference, generally You need to hear the sound of all the venue terminals so that the conference can proceed normally. Therefore, after each site terminal sends the first audio code stream to the server, the server processes the second audio code stream, and then sends the second audio code stream to each site terminal. The processing of the first audio code stream by the server is the same as that in the above embodiment, and details are not described herein again.
本实施例提供了一种云会议处理方法,请参考图3,包括:This embodiment provides a cloud conference processing method. Referring to FIG. 3, the method includes:
S301、服务器会叫会场终端加入会议:服务器将与会会场终端T1-Tn呼叫加入会议;S301: The server calls the conference terminal to join the conference: the server joins the conference site terminal T1-Tn to join the conference;
云会议的发起者一般就是广播源,广播源通过服务器发起呼叫,建立云会议。The initiator of the cloud conference is generally the broadcast source, and the broadcast source initiates a call through the server to establish a cloud conference.
S302、会场终端加入会议,并发送本地视频码流、第一音频频码流给服务器:各个会场终端T1-Tn加入会议,并将本地视频码流、第一音频码流发给服务器;S302, the site terminal joins the conference, and sends the local video code stream and the first audio frequency code stream to the server: each site terminal T1-Tn joins the conference, and sends the local video code stream and the first audio code stream to the server;
各个收到会议请求的会场终端在加入会议时,就将各自的本地视频码流、音频码流发送给服务器。Each site terminal that receives the conference request sends its local video stream and audio stream to the server when joining the conference.
S303、确定广播源和广播源看的端:服务器确定Tx会场终端为会议 广播源;Ty会场终端为广播源会场终端Tx看的端;S303. Determine the end of the broadcast source and the broadcast source: the server determines that the Tx site terminal is a conference. Broadcast source; the terminal of the Ty site is the end of the broadcast source site terminal Tx;
由于各个与会的会场终端在会议中至少需要广播源的画面,因此服务器需要确定哪个是广播源;由于广播源是会议的发起者,此步骤也可以在一开始就进行。Since each participating conference terminal needs at least a broadcast source picture in the conference, the server needs to determine which is the broadcast source; since the broadcast source is the initiator of the conference, this step can also be performed at the beginning.
S304、服务器将广播源的本地视频码流和第二音频码流发送给各个会场终端:服务器将收到的会议广播源Tx的本地视频码流不进行编解码处理直接透传、转发给入会的T1-Tn会场终端;将收到的Ty会场终端的本地视频码流转发给广播源Tx会场终端。同时云服务器将各会场终端发来的第一音频码流进行解码、混音再编码后形成第二音频码流,再重新发回给所有会场终端;S304. The server sends the local video code stream and the second audio code stream of the broadcast source to each site terminal. The server directly transmits and forwards the local video code stream of the received conference broadcast source Tx to the conference. The T1-Tn site terminal is forwarded to the broadcast source Tx site terminal by the local video code stream of the received Ty site terminal. At the same time, the cloud server decodes, mixes and re-encodes the first audio stream sent by each site terminal to form a second audio stream, and then sends it back to all the venue terminals.
服务器已经不再对各个会场终端所发送的本地视频码流进行解码合成等处理,而是直接分别将不同于各个会场终端的至少一个其他会场终端的本地视频码流发送给各个会场终端,可以有效的节约服务器的资源,让各个会场终端对本地视频码流的处理进行分担。The server does not perform the process of decoding and synthesizing the local video code stream sent by each site terminal, but directly transmits the local video code stream of at least one other site terminal different from each site terminal to each site terminal, which can be effective. The resources of the server are saved, and each site terminal shares the processing of the local video code stream.
广播源的本地视频码流应该发送给每一个其他的会场终端,即T1-Tn中除了Tx的会场终端;广播源看的端Ty的本地视频码流则发送给广播源。The local video code stream of the broadcast source should be sent to each other site terminal, that is, the site terminal except Tx in T1-Tn; the local video code stream of the terminal Ty viewed by the broadcast source is sent to the broadcast source.
与此同时,服务器也接收来自各个会场终端的第一音频码流,对该第一音频码流进行解码处理,得到至少一个音频;对这些音频进行混音;然后将混音后的音频进行编码,得到第二音频码流,再将第二音频码流发送给每个会场终端。At the same time, the server also receives the first audio stream from each venue terminal, decodes the first audio stream to obtain at least one audio, mixes the audio, and then encodes the mixed audio. And obtaining a second audio code stream, and then sending the second audio code stream to each of the venue terminals.
S305、是否收到会场终端的请求:如果没有收到会场终端的请求,则继续保持S304的流程;S305. Whether the request of the site terminal is received: if the request of the site terminal is not received, the process of S304 is continued;
服务器此时已经将广播源的本地视频码流发送给了各个会场终端,第二音频码流也已经发送给了各个会场终端;若会场终端没有查看其他会场终端的画面的需求,那么,此时服务器就继续进行S304的步骤,继续云会议。The server has already sent the local video code stream of the broadcast source to each site terminal, and the second audio code stream has been sent to each site terminal. If the site terminal does not need to view the images of other site terminals, then The server proceeds to the step of S304 to continue the cloud conference.
S306、判断该会场终端所请求的本地视频码流:当收到会场终端发送 过来的请求,服务器会场终端请求的内容中判断该会场终端请求的是哪个/哪些会场终端的本地视频码流;S306. Determine a local video code stream requested by the site terminal: when receiving the site terminal, sending The request from the server, the content requested by the server site terminal determines which local video stream of the conference terminal is requested by the site terminal;
当会场终端想查看某些会场终端的画面时,会场终端就向服务器发送一个请求,服务器则根据该请求确认该会场终端想要看的是哪个/哪些会场终端的本地视频码流。When the site terminal wants to view the screen of the site terminal, the site terminal sends a request to the server, and the server confirms which local video stream of the site terminal the site terminal wants to see according to the request.
S307、将请求的本地视频码流发送给该终端:服务器将相应的本地视频码流发送给该会场终端。S307. Send the requested local video code stream to the terminal: the server sends the corresponding local video code stream to the site terminal.
服务器确认了该会场终端需要看的会场终端的本地视频码流之后,就可以将该本地视频码流发送给该会场终端;之后,就由该会场终端对该本地视频码流进行处理,从而得到所需的视频画面。After the server confirms the local video code stream of the site terminal that the site terminal needs to view, the local video code stream can be sent to the site terminal; after that, the site video stream is processed by the site terminal, thereby obtaining The desired video picture.
本实施例提供了一种云会议处理方法,请参考图4,包括:This embodiment provides a cloud conference processing method. Referring to FIG. 4, the method includes:
S401、本会场终端收到入会请求;S401. The terminal of the site receives the membership request;
此时,广播源发起了云会议,相应的会场终端则需要入会。At this point, the broadcast source initiates a cloud conference, and the corresponding site terminal needs to join the conference.
S402、本会场终端加入会议,并将本地的本地视频码流、第一音频码流编码后发给服务器;S402. The terminal of the site joins the conference, and the local local video stream and the first audio stream are encoded and sent to the server.
本会场终端在加入会议时,就将本会场终端的本地视频码流、第一音频码流发送给服务器;本地视频码流和第一音频码流都是通过编码而成的,根据各自的编码格式。When the conference terminal joins the conference, the local video code stream and the first audio code stream of the conference terminal are sent to the server; the local video code stream and the first audio stream are encoded, according to the respective codes. format.
S403、接收本地视频码流和第二音频码流:本会场终端收到云服务器发送过来的广播源的本地视频码流和第二音频码流;同时收到与会的会场终端列表;S403. Receive a local video code stream and a second audio code stream: the local terminal terminal receives the local video code stream and the second audio code stream of the broadcast source sent by the cloud server; and receives the list of the participating site terminals;
广播源的本地视频码流需要每个会场终端均接收,也就是说广播源的画面每个会场终端都需要看到;服务器在接收各个会场终端发送的第一音频码流之后,就对该音频码流进行处理,包括解码、混音再编码,然后发送给各个会场终端。 The local video code stream of the broadcast source needs to be received by each site terminal, that is, the picture of the broadcast source needs to be seen by each site terminal; after receiving the first audio code stream sent by each site terminal, the server receives the audio stream. The code stream is processed, including decoding, mixing, and encoding, and then sent to each venue terminal.
本会场终端除了需要接收广播源的本地视频码流以及第二音频码流之外,还应该得知与会的会场终端情况,这样才能决定是否看其他的会场终端。In addition to receiving the local video stream and the second audio stream of the broadcast source, the site terminal should also know the site terminal of the conference, so as to determine whether to view other site terminals.
S404、对本地视频码流和第二音频码流进行处理,形成视频和音频:本会场终端分别对接收到的本地视频码流和第二音频码流进行处理,形成视频和音频;S404: Processing the local video code stream and the second audio code stream to form video and audio: the local terminal terminal processes the received local video code stream and the second audio code stream to form video and audio, respectively;
会场终端此时接收到的本地视频码流是广播源的本地视频码流;对本地视频码流的处理包括解码,再合成;当只有一个本地视频码流时,无须进行合成操作,直接对本地视频码流进行解码就可以得到视频了;第二音频码流是各个会场终端的第一音频码流在服务器端处理所得,解码后,本会场终端就可以得到各个会场终端合成后的音频。The local video stream received by the site terminal at this time is the local video stream of the broadcast source; the processing of the local video stream includes decoding and recombination; when there is only one local video stream, the synthesis operation is not required, directly to the local The video stream can be decoded to obtain the video. The second audio stream is processed by the first audio stream of each site terminal on the server. After decoding, the site terminal can obtain the synthesized audio of each site terminal.
S405、是否发送请求:如果不向服务器发送请求,本会场终端保持步骤S403、S404;S405: Whether to send a request: if the request is not sent to the server, the site terminal keeps steps S403 and S404;
如果本会场终端没有多画面的需求,此时仅仅需要保持看广播源的画面即可。If there is no multi-screen requirement in the terminal, you only need to keep watching the broadcast source.
S406、向服务器发送请求:本会场终端向服务器发送请求,先从与会的会场终端列表中确定需要看那些会场终端的画面;然后将请求消息发给服务器,请求中包含需要看的会场终端的列表;S406: Sending a request to the server: the site terminal sends a request to the server, and first determines a screen of the site terminal that needs to be viewed from the site terminal list of the participant; and then sends the request message to the server, where the request includes a list of the site terminal to be viewed. ;
若本会场终端需要看其他的会场终端的图像,首先,确定需要看的会场终端是哪些;然后,将体现这些会场终端的列表发送给服务器。If the site terminal needs to view the image of the other site terminal, first determine which site terminals need to be viewed; then, send a list of the site terminals to the server.
S407、服务器将这些终端的本地视频码流发送给会场终端:云服务器从会场终端的请求中的会场终端的列表中找出对应的会场终端,并将这些会场终端的本地视频码流流转发给本会场终端;S407. The server sends the local video code stream of the terminal to the site terminal. The cloud server finds the corresponding site terminal from the list of the site terminals in the request of the site terminal, and forwards the local video code stream of the site terminal to the site. The venue terminal;
S408、本会场终端接收视频码,并进行合成,得到视频:本会场终端收到这些本地视频码流流后分别进行解码,再合成为多画面,输出到显示设备上。 S408: The terminal of the site receives the video code, and performs synthesis to obtain a video: the site terminal receives the local video code stream and decodes the video stream, and then synthesizes into multiple frames and outputs the image to the display device.
本实施例提供了一种服务器,请参考图5,包括:This embodiment provides a server. Referring to FIG. 5, the method includes:
第一接收模块101,设置为接收各个会场终端上报的本地视频码流;The first receiving module 101 is configured to receive a local video code stream reported by each site terminal;
确定模块105,设置为针对各个会场终端,分别根据预设的规则确定向其下发的本地视频码流集合,本地视频码流集合至少包括一个本地视频码流;The determining module 105 is configured to determine, according to a preset rule, a set of local video code streams sent to the site terminals, where the local video code stream set includes at least one local video code stream;
第一发送模块102,设置为将本地视频码流集合包括的本地视频码流发送给对应的会场终端。The first sending module 102 is configured to send the local video code stream included in the local video code stream set to the corresponding site terminal.
在进行云会议时,各个会场终端均需要将自己的本地视频码流发送给服务器。此时,服务器已经不再对各个会场终端所发送的本地视频码流进行解码合成等处理,而是针对各个会场终端,分别根据预设的规则确定向其下发的本地视频码流集合,而本地视频码流集合至少包括一个本地视频码流;然后再根据确定的向会场终端下发的本地视频码流集合,将该本地视频码流集合包括的本地视频码流发送给各个会场终端,可以有效的节约服务器的资源,让各个会场终端对本地视频码流的处理进行分担。服务器发送本地视频码流是各个会场终端分别发送的,各个会场终端之间也不会有干扰。由于与会的会场终端并不止一个,因此,会场终端的本地视频码流也会有很多;所以,第一发送模块102中的分别的含义,就是各个会场终端所需的本地视频码流很可能不止一个,那么就将该会场终端所需的本地视频码流分别进行转发;服务器将接收到的各个会场终端的本地视频码流,不加处理的直接转发,转发给各个会场终端的所需的本地视频码流是各个会场终端发送给服务器的原始本地视频码流。本地视频码流集合,是指各个终端上报给服务器的本地视频码流的集合,这个集合是原始的本地视频码流的集合,而各个会场终端所对应的本地视频码流集合中所包括的本地视频码流可能是相同的,也可能是部分相同的,或者是各不相同。During a cloud conference, each site terminal needs to send its own local video stream to the server. At this time, the server does not perform the process of decoding and synthesizing the local video code stream sent by each site terminal, but determines the local video code stream set to be sent to the site terminal according to the preset rule. The local video code stream set includes at least one local video code stream; and then, according to the determined local video code stream set sent to the site terminal, the local video code stream included in the local video code stream set is sent to each site terminal, Effectively save the resources of the server, and let the site terminals share the processing of the local video stream. The local video code stream sent by the server is sent by each site terminal, and there is no interference between the sites. Since there are more than one site terminal at the conference, there are many local video streams of the site terminal. Therefore, the meaning of the first sending module 102 is that the local video stream required by each site terminal is likely to be more than One, then the local video code stream required by the site terminal is separately forwarded; the server will forward the local video code stream of each site terminal to the required locality of each site terminal without processing. The video stream is the original local video stream sent to the server by each venue terminal. The local video code stream set refers to a set of local video code streams reported by the respective terminals to the server. The set is a set of original local video code streams, and the local video code stream set corresponding to each site terminal is included in the local The video streams may be the same, they may be partially identical, or they may be different.
上述的预设的规则,可以包括:将广播源的本地视频码流,作为本地视频码流集合中的一部分,即本地视频码流集合包括广播源的本地视频码流;还可以包括,接收会场终端的请求,将会场终端请求的本地视频码流 作为本地视频码流集合中的一部分,即本地视频码流集合包括会场终端请求的本地视频码流。The foregoing preset rule may include: the local video code stream of the broadcast source as part of the local video code stream set, that is, the local video code stream set includes the local video code stream of the broadcast source; and may further include: receiving the conference site The request of the terminal, the local video stream requested by the field terminal As part of the local video stream set, the local video stream set includes the local video stream requested by the venue terminal.
一般而言,发送给某一个会场终端的本地视频码流应该是不同于该会场终端的本地视频码流,当然,该会场终端自己的本地视频码流是无须从服务器接收的;在云会议中,任何会场终端至少应该有一个会场终端的画面,更多的时候可以有多个画面。由于与会的会场终端数目并不是唯一的,而各个会场终端并不是都需要看到所有的会场终端,有的会场终端出于突出重点的考虑,可能只会查看某一个会场终端的视频,有的会场终端又可能为了多看看其他会场终端的画面而选择查看所有会场终端的视频,因此,服务器所需要的是,根据不同的会场终端的需求,将这些会场终端需要的那些会场终端的本地视频码流发送给这些会场终端。Generally, the local video code stream sent to a certain site terminal should be a local video code stream different from the site terminal. Of course, the local video stream of the site terminal does not need to be received from the server; in the cloud conference Any venue terminal should have at least one screen of the venue terminal, and more times there can be multiple screens. Because the number of conference site terminals is not unique, and each site terminal does not need to see all the site terminals. Some site terminals may only view videos of a certain site terminal for consideration. The site terminal may choose to view the video of all the site terminals in order to view the images of the other site terminals. Therefore, the server needs the local video of the site terminals required by the site terminals according to the requirements of different site terminals. The code stream is sent to these venue terminals.
在云会议中,常常会有广播源,广播源也是会场终端中的一个,即整个云会议的大局由这个会场终端来掌控,因此,往往需要将这个会场终端的本地视频码流发送到每一个会场终端,那么,此时,本地视频码流集合包括广播源的本地视频码流,第一发送模块102包括视频发送模块1021,设置为将广播源的本地视频码流发送给其他各个会场终端;此外,广播源在会议中也需要有指向的看某个或者某些会场终端,相应的,需要将这些会场终端的本地视频码流发送给广播源;广播源可以看其他的会场终端,也可以看广播源自己,广播源所看的会场终端,就称为广播源看的端;在一个云会议中,往往会有这两个比较特殊的会场终端,其中,广播源的本地视频码流需要发送给每个会场终端,而广播源看的端需要发送给广播源。当然,广播源看的端在一次云会议中是可以变化的,而且可能会经常性的变化,这个并不影响本方案的实施。此外,出于各种原因,广播源本身在一次云会议中也是可能变化的,在广播源变化之后,也应该将变化后的广播源的本地视频码流发送给各个会场终端。In a cloud conference, there is often a broadcast source, and the broadcast source is also one of the venue terminals. That is, the overall situation of the entire cloud conference is controlled by the conference terminal. Therefore, it is often necessary to send the local video code stream of the conference terminal to each one. The site terminal, then, at this time, the local video code stream set includes the local video code stream of the broadcast source, and the first sending module 102 includes a video sending module 1021, configured to send the local video code stream of the broadcast source to other site terminals; In addition, the broadcast source needs to point to a certain or some site terminal in the conference. Correspondingly, the local video code stream of the site terminals needs to be sent to the broadcast source; the broadcast source can view other site terminals, or Look at the broadcast source itself. The venue terminal that the broadcast source sees is called the broadcast source. In a cloud conference, there are often two special venue terminals. The local video stream of the broadcast source needs to be It is sent to each venue terminal, and the end of the broadcast source needs to be sent to the broadcast source. Of course, the end of the broadcast source can be changed in a cloud conference, and it may change frequently. This does not affect the implementation of this solution. In addition, for various reasons, the broadcast source itself may also change in a cloud conference. After the broadcast source changes, the local video code stream of the changed broadcast source should also be sent to each conference terminal.
此外,确定模块105还包括请求接收模块1051;各个会场终端在云会议中的任何时候,不论是刚开始,或者在会议中,都可以向服务器发送请求,这个请求至少包括该会场终端需要看的会场终端,也即是说,各个会 场终端向服务器发送包括需要看的会场终端的请求;请求接收模块1051则设置为接收该请求,确定该会场终端所需的本地视频码流,即本地视频码流集合包括会场终端请求的本地视频码流,视频发送模块1021则根据该请求向该会场终端发送相应的本地视频码流。也即是说,各个会场终端至少应该接收广播源的本地视频码流,此外,还可以接收其他会场终端的本地视频码流。视频发送模块1021还设置为将会场终端请求的本地视频码流发送给会场终端。In addition, the determining module 105 further includes a request receiving module 1051; each meeting terminal can send a request to the server at any time in the cloud meeting, whether at the beginning or in the meeting, the request at least includes the need for the site terminal to view Venue terminal, that is to say, each meeting The field terminal sends a request to the server to include the site terminal to be viewed; the request receiving module 1051 is configured to receive the request, and determine the local video code stream required by the site terminal, that is, the local video code stream set includes the local video requested by the site terminal. The code stream, the video sending module 1021 sends a corresponding local video code stream to the venue terminal according to the request. That is to say, each site terminal should receive at least the local video code stream of the broadcast source, and can also receive the local video code stream of other site terminals. The video sending module 1021 is further configured to send the local video code stream requested by the field terminal to the site terminal.
本发明实施例中,还包括第一音频接收模块103和音频处理模块104;服务器在接收各个会场终端的本地视频码流时,相应的,由第一音频接收模块103接收各个会场终端的第一音频码流;在接收各个会场终端的第一音频码流时,诚然,服务器也可以直接将各个会场终端的第一音频码流发送给各个会场终端进行处理,然而,由于音频的处理与视频的处理相比其占用的服务器的CPU和内存较低,因此没有必要;此外,从实际出发,由于各个与会的会场终端在会议中并不需要看到每个会场终端的画面,但是一般需要听到所有会场终端的声音,这样才能使的会议能够正常进行。若会场终端选择性的对第一音频码流流进行接收,则会导致该会场终端错过某些其他会场终端的发言,若错过的发言特别重要的话,则会直接影响此次会议的质量。因此,本实施例中的各个会场终端所发送的第一音频码流,由音频处理模块104对该第一音频码流进行处理,然后发送给各个会场终端。音频处理模块104包括:音频解码模块1041、混音模块1042、编码模块1043、第一音频发送模块1044;其中,音频解码模块1041设置为对第一音频码流进行解码,得到至少一个音频;混音模块1042设置为对该音频进行混音;编码模块1043设置为将混音后的音频进行编码,得到第二音频码流;第一音频发送模块1044设置为将第二音频码流发送给各个会场终端。其中,对各个会场终端发送的第一音频解码和/或第二音频编码可以包括波形编解码、参数编解码以及混合编解码中的任何可行的方式;波形编解码包括PCM、ADPCM、SB-ADPCM等等编解码方式,参数编解码包括LPC等编解码方式,混合编解码包括CELPC、VSLPC、RPE-LTP、LD-CELP、 MPE等等编解码方式。对音频进行混音是音频处理中的一个步骤,是把多种来源的声音,整合为一个声音。本实施例中的各个音频来自不同的会场终端,在混音过程中,会将每一个音频的频率、动态、音质、定位、残响和声场单独进行调整,让各音轨最佳化,之后再叠加于最终成品上。这种处理方式,能制作出层次分明的音频效果。混音可以由混音软件来进行处理。对混音后的音频再进行编码,就可以发送给各个会场终端。服务器可以对所有音频均进行处理,但在某些情况下,服务器也可以选择性的对各个会场终端的音频码流进行处理,这可能是因为该会场终端不需要说话,或者由广播源进行调配,采用有序发言的方式等等。In the embodiment of the present invention, the first audio receiving module 103 and the audio processing module 104 are further included; when the server receives the local video code stream of each site terminal, correspondingly, the first audio receiving module 103 receives the first of each site terminal. Audio stream; when receiving the first audio stream of each venue terminal, it is true that the server can also directly send the first audio stream of each venue terminal to each venue terminal for processing, however, due to audio processing and video Compared with the CPU and memory of the server it occupies, it is not necessary. In addition, from the actual situation, since the conference terminals of each conference do not need to see the screen of each conference terminal in the conference, they generally need to hear it. The sound of all the venue terminals, so that the conference can be carried out normally. If the site terminal selectively receives the first audio code stream, the site terminal misses the speech of some other site terminals. If the missed speech is particularly important, the quality of the conference will be directly affected. Therefore, the first audio code stream sent by each site terminal in the embodiment is processed by the audio processing module 104, and then sent to each site terminal. The audio processing module 104 includes: an audio decoding module 1041, a mixing module 1042, an encoding module 1043, and a first audio sending module 1044; wherein the audio decoding module 1041 is configured to decode the first audio stream to obtain at least one audio; The audio module 1042 is configured to mix the audio; the encoding module 1043 is configured to encode the mixed audio to obtain a second audio stream; the first audio transmitting module 1044 is configured to send the second audio stream to each Venue terminal. The first audio decoding and/or the second audio encoding sent to each venue terminal may include any feasible manner in waveform encoding and decoding, parameter encoding and decoding, and hybrid encoding and decoding; the waveform encoding and decoding includes PCM, ADPCM, and SB-ADPCM. Such as codec mode, parameter codec includes LPC and other codec modes, mixed codec includes CELPC, VSLPC, RPE-LTP, LD-CELP, MPE and other codec methods. Mixing audio is a step in audio processing that combines sound from multiple sources into one sound. The audio in this embodiment is from different venue terminals. During the mixing process, the frequency, dynamics, sound quality, positioning, reverberation and sound field of each audio are separately adjusted to optimize each track. Superimposed on the final product. This kind of processing can produce a layered audio effect. The mix can be handled by the mixing software. After the mixed audio is encoded, it can be sent to each venue terminal. The server can process all the audio, but in some cases, the server can selectively process the audio stream of each site terminal. This may be because the site terminal does not need to talk or is deployed by the broadcast source. , using an orderly way of speaking and so on.
本实施例提供了一种会场终端,请参考图6,包括:This embodiment provides a site terminal. Referring to FIG. 6, the method includes:
第二发送模块201,设置为将会场终端的本地视频码流发送给服务器;The second sending module 201 is configured to send the local video code stream of the field terminal to the server;
第二接收模块202,设置为接收服务器下发的本地视频码流集合包括的本地视频码流;The second receiving module 202 is configured to receive a local video code stream included in the local video code stream set delivered by the server;
视频处理模块203,设置为对本地视频码流进行处理,得到视频。The video processing module 203 is configured to process the local video code stream to obtain a video.
在云会议中,每一个加入云会议的会场终端都应该将各自的本地视频码流发送给服务器。虽然各个云服务器处在同一个云会议中,但各个会场终端发送本地视频码流的过程是独立的,即各个会场终端是分别将各自的本地视频码流发送给服务器,同样的,各个会场终端接收服务器的至少一个其他会场终端的本地视频码流的过程也是独立的,各个会场终端之间不会互相干扰。In a cloud conference, each site terminal that joins the cloud conference should send its own local video stream to the server. Although each cloud server is in the same cloud conference, the process of sending local video code streams by each site terminal is independent, that is, each site terminal separately sends its own local video code stream to the server. Similarly, each site terminal The process of receiving the local video code stream of at least one other site terminal of the server is also independent, and the site terminals do not interfere with each other.
接收服务器下发的本地视频码流集合包括的本地视频码流指的是,在云会议中,服务器接收了各个会场终端分别发送的本地视频码流,然后将本地视频码流分别发送给各个会场终端;各个会场终端所要接收的本地视频码流应该是其他会场终端的本地视频码流,而无须接收会场终端自己的本地视频码流;本地视频码流集合包括的本地视频码流,一般而言,是指在云会议中,任何会场终端至少应该有一个会场终端的画面,更多的时候 有多个画面。由于与会的会场终端数目并不是唯一的,而各个会场终端并不是都需要看到所有的会场终端,有的会场终端出于突出重点的考虑,可能只会查看某一个会场终端的视频,有的会场终端又可能为了多看看其他会场终端的画面而选择查看所有会场终端的视频,因此,服务器所需要的是,根据不同的会场终端的需求,将这些会场终端需要的那些会场终端的本地视频码流发送给这些会场终端。本地视频码流集合,是指各个终端上报给服务器的本地视频码流的集合,这个集合是原始的本地视频码流的集合,而各个会场终端所对应的本地视频码流集合中所包括的本地视频码流可能是相同的,也可能是部分相同的,或者是各不相同。The local video code stream included in the local video code stream set sent by the receiving server refers to: in the cloud conference, the server receives the local video code stream sent by each site terminal, and then sends the local video code stream to each site respectively. The local video code stream to be received by each site terminal should be the local video code stream of other site terminals, without receiving the local video code stream of the site terminal; the local video code stream included in the local video code stream set is generally In the cloud conference, any venue terminal should have at least one site terminal screen, more often There are multiple screens. Because the number of conference site terminals is not unique, and each site terminal does not need to see all the site terminals. Some site terminals may only view videos of a certain site terminal for consideration. The site terminal may choose to view the video of all the site terminals in order to view the images of the other site terminals. Therefore, the server needs the local video of the site terminals required by the site terminals according to the requirements of different site terminals. The code stream is sent to these venue terminals. The local video code stream set refers to a set of local video code streams reported by the respective terminals to the server. The set is a set of original local video code streams, and the local video code stream set corresponding to each site terminal is included in the local The video streams may be the same, they may be partially identical, or they may be different.
在云会议中,常常会有广播源,广播源也是会场终端中的一个,即整个云会议的大局由这个会场终端来掌控,因此,往往需要将这个会场终端的本地视频码流发送到每一个会场终端,那么,第二接收模块202包括视频接收模块2021,设置为接收广播源的本地视频码流,即本地视频码流集合包括广播源的本地视频码流;当然,这里的接收还是由服务器所发送的;此外,广播源在会议中也需要有指向的看某个或者某些会场终端,相应的,需要将这些会场终端的本地视频码流发送给广播源,也就是说广播源需要接收广播源看的端的本地视频码流;广播源可以看其他的会场终端,也可以看广播源自己,广播源所看的会场终端,就称为广播源看的端;在一个云会议中,往往会有这两个比较特殊的会场终端,其中,广播源的本地视频码流需要发送给每个会场终端,而广播源看的端需要发送给广播源。当然,广播源看的端在一次云会议中是可以变化的,而且可能会经常性的变化,这个并不影响本方案的实施。此外,出于各种原因,广播源本身在一次云会议中也是可能变化的,在广播源变化之后,也应该将变化后的广播源的本地视频码流发送给各个会场终端。In a cloud conference, there is often a broadcast source, and the broadcast source is also one of the venue terminals. That is, the overall situation of the entire cloud conference is controlled by the conference terminal. Therefore, it is often necessary to send the local video code stream of the conference terminal to each one. The second receiving module 202 includes a video receiving module 2021 configured to receive a local video code stream of the broadcast source, that is, the local video code stream set includes a local video code stream of the broadcast source; of course, the receiving here is still performed by the server. In addition, the broadcast source also needs to point to some or some of the site terminals in the conference. Correspondingly, the local video code streams of the site terminals need to be sent to the broadcast source, that is, the broadcast source needs to receive. The local video stream of the end viewed by the broadcast source; the broadcast source can look at other venue terminals, or the broadcast source itself, and the venue terminal viewed by the broadcast source is called the broadcast source; in a cloud conference, often There are two special site terminals, in which the local video stream of the broadcast source needs to be sent to each site terminal. The end of the broadcast source needs to be sent to the broadcast source. Of course, the end of the broadcast source can be changed in a cloud conference, and it may change frequently. This does not affect the implementation of this solution. In addition, for various reasons, the broadcast source itself may also change in a cloud conference. After the broadcast source changes, the local video code stream of the changed broadcast source should also be sent to each conference terminal.
此外,第二接收模块202还包括请求发送模块2022;设置为向服务器发送请求;各个会场终端在云会议中的任何时候,不论是刚开始,或者在会议中,都可以向服务器发送请求,这个请求至少包括该会场终端需要看的会场终端,也即是说,各个会场终端向服务器发送包括需要看的会场终 端的请求;服务器则根据该请求向该会场终端发送相应的本地视频码流。本地视频码流集合包括会场终端请求的本地视频码流。也即是说,各个会场终端至少应该接收广播源的本地视频码流,此外,还可以接收其他会场终端的本地视频码流。视频接收模块2021还设置为接收请求的本地视频码流,即请求发送模块2022请求的会场终端的本地视频码流。In addition, the second receiving module 202 further includes a request sending module 2022; configured to send a request to the server; each venue terminal can send a request to the server at any time in the cloud conference, whether at the beginning or in the conference, this The request includes at least the site terminal that the site terminal needs to see, that is, each site terminal sends the server to the server including the site to be viewed. The request of the end; the server sends a corresponding local video stream to the site terminal according to the request. The local video stream set includes a local video stream requested by the venue terminal. That is to say, each site terminal should receive at least the local video code stream of the broadcast source, and can also receive the local video code stream of other site terminals. The video receiving module 2021 is further configured to receive the requested local video code stream, that is, the local video code stream of the site terminal requested by the sending module 2022.
在接收到至少一个本地视频码流后,就可以对本地视频码流进行处理;视频处理模块203包括视频解码模块2031和合成模块2032,视频解码模块2031设置为对本地视频码流进行解码,得到至少一个子视频;然后合成模块2032对子视频执行合成操作,得到视频。会场终端接收的本地视频码流,包括广播源在内,可能有多个,会场终端对这多个本地视频码流进行解码;视频的编解码的格式主要有以下几种:H.261、H.263、H.264,采用任何一种格式均可。将各个本地视频码流解码后,得到了至少一个子视频,就可以根据会场终端自己的意愿对各个子视频进行合成。合成后的视频的画面版式可以是任意的,一些常用的合成后的画面版式可以是:其一,对应各个会场终端的子视频按照同样的大小在画面中均匀分布;其二,以其中一个子视频为主视频,使其画面最大,其他的子视频作为从视频,画面分布在主画面的周围,其中,主视频可以以广播源作为主视频,或者此时云会议中主要发言的会场终端的视频作为主视频;其三,合成后的视频的画面可以是动态的,如谁发言谁的视频画面就相应的放大,显得比较突出。采用以上任何的方式或者其他未提及的合成方式在本实施例中均是适用的,只要其能够让该会场终端上的视频正常显示给与会的用户看即可。这样,将本地视频码流的解码合成的处理过程转到相应的会场终端来处理,由于各个会场终端的处理是独立的,不仅可以依照各个会场终端私人订制,有很好的用户体验,还减少了服务器端的CPU和内存使用,使得服务器可以接入更多的会场终端,从而可以提高会议的效率。After receiving at least one local video code stream, the local video code stream can be processed; the video processing module 203 includes a video decoding module 2031 and a synthesis module 2032, and the video decoding module 2031 is configured to decode the local video code stream to obtain At least one sub video; then the synthesizing module 2032 performs a synthesizing operation on the sub video to obtain a video. The local video code stream received by the site terminal, including the broadcast source, may be multiple. The site terminal decodes the multiple local video code streams. The format of the video codec is mainly as follows: H.261, H .263, H.264, can be used in any format. After decoding each local video stream, at least one sub-video is obtained, and each sub-video can be synthesized according to the will of the venue terminal. The screen layout of the synthesized video may be arbitrary. Some commonly used synthesized screen layouts may be: first, the sub-videos corresponding to the respective venue terminals are evenly distributed in the screen according to the same size; second, one of them The video is the main video, and the picture is the largest. The other sub-videos are used as the slave video, and the picture is distributed around the main picture. The main video can use the broadcast source as the main video, or the conference terminal that mainly speaks in the cloud conference. The video is used as the main video; thirdly, the picture of the synthesized video can be dynamic, such as who is speaking and whose video picture is enlarged accordingly, which is more prominent. Any of the above methods or other unmentioned synthesis methods are applicable in this embodiment, as long as they enable the video on the site terminal to be normally displayed to the participating users. In this way, the process of decoding and synthesizing the local video code stream is transferred to the corresponding site terminal for processing. Since the processing of each site terminal is independent, not only can the personal customization of each site terminal be customized, but also a good user experience. The server-side CPU and memory usage are reduced, so that the server can access more site terminals, thereby improving the efficiency of the conference.
此外,还包括第二音频发送模块204,设置为在发送会场终端的本地视频码流给服务器的同时,将各个会场终端的第一音频码流发送给服务器。在发送各个会场终端的第一音频码流时,诚然,会场终端也可以采用类似 本地视频码流的处理方式对音频码流进行处理,但是音频码流有其特殊性,音频的处理与视频的处理相比,音频处理所占用的服务器的CPU和内存较低,因此没有必要;此外,从实际出发,由于各个与会的会场终端在会议中并不需要看到每个会场终端的画面,但是一般需要听到所有会场终端的声音,这样才能使得会议能够正常进行。因此,还包括第二音频接收模块205,各个会场终端将第一音频码流发送给服务器后,服务器对其进行处理,得到第二音频码流,然后将第二音频码流发送给每个会场终端。服务器对第一音频码流的处理与以上实施例中的一致,这里不再赘述。In addition, the second audio sending module 204 is further configured to send the first audio code stream of each site terminal to the server while transmitting the local video code stream of the site terminal to the server. When sending the first audio stream of each venue terminal, it is true that the venue terminal can also adopt similar The processing method of the local video stream processes the audio stream, but the audio stream has its particularity. Compared with the processing of the video, the CPU and memory of the server occupied by the audio processing are lower, so it is not necessary; In addition, from the actual situation, since the conference terminals of each conference do not need to see the screen of each conference terminal in the conference, it is generally necessary to hear the voices of all the conference terminals, so that the conference can be performed normally. Therefore, the second audio receiving module 205 is further included, after each of the site terminals sends the first audio code stream to the server, the server processes the second audio code stream, and then sends the second audio code stream to each site. terminal. The processing of the first audio code stream by the server is the same as that in the above embodiment, and details are not described herein again.
本发明的实施例还提供了一种存储介质,该存储介质包括存储的程序,其中,上述程序运行时执行上述任一项所述的方法。Embodiments of the present invention also provide a storage medium including a stored program, wherein the program described above executes the method of any of the above.
可选地,在本实施例中,上述存储介质可以被设置为存储用于执行图1至图4所述的任一图所示的方法的步骤的程序代码。Alternatively, in the present embodiment, the above storage medium may be provided as program code for storing steps for performing the method shown in any of the figures described in FIGS. 1 to 4.
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。Optionally, in the embodiment, the foregoing storage medium may include, but is not limited to, a USB flash drive, a Read-Only Memory (ROM), and a Random Access Memory (RAM). A variety of media that can store program code, such as a hard disk, a disk, or an optical disk.
本发明的实施例还提供了一种处理器,该处理器用于运行程序,其中,该程序运行时执行上述任一项方法中的步骤。Embodiments of the present invention also provide a processor for running a program, wherein the program is executed to perform the steps of any of the above methods.
可选地,在本实施例中,上述程序用于执行图1至图4所述的任一图所示的方法的步骤。Optionally, in the present embodiment, the above program is used to perform the steps of the method shown in any of the figures shown in FIGS. 1 to 4.
可选地,本实施例中的具体示例可以参考上述实施例及可选实施方式中所描述的示例,本实施例在此不再赘述。For example, the specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the optional embodiments, and details are not described herein again.
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执 行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。It will be apparent to those skilled in the art that the various modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they can be executed by computing devices The program code of the lines is implemented so that they can be stored in the storage device by the computing device, and in some cases, the steps shown or described can be performed in a different order than here, or they can be Each of the integrated circuit modules is fabricated separately, or a plurality of modules or steps thereof are fabricated into a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above description is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the scope of the present invention are intended to be included within the scope of the present invention.
工业实用性Industrial applicability
基于本发明实施例提供的技术方案,针对各个会场终端,分别根据预设的规则确定向其下发的本地视频码流集合,本地视频码流集合至少包括一个本地视频码流;将所述本地视频码流集合包括的本地视频码流发送给对应的所述的会场终端;会场终端接收服务器下发的本地视频码流集合包括的本地视频码流,并对该本地视频码流进行处理,得到视频。克服了现有技术中的由服务器对本地视频码流进行处理而导致的服务器的CPU和内存的占用高,进而使得服务器的接入容量低的问题,大大提升了服务器的会场终端接入量,且保证了用户体验。 According to the technical solution provided by the embodiment of the present invention, the local video code stream set to be sent to the site terminal is determined according to a preset rule, where the local video code stream set includes at least one local video code stream; The local video code stream included in the video code stream set is sent to the corresponding site terminal; the site terminal receives the local video code stream included in the local video code stream set sent by the server, and processes the local video code stream to obtain video. The problem that the CPU and the memory of the server are occupied by the server to process the local video code stream in the prior art is overcome, and the access capacity of the server is low, thereby greatly increasing the access volume of the server terminal. And guarantee the user experience.

Claims (26)

  1. 一种云会议处理方法,包括如下步骤:A cloud conference processing method includes the following steps:
    接收各个会场终端上报的本地视频码流;Receiving a local video code stream reported by each site terminal;
    针对各个会场终端,分别根据预设的规则确定向其下发的本地视频码流集合,所述本地视频码流集合至少包括一个所述本地视频码流;Determining, according to a preset rule, a set of local video code streams sent to each of the site terminals, where the set of local video code streams includes at least one of the local video code streams;
    将所述本地视频码流集合包括的本地视频码流发送给对应的所述会场终端。Sending the local video code stream included in the local video code stream set to the corresponding conference terminal.
  2. 如权利要求1所述的云会议处理方法,其中,所述接收各个会场终端的本地视频码流时,还包括:The cloud conference processing method according to claim 1, wherein the receiving the local video code stream of each site terminal further comprises:
    接收来自各个会场终端的第一音频码流;Receiving a first audio code stream from each venue terminal;
    将所述各个会场终端的第一音频码流处理后发送给各个会场终端。The first audio code stream of each of the site terminals is processed and sent to each site terminal.
  3. 如权利要求2所述的云会议处理方法,其中,所述将各个会场终端的第一音频码流处理后发送给各个会场终端包括:The cloud conference processing method according to claim 2, wherein the processing of the first audio code stream of each of the site terminals to each of the venue terminals comprises:
    将所述第一音频码流进行解码,得到至少一个音频;Decoding the first audio code stream to obtain at least one audio;
    对所述音频进行混音;Mixing the audio;
    对所述混音后的音频进行编码,得到第二音频码流;Encoding the mixed audio to obtain a second audio stream;
    将所述第二音频码流发送给各个所述会场终端。And transmitting the second audio code stream to each of the venue terminals.
  4. 如权利要求1-3任一项所述的云会议处理方法,其中,所述预设的规则包括:所述本地视频码流集合包括广播源的本地视频码流;The cloud conference processing method according to any one of claims 1 to 3, wherein the preset rule comprises: the local video code stream set includes a local video code stream of a broadcast source;
    所述将本地视频码流集合包括的本地视频码流发送给对应的会场终端包括:将所述广播源的本地视频码流发送给对应的会场终端。The sending the local video code stream included in the local video code stream set to the corresponding site terminal comprises: sending the local video code stream of the broadcast source to the corresponding site terminal.
  5. 如权利要求4所述的云会议处理方法,其中,所述预设的规则还包括:接收所述会场终端的请求,所述本地视频码流集合包括所述会场终端请求的本地视频码流;The cloud conference processing method according to claim 4, wherein the preset rule further comprises: receiving a request of the site terminal, where the local video code stream set includes a local video code stream requested by the site terminal;
    所述将本地视频码流集合包括的本地视频码流发送给对应的会场终端还包括:将所述会场终端请求的本地视频码流发送给所述会场终端。 The sending the local video code stream that is included in the local video code stream set to the corresponding site terminal further includes: sending the local video code stream requested by the site terminal to the site terminal.
  6. 一种云会议处理方法,包括如下步骤:A cloud conference processing method includes the following steps:
    将会场终端的本地视频码流发送给服务器;Sending a local video code stream of the field terminal to the server;
    接收所述服务器下发的本地视频码流集合包括的所述本地视频码流;Receiving, by the local video code stream that is sent by the server, the local video code stream;
    对所述本地视频码流进行处理,得到视频。Processing the local video stream to obtain a video.
  7. 如权利要求6所述的云会议处理方法,其中,所述接收所述服务器下发的本地视频码流集合包括的本地视频码流包括:接收广播源的本地视频码流。The cloud conference processing method according to claim 6, wherein the receiving the local video code stream included in the local video code stream set sent by the server comprises: receiving a local video code stream of a broadcast source.
  8. 如权利要求7所述的云会议处理方法,其中,所述接收所述服务器下发的本地视频码流集合包括的本地视频码流还包括:向所述服务器发送请求;接收请求的本地视频码流。The cloud conference processing method according to claim 7, wherein the receiving the local video code stream included in the local video code stream set delivered by the server further comprises: sending a request to the server; receiving the requested local video code flow.
  9. 如权利要求6所述的云会议处理方法,其中,在所述将会场终端的本地视频码流发送给服务器时,还包括:将所述会场终端的第一音频码流发送给所述服务器。The cloud conference processing method according to claim 6, wherein when the local video code stream of the venue terminal is sent to the server, the method further includes: transmitting the first audio code stream of the conference terminal to the server.
  10. 如权利要求9所述的云会议处理方法,其中,在所述将会场终端的第一音频码流发送给服务器之后,还包括:接收来自服务器的第二音频码流。The cloud conference processing method according to claim 9, wherein after the first audio code stream of the venue terminal is sent to the server, the method further comprises: receiving the second audio code stream from the server.
  11. 如权利要求6-10任一项所述的云会议处理方法,其中,所述对本地视频码流进行处理,得到视频包括:对所述本地视频码流进行解码,得到至少一个子视频;将所述子视频进行合成,得到视频。The cloud conference processing method according to any one of claims 6 to 10, wherein the processing the local video code stream to obtain the video comprises: decoding the local video code stream to obtain at least one sub video; The sub video is synthesized to obtain a video.
  12. 一种服务器,包括:A server that includes:
    第一接收模块,设置为接收各个会场终端上报的本地视频码流;The first receiving module is configured to receive the local video code stream reported by each site terminal;
    确定模块,设置为针对各个会场终端,分别根据预设规则确定向其下发的本地视频码流集合,所述本地视频码流集合至少包括一个所述本地视频码流;a determining module, configured to determine, according to a preset rule, a local video code stream set to be sent to each of the site terminals, where the local video code stream set includes at least one of the local video code streams;
    第一发送模块,设置为将所述本地视频码流集合包括的本地视频码流发送给对应的所述会场终端。 The first sending module is configured to send the local video code stream included in the local video code stream set to the corresponding conference terminal.
  13. 如权利要求12所述的服务器,其中,还包括第一音频接收模块和音频处理模块;所述第一音频接收模块设置为接收来自各个会场终端的第一音频码流;所述音频处理模块设置为将所述音频码流处理后发送给各个会场终端。The server according to claim 12, further comprising a first audio receiving module and an audio processing module; said first audio receiving module being arranged to receive a first audio code stream from each of the venue terminals; said audio processing module setting The audio code stream is processed and sent to each venue terminal.
  14. 如权利要求13所述的服务器,其中,所述音频处理模块包括音频解码模块、混音模块、编码模块、第一音频发送模块;所述音频解码模块设置为将所述第一音频码流进行解码,得到至少一个音频;所述混音模块设置为对所述音频进行混音;所述编码模块设置为对所述混音后的音频进行编码,得到第二音频码流;所述音频发送模块设置为将所述第二音频码流发送给各个所述会场终端。The server according to claim 13, wherein the audio processing module comprises an audio decoding module, a mixing module, an encoding module, and a first audio transmitting module; the audio decoding module is configured to perform the first audio stream Decoding to obtain at least one audio; the mixing module is configured to mix the audio; the encoding module is configured to encode the mixed audio to obtain a second audio stream; the audio transmission The module is configured to send the second audio code stream to each of the venue terminals.
  15. 如权利要求12-14中任一项所述的服务器,其中,所述预设的规则包括:所述本地视频码流集合包括广播源的本地视频码流;所述第一发送模块包括视频发送模块,设置为将广播源的本地视频码流发送给对应的会场终端.The server according to any one of claims 12-14, wherein the preset rule comprises: the local video code stream set comprises a local video code stream of a broadcast source; the first sending module comprises a video transmission The module is configured to send the local video code stream of the broadcast source to the corresponding conference terminal.
  16. 如权利要求15所述的服务器,其中,所述预设的规则还包括:接收所述会场终端的请求,所述本地视频码流集合包括所述会场终端请求的本地视频码流;所述视频发送模块还设置为将所述会场终端请求的本地视频码流直接转发给所述会场终端。The server according to claim 15, wherein the preset rule further comprises: receiving a request of the site terminal, the local video code stream set comprising a local video code stream requested by the site terminal; the video The sending module is further configured to directly forward the local video code stream requested by the site terminal to the site terminal.
  17. 一种会场终端,包括:A venue terminal, including:
    第二发送模块,设置为将会场终端的本地视频码流发送给服务器;a second sending module, configured to send the local video code stream of the field terminal to the server;
    第二接收模块,设置为接收所述服务器下发的本地视频码流集合包括的所述本地视频码流;a second receiving module, configured to receive the local video code stream included in the local video code stream set delivered by the server;
    视频处理模块,设置为对所述本地视频码流进行处理,得到视频。The video processing module is configured to process the local video code stream to obtain a video.
  18. 如权利要求17所述的会场终端,其中,所述第二接收模块包括视频接收模块,设置为接收广播源的本地视频码流。The venue terminal of claim 17, wherein the second receiving module comprises a video receiving module configured to receive a local video code stream of a broadcast source.
  19. 如权利要求18所述的会场终端,其中,所述第二接收模块还包括请求发送模块,设置为向所述服务器发送请求;所述视频接收模块还设 置为接收请求的本地视频码流。The conference terminal of claim 18, wherein the second receiving module further comprises a request sending module configured to send a request to the server; the video receiving module further comprises Set to receive the requested local video stream.
  20. 如权利要求17所述的会场终端,其中,还包括第二音频发送模块,设置为在将会场终端的本地视频码流发送给服务器时,将所述会场终端的第一音频码流发送给所述服务器。The site terminal of claim 17, further comprising a second audio sending module, configured to send the first audio stream of the venue terminal to the local audio code stream of the venue terminal when the local video code stream is sent to the server Said server.
  21. 如权利要求20所述的会场终端,其中,还包括第二音频接收模块,设置为接收来自服务器的第二音频码流。The venue terminal of claim 20, further comprising a second audio receiving module configured to receive the second audio stream from the server.
  22. 如权利要求17-21中任一项所述的会场终端,其中,所述视频处理模块包括视频解码模块和合成模块,所述解码模块设置为对所述本地视频码流进行解码,得到至少一个子视频;所述合成模块设置为将所述子视频进行合成,得到视频。The conference site terminal according to any one of claims 17 to 21, wherein the video processing module comprises a video decoding module and a synthesis module, and the decoding module is configured to decode the local video code stream to obtain at least one a sub video; the synthesizing module is configured to synthesize the sub video to obtain a video.
  23. 一种存储介质,所述存储介质包括存储的程序,其中,所述程序运行时执行权利要求1至5中任一项所述的方法。A storage medium, the storage medium comprising a stored program, wherein the program is executed to perform the method of any one of claims 1 to 5.
  24. 一种存储介质,所述存储介质包括存储的程序,其中,所述程序运行时执行权利要求6至11中任一项所述的方法。A storage medium, the storage medium comprising a stored program, wherein the program is executed to perform the method of any one of claims 6 to 11.
  25. 一种处理器,所述处理器用于运行程序,其中,所述程序运行时执行权利要求1至5中任一项所述的方法。A processor for running a program, wherein the program is executed to perform the method of any one of claims 1 to 5.
  26. 一种处理器,所述处理器用于运行程序,其中,所述程序运行时执行权利要求6至11中任一项所述的方法。 A processor for running a program, wherein the program is executed to perform the method of any one of claims 6 to 11.
PCT/CN2017/078856 2016-04-08 2017-03-30 Server, conference terminal, and cloud conference processing method WO2017173953A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610219243.3 2016-04-08
CN201610219243.3A CN107277425A (en) 2016-04-08 2016-04-08 A kind of server, meeting-place terminal and cloud meeting processing method

Publications (1)

Publication Number Publication Date
WO2017173953A1 true WO2017173953A1 (en) 2017-10-12

Family

ID=60000236

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/078856 WO2017173953A1 (en) 2016-04-08 2017-03-30 Server, conference terminal, and cloud conference processing method

Country Status (2)

Country Link
CN (1) CN107277425A (en)
WO (1) WO2017173953A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112073810A (en) * 2020-11-16 2020-12-11 全时云商务服务股份有限公司 Multi-layout cloud conference recording method and system and readable storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112788276A (en) * 2019-11-11 2021-05-11 中兴通讯股份有限公司 Video stream display method, transmission method, device, terminal, server and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101073257A (en) * 2004-12-22 2007-11-14 中兴通讯股份有限公司 Method for transmitting multi-path video for conference television system
CN101094382A (en) * 2007-07-12 2007-12-26 杭州华三通信技术有限公司 Video terminal, user interface, and method for playing back accessorial stream
CN101141616A (en) * 2007-10-18 2008-03-12 华为技术有限公司 Video session method and system, application server and media resource server
CN101668162A (en) * 2009-10-14 2010-03-10 中国电信股份有限公司 Implementation method of video conference and video conference system
CN101753961A (en) * 2008-12-08 2010-06-23 北京中星微电子有限公司 Meeting realizing method in video monitoring system and video monitoring meeting system

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102065264B (en) * 2009-11-18 2015-06-24 深圳市邦彦信息技术有限公司 Video command/session system without MCU and method
US8395654B2 (en) * 2011-01-03 2013-03-12 Alcatel Lucent Offload of server-based videoconference to client-based video conference
CN102447878B (en) * 2011-12-23 2013-07-03 南京超然科技有限公司 Remote packet capturing method for television wall server and recording and broadcasting server
CN102611873A (en) * 2012-03-06 2012-07-25 宋健 Method and system for realizing 2D/3D (two dimension/3 dimension) video communication and transmission optimization
CN102625080B (en) * 2012-04-23 2014-09-10 广东大晋对接信息科技有限公司 P2P-based WEB video conference system
CN103313027A (en) * 2013-06-08 2013-09-18 青岛优视通网络有限公司 Conference terminal strong in network adaptability and method for realizing video conferences
CN104427296B (en) * 2013-09-05 2019-03-01 华为终端(东莞)有限公司 The transmission method and device of Media Stream in video conference
CN104735389B (en) * 2013-12-23 2018-08-31 联想(北京)有限公司 Information processing method and information processing equipment
CN104167210A (en) * 2014-08-21 2014-11-26 华侨大学 Lightweight class multi-side conference sound mixing method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101073257A (en) * 2004-12-22 2007-11-14 中兴通讯股份有限公司 Method for transmitting multi-path video for conference television system
CN101094382A (en) * 2007-07-12 2007-12-26 杭州华三通信技术有限公司 Video terminal, user interface, and method for playing back accessorial stream
CN101141616A (en) * 2007-10-18 2008-03-12 华为技术有限公司 Video session method and system, application server and media resource server
CN101753961A (en) * 2008-12-08 2010-06-23 北京中星微电子有限公司 Meeting realizing method in video monitoring system and video monitoring meeting system
CN101668162A (en) * 2009-10-14 2010-03-10 中国电信股份有限公司 Implementation method of video conference and video conference system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112073810A (en) * 2020-11-16 2020-12-11 全时云商务服务股份有限公司 Multi-layout cloud conference recording method and system and readable storage medium
CN112073810B (en) * 2020-11-16 2021-02-02 全时云商务服务股份有限公司 Multi-layout cloud conference recording method and system and readable storage medium

Also Published As

Publication number Publication date
CN107277425A (en) 2017-10-20

Similar Documents

Publication Publication Date Title
US9781386B2 (en) Virtual multipoint control unit for unified communications
US9024997B2 (en) Virtual presence via mobile
US9154737B2 (en) User-defined content magnification and multi-point video conference system, method and logic
JP6940587B2 (en) Methods and equipment for the use of compact parallel codecs in multimedia communications
US8208004B2 (en) Device, methods, and media for providing multi-point video conferencing unit functions
US8984156B2 (en) Multi-party mesh conferencing with stream processing
US9961303B2 (en) Video conference virtual endpoints
US8836753B2 (en) Method, apparatus, and system for processing cascade conference sites in cascade conference
CN102893603B (en) Video conferencing processing method, apparatus and communication system
US9560096B2 (en) Local media rendering
CN105472306A (en) Video conference data sharing method and related device
WO2015003532A1 (en) Multimedia conferencing establishment method, device and system
US20190007463A1 (en) Multimedia composition in meeting spaces
CN113542660A (en) Method, system and storage medium for realizing conference multi-picture high-definition display
CN110460603B (en) Multimedia file transmission method, terminal, server, system and storage medium
WO2017173953A1 (en) Server, conference terminal, and cloud conference processing method
GB2511822A (en) A telecommunication network
US20200329083A1 (en) Video conference transmission method and apparatus, and mcu
EP3881559B1 (en) Audio processing in immersive audio services
WO2016045496A1 (en) Media control method and device
WO2016086371A1 (en) Conference resource scheduling method and apparatus
WO2019020005A1 (en) Video interaction management method and system, and computer-readable storage medium
WO2021093882A1 (en) Video meeting method, meeting terminal, server, and storage medium
US20120075408A1 (en) Technique for providing in-built audio/video bridge on endpoints capable of video communication over ip
US9591037B2 (en) Distributed audio mixing and forwarding

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17778617

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17778617

Country of ref document: EP

Kind code of ref document: A1