CN104486518B

CN104486518B - A kind of videoconference distribution sound mixing method under bandwidth limited network environment

Info

Publication number: CN104486518B
Application number: CN201410729699.5A
Authority: CN
Inventors: 陈强; 陈志辉; 王俊; 蒲长春; 刘成; 龙怡翔; 阳洋
Original assignee: CETC 30 Research Institute
Current assignee: CETC 30 Research Institute
Priority date: 2014-12-03
Filing date: 2014-12-03
Publication date: 2017-06-30
Anticipated expiration: 2034-12-03
Also published as: CN104486518A

Abstract

The invention discloses the videoconference distribution sound mixing method under a kind of bandwidth limited network environment, the exit deployment mixer of network domains where node of attending a meeting, it is top mixer to use static schema or dynamic mode to specify certain mixer, remaining mixer is secondary mixer, secondary mixer is responsible for the stereo process in this domain, and audio mixing result is sent to top mixer, top mixer attended a meeting node voice by this domain and secondary audio mixing result carries out stereo process again as audio mixing input, this domain is sent the result to attend a meeting node and secondary mixer, the attend a meeting voice of node of top audio mixing result that secondary mixer will be received and this domain carries out the stereo process in this domain as input.The inventive method greatly reduces the Media Stream quantity transmitted between heterogeneous networks domain, when personnel participating in the meeting is increased, the expense of bandwidth limited link between additional voice Media Stream quantity, effectively save domain will not be increased between network domains, and mixer load is reduced, facilitates meeting to extend.

Description

A kind of videoconference distribution sound mixing method under bandwidth limited network environment

Technical field

The present invention relates to the videoconference distribution sound mixing method under a kind of bandwidth limited network environment.

Background technology

TeleConference Bridge is held a meeting there is provided a kind of convenient solution for strange land.Current telephony conference system master There is end subscriber mixed mode and concentrate mixed mode.End subscriber mixed mode is to be responsible for selection algorithm by server, and terminal is born Duty voice mixing treatment, the pattern advantage can be reduce server treatment load, have the disadvantage it is higher to terminal capability requirement, Speech data redundancy in network, efficiency of bandwidth use is low, easily causes localized network congestion；Mixed mode is concentrated there was only server Center mixer, all Media Streams are sent to after mixer audio mixing and N bars Media Stream (N is number of terminals of attending a meeting) are sent into each ginseng again Meeting terminal, the pattern has the convenient advantage of Media Stream mixed processing, has the disadvantage that server performance requirement is higher, it is impossible to avoid office Portion's network congestion, efficiency of bandwidth use is not high.Both patterns have high requirements to bandwidth, with the increase of number of participants, mix The performance and bandwidth of sound device turn into unvanquishable bottleneck.

In actual use, the member for participating in videoconference often has concentration of local wide area characteristic distributions, concentrates Refer to that concentration is compared in same region, same level members network site, often just in same LAN, bandwidth is filled between member Foot, forms relatively-stationary member's group；Distribution refers between different geographical, different stage member's group or same rank is different The diverse location of network is distributed between member's group, the communication connection between them is mainly by radio communications such as wide area network or satellites Net, limited bandwidth is costly.End subscriber mixed mode and concentration mixed mode videoconference do not adapt to this network and open up The characteristics of flutterring.

The content of the invention

In order to overcome the disadvantages mentioned above of prior art, the invention provides the phone meeting under a kind of bandwidth limited network environment The distributed sound mixing method of view, overcomes the low drawn game of bandwidth utilization rate that audio medium stream causes when mixer is converged and is distributed The problem of portion's network congestion, it is to avoid transmission in a network repeats voice, improves efficiency of bandwidth use, and reduce center mixer Performance cost.

The technical solution adopted for the present invention to solve the technical problems is：A kind of phone meeting under bandwidth limited network environment The distributed sound mixing method of view, the exit deployment mixer of network domains where node of attending a meeting, using static schema or dynamic analog Formula specifies certain mixer for top mixer, and remaining mixer is secondary mixer, and secondary mixer is responsible for the audio mixing in this domain Treatment, and audio mixing result is sent into top mixer, top mixer attends a meeting node voice and secondary audio mixing result this domain Stereo process being carried out again as audio mixing input, this domain being sent the result to and is attended a meeting node with secondary mixer, secondary mixer is incited somebody to action The attend a meeting voice of node of the top audio mixing result that receives and this domain carries out the stereo process in this domain as input；Described mixer Including receiving sending module, voice flow selecting module and mix module, wherein：The reception sending module, when receiving, voice is defeated It is fashionable, decoded according to voice coding modes, decoded result feeding voice flow selecting module treatment；When voice is sent, according to mesh End coded system voice is encoded, re-send to destination；The voice flow selecting module, calculates each road voice flow letter Number energy size, mute signal is filtered by Jing Yin detection, maximum audio mixing number according to configuration and ensures that speech is continuous former Then filter out satisfactory voice flow；The mix module is processed input voice flow according to Mixed Audio Algorithm.

Compared with prior art, the positive effect of the present invention is：Greatly reduce the media transmitted between heterogeneous networks domain Stream quantity, when personnel participating in the meeting is increased, due to the preliminary stereo process in this domain, will not increase additional voice matchmaker between network domains The expense of bandwidth limited link between body stream quantity, effectively save domain, and mixer load is reduced, facilitate meeting to extend.This hair It is bright to can be used for any network topology structure with integrated distribution formula feature.

Brief description of the drawings

Examples of the present invention will be described by way of reference to the accompanying drawings, wherein：

Fig. 1 is bandwidth limited network topological diagram；

Fig. 2 is distributed audio mixing configuration diagram；

Fig. 3 is the flow chart of the inventive method.

Specific embodiment

Bandwidth limited network topological diagram is as shown in figure 1, integrated distribution formula network is divided into multiple according to terminal distributing position Network domains, each network domains dispose local concentration audio mixing server, and each audio mixing server has the relation of equity.Network domains it Between be the wireless communication networks such as wide area network or satellite, limited bandwidth is costly.

When Meeting Held, need to first select certain mixer for top mixer, be responsible for the audio mixing of top.It is distributed Sound mixing method has two ways to select top mixer, i.e. static schema and dynamic mode, and static schema is by configuration file Top mixer is specified, will not be changed in configuration modification perclimax mixer；Dynamic mode refer to when videoconference is held according to The tactful top mixer of dynamic select, under dynamic mode, decision factor is including mixer loading condition, network topological information etc.. Dynamic mode is proposed with the case of it can obtain the network resource informations such as network topology and loading condition, it is possible to achieve audio mixing The load balancing of device；The reasonable efficient profit of Internet resources can be ensured when can not obtain network resource information by static schema With.Once top mixer determines, both of which has no Different Effects to the present invention, using identical distribution sound mixing method.

After determining top mixer, remaining mixer is secondary mixer, as shown in Fig. 2 with the audio mixing in network domains 1 Device is confirmed as the distributed audio mixing framework illustrated in conference process as a example by top mixer.

Distributed audio mixing framework is by top mixer (Top-Mixer), secondary mixer (Sub-Mixer), client (UAC) constitute.The secondary mixer of client connection or top mixer, secondary mixer connect top mixer.

Top mixer and secondary mixer have identical functional module and handling process, and difference is the voice flow of input Difference, top mixer input is this domain client voice flow and multiple secondary mixer audio mixing streams, and secondary mixer input is This domain client voice flow and unique top mixer audio mixing stream.Under dynamic designated mode, while the multiple phones held Meeting can specify different top mixers, mixer can have simultaneously top mixer and secondary mixer role or Changed between two kinds of roles of top mixer and secondary mixer.

Described mixer includes receiving sending module, voice flow selecting module and mix module.

Sending module is received, when phonetic entry is received, is decoded according to voice coding modes, decoded result feeding voice flow Selecting module treatment；When voice is sent, voice is encoded according to destination coded system, re-send to destination.

Voice flow selecting module, the module calculates each road voice flow signal energy size, filters Jing Yin by Jing Yin detection Signal, maximum audio mixing number and the continuous principle of guarantee speech according to configuration filter out satisfactory voice flow, specifically Screening technique is as follows：

When current mixer has been maxed out audio mixing way n, if the energy value of the (n+1)th road voice signal is more than preceding Certain energy value all the way in n roads, then control the preceding n roads made a speech to continue to make a speech, and the (n+1)th tunnel wouldn't be allowed to make a speech；Until When certain stops speech all the way in preceding n roads through making a speech, the (n+1)th tunnel is chosen to add audio mixing, it is allowed to make a speech on the road.

Mix module, the module is processed input voice flow according to Mixed Audio Algorithm, and commonly using Mixed Audio Algorithm is included directly Clamp method, equalization audio mixing, alignment audio mixing etc., in order to avoid echo, the road need to be removed when audio mixing result is sent to destination The voice of itself.The selection of Mixed Audio Algorithm, without different influences, performs identical distribution mix process to the present invention, therefore to tool Body algorithm is no longer described in detail.

Distributed sound mixing method flow of the invention is as shown in figure 3, comprise the following steps：

When step one, reception sending module receive this domain client and other mixer voice flows, according to the mark of voice flow Knowledge is put into corresponding pending queue；

Step 2, reception sending module are decoded according to the coded system of voice flow, and decoded result are sent into language Sound stream selecting module；

Step 3, voice flow selecting module calculate decoded voice flow energy value, according to energy value size and guarantee language Sound principle of continuity selects satisfactory voice to stream to mix module；

Step 4, mix module carry out stereo process to the voice flow for selecting, then after the road voice of itself is removed Audio mixing result is sent to reception sending module；

Step 5, reception sending module are carried out according to the coded system that purpose client or purpose mixer are supported to voice After coding, the speech data after coding is sent to corresponding purpose client or purpose mixer.

It is a feature of the present invention that the voice flow of mixer is input into destination except comprising general local VoIP client End, also comprising other distal end mixers, the output result after the voice flow and distal end mixer audio mixing of local client can be made For the input of current mixer carries out stereo process again；Certain mixer can be provided simultaneously with top mixer and secondary mixer Role, or changed between top mixer and secondary two kinds of roles of mixer.When Meeting Held, in static schema Under, a plurality of voice flow reaches this domain mixer and carries out audio mixing, if the mixer is configured to secondary mixer, then the knot of audio mixing The top mixer that fruit will be sent to configuration carries out audio mixing again, if the mixer is configured to top mixer, the knot of audio mixing Fruit is directly returned by source port, and in a dynamic mode, a plurality of voice flow reaches this domain mixer and carries out audio mixing, meanwhile, root It is secondary mixer or top audio mixing in this conversation procedure according to load condition, the network topological information dynamic decision mixer Device, the result of decision of this time is to the dynamic decision in next conversation procedure without absolute effect.In conference process, top and secondary Only comprising up all the way and downlink voice stream all the way between network domains, the voice between secondary network domain is mixed by top mixer Sound is forwarded, it is not necessary to is increased additional voice stream, is greatlyd save bandwidth cost；Secondary mixer is responsible for the stereo process in place domain, Top mixer is responsible for the stereo process of place domain and secondary mixer, and this has just disperseed mixer load, reduces to mixer The requirement of performance.

Operation principle of the invention is：The present invention proposes a kind of distributed sound mixing method using distributed thought, is attending a meeting The exit deployment mixer of network domains where node, certain mixer is specified using configuration file and based on tactful two ways It is top mixer, remaining mixer is secondary mixer, secondary mixer is responsible for the stereo process in this domain, and by audio mixing result It is sent to top mixer, top mixer attended a meeting node voice by this domain and secondary audio mixing result is carried out again as audio mixing input Stereo process, sends the result to this domain and attends a meeting node and secondary mixer, the top audio mixing result that secondary mixer will be received With the attend a meeting voice of node of this domain the stereo process in this domain is carried out as input.

Claims

1. a kind of distributed sound mixing method of videoconference under bandwidth limited network environment, it is characterised in that：In the node institute that attends a meeting Mixer is disposed in the exit of network domains, it is top mixer, remaining mixer to use dynamic mode to specify certain mixer It is secondary mixer, secondary mixer is responsible for the stereo process in this domain, and audio mixing result is sent into top mixer, top mixed Sound device attended a meeting node voice by this domain and secondary audio mixing result carries out stereo process again as audio mixing input, sends the result to this Domain is attended a meeting node and secondary mixer, the top audio mixing result and this domain that secondary mixer will be received attend a meeting node voice as Input carries out the stereo process in this domain；Described mixer includes receiving sending module, voice flow selecting module and mix module, Wherein：The reception sending module, when phonetic entry is received, decodes according to voice coding modes, decoded result feeding voice Stream selecting module treatment；When voice is sent, voice is encoded according to destination coded system, re-send to destination； The voice flow selecting module, calculates each road voice flow signal energy size, and mute signal is filtered by Jing Yin detection, according to matching somebody with somebody The maximum audio mixing number and the continuous principle of guarantee speech put filter out satisfactory voice flow；The mix module is according to mixed Sound algorithm is processed input voice flow；Under dynamic designated mode, while the multiple videoconferences held can be specified not With top mixer, mixer can have simultaneously top mixer and secondary mixer role or in top mixer and Changed between two kinds of roles of secondary mixer.

2. the distributed sound mixing method of videoconference under a kind of bandwidth limited network environment according to claim 1, it is special Levy and be：The voice flow selecting module is to the screening technique of voice flow：Audio mixing road is had been maxed out in current mixer During number n, if the energy value of the (n+1)th road voice signal is more than certain energy value all the way in preceding n roads, before controlling to make a speech N roads continue to make a speech, and the (n+1)th tunnel wouldn't be allowed to make a speech；When certain stops speech all the way in the preceding n roads made a speech, n-th is chosen + 1 tunnel adds audio mixing, it is allowed to make a speech on the road.

3. the distributed sound mixing method of videoconference under a kind of bandwidth limited network environment according to claim 1, it is special Levy and be：The sound mixing method of the mixer comprises the following steps：

When step one, reception sending module receive this domain client and other mixer voice flows, the mark according to voice flow is put Enter corresponding pending queue；

Step 2, reception sending module are decoded according to the coded system of voice flow, and decoded result are sent into voice flow Selecting module；

Step 3, voice flow selecting module calculate decoded voice flow energy value, are connected according to energy value size and guarantee voice Continuous principle selects satisfactory voice to stream to mix module；

Step 4, mix module carry out stereo process to the voice flow for selecting, then will be mixed after the road voice of itself is removed Sound result is sent to reception sending module；

Step 5, reception sending module are encoded according to the coded system that purpose client or purpose mixer are supported to voice Afterwards, the speech data after coding is sent to corresponding purpose client or purpose mixer.

4. the distributed sound mixing method of videoconference under a kind of bandwidth limited network environment according to claim 1, it is special Levy and be：The Mixed Audio Algorithm includes directly clamp method, equalization audio mixing and alignment audio mixing.