CN107195308A

CN107195308A - Sound mixing method, the apparatus and system of audio/video conference system

Info

Publication number: CN107195308A
Application number: CN201710243769.XA
Authority: CN
Inventors: 肖集华; 顾振华; 周金龙
Original assignee: Suzhou Keda Technology Co Ltd
Current assignee: Suzhou Keda Technology Co Ltd
Priority date: 2017-04-14
Filing date: 2017-04-14
Publication date: 2017-09-22
Anticipated expiration: 2037-04-14
Also published as: CN107195308B

Abstract

The invention discloses a kind of sound mixing method, the apparatus and system of audio/video conference system, wherein method includes：Receive the multi-path audio-frequency data bag in the first meeting-place and the packets of audio data in multiple second meeting-place；Packets of audio data is subjected to mixing operation；Each road audio mixing packet obtained by mixing operation is respectively sent to the 3rd meeting-place；3rd meeting-place includes multiple loudspeakers, is corresponded with multiple microphones；Make the time synchronized of the multi-path audio-frequency data bag in the first meeting-place before mixing operation is carried out, and/or, each road audio mixing packet is sent to the time synchronized of Shi Ge roads audio mixing packet before the 3rd meeting-place.The present invention makes the reproduction time difference of synchronization is gathered in the first meeting-place multiple sound in the 3rd meeting-place smaller, and then it is poor for the acquisition time of same speech content to embody each microphone in the first meeting-place, allow the user in the 3rd meeting-place to differentiate the position of sound according to acquisition time difference, reach the effect for listening sound to distinguish position.

Description

Sound mixing method, the apparatus and system of audio/video conference system

Technical field

The present invention relates to audio/video conference technical field, and in particular to a kind of sound mixing method of audio/video conference system, dress Put and system.

Background technology

Video conferencing system can be such that the user positioned at two or more meeting-place is talked in real time, and existing video Conferencing technology has been able to that more truly the image of spokesman in other meeting-place is presented in face of the participant in local meeting-place, So as to make user as placed oneself in the midst of real conference scenario in video conferencing system.

In terms of packets of audio data, the participant in local meeting-place can listen to multiple simultaneously under existing video conferencing technology The speech content in other meeting-place.Specifically, the A of Chinese patent literature CN 102364952 disclose a kind of multichannel audio-video frequency simultaneously The method that audio-visual synchronization is handled during broadcasting, to each with gathering packets of audio data all the way per family in N number of user in this method, Then this N roads packets of audio data is carried out after the packets of audio data of audio mixing formation N+1 roads, then is sent respectively to each user.For example, The packets of audio data in tri- meeting-place of A, B, C is gathered respectively, and audio mixing is sent respectively to A, B, C and other meeting-place into four tunnel audios (such as meeting-place without right to speak), i.e. the first audio mixing carries out audio mixing to the packets of audio data in tri- meeting-place of A, B, C, and first is mixed Sound packet is sent to other meeting-place can be so that the participant in other meeting-place be while hear in the speech in tri- meeting-place of A, B, C Hold；Second audio mixing only carries out audio mixing to the packets of audio data in two meeting-place of B, C, and the second audio mixing packet is sent to A meeting-place i.e. The participant in A meeting-place can be made while hearing the speech content in two meeting-place of B, C；Sound of 3rd audio mixing only to two meeting-place of A, C Frequency packet carries out audio mixing, and the 3rd audio mixing packet is sent can be so that the participant in B meeting-place be while hear A, C to B meeting-place The speech content in two meeting-place；4th audio mixing only carries out audio mixing to the packets of audio data in two meeting-place of A, B, by the 4th audio mixing data Bag is sent can be so that the participant in C meeting-place hears the speech content in two meeting-place of A, B simultaneously to C meeting-place.

, only need to be simple group by each meeting-place during audio mixing because existing mode only gathers packets of audio data all the way to each meeting-place Respective audio packet is subjected to audio mixing after conjunction, thus can only hear that there is sound in other meeting-place in each meeting-place, and when other meetings When the sound of field comes from different azimuth, then the sound of the different azimuth can not be made a distinction, it is impossible to realize the work(for listening sound to distinguish position Energy.

The content of the invention

In view of this, the embodiments of the invention provide a kind of sound mixing method, the apparatus and system of audio/video conference system, with The problem of audition is distinguished can not be realized by solving prior art.

According in a first aspect, the embodiments of the invention provide a kind of sound mixing method of audio/video conference system, including：Receive The multi-path audio-frequency data bag in the first meeting-place and the packets of audio data in multiple second meeting-place；The MCVF multichannel voice frequency number in first meeting-place According to bag by first meeting-place diverse location it is multiple correspondence microphones gathered；By the multichannel sound in first meeting-place Frequency packet carries out mixing operation with the packets of audio data in the multiple second meeting-place respectively；Each road obtained by mixing operation is mixed Sound packet is respectively sent to the 3rd meeting-place；3rd meeting-place includes multiple loudspeakers, for playing each road audio mixing respectively；Institute Multiple loudspeakers are stated to correspond with the multiple microphone；Methods described also includes：Described by many of first meeting-place Before the step of road packets of audio data carries out mixing operation with the packets of audio data in the multiple second meeting-place respectively, make described the The time synchronized of the multi-path audio-frequency data bag in one meeting-place；And/or, in each road audio mixing packet by obtained by mixing operation point Do not send to before the step of three meeting-place, make the time synchronized of each road audio mixing packet obtained by mixing operation.

Alternatively, every road packets of audio data in first meeting-place carries the first source mark and acquisition time stamp；It is described First source is identified for identifying packets of audio data from first meeting-place；The acquisition time is stabbed for identifying every road audio The packet collected time；The step of time synchronized of the multi-path audio-frequency data bag for making first meeting-place, includes：Obtain Take the acquisition time stamp for the every road packets of audio data for carrying the first source mark；Judge to carry the first source mark Whether predetermined acquisition time stamp identical packets of audio data way reaches predetermined quantity in per road packets of audio data；The predetermined number Measure as the quantity of microphone in first meeting-place；When reaching the predetermined quantity, that is, complete the predetermined acquisition time stamp The time synchronized operation of the multi-path audio-frequency data bag in the corresponding meeting-place of moment first；Correspondingly, it is described by first meeting-place Multi-path audio-frequency data bag is carried out in mixing operation step with the packets of audio data in the multiple second meeting-place respectively, first meeting The multi-path audio-frequency data bag of field is to carry the first source mark and acquisition time stamp identical, the predetermined quantity way Packets of audio data.

Alternatively, each road audio mixing packet obtained by the mixing operation carries audio mixing timestamp, the audio mixing time Acquisition time stamp entrained by the packets of audio data all the way in first meeting-place corresponding to the audio mixing packet of Chuo Yumei roads is consistent； The step of time synchronized of each road audio mixing packet made obtained by mixing operation, includes：Each road audio mixing packet is obtained to be taken The audio mixing timestamp of band；Judge whether audio mixing timestamp identical audio mixing way reaches in accessed each road audio mixing packet Predetermined quantity；The predetermined quantity is the quantity of microphone in first meeting-place；When reaching the predetermined quantity, institute is performed State the step of each road audio mixing packet obtained by mixing operation is respectively sent to three meeting-place；Correspondingly, it is described to grasp audio mixing In the step of each road audio mixing packet obtained by work is respectively sent to three meeting-place, each road audio mixing number obtained by the mixing operation According to the audio mixing packet that bag is audio mixing timestamp identical, the predetermined quantity way.

Alternatively, the packets of audio data way in second meeting-place and the packets of audio data way phase in first meeting-place Together；Packets of audio data of the multi-path audio-frequency data bag by first meeting-place respectively with the multiple second meeting-place is mixed The step of sound is operated includes：By the second meeting-place Zhong Mei roads packets of audio data respectively with the sound all the way in first meeting-place Frequency packet carries out mixing operation.

According to second aspect, the embodiments of the invention provide a kind of device sound mixing of audio/video conference system, including：Receive Unit, for receiving the multi-path audio-frequency data bag in the first meeting-place and the packets of audio data in multiple second meeting-place；First meeting Multi-path audio-frequency data bag by first meeting-place diverse location it is multiple correspondence microphones gathered；First audio mixing Unit, is carried out for the packets of audio data by the multi-path audio-frequency data bag in first meeting-place respectively with the multiple second meeting-place Mixing operation；Transmitting element, for each road audio mixing packet obtained by mixing operation to be respectively sent into the 3rd meeting-place；Described Three meeting-place include multiple loudspeakers, for playing each road audio mixing respectively；The multiple loudspeaker and the multiple microphone are one by one Correspondence；Described device also includes：First synchronization unit, for distinguishing in the multi-path audio-frequency data bag by first meeting-place Before the step of carrying out mixing operation with the packets of audio data in the multiple second meeting-place, make the MCVF multichannel voice frequency in first meeting-place The time synchronized of packet；And/or, the second synchronization unit, in each road audio mixing packet by obtained by mixing operation Before the step of being respectively sent to three meeting-place, make the time synchronized of each road audio mixing packet obtained by mixing operation.

Alternatively, every road packets of audio data in first meeting-place carries the first source mark and acquisition time stamp；It is described First source is identified for identifying packets of audio data from first meeting-place；The acquisition time is stabbed for identifying every road audio The packet collected time；First synchronization unit includes：First obtains subelement, and described first is carried for obtaining The acquisition time stamp of every road packets of audio data of source mark；First judgment sub-unit, for judging to carry the first source mark Whether predetermined acquisition time stamp identical packets of audio data way reaches predetermined quantity in the packets of audio data of Shi Mei roads；It is described pre- Fixed number amount is the quantity of microphone in first meeting-place；When reaching the predetermined quantity, that is, when completing the predetermined collection Between stab the corresponding meeting-place of moment first multi-path audio-frequency data bag time synchronized operation；Correspondingly, it is described by first meeting Multi-path audio-frequency data bag carried out respectively with the packets of audio data in the multiple second meeting-place in mixing operation step, described the The multi-path audio-frequency data bag in one meeting-place is identified and acquisition time stamp identical, the predetermined quantity road to carry first source Several packets of audio data.

Alternatively, each road audio mixing packet obtained by the mixing operation carries audio mixing timestamp, the audio mixing time Acquisition time stamp entrained by the packets of audio data all the way in first meeting-place corresponding to the audio mixing packet of Chuo Yumei roads is consistent； Second synchronization unit includes：Second obtains subelement, for obtaining the audio mixing timestamp entrained by each road audio mixing packet； Second judgment sub-unit, for judging in accessed each road audio mixing packet whether is audio mixing timestamp identical audio mixing way Reach predetermined quantity；The predetermined quantity is the quantity of microphone in first meeting-place；When reaching the predetermined quantity, hold The step of row each road audio mixing packet by obtained by mixing operation is respectively sent to three meeting-place；Correspondingly, it is described to mix In the step of each road audio mixing packet obtained by sound operation is respectively sent to three meeting-place, each road obtained by the mixing operation is mixed Sound packet is audio mixing timestamp identical, the audio mixing packet of the predetermined quantity way.

Alternatively, the packets of audio data way in second meeting-place and the packets of audio data way phase in first meeting-place Together；Described device also includes：Second downmixing unit, for by the second meeting-place Zhong Mei roads packets of audio data respectively with it is described Packets of audio data all the way in first meeting-place carries out mixing operation.

According to the third aspect, the embodiments of the invention provide a kind of audio/video conference system, including：It is multiple in first meeting-place Microphone；The microphone in the second meeting-place；With the multiple loudspeakers of number of microphone identical in first meeting-place in 3rd meeting-place, The loudspeaker in the 3rd meeting-place is corresponded with the microphone in first meeting-place；Audio mixing server, for performing first party The sound mixing method of audio/video conference system described in face and first aspect any one optional mode.

Alternatively, the audio mixing server includes multiple mixers, and the quantity of the mixer is no less than first meeting The quantity of microphone in.

Sound mixing method, the apparatus and system for the audio/video conference system that the embodiment of the present invention is provided, by positioned at first Multiple correspondence microphones of diverse location collect multi-path audio-frequency data bag in meeting-place, audio mixing server receive from this first Meeting-place multi-path audio-frequency data bag and the packets of audio data in multiple second meeting-place, then the multi-path audio-frequency data bag in the first meeting-place is divided Packets of audio data not with multiple second meeting-place carries out mixing operation, finally by each road audio mixing packet obtained by mixing operation point Do not send to the 3rd meeting-place, by playing each road audio mixing number with the one-to-one loudspeaker of the first meeting-place microphone in the 3rd meeting-place According to bag.Make the time synchronized of the multi-path audio-frequency data bag in the first meeting-place before mixing operation is carried out, and/or, by each road audio mixing Packet is sent to the time synchronized of Shi Ge roads audio mixing packet before the 3rd meeting-place, so that synchronization institute in the first meeting-place Reproduction time difference of the multiple sound of collection in the 3rd meeting-place is smaller, so can embody in the first meeting-place each microphone for The acquisition time of same speech content is poor so that the user in the 3rd meeting-place can differentiate the position of sound according to acquisition time difference Put, reach the effect for listening sound to distinguish position, as oneself placing oneself in the midst of in A meeting-place.

Brief description of the drawings

The features and advantages of the present invention can be more clearly understood from by reference to accompanying drawing, accompanying drawing is schematical without that should manage Solve to carry out any limitation to the present invention, in the accompanying drawings：

Fig. 1 shows the application scenarios schematic diagram of the embodiment of the present invention；

Fig. 2 shows a kind of flow chart of the sound mixing method of audio/video conference system according to embodiments of the present invention；

Fig. 3 shows the flow chart of the sound mixing method of another audio/video conference system according to embodiments of the present invention；

Fig. 4 shows the flow chart of the sound mixing method of another audio/video conference system according to embodiments of the present invention；

Fig. 5 shows the flow chart of the sound mixing method of another audio/video conference system according to embodiments of the present invention；

Fig. 6 shows the flow chart of the sound mixing method of another audio/video conference system according to embodiments of the present invention；

Fig. 7 shows a kind of schematic diagram of the device sound mixing of audio/video conference system according to embodiments of the present invention；

Fig. 8 shows the schematic diagram of the device sound mixing of another audio/video conference system according to embodiments of the present invention.

Embodiment

To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is A part of embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, those skilled in the art are not having There is the every other embodiment made and obtained under the premise of creative work, belong to the scope of protection of the invention.

Fig. 1 shows the application scenarios schematic diagram of the embodiment of the present invention.Fig. 1 include audio collection meeting-place (A in Fig. 1, B, C, D meeting-place), audio play meeting-place (E meeting-place) and audio mixing service.The first meeting-place (such as Fig. 1 wherein in audio collection meeting-place In A meeting-place) gather sound in the meeting-place using multiple microphones.Each road voice data that audio collection meeting place is collected Bag is transmitted to audio mixing server and carries out audio mixing, and each road audio mixing packet of gained after audio mixing is sent to audio and plays meeting-place progress Play.Audio plays meeting-place and plays each road audio mixing packet (the E meeting-place in Fig. 1) using multiple loudspeakers.Its sound intermediate frequency is played The number of loudspeakers in meeting-place is not less than the number of microphone in the first meeting-place in audio collection meeting-place.The audio mixing server includes network Receiving module, network sending module and multiple mixers, the wherein quantity of the mixer are not less than first in audio collection meeting-place The quantity of the microphone in meeting-place.

Because spokesman is different from the distance of each microphone in A meeting-place, then each microphone is same for the moment for spokesman The acquisition time for carving speech content has loudspeaker in difference, E meeting-place when playing the sound that correspondence microphone is gathered, if It is poor (i.e. the difference of acquisition time when multiple microphones gather same sound in the first meeting-place) to embody this very first time, then E The very first time difference for the A meeting-place multiple sound that user in meeting-place can be heard according to itself differentiates the position of sound, as User itself places oneself in the midst of the same in A meeting-place.If realizing the loudspeaker in the E meeting-place energy when playing the sound that correspondence microphone is gathered Enough embody that the above-mentioned very first time is poor, then need to gather moment identical sound in the multiple sound in A meeting-place in E meeting-place while broadcasting Put (or the time difference played is smaller).

It should be added that, the microphone in the application can also be other audio collecting devices, and loudspeaker may be used also To be other audio-frequence player devices.

Embodiment one

Based on above-mentioned principle, the embodiments of the invention provide a kind of sound mixing method of audio/video conference system, Fig. 2 is shown A kind of flow chart of the sound mixing method of audio/video conference system according to embodiments of the present invention.The audio mixing of the audio/video conference system Method is applied to the audio mixing server shown in Fig. 1.According to Fig. 2, this method includes：

S101：Receive the multi-path audio-frequency data bag in the first meeting-place and the packets of audio data in multiple second meeting-place.First meeting Multi-path audio-frequency data bag by the first meeting-place diverse location it is multiple correspondence microphones gathered.

As shown in figure 1,3 correspondence microphones of the 3 tunnel packets of audio data in A meeting-place by the diverse location in the first meeting-place Gathered, A meeting-place are the first meeting-place.B, C, D meeting-place are the second meeting-place.Audio mixing server receives the A by network receiving module Meeting-place Zhong tri- tunnel packets of audio data A1, A2 and A3, and receive the packets of audio data in packets of audio data B1, the C meeting-place in B meeting-place The packets of audio data D1 in C1, D meeting-place.

S102：Make the time synchronized of the multi-path audio-frequency data bag in the first meeting-place.

For example, due to network congestion or delay, 3 sounds that 3 microphone synchronizations are collected in the first meeting-place Frequency packet is not while reaching audio mixing server.For example, saying " Kazakhstan " word, 3 microphone institutes in A meeting-place spokesman 3 packets of audio data A1, A2 and the A3 collected, audio mixing is reached respectively at t1 moment, t1+ △ ts and t1+2* △ ts , then there is for the second time difference after audio mixing between each audio mixing packet of gained (i.e. for collection moment identical data in server Bag, the time difference caused by transmission delay), such as the second time difference is also △ t, in addition from audio mixing server transport to E Time delay is also had during meeting-place, then the second time difference after audio mixing between each audio mixing packet further increase, from And easily play out the situation of " ha ha ha " in E meeting-place.

Step S102 make the 3 tunnel packets of audio data that 3 microphone synchronizations in the first meeting-place are collected when Between it is synchronous, that is, make the packets of audio data that synchronization is collected in the first meeting-place carry out audio mixing while entering mixer, So as to reduce the second time difference between each road audio content in the first meeting-place after audio mixing (such as " Kazakhstan " that collects), it is to avoid The sound (for example " breathing out ") that multiple microphone synchronizations are collected in first meeting-place is played in audio plays meeting-place The time difference larger multiple same sound (such as " ha ha ha ").

S103：Packets of audio data by the multi-path audio-frequency data bag in the first meeting-place respectively with multiple second meeting-place carries out audio mixing Operation.

As shown in figure 1, after step S102, the packets of audio data in the first meeting-place A meeting-place is AA1, AA2 and AA3, by this three Individual packets of audio data carries out audio mixing with packets of audio data B1, C1, the D1 in B, C, D meeting-place respectively, such as obtains AA1 and B1 progress audio mixings Audio mixing packet H1 is obtained, AA2 and C1 is carried out into audio mixing obtains audio mixing packet H2, AA3 and D1 progress audio mixings are obtained into audio mixing number According to bag H3.

It should be added that, each road packets of audio data in the first meeting-place is respectively adopted a mixer and carries out audio mixing, As packets of audio data AA1 uses mixer 1, AA2 to use mixer 3 using mixer 2, AA3.The packets of audio data in the second meeting-place Any one progress audio mixing in each mixer can be used, such as packets of audio data B1, C1 and D1 can be used in Fig. 1 Shown mixer 1 carries out audio mixing；Or, if another have a second meeting-place E, its packets of audio data E1 to use in Fig. 1 Shown any mixer carries out audio mixing.

S104：Each road audio mixing packet obtained by mixing operation is respectively sent to the 3rd meeting-place.3rd meeting-place includes many Individual loudspeaker, for playing each road audio mixing respectively.Multiple microphones in multiple loudspeakers and the first meeting-place in 3rd meeting-place Correspond.

For example, directly 3 road audio mixing packet H1, H2, H3 shown in Fig. 1 are sent to the 3rd by network sending module This 3 road audio mixing is played by 3 loudspeakers in meeting-place, the 3rd meeting-place.

The audio that signified " one-to-one corresponding " is gathered by a microphone herein is played by corresponding loudspeaker；Or Further, in order that the sound in the first meeting-place can more truly be presented on the 3rd meeting-place, loudspeaker is in the 3rd meeting-place Putting position of the putting position also with microphone in the first meeting-place it is corresponding.

The sound mixing method of above-mentioned audio/video conference system, passes through multiple correspondence Mikes of the diverse location in the first meeting-place Wind collects multi-path audio-frequency data bag, and audio mixing server is received from the first meeting-place multi-path audio-frequency data bag and multiple the The packets of audio data in two meeting-place, then makes the time synchronized of the multi-path audio-frequency data bag in the first meeting-place, then by many of the first meeting-place Road packets of audio data carries out mixing operation with the packets of audio data in multiple second meeting-place respectively, finally will be each obtained by mixing operation Road audio mixing packet be respectively sent in the 3rd meeting-place, the 3rd meeting-place by with the first one-to-one loudspeaker of meeting-place microphone Each road audio mixing packet is played, so that the multiple sound that synchronization is gathered in the first meeting-place is in the broadcasting in the 3rd meeting-place Between difference it is smaller, and then it is poor for the acquisition time of same speech content to embody each microphone in the first meeting-place so that the The user in three meeting-place can differentiate the position of sound according to acquisition time difference, the effect for listening sound to distinguish position be reached, as oneself is put Body is the same in A meeting-place.

Embodiment two

Fig. 3 shows the flow chart of the sound mixing method of another audio/video conference system according to embodiments of the present invention.Should The sound mixing method of audio/video conference system is applied to the audio mixing server shown in Fig. 1.According to Fig. 3, this method includes：

S201：Receive the multi-path audio-frequency data bag in the first meeting-place and the packets of audio data in multiple second meeting-place.First meeting Multi-path audio-frequency data bag by the first meeting-place diverse location it is multiple correspondence microphones gathered.The step refer to Step S101 in embodiment one is similar, will not be repeated here.

S202：Packets of audio data by the multi-path audio-frequency data bag in the first meeting-place respectively with multiple second meeting-place carries out audio mixing Operation.

As shown in figure 1, the packets of audio data in the first meeting-place A meeting-place is A1, A2 and A3, directly by these three packets of audio data Packets of audio data B1, C1, D1 with B, C, D meeting-place carry out audio mixing respectively, and A1 and B1 such as is carried out into audio mixing obtains audio mixing packet H1, carries out audio mixing by A2 and C1 and obtains audio mixing packet H2, and A3 and D1 is carried out into audio mixing obtains audio mixing packet H3.

It should be added that, each road packets of audio data in the first meeting-place is respectively adopted a mixer and carries out audio mixing, As packets of audio data A1 uses mixer 1, A2 to use mixer 3 using mixer 2, A3.The packets of audio data in the second meeting-place can To carry out audio mixing using any one in each mixer, such as packets of audio data B1, C1 and D1 can use institute in Fig. 1 The mixer 1 shown carries out audio mixing；Or, if another have a second meeting-place E, its packets of audio data E1 to use institute in Fig. 1 Any mixer shown carries out audio mixing.

S203：Make the time synchronized of each road audio mixing packet obtained by mixing operation.

Step S203 makes the time synchronized of each road audio mixing packet obtained by mixing operation, that is, Shi Ge roads audio mixing data Bag is sent to before the 3rd meeting-place, is first carried out actual synchronization to audio mixing packet, is especially made the first meeting-place in audio mixing packet The time synchronized of audio content, so as to reduce each road audio content in the first meeting-place after audio mixing (" Kazakhstan " that for example collects) it Between the second time difference, it is to avoid the sound (for example " breathing out ") that multiple microphone synchronizations are collected in the first meeting-place is in audio Play and be played in meeting-place as the time difference larger multiple same sound (such as " ha ha ha ").

S204：Each road audio mixing packet is respectively sent to the 3rd meeting-place.3rd meeting-place includes multiple loudspeakers, for dividing Each road audio mixing is not played.Multiple microphones in multiple loudspeakers and the first meeting-place in 3rd meeting-place are corresponded.The step The step S104 that refer in embodiment one is similar, will not be repeated here.

The sound mixing method of above-mentioned audio/video conference system, passes through multiple correspondence Mikes of the diverse location in the first meeting-place Wind collects multi-path audio-frequency data bag, and audio mixing server is received from the first meeting-place multi-path audio-frequency data bag and multiple the The packets of audio data in two meeting-place, the then voice data by the multi-path audio-frequency data bag in the first meeting-place respectively with multiple second meeting-place Bag carries out mixing operation, then makes the time synchronized of each road audio mixing packet obtained by mixing operation, finally by each road audio mixing data Bag is respectively sent in the 3rd meeting-place, the 3rd meeting-place mix by playing each road with the one-to-one loudspeaker of the first meeting-place microphone Sound packet, so that reproduction time difference of the multiple sound that synchronization is gathered in the first meeting-place in the 3rd meeting-place is smaller, And then it is poor for the acquisition time of same speech content to embody each microphone in the first meeting-place so that the use in the 3rd meeting-place Family can differentiate the position of sound according to acquisition time difference, the effect for listening sound to distinguish position be reached, as oneself placed oneself in the midst of in A meeting-place Equally.

Embodiment three

Fig. 4 shows the flow chart of the sound mixing method of another audio/video conference system according to embodiments of the present invention.Should The sound mixing method of audio/video conference system is applied to the audio mixing server shown in Fig. 1.According to Fig. 4, this method includes：

S301：Receive the multi-path audio-frequency data bag in the first meeting-place and the packets of audio data in multiple second meeting-place.First meeting Multi-path audio-frequency data bag by the first meeting-place diverse location it is multiple correspondence microphones gathered.The step is with implementing Step S101 in example one is similar, will not be repeated here.

S302：Obtain the acquisition time stamp for the every road packets of audio data for carrying the first source mark.

Every road packets of audio data in the first meeting-place carries the first source mark and acquisition time stamp.First source is identified for marking Know packets of audio data and derive from the first meeting-place.Acquisition time stabs the time collected for identifying every road packets of audio data.

S303：Judge to carry predetermined acquisition time stamp identical audio number in every road packets of audio data that the first source is identified Whether predetermined quantity is reached according to bag way.Predetermined quantity is the quantity of microphone in the first meeting-place.

S304：When reaching predetermined quantity, that is, complete the multichannel sound in predetermined acquisition time stamp corresponding moment first meeting-place The time synchronized operation of frequency packet.

In above-mentioned steps S302, S303 and S304, audio mixing server is received after each packets of audio data, checks audio The first source mark whether is carried in packet, is needed if having the first source mark packets of audio data of the first meeting-place (be) pair The packets of audio data carries out time synchronized.

Specifically, to carry the first source mark packets of audio data, obtain entrained by it acquisition time stamp after, etc. When timestamp identical packets of audio data reaches predetermined quantity, i.e. deadline simultaneously operating.Such as in the first meeting-place Mike The quantity of wind is 3, and the packets of audio data for t2 is stabbed for acquisition time, when acquisition time stabs the packets of audio data for t2 When the quantity of (carry the first source mark) reaches 3, i.e. deadline synchronizing step.Alternatively, when wait predetermined amount of time Afterwards, if timestamp identical packets of audio data is not up to predetermined quantity, the packets of audio data with the timestamp is abandoned.

The step S102 that above-mentioned steps S302, S303 and S304 have been implemented in embodiment one " makes many of the first meeting-place The time synchronized of road packets of audio data ".

It should be added that, each road sound of the time to the first meeting-place of sound is gathered in the present embodiment using microphone Frequency packet carries out time synchronized, the data that can also be collected as the mode of texturing of present embodiment using microphone Send the timestamp made an addition to any instant before audio mixing server in the packets of audio data.

S305：Packets of audio data by the multi-path audio-frequency data bag in the first meeting-place respectively with multiple second meeting-place carries out audio mixing Operation.The multi-path audio-frequency data bag in the first meeting-place is identified and acquisition time stamp identical, predetermined quantity road to carry the first source Several packets of audio data.The step is similar with the step S103 in embodiment one, will not be repeated here.

S306：Each road audio mixing packet obtained by mixing operation is respectively sent to the 3rd meeting-place.3rd meeting-place includes many Individual loudspeaker, for playing each road audio mixing respectively.Multiple microphones in multiple loudspeakers and the first meeting-place in 3rd meeting-place Correspond.The step is similar with the step S104 in embodiment one, will not be repeated here.

Example IV

Fig. 5 shows the flow chart of the sound mixing method of another audio/video conference system according to embodiments of the present invention.Should The sound mixing method of audio/video conference system is applied to the audio mixing server shown in Fig. 1.According to Fig. 5, this method includes：

S401：Receive the multi-path audio-frequency data bag in the first meeting-place and the packets of audio data in multiple second meeting-place.First meeting Multi-path audio-frequency data bag by the first meeting-place diverse location it is multiple correspondence microphones gathered.The step refer to Step S201 in embodiment two is similar, will not be repeated here.

S402：Packets of audio data by the multi-path audio-frequency data bag in the first meeting-place respectively with multiple second meeting-place carries out audio mixing Operation.The step S202 that the step refer in embodiment two is similar, will not be repeated here.

S403：Obtain the audio mixing timestamp entrained by each road audio mixing packet.

Each road audio mixing packet obtained by mixing operation carries audio mixing timestamp, audio mixing timestamp and every road audio mixing data Acquisition time stamp entrained by the packets of audio data all the way in the first corresponding meeting-place of bag is consistent.If as shown in figure 1, the first meeting-place Packets of audio data B1 audio mixings of the packets of audio data A1 directly with the second meeting-place B meeting-place obtain audio mixing packet H1, then H1 is taken The audio mixing timestamp of band is consistent with the acquisition time stamp entrained by A1.

S404：Whether audio mixing timestamp identical audio mixing way reaches pre- in each road audio mixing packet accessed by judging Fixed number amount.The predetermined quantity is the quantity of microphone in the first meeting-place.

S405：When reaching predetermined quantity, each road audio mixing packet is respectively sent to the 3rd meeting-place.3rd meeting-place includes Multiple loudspeakers, for playing each road audio mixing respectively.Multiple Mikes in multiple loudspeakers and the first meeting-place in 3rd meeting-place Wind is corresponded.The step of each road audio mixing packet is respectively sent into three meeting-place refer to the step S204 in embodiment two It is similar, it will not be repeated here.

Each road audio mixing packet in the step obtained by mixing operation is audio mixing timestamp identical, predetermined quantity way Audio mixing packet.

In above-mentioned steps S403, S404 and S405, audio mixing server is by each road audio mixing packet obtained by mixing operation Send to before the 3rd meeting-place, first obtain the audio mixing timestamp entrained by each road audio mixing packet, wait audio mixing timestamp identical The quantity of audio mixing packet when reaching predetermined quantity, i.e. deadline simultaneously operating.Such as in the first meeting-place number of microphone Measure as 3, for the audio mixing packet that audio mixing timestamp is t3, when audio mixing timestamp reaches for the quantity of t3 audio mixing packet During to 3, i.e. deadline synchronizing step.Alternatively, after predetermined amount of time is waited, if timestamp identical packets of audio data Not up to predetermined quantity, then abandon the packets of audio data with the timestamp.

The step S203 that above-mentioned steps S403, S404 and S405 have been implemented in embodiment two " makes obtained by mixing operation Each road audio mixing packet time synchronized ".

Embodiment five

Fig. 6 shows the flow chart of the sound mixing method of another audio/video conference system according to embodiments of the present invention.Should The sound mixing method of audio/video conference system is applied to the audio mixing server shown in Fig. 1.According to Fig. 6, this method includes：

S501：Receive the multi-path audio-frequency data bag in the first meeting-place and the packets of audio data in multiple second meeting-place.First meeting Multi-path audio-frequency data bag by the first meeting-place diverse location it is multiple correspondence microphones gathered.The step refer to Step S101 in embodiment one is similar, will not be repeated here.

S502：Make the time synchronized of the multi-path audio-frequency data bag in the first meeting-place.The step refer to the step in embodiment one Rapid S102 is similar, will not be repeated here.

It should be added that, step S502 can also replace with step S302, S303 in embodiment three and S304。

S503：Packets of audio data by the multi-path audio-frequency data bag in the first meeting-place respectively with multiple second meeting-place carries out audio mixing Operation.The step S103 that the step refer in embodiment one is similar, will not be repeated here.

S504：Make the time synchronized of each road audio mixing packet obtained by mixing operation.The step refer in embodiment two Step S203 it is similar, will not be repeated here.

It should be added that, step S504 can also replace with step S403, S404 in example IV and S405。

S505：Each road audio mixing packet is respectively sent to the 3rd meeting-place.3rd meeting-place includes multiple loudspeakers, for dividing Each road audio mixing is not played.Multiple microphones in multiple loudspeakers and the first meeting-place in 3rd meeting-place are corresponded.The step The step S104 that refer in embodiment one is similar, will not be repeated here.

Embodiment six

The embodiments of the invention provide a kind of sound mixing method of audio/video conference system, with embodiment one to embodiment five Difference is that the packets of audio data way in the second meeting-place is identical with the packets of audio data way in the first meeting-place.

Packets of audio data by the multi-path audio-frequency data bag in the first meeting-place respectively with multiple second meeting-place carries out mixing operation The step of include：Second meeting-place Zhong Mei roads packets of audio data is mixed with the packets of audio data all the way in the first meeting-place respectively Sound is operated.

As shown in Figure 1, it is assumed that as also thering are 3 microphones to collect 3 tunnel packets of audio data in the B meeting-place in the second meeting-place B1, B2, B3 (identical with the quantity of the packets of audio data in the first meeting-place), then the first meeting-place A meeting-place Zhong Mei roads packets of audio data Audio mixing, such as A1 and B1 audio mixings, A2 and B2 audio mixings, A3 and B3 are carried out respectively with every road packets of audio data in the second meeting-place B meeting-place Audio mixing.Namely B meeting-place Zhong Mei roads audio is also the same with the first meeting-place A meeting-place Zhong Mei roads audio, and an audio mixing is respectively adopted Device carries out audio mixing.

Embodiment seven

Fig. 7 shows a kind of schematic diagram of the device sound mixing of audio/video conference system according to embodiments of the present invention.The sound The device sound mixing of video conferencing system is applied to the audio mixing server shown in Fig. 1.According to Fig. 7, the device includes receiving Unit 10, the first downmixing unit 20 and transmitting element 30.

Receiving unit 10, for receiving the multi-path audio-frequency data bag in the first meeting-place and the voice data in multiple second meeting-place Bag.The multi-path audio-frequency data bag in the first meeting-place by the first meeting-place diverse location it is multiple correspondence microphones gathered.

First downmixing unit 20, for the audio by the multi-path audio-frequency data bag in the first meeting-place respectively with multiple second meeting-place Packet carries out mixing operation.

Transmitting element 30, for each road audio mixing packet obtained by mixing operation to be respectively sent into the 3rd meeting-place.3rd Meeting-place includes multiple loudspeakers, for playing each road audio mixing respectively.Orientation and multiple Mikes of multiple loudspeakers in the 3rd meeting-place Wind is corresponded in the orientation in the first meeting-place.

The device also includes the first synchronization unit 40 and/or the second synchronization unit 50.

First synchronization unit 40, in the sound by the multi-path audio-frequency data bag in the first meeting-place respectively with multiple second meeting-place Before the step of frequency packet carries out mixing operation, make the time synchronized of the multi-path audio-frequency data bag in the first meeting-place.

Second synchronization unit 50, for each road audio mixing packet obtained by mixing operation to be respectively sent into the 3rd meeting-place The step of before, make the time synchronized of each road audio mixing packet obtained by mixing operation.

The device sound mixing of above-mentioned audio/video conference system, passes through multiple correspondence Mikes of the diverse location in the first meeting-place Wind collects multi-path audio-frequency data bag, and audio mixing server is received from the first meeting-place multi-path audio-frequency data bag and multiple the The packets of audio data in two meeting-place, then the packets of audio data by the multi-path audio-frequency data bag in the first meeting-place respectively with multiple second meeting-place Mixing operation is carried out, finally each road audio mixing packet obtained by mixing operation is respectively sent in the 3rd meeting-place, the 3rd meeting-place By playing each road audio mixing packet with the one-to-one loudspeaker of the first meeting-place microphone.Make the before mixing operation is carried out The time synchronized of the multi-path audio-frequency data bag in one meeting-place, and/or, each road audio mixing packet, which is sent to before the 3rd meeting-place, to be made respectively The time synchronized of road audio mixing packet, so that the multiple sound that synchronization is gathered in the first meeting-place broadcasting in the 3rd meeting-place Put that the time difference is smaller, and then can embody in the first meeting-place that each microphone is poor for the acquisition time of same speech content, make The position of sound can be differentiated according to acquisition time difference by obtaining the user in the 3rd meeting-place, the effect for listening sound to distinguish position be reached, as certainly Oneself places oneself in the midst of the same in A meeting-place.

Embodiment eight

Fig. 8 shows the schematic diagram of the device sound mixing of another audio/video conference system according to embodiments of the present invention.Should The device sound mixing of audio/video conference system is applied to the audio mixing server shown in Fig. 1, and the difference of itself and embodiment seven is, the Every road packets of audio data in one meeting-place carries the first source mark and acquisition time stamp.First source is identified for identifying voice data Bag derives from the first meeting-place.Acquisition time stabs the time collected for identifying every road packets of audio data.First synchronization unit 40 The judgment sub-unit 42 of subelement 41 and first is obtained including first.

First obtains subelement 41, the acquisition time for obtaining the every road packets of audio data for carrying the first source mark Stamp.

When making a reservation for collection in the first judgment sub-unit 42, every road packets of audio data for judging to carry the first source mark Between stamp identical packets of audio data way whether reach predetermined quantity.Predetermined quantity is the quantity of microphone in the first meeting-place.

When reaching predetermined quantity, that is, complete the multi-path audio-frequency data in predetermined acquisition time stamp corresponding moment first meeting-place The time synchronized operation of bag.

Correspondingly, the packets of audio data by the multi-path audio-frequency data bag in the first meeting-place respectively with multiple second meeting-place is mixed In sound operating procedure, the multi-path audio-frequency data bag in the first meeting-place to carry the first source mark and acquisition time stamp identical, it is pre- The packets of audio data of fixed number amount way.

As a kind of optional embodiment of the present embodiment, each road audio mixing packet obtained by mixing operation carries audio mixing Timestamp, audio mixing timestamp and the collection entrained by the packets of audio data all the way per the first meeting-place corresponding to the audio mixing packet of road Timestamp is consistent.Second synchronization unit 50 includes second and obtains the judgment sub-unit 52 of subelement 51 and second.

Second obtains subelement 51, for obtaining the audio mixing timestamp entrained by each road audio mixing packet.

Second judgment sub-unit 52, for judging that audio mixing timestamp identical is mixed in accessed each road audio mixing packet Whether sound way reaches predetermined quantity.Predetermined quantity is the quantity of microphone in the first meeting-place.

When reaching predetermined quantity, each road audio mixing packet obtained by mixing operation is respectively sent to the 3rd meeting-place by execution The step of.

Correspondingly, in the step of each road audio mixing packet obtained by mixing operation being respectively sent into three meeting-place, audio mixing Each road audio mixing packet obtained by operation is audio mixing timestamp identical, the audio mixing packet of predetermined quantity way.

As a kind of optional embodiment of the present embodiment, the packets of audio data way in the second meeting-place and the sound in the first meeting-place Frequency packet way is identical.The device also includes：Second downmixing unit 60, for by the second meeting-place Zhong Mei roads packets of audio data Respectively mixing operation is carried out with the packets of audio data all the way in the first meeting-place.

Embodiment nine

The embodiments of the invention provide a kind of audio/video conference system, as shown in figure 1, including multiple Mikes in the first meeting-place The multiple loudspeakers of number of microphone identical and audio mixing in wind, the microphone in the second meeting-place, the 3rd meeting-place with the first meeting-place take Business device.The loudspeaker in the 3rd meeting-place and the microphone in the first meeting-place are corresponded.

The audio mixing server is used for the sound mixing method for performing any described audio/video conference system of embodiment one to six.

As a kind of optional embodiment of the present embodiment, audio mixing server includes multiple mixers, the quantity of mixer No less than the quantity of microphone in the first meeting-place.

Although being described in conjunction with the accompanying embodiments of the invention, those skilled in the art can not depart from the present invention Spirit and scope in the case of various modification can be adapted and modification, such modifications and variations are each fallen within by appended claims institute Within the scope of restriction.

Claims

1. a kind of sound mixing method of audio/video conference system, it is characterised in that including：

Receive the multi-path audio-frequency data bag in the first meeting-place and the packets of audio data in multiple second meeting-place；First meeting-place it is many Road packets of audio data by first meeting-place diverse location it is multiple correspondence microphones gathered；

Packets of audio data by the multi-path audio-frequency data bag in first meeting-place respectively with the multiple second meeting-place carries out audio mixing Operation；

Each road audio mixing packet obtained by mixing operation is respectively sent to the 3rd meeting-place；3rd meeting-place includes multiple raise one's voice Device, for playing each road audio mixing respectively；The multiple loudspeaker is corresponded with the multiple microphone；

Methods described also includes：The multi-path audio-frequency data bag by first meeting-place respectively with the multiple second meeting-place Packets of audio data carry out mixing operation the step of before, make the time synchronized of the multi-path audio-frequency data bag in first meeting-place； And/or,

Before the step of each road audio mixing packet by obtained by mixing operation is respectively sent to three meeting-place, grasp audio mixing The time synchronized of each road audio mixing packet obtained by work.

2. the sound mixing method of audio/video conference system according to claim 1, it is characterised in that first meeting-place it is every Road packets of audio data carries the first source mark and acquisition time stamp；First source is identified for identifying packets of audio data source In first meeting-place；The acquisition time stabs the time collected for identifying every road packets of audio data；It is described to make described The step of time synchronized of the multi-path audio-frequency data bag in one meeting-place, includes：

Obtain the acquisition time stamp for the every road packets of audio data for carrying the first source mark；

Judge to carry predetermined acquisition time stamp identical packets of audio data in every road packets of audio data that first source is identified Whether way reaches predetermined quantity；The predetermined quantity is the quantity of microphone in first meeting-place；

When reaching the predetermined quantity, that is, complete the MCVF multichannel voice frequency in predetermined acquisition time stamp corresponding moment first meeting-place The time synchronized operation of packet；

Correspondingly, the voice data of the multi-path audio-frequency data bag by first meeting-place respectively with the multiple second meeting-place Bag is carried out in mixing operation step, and the multi-path audio-frequency data bag in first meeting-place is identified and collection to carry first source The packets of audio data of timestamp identical, the predetermined quantity way.

3. the sound mixing method of the audio/video conference system according to right wants 1 or 2, it is characterised in that the mixing operation institute Each road audio mixing packet carry audio mixing timestamp, the audio mixing timestamp with it is described corresponding to often road audio mixing packet Acquisition time stamp entrained by the packets of audio data all the way in the first meeting-place is consistent；

The step of time synchronized of each road audio mixing packet made obtained by mixing operation, includes：

Obtain the audio mixing timestamp entrained by each road audio mixing packet；

Judge whether audio mixing timestamp identical audio mixing way reaches predetermined quantity in accessed each road audio mixing packet；Institute State quantity of the predetermined quantity for microphone in first meeting-place；

When reaching the predetermined quantity, perform each road audio mixing packet by obtained by mixing operation and be respectively sent to the 3rd The step of meeting-place；

Correspondingly, it is described in the step of each road audio mixing packet by obtained by mixing operation is respectively sent to three meeting-place Each road audio mixing packet obtained by mixing operation is audio mixing timestamp identical, the audio mixing packet of the predetermined quantity way.

4. the sound mixing method of audio/video conference system according to claim 1, it is characterised in that the sound in second meeting-place Frequency packet way is identical with the packets of audio data way in first meeting-place；

The multi-path audio-frequency data bag by first meeting-place is carried out with the packets of audio data in the multiple second meeting-place respectively The step of mixing operation, includes：By the second meeting-place Zhong Mei roads packets of audio data respectively with first meeting-place all the way Packets of audio data carries out mixing operation.

5. a kind of device sound mixing of audio/video conference system, it is characterised in that including：

Receiving unit, for receiving the multi-path audio-frequency data bag in the first meeting-place and the packets of audio data in multiple second meeting-place；Institute State the multi-path audio-frequency data bag in the first meeting-place by first meeting-place multiple correspondence microphones of diverse location gathered；

First downmixing unit, for the sound by the multi-path audio-frequency data bag in first meeting-place respectively with the multiple second meeting-place Frequency packet carries out mixing operation；

Transmitting element, for each road audio mixing packet obtained by mixing operation to be respectively sent into the 3rd meeting-place；3rd meeting Field includes multiple loudspeakers, for playing each road audio mixing respectively；The multiple loudspeaker is corresponded with the multiple microphone；

Described device also includes：First synchronization unit, for distinguishing in the multi-path audio-frequency data bag by first meeting-place Before the step of carrying out mixing operation with the packets of audio data in the multiple second meeting-place, make the MCVF multichannel voice frequency in first meeting-place The time synchronized of packet；And/or,

Second synchronization unit, for being respectively sent to the 3rd meeting-place in each road audio mixing packet by obtained by mixing operation Before step, make the time synchronized of each road audio mixing packet obtained by mixing operation.

6. the device sound mixing of audio/video conference system according to claim 5, it is characterised in that first meeting-place it is every Road packets of audio data carries the first source mark and acquisition time stamp；First source is identified for identifying packets of audio data source In first meeting-place；The acquisition time stabs the time collected for identifying every road packets of audio data；Described first is synchronous Unit includes：

First obtains subelement, the acquisition time stamp for obtaining the every road packets of audio data for carrying the first source mark；

Predetermined acquisition time in first judgment sub-unit, every road packets of audio data for judging to carry the first source mark Whether stamp identical packets of audio data way reaches predetermined quantity；The predetermined quantity is the number of microphone in first meeting-place Amount；

7. the device sound mixing of the audio/video conference system according to right wants 5 or 6, it is characterised in that the mixing operation institute Each road audio mixing packet carry audio mixing timestamp, the audio mixing timestamp with it is described corresponding to often road audio mixing packet Acquisition time stamp entrained by the packets of audio data all the way in the first meeting-place is consistent；

Second synchronization unit includes：

Second obtains subelement, for obtaining the audio mixing timestamp entrained by each road audio mixing packet；

Second judgment sub-unit, for judging audio mixing timestamp identical audio mixing way in accessed each road audio mixing packet Whether predetermined quantity is reached；The predetermined quantity is the quantity of microphone in first meeting-place；

8. the device sound mixing of audio/video conference system according to claim 5, it is characterised in that the sound in second meeting-place Frequency packet way is identical with the packets of audio data way in first meeting-place；

Described device also includes：Second downmixing unit, for by the second meeting-place Zhong Mei roads packets of audio data respectively with institute The packets of audio data all the way stated in the first meeting-place carries out mixing operation.

9. a kind of audio/video conference system, it is characterised in that including：

Multiple microphones in first meeting-place；

The microphone in the second meeting-place；

With the multiple loudspeakers of number of microphone identical in first meeting-place in 3rd meeting-place, the loudspeaker in the 3rd meeting-place Corresponded with the microphone in first meeting-place；

Audio mixing server, the sound mixing method of the audio/video conference system described in 1 to 4 any one is required for perform claim.

10. audio/video conference system according to claim 9, it is characterised in that the audio mixing server includes multiple mixed Sound device, the quantity of the mixer is no less than the quantity of microphone in first meeting-place.