CN107195308B - Audio mixing method, device and system of audio and video conference system - Google Patents

Audio mixing method, device and system of audio and video conference system Download PDF

Info

Publication number
CN107195308B
CN107195308B CN201710243769.XA CN201710243769A CN107195308B CN 107195308 B CN107195308 B CN 107195308B CN 201710243769 A CN201710243769 A CN 201710243769A CN 107195308 B CN107195308 B CN 107195308B
Authority
CN
China
Prior art keywords
audio
meeting place
mixing
audio data
sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710243769.XA
Other languages
Chinese (zh)
Other versions
CN107195308A (en
Inventor
肖集华
凡超
顾振华
周金龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Keda Technology Co Ltd
Original Assignee
Suzhou Keda Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Keda Technology Co Ltd filed Critical Suzhou Keda Technology Co Ltd
Priority to CN201710243769.XA priority Critical patent/CN107195308B/en
Publication of CN107195308A publication Critical patent/CN107195308A/en
Application granted granted Critical
Publication of CN107195308B publication Critical patent/CN107195308B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04JMULTIPLEX COMMUNICATION
    • H04J3/00Time-division multiplex systems
    • H04J3/02Details
    • H04J3/06Synchronising arrangements
    • H04J3/0635Clock or time synchronisation in a network
    • H04J3/0638Clock or time synchronisation among nodes; Internode synchronisation
    • H04J3/0658Clock or time synchronisation among packet nodes
    • H04J3/0661Clock or time synchronisation among packet nodes using timestamps
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Abstract

The invention discloses a sound mixing method, a device and a system of an audio and video conference system, wherein the method comprises the following steps: receiving a plurality of paths of audio data packets of a first meeting place and audio data packets of a plurality of second meeting places; carrying out sound mixing operation on the audio data packet; respectively sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place; the third meeting place comprises a plurality of loudspeakers and a plurality of microphones, wherein the loudspeakers correspond to the microphones one by one; and before the sound mixing operation is carried out, synchronizing the time of the multiple paths of audio data packets of the first meeting place, and/or synchronizing the time of each path of sound mixing data packet before each path of sound mixing data packet is sent to the third meeting place. The invention ensures that the playing time difference of the multi-path sound collected at the same time in the first meeting place in the third meeting place is smaller, thereby reflecting the collecting time difference of each microphone in the first meeting place for the same speech content, ensuring that the user in the third meeting place can judge the position of the sound according to the collecting time difference, and achieving the effect of sound listening and position distinguishing.

Description

Audio mixing method, device and system of audio and video conference system
Technical Field
The invention relates to the technical field of audio and video conferences, in particular to a sound mixing method, a device and a system of an audio and video conference system.
Background
Video conferencing systems allow users at two or more sites to converse in real time, and existing video conferencing technologies have been able to more realistically present images of speakers in other sites to participants at the local site, thereby allowing users to appear as if they were in a real conference scene in the video conferencing systems.
In terms of audio data packets, the participants in the local conference hall can listen to the speaking contents of a plurality of other conference halls at the same time under the existing video conference technology. Specifically, chinese patent document CN 102364952 a discloses a method for processing audio and video synchronization when multiple paths of audio and video are played simultaneously, in which a path of audio data packet is collected for each of N users, and then the N paths of audio data packets are mixed to form N +1 paths of audio data packets, and then the audio data packets are sent to each user respectively. For example, the audio data packets of A, B, C three meeting places are respectively collected and mixed into four paths of audio, which are respectively sent to A, B, C and other meeting places (for example, meeting places without the right to speak), that is, the audio data packets of A, B, C three meeting places are mixed by the first mixed audio, and the conference participants of other meeting places can simultaneously hear the speaking content of A, B, C three meeting places by sending the first mixed audio data packet to other meeting places; the second mixing only mixes the audio data packets of B, C two meeting places, and the second mixing data packet is sent to the A meeting place, so that the participants in the A meeting place can simultaneously hear the speech content of B, C two meeting places; the third mixing only mixes the audio data packets of A, C two meeting places, and the third mixing data packet is sent to the B meeting place, so that the participants in the B meeting place can simultaneously hear the speech content of A, C two meeting places; the fourth mixing only mixes the audio data packets of A, B two meeting places, and the fourth mixing data packet is sent to the C meeting place, so that the participants at the C meeting place can simultaneously hear the speech content of A, B two meeting places.
Because the existing mode only collects one path of audio data packet for each meeting place, the audio mixing is carried out on the corresponding audio data packet after each meeting place is simply combined during the audio mixing, so that only sound of other meeting places can be heard in each meeting place, and when the sound of other meeting places comes from different directions, the sound in different directions cannot be distinguished, and the function of distinguishing positions by listening sound cannot be realized.
Disclosure of Invention
In view of this, embodiments of the present invention provide a sound mixing method, device and system for an audio/video conference system, so as to solve the problem that listening and location identification cannot be realized in the prior art.
According to a first aspect, an embodiment of the present invention provides a mixing method for an audio/video conference system, including: receiving a plurality of paths of audio data packets of a first meeting place and audio data packets of a plurality of second meeting places; the multi-channel audio data packets of the first meeting place are collected by a plurality of corresponding microphones positioned at different positions in the first meeting place; performing sound mixing operation on the multi-channel audio data packets of the first meeting place and the audio data packets of the plurality of second meeting places respectively; respectively sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place; the third meeting place comprises a plurality of loudspeakers and is used for respectively playing each mixed sound; the plurality of loudspeakers correspond to the plurality of microphones one by one; the method further comprises the following steps: before the step of mixing the multiple audio data packets of the first meeting place with the audio data packets of the multiple second meeting places, synchronizing the time of the multiple audio data packets of the first meeting place; and/or before the step of sending each path of mixed sound data packet obtained by the mixed sound operation to the third meeting place, synchronizing the time of each path of mixed sound data packet obtained by the mixed sound operation.
Optionally, each audio data packet of the first session carries a first source identifier and a collection timestamp; the first source identification is used for identifying that the audio data packet originates from the first meeting place; the acquisition time stamp is used for identifying the time when each path of audio data packet is acquired; the step of synchronizing the time of the multiple audio data packets of the first conference room comprises: acquiring a collection time stamp of each path of audio data packet carrying the first source identifier; judging whether the number of the audio data packets with the same preset acquisition time stamp in each audio data packet carrying the first source identifier reaches a preset number or not; the preset number is the number of the microphones in the first meeting place; when the preset number is reached, the time synchronization operation of the multi-channel audio data packets of the first meeting place at the moment corresponding to the preset acquisition timestamp is completed; correspondingly, in the step of performing the audio mixing operation on the multiple audio data packets of the first meeting place and the multiple audio data packets of the multiple second meeting places, the multiple audio data packets of the first meeting place are the audio data packets of the predetermined number of paths which carry the first source identifier and have the same collection timestamp.
Optionally, each path of mixed sound data packet obtained by the mixed sound operation carries a mixed sound time stamp, and the mixed sound time stamp is consistent with an acquisition time stamp carried by one path of audio data packet of the first meeting place corresponding to each path of mixed sound data packet; the step of synchronizing the time of each mixed data packet obtained by the mixing operation comprises: acquiring a sound mixing timestamp carried by each sound mixing data packet; judging whether the number of the sound mixing paths with the same sound mixing time stamps in the obtained sound mixing data packets reaches a preset number or not; the preset number is the number of the microphones in the first meeting place; when the preset number is reached, executing the step of respectively sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place; correspondingly, in the step of sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place, each path of mixed sound data packet obtained by the mixed sound operation is the mixed sound data packet with the same mixed sound time stamp and the predetermined number of paths.
Optionally, the number of the audio data packets in the second conference room is the same as the number of the audio data packets in the first conference room; the step of mixing the multiple audio data packets of the first conference room with the audio data packets of the multiple second conference rooms respectively comprises: and performing sound mixing operation on each path of audio data packet in the second meeting place and one path of audio data packet in the first meeting place respectively.
According to a second aspect, an embodiment of the present invention provides an audio mixing apparatus for an audio and video conference system, including: the receiving unit is used for receiving the multi-channel audio data packets of the first meeting place and the audio data packets of the second meeting places; the multi-channel audio data packets of the first meeting place are collected by a plurality of corresponding microphones positioned at different positions in the first meeting place; the first audio mixing unit is used for mixing the multi-channel audio data packets of the first meeting place with the audio data packets of the plurality of second meeting places respectively; a sending unit, configured to send each audio mixing data packet obtained by the audio mixing operation to a third meeting place respectively; the third meeting place comprises a plurality of loudspeakers and is used for respectively playing each mixed sound; the plurality of loudspeakers correspond to the plurality of microphones one by one; the device further comprises: a first synchronization unit, configured to synchronize time of the multiple audio packets of the first conference room before the step of mixing the multiple audio packets of the first conference room with the audio packets of the multiple second conference rooms, respectively; and/or the second synchronization unit is used for synchronizing the time of each path of mixed data packet obtained by the mixed operation before the step of respectively sending each path of mixed data packet obtained by the mixed operation to the third meeting place.
Optionally, each audio data packet of the first session carries a first source identifier and a collection timestamp; the first source identification is used for identifying that the audio data packet originates from the first meeting place; the acquisition time stamp is used for identifying the time when each path of audio data packet is acquired; the first synchronization unit includes: the first obtaining subunit is configured to obtain a collection timestamp of each audio data packet carrying the first source identifier; the first judging subunit is configured to judge whether the number of the audio data packets with the same preset acquisition timestamp in each audio data packet carrying the first source identifier reaches a preset number; the preset number is the number of the microphones in the first meeting place; when the preset number is reached, the time synchronization operation of the multi-channel audio data packets of the first meeting place at the moment corresponding to the preset acquisition timestamp is completed; correspondingly, in the step of performing the audio mixing operation on the multiple audio data packets of the first meeting place and the multiple audio data packets of the multiple second meeting places, the multiple audio data packets of the first meeting place are the audio data packets of the predetermined number of paths which carry the first source identifier and have the same collection timestamp.
Optionally, each path of mixed sound data packet obtained by the mixed sound operation carries a mixed sound time stamp, and the mixed sound time stamp is consistent with an acquisition time stamp carried by one path of audio data packet of the first meeting place corresponding to each path of mixed sound data packet; the second synchronization unit includes: the second obtaining subunit is configured to obtain the audio mixing timestamps carried by the audio mixing data packets of each channel; the second judging subunit is configured to judge whether the number of mixing paths with the same mixing timestamp in each acquired mixing data packet reaches a predetermined number; the preset number is the number of the microphones in the first meeting place; when the preset number is reached, executing the step of respectively sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place; correspondingly, in the step of sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place, each path of mixed sound data packet obtained by the mixed sound operation is the mixed sound data packet with the same mixed sound time stamp and the predetermined number of paths.
Optionally, the number of the audio data packets in the second conference room is the same as the number of the audio data packets in the first conference room; the device further comprises: and the second sound mixing unit is used for mixing each path of audio data packet in the second meeting place with one path of audio data packet in the first meeting place.
According to a third aspect, an embodiment of the present invention provides an audio and video conference system, including: a plurality of microphones in a first venue; a microphone of the second venue; the number of the loudspeakers in the third meeting place is the same as that of the microphones in the first meeting place, and the loudspeakers in the third meeting place correspond to the microphones in the first meeting place one by one; and the mixing server is configured to execute the mixing method of the audio/video conference system according to the first aspect and any one of the optional manners of the first aspect.
Optionally, the mixing server includes a plurality of mixers, and the number of the mixers is not less than the number of the microphones in the first meeting place.
The audio mixing method, the device and the system of the audio and video conference system provided by the embodiment of the invention acquire the multi-channel audio data packets through the plurality of corresponding microphones positioned at different positions in the first meeting place, the audio mixing server receives the multi-channel audio data packets from the first meeting place and the audio data packets from the plurality of second meeting places, then the multi-channel audio data packets of the first meeting place and the audio data packets of the plurality of second meeting places are subjected to audio mixing operation respectively, finally, the audio mixing data packets obtained by the audio mixing operation are respectively sent to the third meeting place, and the audio mixing data packets of the third meeting place are played through the loudspeakers corresponding to the microphones of the first meeting place one by one. The method comprises the steps of synchronizing the time of multiple paths of audio data packets of a first meeting place before sound mixing operation is carried out, and/or synchronizing the time of each path of sound mixing data packet before each path of sound mixing data packet is sent to a third meeting place, so that the playing time difference of multiple paths of sound collected at the same moment in the first meeting place in the third meeting place is smaller, the collecting time difference of each microphone in the first meeting place for the same speech content can be reflected, a user in the third meeting place can judge the position of the sound according to the collecting time difference, and the effect of sound distinguishing is achieved, and the method is the same as the method when the user is placed in an A meeting place.
Drawings
The features and advantages of the present invention will be more clearly understood by reference to the accompanying drawings, which are illustrative and not to be construed as limiting the invention in any way, and in which:
FIG. 1 shows a schematic diagram of an application scenario of an embodiment of the present invention;
fig. 2 is a flowchart illustrating a mixing method of an audio-video conference system according to an embodiment of the present invention;
fig. 3 is a flowchart illustrating a mixing method of another audio-video conference system according to an embodiment of the present invention;
fig. 4 is a flowchart illustrating a mixing method of an audio-video conference system according to another embodiment of the present invention;
fig. 5 is a flowchart illustrating a mixing method of an audio-video conference system according to another embodiment of the present invention;
fig. 6 is a flowchart illustrating a mixing method of an audio-video conference system according to another embodiment of the present invention;
fig. 7 is a schematic diagram illustrating a mixing apparatus of an audio-video conference system according to an embodiment of the present invention;
fig. 8 is a schematic diagram of an audio mixing apparatus of another audio-video conference system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 shows a schematic view of an application scenario of an embodiment of the present invention. Fig. 1 includes an audio collection conference (A, B, C, D conference in fig. 1), an audio playback conference (E conference), and a mixing service. Wherein a first one of the audio collection sites (e.g., site a in fig. 1) collects sound in the site using a plurality of microphones. And transmitting the audio data packets collected by the audio collection meeting place to an audio mixing server for audio mixing, and transmitting the audio data packets obtained after audio mixing to an audio playing meeting place for playing. The audio playing venue adopts a plurality of speakers to play each mixed sound data packet (E venue in fig. 1). The number of the loudspeakers of the audio playing meeting place is not less than the number of the microphones of the first meeting place in the audio collecting meeting place. The mixing server comprises a network receiving module, a network sending module and a plurality of mixers, wherein the number of the mixers is not less than the number of microphones in a first meeting place in an audio acquisition meeting place.
Because the distances between the speaker and the microphones in the conference room a are different, the microphones have different collection times for the content of the speaker speaking at the same time, and if the first time difference (i.e., the difference between the collection times when the microphones collect the same sound in the first conference room) can be reflected when the speaker in the conference room E plays the sound collected by the corresponding microphone, the user in the conference room E can judge the position of the sound according to the first time difference of the multi-path sound in the conference room a heard by the user, as if the user is placed in the conference room a. If the speaker of the meeting place E can represent the first time difference when playing the sound collected by the corresponding microphone, it is necessary that the sound collected at the same time in the multiple paths of sound of the meeting place a is played at the same time in the meeting place E (or the playing time difference is small).
It should be added that the microphone in the present application may also be other audio collecting devices, and the speaker may also be other audio playing devices.
Example one
Based on the foregoing principle, an embodiment of the present invention provides a sound mixing method for an audio/video conference system, and fig. 2 shows a flowchart of the sound mixing method for the audio/video conference system according to the embodiment of the present invention. The mixing method of the audio and video conference system is suitable for the mixing server shown in figure 1. According to fig. 2, the method comprises:
s101: and receiving a plurality of audio data packets of the first meeting place and audio data packets of a plurality of second meeting places. The multiple audio data packets of the first venue are collected by multiple corresponding microphones located at different locations within the first venue.
As shown in fig. 1, 3 audio packets of the a conference site are collected by 3 corresponding microphones located at different positions in the first conference site, which is the first conference site. B. C, D is the second venue. The mixing server receives three audio data packets A1, A2 and A3 in the A meeting place through a network receiving module, and receives an audio data packet B1 of the B meeting place, an audio data packet C1 of the C meeting place and an audio data packet D1 of the D meeting place.
S102: time synchronization of the multiple audio packets of the first venue is performed.
For example, due to network congestion or delay, 3 audio data packets collected by 3 microphones in the first conference room at the same time do not arrive at the mixing server at the same time. For example, a speaker speaks a word "haha" in the meeting place a, and 3 audio data packets a1, a2, and A3 collected by 3 microphones arrive at the mixing server at time t1, time t1 +. DELTA.t, and time t1+2 × Δ t, respectively, so that a second time difference (i.e., a time difference caused by transmission delay for data packets with the same collection time) exists between the mixed data packets obtained after mixing, for example, the second time difference is Δ t, and in addition, there is a time delay in the transmission from the mixing server to the meeting place E, so that the second time difference between the mixed data packets after mixing is further increased, and thus, the "haha" is easily played in the meeting place E.
Step S102 synchronizes the time of 3 audio packets collected by 3 microphones in the first meeting place at the same time, that is, the audio packets collected at the same time in the first meeting place enter the audio mixer at the same time for audio mixing, so as to reduce a second time difference between audio contents (for example, collected "haha") in the first meeting place after audio mixing, and prevent sounds (for example, "haha") collected by a plurality of microphones in the first meeting place at the same time from being played as a plurality of same sounds (for example, "haha") with a large time difference in the audio playing meeting place.
S103: and respectively carrying out sound mixing operation on the multi-channel audio data packets of the first meeting place and the audio data packets of the plurality of second meeting places.
As shown in fig. 1, after step S102, the audio data packets of the first venue a are AA1, AA2, and AA3, and these three audio data packets are mixed with the audio data packets B1, C1, and D1 of the B, C, D venue respectively, for example, AA1 and B1 are mixed to obtain mixed data packet H1, AA2 and C1 are mixed to obtain mixed data packet H2, and AA3 and D1 are mixed to obtain mixed data packet H3.
It should be added that each audio data packet of the first meeting place is mixed by using one mixer, for example, the audio data packet AA1 is mixed by using the mixer 1, the audio data packet AA2 is mixed by using the mixer 2, and the audio data packet AA3 is mixed by using the mixer 3. The audio data packets of the second meeting place may be mixed by any one of the mixers, for example, the audio data packets B1, C1 and D1 may be mixed by the mixer 1 shown in fig. 1; alternatively, if there is another second meeting place E, the audio data packet E1 can be mixed by any one of the mixers shown in fig. 1.
S104: and respectively sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place. The third meeting place comprises a plurality of loudspeakers used for respectively playing the mixed sound of each path. The plurality of speakers in the third conference room correspond to the plurality of microphones in the first conference room one to one.
For example, the 3-way mixed data packets H1, H2, and H3 shown in fig. 1 are directly transmitted to the third meeting place through the network transmission module, and the third meeting place plays the 3-way mixed data through 3 speakers.
The one-to-one correspondence referred to herein is that the audio collected by one microphone is played by the corresponding speaker; or further, in order to enable the sound of the first meeting place to be more truly presented in the third meeting place, the placing position of the loudspeaker in the third meeting place also corresponds to the placing position of the microphone in the first meeting place.
The audio mixing method of the audio and video conference system acquires multi-channel audio data packets through a plurality of corresponding microphones positioned at different positions in a first meeting place, the audio mixing server receives the multi-channel audio data packets from the first meeting place and the audio data packets from a plurality of second meeting places, then the time of the multi-channel audio data packets of the first meeting place is synchronized, then the multi-channel audio data packets of the first meeting place are respectively mixed with the audio data packets of the plurality of second meeting places, finally, the multi-channel audio data packets obtained through the audio mixing operation are respectively sent to a third meeting place, the multi-channel audio data packets are played in the third meeting place through loudspeakers corresponding to the microphones of the first meeting place one by one, so that the playing time difference of multi-channel sound acquired at the same time in the first meeting place in the third meeting place is smaller, and further, the acquisition time difference of the microphones in the first meeting place for the same speaking content can be reflected, therefore, the user in the third meeting place can judge the position of the sound according to the acquisition time difference, and the effect of recognizing the position by listening to the sound is achieved, as if the user is positioned in the meeting place A.
Example two
Fig. 3 is a flowchart illustrating a mixing method of an audio-video conference system according to another embodiment of the present invention. The mixing method of the audio and video conference system is suitable for the mixing server shown in figure 1. According to fig. 3, the method comprises:
s201: and receiving a plurality of audio data packets of the first meeting place and audio data packets of a plurality of second meeting places. The multiple audio data packets of the first venue are collected by multiple corresponding microphones located at different locations within the first venue. Please refer to step S101 in the first embodiment, which is similar and not described herein again.
S202: and respectively carrying out sound mixing operation on the multi-channel audio data packets of the first meeting place and the audio data packets of the plurality of second meeting places.
As shown in fig. 1, the audio data packets of the first conference site a are a1, a2 and A3, and these three audio data packets are directly mixed with the audio data packets B1, C1 and D1 of the B, C, D conference site, for example, a1 and B1 are mixed to obtain a mixed data packet H1, a2 and C1 are mixed to obtain a mixed data packet H2, and A3 and D1 are mixed to obtain a mixed data packet H3.
It should be added that each audio data packet of the first meeting place is mixed by using one mixer, for example, the audio data packet a1 is mixed by using the mixer 1, the audio data packet a2 is mixed by using the mixer 2, and the audio data packet A3 is mixed by using the mixer 3. The audio data packets of the second meeting place may be mixed by any one of the mixers, for example, the audio data packets B1, C1 and D1 may be mixed by the mixer 1 shown in fig. 1; alternatively, if there is another second meeting place E, the audio data packet E1 can be mixed by any one of the mixers shown in fig. 1.
S203: and synchronizing the time of each mixed data packet obtained by the mixing operation.
For example, due to network congestion or delay, 3 audio data packets collected by 3 microphones in the first conference room at the same time do not arrive at the mixing server at the same time. For example, a speaker speaks a word "haha" in the meeting place a, and 3 audio data packets a1, a2, and A3 collected by 3 microphones arrive at the mixing server at time t1, time t1 +. DELTA.t, and time t1+2 × Δ t, respectively, so that a second time difference (i.e., a time difference caused by transmission delay for data packets with the same collection time) exists between the mixed data packets obtained after mixing, for example, the second time difference is Δ t, and in addition, there is a time delay in the transmission from the mixing server to the meeting place E, so that the second time difference between the mixed data packets after mixing is further increased, and thus, the "haha" is easily played in the meeting place E.
Step S203 synchronizes the time of each audio mixing data packet obtained by the audio mixing operation, that is, before each audio mixing data packet is sent to the third meeting place, the audio mixing data packets are actually synchronized, especially, the time of the audio content in the first meeting place in the audio mixing data packet is synchronized, so that the second time difference between each audio content (for example, "haha") in the first meeting place after the audio mixing can be reduced, and the sound (for example, "haha") collected by the multiple microphones at the same time in the first meeting place is prevented from being played as multiple same sounds (for example, "haha") with a large time difference in the audio playing meeting place.
S204: and respectively sending each path of mixed sound data packet to a third meeting place. The third meeting place comprises a plurality of loudspeakers used for respectively playing the mixed sound of each path. The plurality of speakers in the third conference room correspond to the plurality of microphones in the first conference room one to one. Please refer to step S104 in the first embodiment, which is not described herein again.
The audio mixing method of the audio and video conference system acquires multi-channel audio data packets through a plurality of corresponding microphones positioned at different positions in a first meeting place, the audio mixing server receives the multi-channel audio data packets from the first meeting place and the audio data packets from a plurality of second meeting places, then carries out audio mixing operation on the multi-channel audio data packets of the first meeting place and the audio data packets of the plurality of second meeting places respectively, synchronizes the time of each multi-channel audio mixing data packet obtained by the audio mixing operation, finally sends each multi-channel audio mixing data packet to a third meeting place respectively, and plays each multi-channel audio mixing data packet through a loudspeaker which is in one-to-one correspondence with the microphone of the first meeting place in the third meeting place, so that the playing time difference of multi-channel sound acquired at the same time in the first meeting place in the third meeting place is smaller, and the acquisition time difference of each microphone in the first meeting place for the same speech content can be reflected, therefore, the user in the third meeting place can judge the position of the sound according to the acquisition time difference, and the effect of recognizing the position by listening to the sound is achieved, as if the user is positioned in the meeting place A.
EXAMPLE III
Fig. 4 is a flowchart illustrating a mixing method of an audio-video conference system according to an embodiment of the present invention. The mixing method of the audio and video conference system is suitable for the mixing server shown in figure 1. According to fig. 4, the method comprises:
s301: and receiving a plurality of audio data packets of the first meeting place and audio data packets of a plurality of second meeting places. The multiple audio data packets of the first venue are collected by multiple corresponding microphones located at different locations within the first venue. This step is similar to step S101 in the first embodiment, and is not described herein again.
S302: and acquiring a collection time stamp of each audio data packet carrying the first source identifier.
Each audio data packet of the first session carries a first source identification and a collection time stamp. The first source identification is used to identify that the audio data packet originated from the first session. The acquisition time stamp is used to identify the time at which each audio data packet was acquired.
S303: and judging whether the number of the audio data packets with the same preset acquisition time stamp in each audio data packet carrying the first source identifier reaches a preset number. The predetermined number is the number of microphones in the first venue.
S304: and when the preset number is reached, finishing the time synchronization operation of the multi-channel audio data packets of the first meeting place at the moment corresponding to the preset acquisition time stamp.
In the above steps S302, S303, and S304, after the audio mixing server receives each audio data packet, it is checked whether the audio data packet carries the first source identifier, and if there is the first source identifier (i.e. the audio data packet of the first meeting place), time synchronization needs to be performed on the audio data packet.
Specifically, after the audio data packets carrying the first source identifier are acquired, the time synchronization operation is completed when the audio data packets having the same time stamp reach a predetermined number. For example, the number of microphones in the first meeting place is 3, and for the audio data packet with the collection timestamp t2, the time synchronization step is completed when the number of audio data packets (carrying the first source identifier) with the collection timestamp t2 reaches 3. Optionally, after waiting for a predetermined period of time, if the audio data packets with the same timestamp do not reach the predetermined number, the audio data packets with the timestamp are discarded.
The above steps S302, S303 and S304 implement step S102 "synchronize the time of the multiple audio packets at the first meeting place" in the first embodiment.
It should be noted that, in this embodiment, the time of collecting the sound by the microphone is used to perform time synchronization on each audio data packet of the first meeting place, and as a variation of this embodiment, a timestamp added to the audio data packet at any time before the data collected by the microphone is sent to the mixing server may also be used.
S305: and respectively carrying out sound mixing operation on the multi-channel audio data packets of the first meeting place and the audio data packets of the plurality of second meeting places. The multi-channel audio data packets of the first meeting place are audio data packets which carry the first source identifier, have the same collecting time stamp and are in a preset number of channels. This step is similar to step S103 in the first embodiment, and is not described again here.
S306: and respectively sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place. The third meeting place comprises a plurality of loudspeakers used for respectively playing the mixed sound of each path. The plurality of speakers in the third conference room correspond to the plurality of microphones in the first conference room one to one. This step is similar to step S104 in the first embodiment, and is not described herein again.
Example four
Fig. 5 is a flowchart illustrating a mixing method of an audio-video conference system according to another embodiment of the present invention. The mixing method of the audio and video conference system is suitable for the mixing server shown in figure 1. According to fig. 5, the method comprises:
s401: and receiving a plurality of audio data packets of the first meeting place and audio data packets of a plurality of second meeting places. The multiple audio data packets of the first venue are collected by multiple corresponding microphones located at different locations within the first venue. Please refer to step S201 in the second embodiment, which is similar to that described above, and will not be described herein again.
S402: and respectively carrying out sound mixing operation on the multi-channel audio data packets of the first meeting place and the audio data packets of the plurality of second meeting places. Please refer to step S202 in the second embodiment, which is not described herein again.
S403: and acquiring the sound mixing time stamp carried by each sound mixing data packet.
Each path of mixed sound data packet obtained by the mixed sound operation carries a mixed sound time stamp, and the mixed sound time stamp is consistent with the acquisition time stamp carried by one path of audio data packet of the first meeting place corresponding to each path of mixed sound data packet. As shown in fig. 1, if the audio data packet a1 at the first meeting place is directly mixed with the audio data packet B1 at the second meeting place B to obtain the mixed data packet H1, the mixing timestamp carried by H1 is identical to the collecting timestamp carried by a 1.
S404: and judging whether the number of the mixing paths with the same mixing time stamp in each acquired mixing data packet reaches a preset number. The predetermined number is the number of microphones in the first venue.
S405: and when the preset number is reached, respectively sending each path of mixed sound data packet to a third meeting place. The third meeting place comprises a plurality of loudspeakers used for respectively playing the mixed sound of each path. The plurality of speakers in the third conference room correspond to the plurality of microphones in the first conference room one to one. The step of sending each mixed data packet to the third meeting place is similar to step S204 in the second embodiment, and is not described herein again.
Each path of mixed sound data packet obtained by the mixed sound operation in the step is a mixed sound data packet with the same mixed sound time stamp and a predetermined number of paths.
In the above steps S403, S404, and S405, before sending each audio mixing data packet obtained by the audio mixing operation to the third meeting place, the audio mixing server first obtains the audio mixing time stamp carried by each audio mixing data packet, and finishes the time synchronization operation when the number of the audio mixing data packets with the same audio mixing time stamp reaches the predetermined number. For example, the number of microphones in the first meeting place is 3, and for the mixing packets with mixing time stamp t3, the time synchronization step is completed when the number of mixing packets with mixing time stamp t3 reaches 3. Optionally, after waiting for a predetermined period of time, if the audio data packets with the same timestamp do not reach the predetermined number, the audio data packets with the timestamp are discarded.
The above steps S403, S404 and S405 implement step S203 "synchronize the time of each mixed data packet obtained by mixing operation" in the second embodiment.
EXAMPLE five
Fig. 6 is a flowchart illustrating a mixing method of an audio-video conference system according to another embodiment of the present invention. The mixing method of the audio and video conference system is suitable for the mixing server shown in figure 1. According to fig. 6, the method comprises:
s501: and receiving a plurality of audio data packets of the first meeting place and audio data packets of a plurality of second meeting places. The multiple audio data packets of the first venue are collected by multiple corresponding microphones located at different locations within the first venue. Please refer to step S101 in the first embodiment, which is similar and not described herein again.
S502: time synchronization of the multiple audio packets of the first venue is performed. Please refer to step S102 in the first embodiment, which is similar and not repeated herein.
It should be noted that, this step S502 may be replaced by steps S302, S303, and S304 in the third embodiment.
S503: and respectively carrying out sound mixing operation on the multi-channel audio data packets of the first meeting place and the audio data packets of the plurality of second meeting places. Please refer to step S103 in the first embodiment, which is not described herein again.
S504: and synchronizing the time of each mixed data packet obtained by the mixing operation. Please refer to step S203 in the second embodiment, which is similar and not repeated herein.
It should be noted that, this step S504 may be replaced by steps S403, S404, and S405 in the fourth embodiment.
S505: and respectively sending each path of mixed sound data packet to a third meeting place. The third meeting place comprises a plurality of loudspeakers used for respectively playing the mixed sound of each path. The plurality of speakers in the third conference room correspond to the plurality of microphones in the first conference room one to one. Please refer to step S104 in the first embodiment, which is not described herein again.
EXAMPLE six
The embodiment of the invention provides a sound mixing method of an audio and video conference system, which is different from the first embodiment to the fifth embodiment in that the number of audio data packet paths of a second conference place is the same as that of the first conference place.
The step of mixing the multi-channel audio data packets of the first meeting place with the audio data packets of the plurality of second meeting places respectively comprises the following steps: and mixing each path of audio data packet in the second meeting place with one path of audio data packet in the first meeting place.
As shown in fig. 1, assuming that there are 3 microphones in the conference B as the second conference room to acquire 3 channels of audio data packets B1, B2, and B3 (the number of audio data packets is the same as that of the first conference room), each channel of audio data packets in the first conference room a and each channel of audio data packets in the second conference room B are mixed separately, such as mixing a1 and B1, mixing a2 and B2, and mixing A3 and B3. That is, each audio in the B conference site is also mixed by using one mixer as each audio in the a conference site of the first conference site.
EXAMPLE seven
Fig. 7 is a schematic diagram illustrating a mixing apparatus of an audio-video conference system according to an embodiment of the present invention. The mixing device of the audio and video conference system is suitable for the mixing server shown in figure 1. According to fig. 7, the apparatus includes a receiving unit 10, a first mixing unit 20, and a transmitting unit 30.
The receiving unit 10 is configured to receive multiple audio packets of a first conference room and multiple audio packets of a second conference room. The multiple audio data packets of the first venue are collected by multiple corresponding microphones located at different locations within the first venue.
The first mixing unit 20 is configured to mix the multiple audio packets of the first conference room with the audio packets of the multiple second conference rooms.
And a sending unit 30, configured to send each mixed data packet obtained by the mixing operation to the third meeting place. The third meeting place comprises a plurality of loudspeakers used for respectively playing the mixed sound of each path. The directions of the plurality of loudspeakers at the third meeting place correspond to the directions of the plurality of microphones at the first meeting place one by one.
The apparatus further comprises a first synchronization unit 40 and/or a second synchronization unit 50.
A first synchronization unit 40, configured to synchronize the time of the multiple audio packets of the first conference room before the step of mixing the multiple audio packets of the first conference room with the audio packets of the multiple second conference rooms, respectively.
And a second synchronizing unit 50, configured to synchronize the time of each of the audio mixing packets obtained by the audio mixing operation before the step of sending each of the audio mixing packets obtained by the audio mixing operation to the third meeting place.
The audio mixing device of the audio and video conference system acquires multiple paths of audio data packets through multiple corresponding microphones located at different positions in a first meeting place, the audio mixing server receives the multiple paths of audio data packets from the first meeting place and multiple audio data packets from a second meeting place, then the multiple paths of audio data packets from the first meeting place and the audio data packets from the second meeting places are subjected to audio mixing operation respectively, finally, the audio mixing data packets obtained through the audio mixing operation are sent to a third meeting place respectively, and the audio mixing data packets are played through loudspeakers corresponding to the microphones in the first meeting place one by one in the third meeting place. The method comprises the steps of synchronizing the time of multiple paths of audio data packets of a first meeting place before sound mixing operation is carried out, and/or synchronizing the time of each path of sound mixing data packet before each path of sound mixing data packet is sent to a third meeting place, so that the playing time difference of multiple paths of sound collected at the same moment in the first meeting place in the third meeting place is smaller, the collecting time difference of each microphone in the first meeting place for the same speech content can be reflected, a user in the third meeting place can judge the position of the sound according to the collecting time difference, and the effect of sound distinguishing is achieved, and the method is the same as the method when the user is placed in an A meeting place.
Example eight
Fig. 8 is a schematic diagram of an audio mixing apparatus of another audio-video conference system according to an embodiment of the present invention. The mixing device of the audio and video conference system is suitable for the mixing server shown in fig. 1, and is different from the seventh embodiment in that each path of audio data packet of the first session carries a first source identifier and a collection timestamp. The first source identification is used to identify that the audio data packet originated from the first session. The acquisition time stamp is used to identify the time at which each audio data packet was acquired. The first synchronization unit 40 includes a first acquisition sub-unit 41 and a first judgment sub-unit 42.
The first obtaining subunit 41 is configured to obtain a collection timestamp of each audio data packet carrying the first source identifier.
The first determining subunit 42 is configured to determine whether the number of the audio data packets with the same preset acquisition timestamp in each audio data packet carrying the first source identifier reaches a preset number. The predetermined number is the number of microphones in the first venue.
And when the preset number is reached, finishing the time synchronization operation of the multi-channel audio data packets of the first meeting place at the moment corresponding to the preset acquisition time stamp.
Correspondingly, in the step of mixing the multiple audio data packets of the first meeting place with the multiple audio data packets of the second meeting place, the multiple audio data packets of the first meeting place are the audio data packets which carry the first source identifier, have the same acquisition timestamp and are the same and have the preset number of paths.
As an optional implementation manner of this embodiment, each of the audio mixing data packets obtained by the audio mixing operation carries an audio mixing timestamp, and the audio mixing timestamp is consistent with an acquisition timestamp carried by one of the audio data packets of the first meeting place corresponding to each of the audio mixing data packets. The second synchronization unit 50 includes a second acquisition sub-unit 51 and a second judgment sub-unit 52.
The second obtaining sub-unit 51 is configured to obtain mixing timestamps carried by each mixing data packet.
The second judging subunit 52 is configured to judge whether the number of mixing paths with the same mixing timestamp in each of the obtained mixing packets reaches a predetermined number. The predetermined number is the number of microphones in the first venue.
And when the preset number is reached, executing the step of respectively sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place.
Correspondingly, in the step of sending each path of mixing data packet obtained by mixing operation to the third meeting place, each path of mixing data packet obtained by mixing operation is a predetermined number of paths of mixing data packets with the same mixing timestamp.
As an optional implementation manner of this embodiment, the number of audio data packets in the second conference room is the same as the number of audio data packets in the first conference room. The device also includes: the second mixing unit 60 is configured to perform mixing operation on each audio data packet in the second conference room and one audio data packet in the first conference room.
Example nine
An embodiment of the present invention provides an audio/video conference system, as shown in fig. 1, including a plurality of microphones in a first conference room, a microphone in a second conference room, a plurality of speakers in a third conference room, the number of which is the same as that of the microphones in the first conference room, and a mixing server. The loudspeakers of the third meeting place correspond to the microphones of the first meeting place one by one.
The audio and video conference system is used for executing the audio and video conference system in any one of the first to the sixth embodiments.
As an optional implementation manner of this embodiment, the mixing server includes a plurality of mixers, and the number of the mixers is not less than the number of the microphones in the first meeting place.
Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims (10)

1. A sound mixing method of an audio and video conference system is characterized by comprising the following steps:
receiving a plurality of paths of audio data packets of a first meeting place and audio data packets of a plurality of second meeting places; the multi-channel audio data packets of the first meeting place are collected by a plurality of corresponding microphones positioned at different positions in the first meeting place;
performing sound mixing operation on the multi-channel audio data packets of the first meeting place and the audio data packets of the plurality of second meeting places respectively;
respectively sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place; the third meeting place comprises a plurality of loudspeakers and is used for respectively playing each mixed sound; the plurality of loudspeakers correspond to the plurality of microphones one by one;
the method further comprises the following steps: before the step of mixing the multiple audio packets of the first conference room with the multiple audio packets of the second conference rooms, respectively, synchronizing the time of the multiple audio packets of the first conference room so that a user of the third conference room can distinguish the position of the sound according to the sound played by the loudspeaker, wherein the position of the sound corresponds to the position of the microphone in the first conference room; and/or the presence of a gas in the gas,
before the step of sending each path of sound mixing data packet obtained by the sound mixing operation to a third meeting place respectively, synchronizing the time of each path of sound mixing data packet obtained by the sound mixing operation so that a user at the third meeting place can distinguish the position of sound according to the sound played by a loudspeaker, wherein the position of sound corresponds to the position of a microphone in the first meeting place.
2. The audio mixing method of an audio/video conference system according to claim 1, wherein each audio data packet of the first session carries a first source identifier and a collection timestamp; the first source identification is used for identifying that the audio data packet originates from the first meeting place; the acquisition time stamp is used for identifying the time when each path of audio data packet is acquired; the step of synchronizing the time of the multiple audio data packets of the first conference room comprises:
acquiring a collection time stamp of each path of audio data packet carrying the first source identifier;
judging whether the number of the audio data packets with the same preset acquisition time stamp in each audio data packet carrying the first source identifier reaches a preset number or not; the preset number is the number of the microphones in the first meeting place;
when the preset number is reached, the time synchronization operation of the multi-channel audio data packets of the first meeting place at the moment corresponding to the preset acquisition timestamp is completed;
correspondingly, in the step of performing the audio mixing operation on the multiple audio data packets of the first meeting place and the multiple audio data packets of the multiple second meeting places, the multiple audio data packets of the first meeting place are the audio data packets of the predetermined number of paths which carry the first source identifier and have the same collection timestamp.
3. The audio mixing method of an audio and video conference system according to claim 1 or 2, wherein each path of audio mixing data packet obtained by the audio mixing operation carries an audio mixing timestamp, and the audio mixing timestamp is consistent with an acquisition timestamp carried by one path of audio data packet of the first meeting place corresponding to each path of audio mixing data packet;
the step of synchronizing the time of each mixed data packet obtained by the mixing operation comprises:
acquiring a sound mixing timestamp carried by each sound mixing data packet;
judging whether the number of the sound mixing paths with the same sound mixing time stamps in the obtained sound mixing data packets reaches a preset number or not; the preset number is the number of the microphones in the first meeting place;
when the preset number is reached, executing the step of respectively sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place;
correspondingly, in the step of sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place, each path of mixed sound data packet obtained by the mixed sound operation is the mixed sound data packet with the same mixed sound time stamp and the predetermined number of paths.
4. The audio mixing method of an audio/video conference system according to claim 1, wherein the number of the audio data packets in the second conference room is the same as the number of the audio data packets in the first conference room;
the step of mixing the multiple audio data packets of the first conference room with the audio data packets of the multiple second conference rooms respectively comprises: and performing sound mixing operation on each path of audio data packet in the second meeting place and one path of audio data packet in the first meeting place respectively.
5. An audio mixing apparatus of an audio-video conference system, comprising:
the receiving unit is used for receiving the multi-channel audio data packets of the first meeting place and the audio data packets of the second meeting places; the multi-channel audio data packets of the first meeting place are collected by a plurality of corresponding microphones positioned at different positions in the first meeting place;
the first audio mixing unit is used for mixing the multi-channel audio data packets of the first meeting place with the audio data packets of the plurality of second meeting places respectively;
a sending unit, configured to send each audio mixing data packet obtained by the audio mixing operation to a third meeting place respectively; the third meeting place comprises a plurality of loudspeakers and is used for respectively playing each mixed sound; the plurality of loudspeakers correspond to the plurality of microphones one by one;
the device further comprises: a first synchronization unit, configured to synchronize the time of the multiple audio packets of the first conference room before the step of mixing the multiple audio packets of the first conference room with the audio packets of the multiple second conference rooms, so that a user at the third conference room can recognize a position of the sound according to the sound played by the speaker, where the position of the sound corresponds to a position of the microphone in the first conference room; and/or the presence of a gas in the gas,
and the second synchronization unit is used for synchronizing the time of each path of sound mixing data packet obtained by the sound mixing operation before the step of respectively sending each path of sound mixing data packet obtained by the sound mixing operation to the third meeting place, so that the user at the third meeting place can distinguish the position of the sound according to the sound played by the loudspeaker, and the position of the sound corresponds to the position of the microphone in the first meeting place.
6. The audio mixing apparatus of an audio/video conference system according to claim 5, wherein each audio data packet of the first session carries a first source identifier and a collection timestamp; the first source identification is used for identifying that the audio data packet originates from the first meeting place; the acquisition time stamp is used for identifying the time when each path of audio data packet is acquired; the first synchronization unit includes:
the first obtaining subunit is configured to obtain a collection timestamp of each audio data packet carrying the first source identifier;
the first judging subunit is configured to judge whether the number of the audio data packets with the same preset acquisition timestamp in each audio data packet carrying the first source identifier reaches a preset number; the preset number is the number of the microphones in the first meeting place;
when the preset number is reached, the time synchronization operation of the multi-channel audio data packets of the first meeting place at the moment corresponding to the preset acquisition timestamp is completed;
correspondingly, in the step of performing the audio mixing operation on the multiple audio data packets of the first meeting place and the multiple audio data packets of the multiple second meeting places, the multiple audio data packets of the first meeting place are the audio data packets of the predetermined number of paths which carry the first source identifier and have the same collection timestamp.
7. The audio mixing device of an audio and video conference system according to claim 5 or 6, wherein each path of audio mixing data packet obtained by the audio mixing operation carries an audio mixing timestamp, and the audio mixing timestamp is consistent with an acquisition timestamp carried by one path of audio data packet of the first meeting place corresponding to each path of audio mixing data packet;
the second synchronization unit includes:
the second obtaining subunit is configured to obtain the audio mixing timestamps carried by the audio mixing data packets of each channel;
the second judging subunit is configured to judge whether the number of mixing paths with the same mixing timestamp in each acquired mixing data packet reaches a predetermined number; the preset number is the number of the microphones in the first meeting place;
when the preset number is reached, executing the step of respectively sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place;
correspondingly, in the step of sending each path of mixed sound data packet obtained by the mixed sound operation to a third meeting place, each path of mixed sound data packet obtained by the mixed sound operation is the mixed sound data packet with the same mixed sound time stamp and the predetermined number of paths.
8. The audio mixing apparatus of an audio/video conference system according to claim 5, wherein the number of the audio data packets in the second conference room is the same as the number of the audio data packets in the first conference room;
the device further comprises: and the second sound mixing unit is used for mixing each path of audio data packet in the second meeting place with one path of audio data packet in the first meeting place.
9. An audio-video conferencing system, comprising:
a plurality of microphones in a first venue;
a microphone of the second venue;
the number of the loudspeakers in the third meeting place is the same as that of the microphones in the first meeting place, and the loudspeakers in the third meeting place correspond to the microphones in the first meeting place one by one;
a mixing server for executing the mixing method of the audio-video conference system according to any one of claims 1 to 4.
10. The audio-video conference system according to claim 9, wherein the mixing server comprises a plurality of mixers, and the number of the mixers is not less than the number of the microphones in the first conference hall.
CN201710243769.XA 2017-04-14 2017-04-14 Audio mixing method, device and system of audio and video conference system Active CN107195308B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710243769.XA CN107195308B (en) 2017-04-14 2017-04-14 Audio mixing method, device and system of audio and video conference system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710243769.XA CN107195308B (en) 2017-04-14 2017-04-14 Audio mixing method, device and system of audio and video conference system

Publications (2)

Publication Number Publication Date
CN107195308A CN107195308A (en) 2017-09-22
CN107195308B true CN107195308B (en) 2021-03-16

Family

ID=59871181

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710243769.XA Active CN107195308B (en) 2017-04-14 2017-04-14 Audio mixing method, device and system of audio and video conference system

Country Status (1)

Country Link
CN (1) CN107195308B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110708432B (en) * 2019-10-12 2021-01-12 浙江大华技术股份有限公司 Method, system, device and storage medium for audio output in audio conference
CN112435649A (en) * 2020-11-09 2021-03-02 合肥名阳信息技术有限公司 Multi-user dubbing sound effect mixing method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102270456A (en) * 2010-06-07 2011-12-07 华为终端有限公司 Method and device for audio signal mixing processing
CN102655584A (en) * 2011-03-04 2012-09-05 中兴通讯股份有限公司 Media data transmitting and playing method and system in tele-presence technology
CN103248774A (en) * 2012-02-13 2013-08-14 陈剑勇 VoIP server synchronous sound mixing method and system
US9519306B2 (en) * 2013-06-12 2016-12-13 Fuji Electric Co., Ltd. Distribution device, distribution system, and distribution method

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101001485A (en) * 2006-10-23 2007-07-18 中国传媒大学 Finite sound source multi-channel sound field system and sound field analogy method
CN101731011B (en) * 2007-05-11 2014-05-28 奥迪耐特有限公司 Systems, methods and computer-readable media for configuring receiver latency
JP5012387B2 (en) * 2007-10-05 2012-08-29 ヤマハ株式会社 Speech processing system
CN101282386B (en) * 2008-05-22 2010-11-10 中山大学 Method for forwarding synchronous mixed audio of VOIP server terminal
CN102364952B (en) * 2011-10-25 2013-12-25 浙江万朋网络技术有限公司 Method for processing audio and video synchronization in simultaneous playing of plurality of paths of audio and video
CN103634561A (en) * 2012-08-21 2014-03-12 徐丙川 Conference communication device and system
CN103310776B (en) * 2013-05-29 2015-12-09 亿览在线网络技术(北京)有限公司 A kind of method and apparatus of real-time sound mixing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102270456A (en) * 2010-06-07 2011-12-07 华为终端有限公司 Method and device for audio signal mixing processing
CN102655584A (en) * 2011-03-04 2012-09-05 中兴通讯股份有限公司 Media data transmitting and playing method and system in tele-presence technology
CN103248774A (en) * 2012-02-13 2013-08-14 陈剑勇 VoIP server synchronous sound mixing method and system
US9519306B2 (en) * 2013-06-12 2016-12-13 Fuji Electric Co., Ltd. Distribution device, distribution system, and distribution method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Conference mixer system based on VOIP";Y Zhou;《IET International Conference on Smart and Sustainable City 2013 (ICSSC 2013)》;20140213;全文 *
"虚拟空间会议系统中音频合成技术的研究与实现";贺宝权;《小型微型计算机系统》;20000630;第21卷(第6期);全文 *

Also Published As

Publication number Publication date
CN107195308A (en) 2017-09-22

Similar Documents

Publication Publication Date Title
US8208664B2 (en) Audio transmission system and communication conference device
CN102572369B (en) Voice volume prompting method and terminal as well as video communication system
US6850496B1 (en) Virtual conference room for voice conferencing
US8606249B1 (en) Methods and systems for enhancing audio quality during teleconferencing
US9313336B2 (en) Systems and methods for processing audio signals captured using microphones of multiple devices
US20130022189A1 (en) Systems and methods for receiving and processing audio signals captured using multiple devices
US7539486B2 (en) Wireless teleconferencing system
US9025002B2 (en) Method and apparatus for playing audio of attendant at remote end and remote video conference system
KR20090091243A (en) Method and device for data capture for push over cellular
US20110103624A1 (en) Systems and Methods for Providing Directional Audio in a Video Teleconference Meeting
CN111147362B (en) Multi-user instant messaging method, system, device and electronic equipment
CN107195308B (en) Audio mixing method, device and system of audio and video conference system
EP2337328A1 (en) Method, system and apparatus for processing 3d audio signal
WO2013083133A1 (en) System for multimedia broadcasting
CN110662206B (en) Bluetooth-based high-definition music and voice transmission operation method
WO2015078105A1 (en) Method and system for processing audio of synchronous classroom
WO2012055291A1 (en) Method and system for transmitting audio data
EP2207311A1 (en) Voice communication device
EP4085661A1 (en) Audio representation and associated rendering
JP2009246528A (en) Voice communication system with image, voice communication method with image, and program
CN108109630B (en) Audio processing method and device and media server
Hardman et al. Enhanced reality audio in interactive networked environments
KR20170013860A (en) Object-based teleconferencing protocol
JP2010288114A (en) Telephone conference device, and telephone conference system using the same
JP2005341202A (en) Portable terminal unit, program and method for switching communication, and television conference system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Xiao Jihua

Inventor after: Fan Chao

Inventor after: Gu Zhenhua

Inventor after: Zhou Jinlong

Inventor before: Xiao Jihua

Inventor before: Gu Zhenhua

Inventor before: Zhou Jinlong

GR01 Patent grant
GR01 Patent grant